Home
OT Extension, Incrementally Constructed

The purpose of this post is to give an intuitive description of Oblivious Transfer (OT) Extension, incrementally building up from a basic understanding of OT (and cryptographic hash functions and PRGs). The scheme covered here is the semi-honest version of the classic IKNP technique, which forms the basis of nearly every modern OT extension protocol.

Whom it's for: Generally anyone trying to understand and use OT Extension, from students/researchers who plan to engage in research in the area (or already know it and just want another perspective), to engineers who are building systems based on OT.

The description in this post is not succinct, proven formally, optimized for performance, or secure against active attacks. Instead, it intends to explain the mechanics of the IKNP technique, so that understanding its security, efficiency, implementation complexity, and hardening for malicious security through other literature becomes much more accessible.


Background

Oblivious Transfer (OT) is one of the most fundamental, yet powerful building blocks in secure distributed computation. Composing OTs cleverly can yield some really useful secure systems, from custom protocols like Threshold ECDSA and Private Set Intersection, all the way to general purpose MPC itself. There are simple Diffie-Hellman like techniques that can be used to construct OT, at roughly the same complexity as a simple key exchange. However, most applications consume large amounts of OT—distributed ECDSA signing for example can invoke over a thousand OT instances to sign a single message, and computing such a large volume of exponentiations may be too expensive in many situations.

We have it on good authority (thanks to Impagliazzo and Rudich) that OT will inherently require public key operations to instantiate. However, this statement doesn't preclude amortizing the cost of such public key operations across many instances. Think of hybrid encryption, in which public key encryption is used to establish a key for a symmetric key encryption scheme. Hybrid encryption pays off when encrypting long messages—one can achieve the effect of PKE for an l-bit message at the cost of encrypting a k-bit symmetric key with PKE and l>>k bits of data with SKE.

OT Extension is this incredible idea (originally conceived by Beaver) to "extend" a few OTs to arbitrarily many OT instances at low amortized cost, just as a hybrid encryption extends the effect of a single PKE to a large volume of data. While Beaver did give a nice construction to show that such OT extension is indeed feasible with just PRGs, a follow-up work by Ishai, Kilian, Nissim, and Petrank (IKNP) gave a more practical alternative to Beaver's construction, and forms the basis for most modern OT extension protocols. The one exception is Silent OT Extension which (due to Lawrence Roy, personal communication) can be interpreted as Beaver's approach instantiated with an LPN-based PRG, garbled with the one-hot approach of Heath and Kolesnikov. We focus on IKNP in this post as it's based on more general/conservative primitives, and more likely to underlie presently deployed systems.

The protocol description in the original IKNP paper is succinct and elegant, however for the longest time I did not feel like I had a handle on the intuition behind the technique. One can of course verify that the algorithm is correct (and even perhaps write a simulation based security proof), however the technique itself seemed like complete magic to me—completely disconnected from anything else known about OT. It was when I read a paper by Wolf and Wullschleger about how OT is "symmetric" (published a couple of years after IKNP) that the IKNP technique really clicked for me. The recipe for OT extension in this post is therefore a re-derivation of the IKNP result in bite-sized chunks, with the symmetricity of OT as the guiding principle.

The plan is to work our way up to OT Extension by asking and answering a sequence of increasingly complex but natural questions. Our starting point is k-bit OT, which allows a receiver to obtain exactly one out of two k-bit long messages that belong to a sender, while keeping the sender oblivious to which one was chosen. Our end goal is to construct l instances of k-bit OT for an arbitrarily large integer l, while consuming only k "actual" instances of k-bit OT—hence the extension from k to l instances. The only cryptographic objects we will need are hash functions and PRGs, which are substantially cheaper to use than the public key machinery necessary to create l OT instances naively from scratch.


Warmup: Sender Message Extension

A well-known gadget is increasing the length of the messages that the sender is able to transfer. We start with k-bit OT as below:


OT R: b∈{0,1} S: m0∈{0,1}k , m1∈{0,1}k b mb m0 , m1 Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

The sender S inputs messages m0, m1, each of which are k-bits long, and the receiver R obtains mb with no knowledge of m1-b. However, we wish to enable the sender to transfer l-bit messages, for some l>k. This is quite straightforward given a pseudorandom generator G:{0,1}k→{0,1}l, as the sender simply uses the OT to transfer seeds for G, and uses these seeds to encrypt m0, m1 and send them outside of the OT.

OT b sb s0 , s1 s0←{0,1}k , s1←{0,1}k G(s0)⊕m0 , G(s1)⊕m1 Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

R is able to compute G(sb) and obtain mb from the corresponding ciphertext. Security follows from the fact that since R knows nothing about s1-b, the opposite ciphertext is effectively a (pseudorandom) one-time pad encryption of m1-b. This simple mechanism enables S to send artbitrarily long messages at essentially no overhead.

The core idea behind the IKNP OT Extension protocol is to translate the benefits of this mechanism to the receiver's side; the ability to send arbitrarily long messages inexpensively will translate to the ability to execute arbitrarily many OT instances inexpensively.


Step 1: OT Role Reversal

The natural first step in translating the benefits of one role to the other, is to understand how roles can be "swapped". In particular, we begin by asking:

Given an invocation of 1-bit OT with fixed roles, how can we swap the roles of the parties?

This can be visualized as follows:


OT S: m0∈{0,1}, m1∈{0,1} R: b∈{0,1} Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

Observe that in the diagram above, the arrows are in the "wrong" place; S has the receiver's interface to OT, and R the sender's. The task at hand is to make use of the OT interface with this configuration, but still deliver mb obliviously to R while keeping m1-b hidden.

Wolf and Wullschleger gave an elegant solution to this problem, which is presented below:


OT Δ = m0 ⊕ m1 Δ p = r⊕bΔ r, r⊕b r ←{0,1} m0⊕ p mb = r⊕(m0⊕ p) = r⊕(m0⊕(r⊕bΔ)) = m0⊕bΔ Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

It might take a minute to walk through and understand the solution above; it's a cute information-theoretic trick, and the derivation at the bottom right should help see why it works. The idea is that S sends to R a ciphertext m0⊕p, where p=r⊕bΔ holds by virtue of the OT. Since R chose r itself, R can "XOR out" r from the ciphertext, leaving it with r⊕(m0⊕p) = r⊕(m0⊕(r⊕bΔ)) = m0⊕bΔ. This is very convenient, as Δ is by definition the value that relates the messages: m0⊕Δ=m1. Therefore, the value m0⊕bΔ obtained by the receiver is m1 when b=1, and simply m0 when b=0.

With this mechanism in place, we turn our attention to scaling the idea beyond 1-bit OT.


Step 2: k-bit OT Role Reversal

We first consider how to scale to larger messages. From here on, we will use "TO" as shorthand for OT with the roles reversed. As per this terminology, Step 1 established how to construct 1-bit TO from 1-bit OT. Constructing k-bit TO from k-bit OT seems challenging, so we'll start with a simpler task:

Given k invocations of 1-bit OT, how can we construct a single k-bit TO?

There's no magic to this; simply repeating the solution in Step 1 k times (once for each bit of the messages) with the same choice bit b in each repetition will solve the problem. In particular, if m0,j and m1,j represent the jth bits of m0 and m1 respectively, the jth OT is used to obtain mb,j as follows:


OTj Δj = m0,j ⊕ m1,j Δj pj = rj⊕bΔj rj, rj⊕b rj ←{0,1} m0,j⊕ pj mb,j = rj⊕(m0,j⊕ pj) = rj⊕(m0,j⊕(rj⊕bΔj)) = m0,j⊕bΔj Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

The diagram above is essentially identical to the previous step, with indices added in.


Step 3: OT→TO, for Many Correlated k-bit Messages

We now attempt to construct TOs for many messages at the same time, with only a fixed number of OTs—this is where the "extension" happens. However to make the task easier, we relax the problem a bit: we assume that all of the message pairs are correlated. In particular, we refer to a given a set of messages {mi 0,mi 1}i∈[l] as "correlated" if Δ=mi 0⊕ mi 1 is the same for each pair of messages. We therefore ask:

Given k invocations of l-bit OT, how can we construct l instances of k-bit TO, under the assumption that the sender's messages are correlated?

Once more, there is no real magic here. The fact that each message pair is correlated lets us reuse Δj in a straightforward way, and simply attaching the correct indices to the solution in the previous step will answer the above question.

First, imagine that we have kl invocations of 1-bit OT. This permits a very simple solution, i.e. for each message pair mi 0,mi 1 execute an OT for their jth bits mi 0,j,mi 1,j to obliviously deliver the message corresponding to the receiver's choice bit bi:


OTji Δi j = mi0,j ⊕ mi1,j Δij pij = rij⊕biΔij rij, rij⊕bi rij ←{0,1} mi0,j⊕ pij mib,j = rij⊕(mi0,j⊕ pij) Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

Once more, this is essentially a naive repetition of the solution from the previous step with the appropriate indices included. Observe that since the messages are correlated, it holds that for each j∈[k], Δi j is the same for every i∈[l]—i.e.∀ j∈[k], i∈[l]: Δij = Δj. Therefore, for each j∈[k], every OT i j instance shares the same choice bit Δj across every i∈[l], meaning that these l separate 1-bit OTs can be combined into a single l-bit OT instance OTj.

As an example when l=3, the jth 3-bit OT has the following structure:

OTj Δj p1j || p2j || p3j r1j||r2j||r3j ,   r1j⊕b1|| r2j⊕b2|| r3j⊕b3 m10,j⊕ p1j m1b,j = r1j⊕(m10,j⊕ p1j) m20,j⊕ p2j m2b,j = r2j⊕(m20,j⊕ p2j) m30,j⊕ p3j m3b,j = r3j⊕(m30,j⊕ p3j)

Step 4: Correlated k-bit Messages to Any k-bit Messages

We will adjust our OT abstraction to "correlated" OT to better capture the notion that we constructed in the previous step. In a correlated OT, one party supplies pairs of messages that share the same correlation, and the other receives one out of each pair obliviously. We can further simplify this description to say that one party supplies a set of k-bit messages {mi}i∈[l] and a k-bit correlation Δ, and the other receives {mi⊕biΔ}i∈[l]. This is visualized below.


ΔOT R: bi∈{0,1} S: Δ∈{0,1}k , mi∈{0,1}k (b1,...,bl) m1 ⊕ b1Δ m2 ⊕ b2Δ . . . ml ⊕ blΔ Δ , (m1,m2,...,ml) Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

The above visualization essentially corresponds to l instances of k-bit ΔOT. We know from Step 3 that l instances of k-bit ΔOT can be obtained from k instances of l-bit TO.

Now, we consider how to use these ΔOTs to construct our desired final object. In particular, we ask:

Given l instances of k-bit ΔOT, how can we construct l instances of k-bit OT?

Recall that in each instance of ΔOT, the sender has two messages (mi, mi⊕Δ) of which the receiver knows one (mi⊕biΔ). Now, imagine that all of the sender's inputs to ΔOT are sampled uniformly at the start of the protocol—perhaps we could use (mi, mi⊕Δ) as encryption keys for a message pair (μi0, μi1)? This is the right idea, provided that the encryption method accounts for the fact that key pairs are correlated. For instance, if the receiver learns both (μi0, μi1) of the ith OT instance through the course of a higher level protocol, this should not compromise an unrelated message μj1-bj of an independent jth OT instance. The encryption method must therefore protect Δ from the receiver in each instance, even when the receiver knows both messages for that instance. Observe that simple one-time pad encryption (i.e. message⊕key) does not suffice, as knowledge of the message also reveals the key, and allowing the receiver to learn both keys (mi, mi⊕Δ) for any i will be catastrophic—Δ will be compromised, and consequently every (mj, mj⊕Δ) even for i≠j, as the receiver already knows one of either mj or mj⊕Δ. We therefore use (H(mi), H(mi⊕Δ)) as encryption keys instead, so that in case the receiver obtains both (μi0i1), a suitably chosen cryptographic hash function H will keep Δ safe. The exact property that such a hash function must satisfy is correlation robustness, however for the purpose of this post think of H:{0,1}*↦{0,1}k as a random oracle.

For completeness, we give the answer to the question posed in this step below, assuming that the receiver has choice bits {bi}i∈[l] and the sender has message pairs {(μi0i1)}i∈[l].


ΔOT R: bi∈{0,1} S: Δ←{0,1}k , {mi←{0,1}k}i∈[l] (b1,...,bl) pb1 = m1 ⊕ b1Δ pb2 = m2 ⊕ b2Δ . . . pbl = ml ⊕ bl Δ Δ , (m1,m2,...,ml) μ10⊕H(m1) ,    μ11⊕H(m1⊕Δ) μ20⊕H(m2) ,    μ21⊕H(m2⊕Δ) . . . μl0⊕H(ml) ,    μl1⊕H(ml ⊕Δ) μib = (μib⊕H(mi ⊕bΔ) ) ⊕ H(pbi)   ∀ i∈[l] Sorry, your browser does not support inline SVG. None of the diagrams in this blog will be rendered.

Security follows from the fact that in the each instance, the message μb that the receiver can decrypt is encrypted by H(p=m⊕bΔ), and the opposite message μ1-b is effectively encrypted by H(p⊕Δ). Note that in the absence of any information about Δ, the value H(p⊕Δ) appears completely random, and therefore the message μ1-b that it encrypts stays hidden. Moreover, even if the receiver were to learn μ1-b from some higher level protocol, the ciphertext only reveals H(p⊕Δ) and not p⊕Δ directly—this ensures that Δ itself stays unexposed through the lifetime of the system.


Putting it All Together

We now have all the tools necessary to build OT extension. Starting with k instances of k-bit OT, we can construct l instances of k-bit TO for any arbitrary l. The sequence is:

  1. k instances of l-bit OT from k instances of k-bit OT via the Warmup construction.
  2. l instances of k-bit ΔTO from k instances of l-bit OT via the Step 3 construction.
  3. l instances of k-bit TO from l instances of k-bit ΔTO via the Step 4 construction.

We therefore have semi-honest OT extension from k to l instances using only cryptographic hash functions and PRGs, as per the IKNP technique. Measuring the concrete efficiency requires assembling all of the building blocks, but here's a rough accounting of the costs that each step adds after the initial k instances of k-bit OTs:

  1. The Warmup construction induces k pairs of PRG evaluations to derive k pairs of l bit pseudorandom strings, and transmits k pairs of l bit ciphertexts from R to S.
  2. The Step 3 construction transmits k sets of l single-bit ciphertexts from S to R.
  3. The Step 4 construction adds l pairs of H evaluations and transmits as many k-bit ciphertexts from S to R.

In total, the OT extension as described above induces 2kl bits of information transmitted from R to S, 3kl bits from S to R, and 2l Hash and 2k PRG invocations on each side. If we assume l>>k, this brings the amortized per-OT cost to 2k+3k bits (R→S + S→R) and 2 H evaluations. Contrast this with a naive per-OT cost of 2k+4k bits and 2 exponentiations (for the barebones Bellare Micali OT), and the computation savings become obvious. Concretely, on modern laptop-grade hardware with standard instantiations, evaluating H costs a fraction of a microsecond, whereas an exponentiation costs tens of microseconds—over an order of magnitude in difference. Note that these are only back-of-the-envelope indicators, as implementation-specific bottlenecks might arise in different places. As an example, Asharov, Lindell, Schneider, and Zohner found that using a cache-efficient matrix transposition algorithm noticeably improved their performance.

Also note that though it may appear that there's a tradeoff in terms of increased bandwidth, this isn't actually the case; there are straightforward optimizations that become easy to see once the OT extension protocol is written down in its entirety (without abstractions) that bring the bandwidth cost down to k+2k bits per OT, i.e. the same as a naive OT.


Conclusion and Further Reading

The writeup above is intended to build intuition for the technique, but of course it is possible to express the algorithm more succinctly; indeed the original IKNP paper presented it in only four lines (with no abstractions) using a matrix transposition. The technique still underlies most modern OT extension protocols, and it was only recently due to Roy that the semi-honest version of the protocol has seen any significant improvement. In the same paper, Roy gives a simple and efficient check that the parties can perform in order to guarantee security against malicious corruptions. There's a distilled writeup in Keller, Orsini, and Scholl's paper that serves as a good reference as well.