Home/Magazine Archive/March 2024 (Vol. 67, No. 3)/Indistinguishability Obfuscation from Well-Founded.../Full Text

Research Highlights
## Indistinguishability Obfuscation from Well-Founded Assumptions

At least since the initial public proposal of public-key cryptography based on computational hardness conjectures, cryptographers have contemplated the possibility of a "one-way compiler" that translates computer programs into "incomprehensible" but equivalent forms. And yet, the search for such a "one-way compiler" remained elusive for decades. We examine a formalization of this concept with the notion of indistinguishability obfuscation *(iO)*. Roughly speaking, *iO* requires that the compiled versions of any two equivalent programs (with the same size and running time) be indistinguishable from any efficient adversary. Finally, we show how to construct *iO* in such a way that we can prove the security of our *iO* scheme based on well-studied computational hardness conjectures in cryptography.

Consider the polynomial *f*_{1}(*x, y*) ∈ ℤ[*x, y*] that is computed as follows:

Alternatively, contemplate the polynomial *f*_{2}(x, *y)* ∈ ℤ[*x, y*] that is computed via:

A calculation shows that *f*_{1} and *f*_{2} are in fact the same polynomial, computed in two different ways. Indeed, the expressions *f*_{1} and *f*_{2} above are special cases of *arithmetic circuits*, which precisely represent "ways to compute a polynomial."

What if we wanted to hide all implementation choices made when creating such an arithmetic circuit for a particular polynomial? An easy way to do that would be to first convert our polynomial into a canonical form and then implement the canonical form as an arithmetic circuit. Indeed, the description of *f*_{2} above can be seen as a canonical representation of the polynomial as a sum of monomials with regard to a natural monomial ordering. However, as this example illustrates, canonical forms can be substantially more complex than other implementations of the same polynomial. For polynomials in *n* variables, the loss in efficiency can be exponential in *n.* This would often make computing the canonical form—or indeed, even writing it down—infeasible.

**1.1. A pseudo-canonical form**

Given that computing, canonical forms can be infeasible, what is there to do? Here, following,^{7} we draw an analogy to the notion of pseudo-randomness. When truly random values are not available, we can instead aim to produce values that "look random" by means of a pseudo-random generator. That is, we require that no efficient algorithm can distinguish between truly random values and the output of our pseudo-random generator.

Now, for two arithmetic circuits *g*_{1} and *g*_{2} that compute the same underlying polynomial, a true canonical form Canonical(*g*_{1}) would be identical to the canonical form of Canonical(*g*_{2}). Instead, we would ask that a pseudo-canonical form PseudoCanonical(*g*_{1}) would simply be indistinguishable from the pseudo-canonical form PseudoCanonical(*g*_{2}), to all efficient algorithms that were given *g*_{1} and *g*_{2} as well. Observe that unless there are actual efficiently computable canonical forms for all arithmetic circuits—which we do not believe to be true—it must be that such a PseudoCanonical operator is randomized, and outputs a *probability distribution* over arithmetic circuits computing the same polynomial.

**1.2. The computing lens**

The classic theory of computation tells us that general computer programs can be converted into equivalent polynomials (albeit over finite fields, which we will focus on implicitly in the sequel). So the pseudo-canonicalization question posed above is equivalent to the pseudo-canonicalization question for general computer programs. Indeed, the question of hiding implementation details within a computer program has a long history, dating at least as far back as the groundbreaking 1976 work of Diffie and Hellman^{13} introducing the concept of public-key cryptography. Historically, this problem has been called "program obfuscation," albeit it was typically discussed in an ill-defined form. Discussed in these vague terms, it was folklore that truly secure program obfuscation would have revolutionary applications to computing, especially for securing intellectual property. The work of Barak et al.^{7} gave a formal treatment of this problem and proved the impossibility of strong forms of general-purpose program obfuscation. This work also formalized the pseudo-canonicalization problem discussed above via the notion of *indistinguishability obfuscation (iO)*. Writing now in the language of Boolean circuits, we define the problem below:

*Definition 1.* (INDISTINGUISHABILITY OBFUSCATOR (IO) FOR CIRCUITS.^{7}) A probabilistic polynomial-time algorithm *(iO)* is called a secure indistinguishability obfuscator for polynomial-sized circuits if the following holds:

**Completeness**: For every λ ∈ ℕ, every circuit*C*with input length*n*, every input*x*∈ {0, 1}^{n}, we have that

**Indistinguishability**: For every two ensembles {*C*_{0, λ}}_{λ∈ℤ+}and {*C*_{1, λ}}_{λ∈ℤ+}of polynomial-sized circuits that have the same size, input length, and output length, and are functionally equivalent, that is, ∀λ ∈ ℤ^{+},*C*_{0, l}(*x*) =*C*_{1},_{l}(*x*) for every input*x*, the distributions*iO*(1^{λ}, (7_{0,λ}) and*iO*(1^{λ}, C_{1,λ}) are*computationally indistinguishable*, denoted as:

It means or every efficient polynomial-time algorithm *D*, for every constant *c* > 0, there exists a constant λ_{0} ∈ ℤ^{+}, such that for all *l* > *l*_{0}, we have:

As we discuss in Section 1.4, indeed *iO* as a formalization of pseudo-canonicalization lived up to the folklore promise of software obfuscation: there was, and still is, a large research community studying novel applications of *iO.*

In contrast, demonstrating the feasibility of constructing *iO* proved far more challenging. Often one expects that theory will lag behind the practice, and given the folklore promise of software obfuscation, one might expect that over the years perhaps clever programmers had come up with heuristic approaches to software obfuscation that resisted the attack. The reality is the opposite. Indeed, in 2021 the third annual White Box Cryptography contest was held to evaluate heuristic methods for software obfuscation, and every one of the 97 submitted obfuscations was broken before the contest ended.^{1}

A large body of theoretical work, starting with the pioneering work of Garg et al.^{14} has attempted to construct *iO* using mathematical tools. However, before the result^{18} by the authors of this article, all previous mathematical approaches to constructing *iO* relied on new, unproven mathematical assumptions, many of which turned out to be false (see Jain et al.^{18} for a list of prior constructions.)

We would like to build *iO* whose security rests upon cryptographic hardness assumptions that have stood the test of time, have a long history of study, and are widely believed to be true. The main result of our works^{18,19} is the construction of an *iO* scheme from three well-studied assumptions. We discuss this in more detail next.

THEOREM 1. Under the following assumptions^{18,19,a}:

*the Learning Parity with Noise (*LPN*) assumption over general prime fields*ℤ_{p}*with polynomially many*LPN*samples and error rate*1/*k*LPN^{δ}, where k is the dimension of the*secret, and δ*> 0*is any constant;**the existence of a Boolean Pseudo-Random Generator (PRG) in*NC^{0}*with stretch n*^{1+t},*where n is the length of the*PRG*seed, and t*> 0*is any constant;**the Decision Linear (*DLIN*) assumption on symmetric bilinear groups of prime order.*

*Indistinguishability obfuscation for all polynomial-size circuits exists.*

The three assumptions above (discussed further in Section 1.3) are based on computational problems with a long history of study, rooted in complexity, coding, and number theory. Further, they were introduced for building basic cryptographic primitives (such as public key encryption), and have been used for realizing a variety of cryptographic goals that have nothing to do with *iO.*

**1.3. Assumptions in more detail**

We now describe each of the assumptions we need in more detail and briefly survey their history.

**The** DLIN **Assumption**: The Decisional Linear assumption (DLIN) is stated as follows: For an appropriate *l*-bit prime *p*, two groups G and G_{T} are chosen of order *p* such that there exists an efficiently computable nontrivial symmetric bilinear map *e* : G × G → G_{T}. A canonical generator *g* for G is also computed. Following the tradition of cryptography, we describe the groups above using multiplicative notation, even though they are cyclic. The DLIN assumption requires that the following computational indistinguishability holds:

This assumption was first introduced in the 2004 work of Boneh et al.^{10} and instantiated using appropriate elliptic curves. Since then DLIN and assumptions implied by DLIN have seen extensive use in a wide variety of applications throughout cryptography, such as Identity-Based Encryption, Attribute-Based Encryption, Functional Encryption for degree 2 polynomials, Non-Interactive Zero-Knowledge, etc.

**The Existence of** PRG**s in NC ^{0}**: The assumption of the existence of a Boolean Pseudo-Random Generator PRG in NC

Pseudorandom generators are a fundamental primitive in their own right and have vast applications throughout cryptography. PRGs in NC^{0} are tightly connected to the fundamental topic of Constraint Satisfaction Problems (CSPs) in complexity theory and were first proposed for cryptographic use by Goldreich^{16} 20 years ago. The complexity theory and cryptography communities have jointly developed a rich body of literature on the cryptanalysis and theory of constant-locality Boolean PRGs.

**LPN over large fields**: The Learning Parity with Noise LPN assumption over finite fields ℤ_{p} is a decoding problem. The standard LPN assumption with respect to subexponential size modulus *p*, dimension *ℓ*, sample complexity *n*, and a noise rate *r* = 1/*ℓ ^{δ}* for some

Above e ← *D _{r}* is a generalized Bernoulli distribution,

The origins of the LPN assumption date all the way back to the 1950s (e.g. Gilbert^{15}): it has long been known that random linear codes possessed remarkably strong minimum distance properties. However, since then, very little progress has been made inefficiently decoding random linear codes under random errors. The LPN over fields assumption above formalizes this, and was introduced over ℤ_{2} for cryptographic uses in 1994,^{9} and formally defined for general finite fields and parameters in 2009,^{17} under the name "Assumption 2."

While in Ishai et al.^{17} the assumption was used when the error rate was assumed to be a constant, in fact, polynomially low error (in fact *δ* = 1/2) has an even longer history in the LPN literature: it was used by Alekhnovitch^{2} to construct public-key encryption with the field ℤ_{2}, and used to build public-key encryption over ℤ_{p} in 2015.^{5} The exact parameter settings that we describe above, with both general fields and inverse polynomial error rate corresponding to an arbitrarily small constant *δ* > 0 was explicitly posed by Boyle,^{11} in the context of building efficient secure two-party and multi-party protocols for arithmetic computations.

Recently, the LPN assumption has led to a wide variety of applications. A comprehensive review of known attacks on LPN over large fields, for the parameter settings we are interested in, was given in Boyle.^{11,12} For our parameter setting, the running time of all known attacks is Ω(2^{ℓ1-δ}), for any choice of the constant *δ* ∈(0, 1) and for any polynomial number of samples *n(ℓ)*.

Finally, we mention that except for the DLIN assumption, the other two assumptions, LPN and PRG in NC^{0}, have search-to-decision reductions.

**1.4. Applications of iO**

The notion of *iO* occupies an intriguing and influential position in complexity theory and cryptography. Interestingly, if NP ⊆ BPP, then *iO* exists for the class of all polynomial size circuits, because if NP ⊆ BPP, then it is possible to efficiently compute a canonical form for any function computable by a polynomial-size circuit. This is in contrast to most powerful cryptographic objects, which typically imply the existence of one-way functions. A large body of work has shown that *iO* plus one-way functions imply a vast array of cryptographic objects, so much so that *iO* has been conjectured to be a "central hub"^{24} for cryptography. In addition, an impressive list of fascinating new cryptographic objects is only known under *iO* or related objects.

We highlight a small subset of these implications (see Jain et al.^{18} and the references therein): *multiparty noninteractive key exchange* in the plain model, *zero-knowledge succinct noninteractive arguments* (ZK-SNARGs) for any NP language, *multilinear maps* with bounded polynomial multilinear degrees, *witness encryption* for any NP language, *functional encryption* supporting all polynomial-sized circuits, and *fully homomorphic encryption schemes* (FHE) for unbounded-depth polynomial-size circuits.

**1.5. Open problems**

Our workplaces *iO* on firm foundations with respect to the assumptions is based on, thereby answering the main feasibility question for the primitive (until we resolve the P vs. NP question). However, there are many important open questions that remain to be answered. One of the main questions is whether there is a construction that is very simple to describe and efficient to implement in practice. Our construction is quite complex, relies on a number of steps, and is recursive in nature. As a consequence, the actual asymptotic complexity is some large polynomial. Simplifying and finding more efficient paths to building *iO* from well-studied assumptions is one of the most important problems that must be solved, toward the vision of using *iO* as a universal tool.

In the remaining of the article, we describe a very high level overview of the main technical ideas implying *iO.* For simplicity of exposition, we choose the simplest path to *iO* that we are aware of. This overview is based on a combination of ideas from Jain et al.,^{18,19} which, however, would require one additional assumption to the three stated above (See Theorem 1)—namely, the Learning With Errors (LWE)^{23} assumption. Since the exact technical reasons for needing LWE is not important for this overview, and this assumption is actually unnecessary,^{19} we do not discuss it in detail here.

**2.1. Preliminaries**

Let's start with introducing some basic notation. Let size(*X*) indicates the length of the binary description of an object *X* (e.g., a string, a circuit, or truth table). Throughout, we consider Boolean functions or circuits or algorithms mapping *n*-bit binary strings to *m*-bit binary strings, for some *n, m* ∈ ℤ^{+}. Let time(*A, x*) denotes the running time of an algorithm (or circuit) *A* on an input *x* (in the case of a circuit *C*, time(*C, x*) is the same as size(*C*)). We say that an algorithm or circuit is efficient if its running time is bounded by a (fixed) polynomial in the length of the input, that is, time(*C, x*) = size(*x*)^{c} for some positive integer *c* ∈ ℤ^{+}. When we only care about the existence of a constant and the concrete value is not important, we write *O*(1) in place of that constant, e.g., time(*C, x*) = size(*x*)^{O(1)} (following the big-O notation in complexity theory).

Restating our goal, we want to design an efficient *randomized* algorithm, called the obfuscator *O*, that given a Boolean circuit *C*: {0, 1}^{n} → {0, 1} with size *s* ≤ *n*^{O(1)}, referred to as the original circuit, outputs another Boolean circuit *Ĉ*: {0, 1}^{n} → {0, 1}, called the obfuscated circuit, such that the following three properties hold:

- Correctness: the obfuscated circuit
*Ĉ*is*functionally equivalent*to the original circuit*C*, denoted as*Ĉ*≡*C*, meaning that for every*x*∈ {0, 1}^{n},*Ĉ(x*) =*C(x*). Correctness must hold no matter what random coins the obfuscator*O*used. - Efficiency: the obfuscator is efficient, meaning
*O*runs in time polynomial in the size of the original circuit, namely, time(*O*,*C*) = size(C)^{O(1)}. - Security: the obfuscated circuit
*Ĉ*hides the implementation details in the original circuit*C.*That is, for every two,*equally-sized*and*functionally-equivalent*circuits*C*_{0}and*C*_{1}(i.e., size(*C*_{0}) = size(*C*_{1}) and*C*_{0}°*C*_{1}), the obfuscated circuits*Ĉ*_{0}and*Ĉ*_{1}are*computationally hard to distinguish*(as defined in Definition 1; see also Jain et al.^{18}).i

Next, we give an informal overview of how to construct *iO* from well-studied assumptions.

**2.2. Simplifying the task of iO**

Perhaps the simplest starting point is the following: If there is no restriction on the time the obfuscator *O* can take, then there is an extremely intuitive obfuscator: the obfuscated circuit is the *truth table* of the original circuit. The truth table TT_{C} of a Boolean circuit *C* is an array indexed by inputs, where TT_{C}[*x*] = *C(x*). It can be computed in time 2^{n} · *s* if the input length is *n* and the circuit size is *s.* Perfect security comes from the fact that for any two functionally equivalent circuits *C*_{0} ≡ *C*_{1}, their truth tables are identical TT_{C0} = TT_{C1} and hence impossible to distinguish.

To put simply, a truth table is a *canonical form* of all circuits producing it. While outputting the truth table satisfies the correctness and security requirement of *iO*, the obvious flaw with this is that the obfuscator is far from efficient: The running time of an *iO* scheme should be *s*^{O(1)}, rather than 2^{n} · *s.* The input length *n* of a Boolean circuit can be close to its size *s*, and hence the time to compute a truth table is exponentially large! This inefficiency is likely inherent, as efficient methods of finding canonical forms of circuits imply the collapse of the polynomial hierarchy, which is widely conjectured to be false.

Therefore, to improve efficiency, we seek canonical forms that fool computationally limited adversaries. Naturally, we start with a more humble goal:

**Simplification 1: Obfuscation with non-trivial exponential efficiency.** *iO* with non-trivial efficiency was studied in Lin et al.^{21} Surprisingly, they showed that a very modest improvement on efficiency – captured in the notion of exponential-efficiency *iO* or *xiO* for short – is actually enough to construct completely efficient (polynomial time) *iO.*^{b}

*xiO*is an obfuscator*O*whose running time is still "trivial" (2^{n}·*s*)^{O(1)}, but outputs an obfuscated circuit*Ĉ*of "non-trivial" size 2^{n(1–∈)}·*s*^{O(1)}for some*∈*> 0.

We can think of *xiO* (as well as *iO*) as a special kind of encryption method, where the obfuscated circuit *Ĉ* is a "ciphertext" of the original circuit *C*, also denoted as spCT(*C*) = *Ĉ*, satisfying that

- the ciphertext spCT(
*C*) hides all information about*C*, except that it lets anyone with access to it learn the truth table of*C.*This is unlike normal encryption which reveals no information about the encrypted message. - The size of the ciphertext spCT(
*C*) is 2^{(1–∈)·n}for some 0 <*∈*< 1 So it can be viewed as a (slightly) compressed version of the truth table (that reveals no other information of*C*to computationally limited adversaries).

The such special encryption scheme is known as *functional encryption*, which controls precisely which information of the encrypted message is revealed, and hides all other information. This notion is tightly connected with *xiO* and *iO*, and, in fact, the implication of *xiO* to *iO* goes via the notion of functional encryption.

When viewing spCT(*C*) as a compressed version of the truth table. It becomes clear why even slight compression is powerful enough to imply *iO*: The idea is keeping compressing iteratively until the size of the special ciphertext becomes polynomial. The works of Ananth and Jain,^{3} Bitansky and Vaikuntanathan,^{8} and Lin et al.^{21} turn this high-level idea into an actual proof that *xiO* implies *iO*^{c} and allows us to focus on constructing *xiO*, or equivalently the special encryption described above.

**Simplification 2: It suffices to obfuscate simple circuits.** Unfortunately, despite the efficiency relaxation, it is still unclear how to obfuscate general Boolean circuits, which can be complex. Naturally, we ask:

It turns out that it suffices to focus on an extremely simple class of circuits C = NC^{0}, where NC^{0} is the set of all circuits with constant *output locality*, meaning every output bit depends on a constant number of input bits. To do so, we will rely on two cryptographic tools, *randomized encodings* and *Pseudo-Random Generators (PRGs)*.

**Randomized encoding in** NC^{0}. A Randomized Encoding (RE) scheme consists of two efficient algorithms (RE, Decode). It gives a way to represent a complex circuit *C*(·), by a much simpler *randomized* circuit RE_{C}(·; ·) := RE(*C*, ·; ·), such that,

- Correctness: for every input
*x*, the output*p*of RE(_{x}*C, x; r)*produced using uniformly random coins*r*encodes the correct output: In other words, there exists an efficient decoding algorithm Decode such that Decode(*p*) =*C(x*). - Security:
*p*reveals no information of_{x}*C*beyond the output of*x*to efficient adversaries. - Simplicity: RE is a simple circuit by some measure of simplicity. Classic works
^{6,25}showed that RE can be simply a NC^{0}circuit, when assuming the existence of PRGs in NC^{0}like we are.

The correctness and security of the randomized encoding suggest that, instead of directly obfuscating a general circuit *C*, we can alternatively obfuscate a circuit *D* that on input *x* outputs an encoding *p _{x}*, which reveals only

To make the above idea goes through, there are, however, a few wrinkles to be ironed out. The issue is that the security of randomized encoding only holds if an encoding *p _{x}* is generated using fresh random coins. There are concrete attacks that learn information of

Such a circuit *D* has size at least 2^{n}. In particular, we cannot hope to "compress" the random coins *r*_{1}, *r*_{2},···, *r*_{2n} into 2^{n(1–∈)} space, which is the target size of the obfuscated circuit. To resolve this problem, we will use a Pseudo-Random Generator.

**Pseudo-random generator in** NC^{0}. Recall that a pseudo-random generator is a Boolean function PRG: {0, 1}^{n} → {0, 1}^{m} that takes as input a uniformly random seed sd ∈ {0, 1}^{n}, and produces a polynomially longer output *r* ∈ {0, 1}^{m}, where *m = n*^{1+r} for some constant *r* > 0, such that, *r* is indistinguishable to a uniformly random *m*-bit string to computationally limited adversaries. There are several proposals of pseudo-random generators that can be computed by an NC^{0} function.

Equipped with a pseudo-random generator in NC^{0}, we can replace uniformly random coins *r*_{1}, *r*_{2},···, *r*_{2n} with pseudorandom coins expanded from a much shorter seed sd of length roughly 2^{n/(1+ρ)} = 2^{n(1–∈′)} for some *∈*′ ∈(0, 1).^{d} This gives a variant of the circuit *D* above:

where PRG(sd) expands to output *r*_{1},···, *r*_{2n} and *r _{x}* = PRG

Moving forward, it suffices to devise a way to encrypt *C*, sd into a ciphertext spCT(*C*, sd), from which we can expand out the truth table of *D¢*, while hiding all other information of *C* and sd.

Note that the special encryption only needs to hide (*C*, sd), instead of the entire description of circuit *D′* since RE and PRG are public algorithms. Moreover, the truth table of *D′* can be computed from (*C*, sd) via a function *G*_{RE, PRG} in NC^{0}.

**2.3. Special encryption for NC ^{0} mappings**

In our works,^{18, 19} we constructed the needed special encryption for general NC^{0} mappings *G*: {0, 1}^{l} → {0, 1}^{m}, where ciphertext spCT(*X*) reveals the output of *G* on *X* while hiding all other information of *X*, such that size(spCT(*X*)) *~* size(*X*) + *m*^{1–δ} for some *δ* > 0. In this subsection, we describe the first half of the ideas behind our construction, which connects the special encryption with bilinear pairing groups and a new object called a structured-seed pseudo-random generator. The second half of the ideas are explained in Section 2.5, describing the construction of a structured-seed pseudo-random generator.

**Connection with bilinear pairing groups.** As a starting point, suppose that the function *G* is really simple: Simple enough so that it can be computed by a degree-2 polynomial mapping over the finite field ℤ_{p} for some prime modulus *p.* Namely,

Then, the special encryption can be implemented using bilinear pairing groups, as shown in Ananth and Sahai,^{4} and Lin.^{20}

**Bilinear pairing groups.** At a high-level, pairing groups allows for computing quadratic polynomials over secret encoded values, and reveal only whether the output is zero or not. More specifically, it consists of cyclic groups G_{1}, G_{2}, and G_{T} with generators *g*_{1}, *g*_{2}, and *g _{T}*, respectively, and all of order

- Encode: For every group G
_{i}, one can compute for*x*∈ ℤ_{p.}The group element is viewed as an*encoding*of*x*in the group G_{i.} - Group Operation: In each group G
_{i}, one can perform the group operation to get , corresponding to "homomorphic" addition modulo*p*in the exponent. Following the tradition of cryptography, we write the group operation multiplicatively. - Bilinear Pairing: Given two source group elements and , one can efficiently compute a target group element , using the so-called pairing operation . This corresponds to "homomorphic" multiplication modulo
*p*in the exponent. However, after multiplication, we obtain an element in the target group that cannot be paired anymore. - Zero Testing: Given a group element , one can always tell if the encoded value is
*x*= 0, by comparing the group element with the identity in*G*Similarly, one can do equality testing to see if for any_{i}.*c.*

Combining the above abilities gives a "rudimentary" special encryption scheme that supports the evaluation of degree-2 polynomials. A ciphertext of includes encodings of every element *X _{l}* in both source groups, . Given these, one can "homomorphically" compute a quadratic mapping

This fulfills the correctness of the special encryption. What about security? The ciphertext must hide all information about *X* beyond what's revealed by *Q(x)*, for which we resort to the security of pairing groups. For simplicity of this overview, let's gain some security intuition by assuming the strongest hardness assumptions pertaining to the pairing groups, known as the generic group model. Think of encoding as black boxes and the only way of extracting information (of the encoded values) is by performing (a combination of) the above four operations. In this model, given *g ^{x}* for a secret and random

**Challenges beyond Degree 2.** Unfortunately, the mapping *G*_{RE,PRG} we care about can only be computed by polynomials of a degree much larger than 2. It is known that a Boolean function with output locality *l* can be computed by a multilinear degree *l* polynomial over ℤ_{p.} However, the locality of Boolean PRG we need is at least 5,^{22} and known randomized encodings have locality at least 4.^{6}

**Key idea: The preprocessing model.** To overcome this challenge, our first idea is to use preprocessing of inputs to help reduce the degree of computation. Instead of directly computing *G(X)* in one shot, we separate it into two steps: First, the input is preprocessed *X¢* = pre(*X;* *r*) in a randomized way using fresh random coins *r*, then the output is computed from the preprocessed input *y = Q(X¢*). The idea is that the preprocessing can perform complex transformations on the input in order to help the computation later. The only constraint is that the preprocessing should not increase the size of its input too much, that is, size(*X¢*) ~ si*ze(X)+m*^{1–δ}, for some *δ* > 0. As such, it suffices to encrypt the preprocessed input spCT(*X¢*), from which one can recover the desired output *y* = *G(X)*, by evaluating a (hopefully) simpler function *Q.* Unfortunately, because of the restriction on the size of *X¢*, it is unclear how preprocessing alone can help.

Our second idea is to further relax the preprocessing model to allow the preprocessed input to contain a public part and a secret part *X¢* = (*P, S*). Importantly, the public part *P* should reveal no information about *X* to computationally limited adversaries. (In contrast, *P, S* together reveals *X* completely.) Moreover, we allow the second stage computation *Q(P, S*) to have an arbitrary constant degree in *P* and only restricts its degree on *S* to 2, that is,

where *a _{j,k}, b_{k}, g* are constant degree polynomials over ℤ

In Jain et al.^{18,19} we show how to compute any NC^{0} Boolean function *G* in such a preprocessing model, assuming the Learning Parity with Noises assumption over general fields, which completes the puzzle of *iO.* When applied to specific cryptographic tools, our techniques give interesting new objects. For instance, it converts any PRG in NC^{0} into what we call structured-seed PRG. Given a preprocessed seed *(P, S)* = PRG(sd; *r¢*), the structured-seed PRG expands out a polynomially longer output *r* = sPRG(*P, S)*, where the computation has only degree-2 in the private seed *S*, and the output *r* is pseudo-random given the public seed *P.* In the next section, we describe how to do preprocessing in the context of constructing structured-seed PRG. The same ideas can be extended to handle general NC^{0} functions.

**2.4. Definition of structured-seed PRG**

*Definition 2.* (SYNTAX OF STRUCTURED-SEED PSEUDORANDOM GENERATORS (sPRG).) Let *t* > 1. A structured seed Boolean PRG (sPRG) with polynomial stretch *t*, is defined by the following PPT algorithms:

- IdSamp(1
^{n},*p*): The index sampling algorithm takes as input the input length parameter*n*, and a prime*p.*It samples a function index*I.* - SdSamp(
*I*) jointly samples two strings, a public seed and a private seed, sd = (*P, S*) which are both vectors of dimension*ℓ*_{sd}=*O(n)*over ℤ_{p.} - Eval(
*I*, sd) computes a string in {0, 1}^{m},*m*=*n*^{t}.

Looking ahead, the prime *p*, that we choose is set to the order of the bilinear group which has a bit length of *n*^{Θ(1)}.

*Definition 3.* (SECURITY OF sPRG.) A structured-seed Boolean PRG, sPRG, satisfies security if for any constant *r* > 0, any function *p*: ℕ → ℤ that takes as input a number *k* ∈ ℕ and outputs a *k ^{r}* bit prime

where *I* is sampled by IdSamp(1^{n}, *p*), sd by SdSamp(*I*), and *r* sampled randomly from {0, 1}^{m(n)}.

*Definition 4.* (COMPLEXITY AND DEGREE OF sPRG.) Let *r* > 0, *d*_{1}, *d*_{2} ∈ ℕ be any constant, and *p* : ℕ → ℤ be any function that maps an integer *k* into a *k ^{r}* bit prime

- Eval(
*I, (P, S)*) = (*g*_{I,1}(*P, S*), ···,*g*)), and,_{I,m}(P, S - The maximum degree of each
*g*over_{I,j}*P*is*d*_{1}, and the maximum degree of*g*over_{I,j}*S*is*d*_{2}.

We remark that the above definition generalizes the standard notion of families of PRGs in two aspects: 1) the seed consists of a public part and a private part, jointly sampled and arbitrarily correlated, and 2) the seed may not be uniform. Therefore, we obtain the standard notion as a special case.

*Definition 5.* (PSEUDO-RANDOM GENERATORS, DEGREE, AND LOCALITY.) A (uniform-seed) Boolean PRG (PRG) is an sPRG with a seed sampling algorithm SdSamp(*I*) that outputs a public seed *P* that is an empty string and a uniformly random private seed *S* ← {0, 1}^{n.} Let *d, c* ∈ ℕ. The PRG has multilinear degree *d* if for every *n* ∈ ℕ and *I* in the support of IdSamp(1^{n}), we have that Eval(*I*, sd) can be written as an *m(n*)-tuple of degree-*d* polynomials over ℤ in *S.* It has constant locality *c* if for every *n* ∈ ℕ and *I* in the support of IdSamp(1^{n}), every output bit of Eval(*I*, sd) depends on at most *c* bits of *S.*

In what follows next we will give an overview of our construction of sPRG from the LPN assumption and the existence of PRG in NC^{0}. For ease of exposition, we will actually use Goldreich's PRG candidate^{16} which is the most well-known conjectured PRG in NC^{0}. See Jain et al.^{18} for the construction based on any PRG in NC^{0}.

*Definition 6.* (GOLDREICH'S PRG.) A Goldreich PRG of locality *c* mapping *n* bits to *m* bits is described using a predicate *f* : {0, 1}^{c} → {0, 1} and a hypergraph *H = {Q*_{1}, ···, *Q _{m}*} where each

We start with a Goldreich PRG PRG = (IdSamp, Eval) with stretch *t*, associated with a *d*-local predicate *f* : {0, 1}^{d} → {0, 1} and a randomly sampled hypergraph *H*, i.e., *I* = (*f, H*). Observe that on any input ** σ** ∈{0, 1}

Our first idea toward this is that we can "encrypt" the seed ** s** using LPN samples over ℤ

where *ℓ* is set to be the dimension for the LPN samples. It is set so that is a distribution that samples randomly from *p* with probability *r*, and 0 otherwise. Finally, *r = ℓ*^{–d}.

It follows directly from LPN assumption that *( A, b)* is pseudorandom and hides

Above we first interpret *f _{i}* over ℤ into the field ℤ

The reason *G*^{(1)} is interesting because ** e** is sparse. With probability, 1−2

This gives us a nice candidate for sPRG that satisfies almost all properties! The dimension of *S* and *P* is *O(n)* which is sublinear in *m*, and it can be computed by a degree (deg *d*, deg 2) polynomial *G*^{(1)}. We would be done if we could somehow force the output to be correct on all the *m* coordinates. For the rest of the overview, we refer to the indices *i* ∈[*m*], such that *f _{i}*(

To correct errors, we further modify the polynomial and include more pre-processed information in the private seeds. We describe a sequence of ideas that lead to the final correction method, starting with two wrong ideas that illustrate the difficulties we will overcome.

- The first wrong idea is correcting by adding the difference Corr =
–*y**y**¢*between the correct and erroneous outputs,= Eval*y*_{I}() and*s*¢ = Eval*y*_{I}(+*s*); we refer to Corr as the*e**correction vector.*To obtain the correct output, evaluation can compute the polynomial map . The problem is that Corr must be included in the seed, but it is as long as the output and would destroy expansion. Thus, we have to make use of the fact that Corr is sparse. - To fix expansion, the second wrong idea is adding correction only for bad outputs, so that the seed only stores non-zero entries in Corr. Recall that, Corr is sparse with at most
*T*non-zero elements. More precisely, the*j*'th output can be computed as if output*j*is bad and without adding Corr_{j}otherwise. This fixes expansion, but now the evaluation polynomial depends on the location of bad outputs. If these locations are included in public information, this would leak information about the location of LPN noises, and jeopardize security. If on the other hand the locations are included in the private seed, then it is unclear how to maintain the requirement that the polynomial map computes only a degree two polynomial in the private seed.

These two wrong ideas illustrate the tension between the expansion and security of our sPRG. Our construction takes care of both, by *compressing* the correction vector Corr to be polynomially shorter than the output and stored in the seed, and *expanding* it back during evaluation in a way that is oblivious of the location of bad output bits. This is possible thanks to the sparsity of the correction vector and the allowed degree two computation on the private seed. We first illustrate our idea with the help of a simple case.

**Simple Case 1: Much fewer than** bad **outputs.** Suppose hypothetically that the number of bad outputs is bounded by *z* which is much smaller than . Thus, if we convert Corr into a matrix,^{e} it has low-rank *z.* We can then factorize Corr into two matrixes **U** and **V** of dimensions and , respectively, such that Corr = **UV**, and compute the correct output as follows: ∀*j* ∈ [*m*],

where (*k _{j}, l_{j}*) is the corresponding index of the output bit

While the idea above works for fewer than bad outputs, it does not work for the case we are in. We have *T* = Θ(*mℓ*^{–d}) bad outputs. Nevertheless, we show that a similar idea works for this case.

*T* bad **outputs.** The above method however cannot handle more than bad outputs, whereas the actual number of bad outputs can be up to *T* = Ω(*m/ℓ ^{d}*), much larger than since

where *i _{j}* is the block that output

This completely solves our problem except that we need to ensure that the bad outputs are well spread out in the manner described above. Our main observation here is that this is ensured due to the fact that in a Goldreich's PRG candidate the input dependence hypergraph *Q*_{1}, ···, *Q _{m}* is randomly chosen. Therefore, once we fix the location of the non-zero errors inside

**Step 1: Assign outputs.**We partition the outputs into*B*buckets, via a mapping*f*_{bkt}: [*m*] → [*B*]. The number of buckets is set to*B = m/ℓ*and the number of elements in each bucket is set to be^{d}*ℓ*so that they exactly form a partition of^{d}*m.*The mapping*f*_{bkt}simply divides*m*by*B*, and outputs the remainder. Since the erroris chosen to be from the LPN error distribution and the hypergraph*e**H*of the PRG is randomly chosen, by a Chernoff-style argument, we can show that in each bucket out of*ℓ*output bits, at most^{d}*t*of them are bad, except with probability 1−2^{−tΩ(1)}. We will set*t = ℓ*for a tiny constant^{r}*r*> 0.**Step 2: Compress the buckets.**Next, we organize each bucket*i*into a matrix**M**_{i}of dimension*ℓ*^{d/2}×*ℓ*^{d/2}and then compute its factorization**M**_{i}=**U**_{i}**V**_{i}, where**U**_{i},**V**_{i}are matrices of dimensions*ℓ*^{d/2}×*t*and*t*×*ℓ*^{d/2}, respectively. To form matrix**M**_{i}, we use another mapping*f*_{ind}: [m] → [*ℓ*^{d/2}] × [*ℓ*^{d/2}] to assign each output bit*j*to an index (*k*) in the matrix of the bucket_{j}, l_{j}*i*it is assigned to. This assignment must guarantee that no two output bits in the same bucket (assigned according to_{j}*f*_{bkt}) have the same index. One such way to compute*f*_{ind}(*j*) is to divide*j*∈ [*m*] by*B.*The remainder is set as*f*_{bkt}(*j*), and the quotient is divided further by*ℓ*^{d/2}. The quotient and the remainder from this division are set as the resulting indices (*k*). Once we have this, (_{j}, l_{j}**M**_{i})_{k,l}is set to Corr_{j}if there is*j*such that*f*_{bkt}(*j*) =*i*and*f*_{ind}(*j*) = (*k, l)*. Since every matrix**M**_{i}has at most*t*non-zero entries, we can factor them and compute the correct output as: ∀*j*∈ [*m*],

*G*^{(2)}is expanding, because the number of field elements in**U.**'s and**V**_{i}'s are much smaller than*m*, namely: 2*tℓ*^{δ/2}*B = O(mℓ*^{–δ/2+ρ}) ≪*m.*We set*I¢ = (I,***A**, f_{bkt},*f*_{ind}).**Step 3: Zeroize if uneven buckets.**Finally, to deal with the low probability event that some bucket contains more than*t*bad outputs, we introduce a new variable called flag. If this occurs, our sPRG sets*P*and*S*as all zero vectors. In this case, the evaluation always outputs 0. This gives us our candidate sPRG. For security, observe that the polynomial map*G*^{(2)}is independent of the location of LPN noises. With probability 1−2^{−nΩ(1)}, the evaluation results in outputTherefore, by the LPN over ℤ*y*._{p}assumption, the seedof PRG is hidden and the security of PRG ensures that the output is pseudorandom when it is not all zero (which occurs with a subexponentially small probability). We now proceed to the formal construction and proof.*s*

This completes the overview of our construction of structured seed PRG, as well as *iO*; see Jain et al.^{18, 19} for the formal constructions and security proofs. In conclusion, *iO* is an extremely powerful cryptographic tool and our works have established its feasibility based on well-studied assumptions.

1. WhibOxContest21—CHES 2021 Challenge. 2021. https://contest2021.whibox.io/.

2. Alekhnovich, M. More on average case vs approximation complexity. In *44 ^{th} FOCS* (Oct. 2003), IEEE Computer Society Press, NY, 298–307.

3. Ananth, P., Jain, A. Indistinguishability obfuscation from compact functional encryption. In *CRYPTO 2015, Part I.* R. Gennaro and M. J. B. Robshaw, eds. Volume 9215 of *LNCS* (Aug. 2015). Springer, Heidelberg, 308–326.

4. Ananth, P., Sahai, A. Projective arithmetic functional encryption and indistinguishability obfuscation from degree-5 multilinear maps. In *EUROCRYPT 2017, Part I.* J.-S. Coron and J. B. Nielsen, eds. Volume 10210 of *LNCS* (Apr./May 2017). Springer, Heidelberg, 152–181.

5. Applebaum, B., Avron, J., Brzuska, C. Arithmetic cryptography: Extended abstract. In *ITCS 2015.* T. Roughgarden, ed. (Jan. 2015). ACM, NY, 143–151.

6. Applebaum, B., Ishai, Y., Kushilevitz, E. Cryptography in NC^{0}. In *FOCS* (2004). IEEE Computer Society, NY, 166–175.

7. Barak, B., Goldreich, O., Impagliazzo, R., Rudich, S., Sahai, A., Vadhan, S.P., et al. On the (im)possibility of obfuscating programs. In *CRYPTO 2001.* J. Kilian, ed. Volume 2139 of *LNCS* (Aug. 2001). Springer, Heidelberg, 1–18.

8. Bitansky, N., Vaikuntanathan, V. Indistinguishability obfuscation from functional encryption. In *56 ^{th} FOCS.* V. Guruswami, ed. (Oct. 2015), IEEE Computer Society Press, NY, 171–190.

9. Blum, A., Furst, M.L., Kearns, M.J., Lipton, R.J. Cryptographic primitives based on hard learning problems. In *CRYPTO'93.* D. R. Stinson, ed. Volume 773 of *LNCS* (Aug. 1994). Springer, Heidelberg, 278–291.

10. Boneh, D., Boyen, X., Shacham, H. Short group signatures. In *CRYPTO 2004.* M. Franklin, ed. Volume 3152 of *LNCS* (Aug. 2004). Springer, Heidelberg, 41–55.

11. Boyle, E., Couteau, G., Gilboa, N., Ishai, Y. Compressing vector OLE. In *ACM CCS 2018.* D. Lie, M. Mannan, M. Backes and X. Wang, eds. (Oct. 2018), ACM Press, NY, 896–912.

12. Boyle, E., Couteau, G., Gilboa, N., Ishai, Y., Kohl, L., et al. Correlated pseudorandom functions from variable-density LPN. In *61 ^{st} FOCS* (Nov. 2020). IEEE Computer Society Press, NY, 1069–1080.

13. Diffie, W., Hellman, M.E. New directions in cryptography. *IEEE Trans. Inform. Theory 22*, 6 (1976), 644–654.

14. Garg, S., Gentry, C., Halevi, S., Raykova, M., Sahai, A., Waters, B. Candidate indistinguishability obfuscation and functional encryption for all circuits. In *54 ^{th} FOCS* (Oct. 2013). IEEE Computer Society Press, NY, 40–49.

15. Gilbert, E.N. A comparison of signalling alphabets. *Bell Syst Tech. J. 31*, 3 (1952), 504–522.

16. Goldreich, O. Candidate one-way functions based on expander graphs. *Electron. Colloq. Comput Complexity (ECCC) 7*, 90 (2000).

17. Ishai, Y., Prabhakaran, M., Sahai, A. Founding cryptography on oblivious transfer—efficiently. In *CRYPTO 2008.* D. Wagner, ed. Volume 5157 of *LNCS* (Aug. 2008). Springer, Heidelberg, 572–591.

18. Jain, A., Lin, H., Sahai, A. Indistinguishability obfuscation from well-founded assumptions. In *STOC21: 53 ^{rd} Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21–25, 2021* S. Khuller and V. V. Williams, eds (2021). ACM, NY, 60–73.

19. Jain, A., Lin, H., Sahai, A. Indistinguishability obfuscation from LPN over F_{p}, DLIN, and PRGs in *NC*^{0}. In *EUROCRYPT 2022, Part I.* O. Dunkelman and S. Dziembowski, eds. Volume 13275 of *LNCS* (May/June 2022). Springer, Heidelberg, 670–699.

20. Lin, H. Indistinguishability obfuscation from SXDH on 5-linear maps and locality-5 PRGs. In *CRYPTO 2017, Part I.* J. Katz and H. Shacham, eds. Volume 10401 of *LNCS* (Aug. 2017). Springer, Heidelberg, 599–629.

21. Lin, H., Pass, R., Seth, K., Telang, S. Indistinguishability obfuscation with non-trivial efficiency. In *PKC 2016, Part II.* C.-M. Cheng, K.-M. Chung, G. Persiano and B.-Y. Yang, eds. Volume 9615 of *LNCS* (Mar. 2016). Springer, Heidelberg, 447–462.

22. Mossel, E., Shpilka, A., Trevisan, L. On e-biased generators in NC0. In *44 ^{th} FOCS* (Oct. 2003). IEEE Computer Society Press, NY, 136–145.

23. Regev, O. On lattices, learning with errors, random linear codes, and cryptography. In *STOC* (2005), 84–93.

24. Sahai, A., Waters, B. How to use indistinguishability obfuscation: deniable encryption, and more. In *46 ^{th} ACM STOC.* D. B. Shmoys, ed. (May/June 2014). ACM Press, NY, 475–484.

25. Yao, A.C.-C. How to generate and exchange secrets (extended abstract). In *FOCS* (1986), 162–167.

a. For technical reasons, we need to hardness of these assumptions to be such that no polynomial-time adversaries have beyond sub-exponentially small advantage in breaking the hardness of the underlying problems.

b. This is the step where the additional assumption of Learning With Errors (LWE) is used. However, as we showed in Jain et al.^{19} this assumption is unnecessary for i*O.*

c. This implication, however, comes with a quantitative weakening in the security.

d. More precisely, the length of sd is (2^{n}*s*^{O(1)})^{1/(1+r)}, since each *r _{x}* is a

e. Any injective mapping from a vector to a matrix that is efficient to compute and invert will do.

To view the accompanying Technical Perspective, visit doi.acm.org/10.1145/3632568

The original version of this paper was published in the *Proceedings of the 53 ^{rd} Annual ACM SIGACT Symposium on Theory of Computing* (June 2021), 60–73.

© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Request permission to publish from permissions@acm.org

The Digital Library is published by the Association for Computing Machinery. Copyright © 2024 ACM, Inc.

No entries found