03/13

BrownAppliedCryptography · Mar 14, 2024 · 6e5bcb4 · 6e5bcb4
1 parent 30212ca
commit 6e5bcb4
Show file tree

Hide file tree

Showing 3 changed files with 227 additions and 1 deletion.
diff --git a/lectures/2024-03-13.tex b/lectures/2024-03-13.tex
@@ -0,0 +1,226 @@
+%!TEX root = ../notes.tex
+% Scribe: Sudatta Hor
+\section{March 13, 2024}
+\label{20240313}
+
+This lecture we cover Succinct Non-Interactive Arguments (SNARGs) and introduce secure multi-party computation.
+
+\subsection{Circuit Satisfiability}
+Recall the circuit satisfiability problem. The language considers an arbitrary boolean circuit which consists of \textsf{AND}, \textsf{XOR} gates. The input are certain values $x$ for input values, and witnesses $w$ are the rest of the wires. The satisfiability problem is whether there exists some $w$ to make the circuit evaluate to $1$. Since the input can be any boolean circuit, this is adaptable and widely used in implementation.
+
+
+This circuit model is considered a lot.
+\begin{example}[Pre-Image of Hash Function]
+    The function is $\mathrm{C}(x,w) = H(w) - x + 1$. The circuit will output $1$ on $w$ such that $H(w) = x$. $w$ here is the pre-image of $x$.
+
+    This allows us to, say, represent SHA as a boolean circuit to prove the pre-image of a hash function.
+\end{example}
+
+The intuition of the zero-knowledge proof is similar. Let's say the prover has some input values. The prover will commit to the bit of every wire.
+
+For example, when a verifier asks to confirm a certain \textsf{XOR} gate, the prover will perform a \emph{small} zero-knowledge proof to prove that that gate was computed correctly. Composing commitments and using sigma protocols from before will allow us to gain the functionality we want.
+
+Let's say
+\begin{align*}
+    c_1 = \mathrm{Com}(x) \\
+    c_2 = \mathrm{Com}(y) \\
+    c_3 = \mathrm{Com}(z)
+\end{align*}
+and $x = y\oplus z$. Using a sigma-\textsf{OR} protocol, we can prove
+\[(y = 0, z = 0, x = 0)\ \mathsf{OR}\ (y = 0, z = 1, x = 1)\ \mathsf{OR}\ \cdots\]
+This allows us to do ZK-proofs for circuit satisfiability.
+
+\subsubsection{Proof Systems for Circuit Satisfiability}
+We discuss the proof systems so far for circuit satisfiability.
+
+\begin{center}
+    \begin{tabular}{c|c|c|c}
+        & NP & $\Sigma$-Protocol & (Fiat-Shamir) NIZK \\
+        Zero-Knowledge & No & Yes & Yes\\
+        Non-Interactive & Yes & No & Yes\\
+        Communication & $O(|w|)$ & ? & ? \\
+        Verifier's computation & $O(|C|)$ & ? & ?\\
+    \end{tabular}
+\end{center}
+
+The na\"ive proof is to reveal witness $w$. This is not zero-knowledge, but is non-interactive. Using $\Sigma$-protocols, we have zero-knowledge but not non-interaction. Using the Fiat-Shamir heuristic, we get both zero-knowledge and non-interaction.
+
+For the easiest \textsf{NP} proof, communication requires $O(|w|)$ complexity and the verifier verifies in $O(|c|)$ (linear in number of gates) complexity. For $\Sigma$-protocols, communication requires a commitment to each wire, which is $O(|c|\cdot \lambda)$ (needs a factor of $\lambda$ security parameter), and the verifier also verifies in $O(|c|\cdot \lambda)$. This is the same for NIZK.
+
+\emph{Can we make this proof system more succinct?} Can we have communication and verification complexity to be \emph{sublinear} in $|c|$ and $|w|$?
+
+\subsection{Succinct Non-Interactive Argument (SNARG)}
+This brings us to succinct arguments, which are seemingly not quite possible.
+\begin{definition}[Succinct Non-Interactive Arguments]
+    A non-interactive proof/argument system is \ul{succinct} if
+    \begin{itemize}
+        \item The proof $\pi$ is of length $|\pi| = \mathrm{poly}(\lambda, \log |c|)$.
+        \item The verifier runs in time $\mathrm{poly}(\lambda, |x|, \log|c|)$.
+    \end{itemize}
+\end{definition}
+Additionally, SNARKs are Succinct Non-Interactive Arguments of Knowledge. A zk-SNARG or zk-SNARK additionally guarantees zero-knowledge property.
+
+\emph{Why succinct proofs?} Here are some examples where we might want succinct proofs.
+\begin{example}[Verifiable Computation]
+    The client sends some $x$ to the server, along with function $f$. The server sends back $y = f(x)$ and a proof. The client wants to check if the computation was done correctly.
+
+    \pseudocodeblock{
+        \textbf{Server} \< \< \textbf{Client}\\
+            \< \sendmessageleft*{x} \< \\
+            \< \sendmessageleft*{\text{compute }f} \< \\
+            \< \sendmessageright*{y} \< \text{Check }y = f(x)
+    }
+
+    If we did not have succinct proofs, then the client would still have to run the function again to verify the output. Note this allows interactions, so this is not the go-to example.
+\end{example}
+\begin{example}[Anonymous Transactions on Blockchains]
+    We think of the blockchain as a public ledger. Say Alice wants to send 2 Bitcoin to Bob, Alice will sign the transaction using her signing key and add that transaction onto the ledger. All transactions are public, you know which addresses sent to which addresses.
+
+    \pseudocodeblock{
+        \textbf{Alice's Account A} \< \< \textbf{Bob's Account B}\\
+        \text{vk}_A \text{ (public)}, \text{sk}_A \text{ (private)} \< \sendmessageright*{2 \text{BTC (Bitcoin)}} \< \text{vk}_B \text{ (public)}, \text{sk}_B \text{ (private)}\\
+        \< \< \\[][\hline]
+        \textit{Transaction}\< \< \\
+        \sigma = \text{Sign}_{\text{sk}_A} (\text{vk}_A, \text{vk}_B, \text{2 BTC}) \< \sendmessageright*{\sigma}\< \\
+        \< \< \\[][\hline]
+        \textit{Anonymous Transaction} \< \< \\
+        \text{Com}(\sigma) \< \sendmessageright*{ }\< \\
+        \text{NIZK: valid transaction} \< \< \\
+    }
+
+    There's a lot of work to make transactions anonymous. We'll hide a transaction and hide it, and use a NIZK to prove that it is a valid transaction. We want these proofs to be non-interactive and succinct (we don't want users to spend too long doing verification). This is a major application of SNARK and zk-SNARK
+\end{example}
+
+\emph{Is it possible?} This remains as the large problem. Even in the na\"ive \textsf{NP} situation, we need to send the entire witness $w$ and check the entire witness.
+
+Enter probabilistically checkable proofs (PCP):
+
+The prover prepares a proof and the verifier will only need to check certain bits of the proof.
+
+\begin{theorem}[PCP Theorem, Informally]
+    Every \textsf{NP} language has a PCP where the verifier reads only a \emph{constant} number of bits of the proof, to gain constant soundness.
+\end{theorem}
+
+The intuition is for the prover to commit the entire proof, the verifier checks certain bits, and the prover opens commitments.
+
+
+The problem with this is that the first round message is not succinct (the commitment is just as long).
+
+Instead of committing linearly, we'll use a Merkle Tree, and only send the commitment/hash of the root node. We build up a binary tree where each node is the hash of its branches. Opening particular bits, the prover will send the root-to-leaf path along with siblings to prove that this opening was correct. This size will grow logarithmically with the size of the tree.
+
+\Graphic{images/2023-03-16/merkle.png}{0.7}
+
+We hash values in a tree format, with each parent node being the hash of its children. We only send the \emph{root note}. Whenever the verifier requests a certain bit, we send the path from the root to the bit (revealing all hashes, and siblings) to verify that this is indeed.
+
+It's very difficult to change any bit. If we changed a bit, at some point up the path of the tree we'll have found a collision for a hash. That is to say, a specific bit being correct is predicated on whether the path to the root is valid and the root hash matches.
+
+\emph{Can we make this hiding?} Right now, we don't guarantee the hiding property. If we only had one layer, every bit would be revealed. How can we modify this algorithm to ensure that each bit is hiding?
+
+One solution would be to add a random string $r$ as a sibling to every leaf. However, this would require us to reveal all siblings when we're verifying a certain leaf node. We can easily modify this to \emph{salt} \emph{every} leaf node. We can add some random $r_i$ to the hash of \emph{every} bit that hides those bits.
+
+Now, instead of sending a commitment of the entire proof, we send a Merkle Tree of the commitment of the proof. Then, when requested for certain bits $i, j, k$, we'll open those commitments as paths on the tree.
+
+\emph{Is this zero-knowledge?} Note that in the PCP theorem, we did not have the zero-knowledge property. Our solution is that when opening commitments, we can instead provide ZK proofs for our `reveals' instead of the actual bits themselves. Asymptotically, this still preserves our succinctness property.
+
+Theoretically, this lets us construct zk-SNARGs. In practice, there are more efficient ways to construct them, but we will not cover them now.
+
+\subsection{Secure Multi-Party Computation}
+
+\subsubsection{2-Party Computation}
+
+We've seen this before, but it is when some parties want to compute the output of some function on their individual inputs, without revealing their own inputs.
+
+\begin{example}
+    Alice and Bob just returned from a date, and want to figure out if they each want a second date. Alice has some choice bit $x$ and Bob some choice bit $y$. They want to jointly compute $f(x, y) = x\land y$.
+\end{example}
+\begin{example}
+    Alice and Bob want to compare riches (who is richer?). They compute
+    \[f(x, y) = \begin{cases}
+            0 & \text{if }x>y    \\
+            1 & \text{otherwise}
+        \end{cases}\]
+\end{example}
+\begin{example}
+    Alice and Bob meet for the first time and want to see if they have friends in common. They have sets of friends $X,Y$, and compute
+    \[f(X, Y) = X\cap Y.\]
+    There are variants of this which only give cardinality of $X\cap Y$, etc.
+\end{example}
+
+In general, this is when two parties have inputs $x, y$ and want to compute some function $f(x, y)$ on them.
+
+\pseudocodeblock{
+    \textbf{Alice}\< \< \textbf{Bob}\\
+    \text{Knows x} \< \< \text{Knows y}\\
+    \< \< \\
+    \< \sendmessageright*{\text{Does computation}} \< \\
+    \< \sendmessageleft*{ }\< \\
+    \< \sendmessageright*{ }\< \\
+    \< \sendmessageleft{bottom={z=f(x,y)}}\< \\
+}
+
+Use cases include:
+\begin{itemize}
+    \item Password breach alert (Chrome/Firefox/Azure/iOS Keychain) runs a set intersection on your passwords and server leaked passwords.
+    \item Privacy-preserving contact tracing for COVID-19 (Apple and Google). We want to know if we have contact but not who had contact with.
+    \item Ads conversion measurements/personalized advertising (Google/Meta). We want to match conversions without either party knowing who converted.
+\end{itemize}
+
+\subsubsection{Multiple Parties!}
+The general case of this is Secure Multi-Party Computation (MPC)
+
+\Graphic{images/2023-03-16/mpc.png}{0.4}
+
+This is when we have some parties $P_i$ and want to compute input on $f(x_i, \cdots, x_n)$.
+
+Here are some applications:
+\begin{itemize}
+    \item Privacy-Preserving Inventory Matching (J.P. Morgan). We have stocks we want to buy/sell but we don't want to reveal which exact ones.
+    \item Setup Ceremony to securely generate CRS (Zcash).
+    \item Distribute Key Management (Unbound / Coinbase). Instead of holding a secret key on a single device, distribute it among multiple devices/servers. 
+    \item Federated learning (used in Google Keyboard Search Suggestion). We want to run machine learning, federated amongst multiple devices. However, we don't want to leak the actual training data from users.
+    \item Auctions (Danish sugar beet auction). Nobody should reveal their bid in the clear.
+    \item Also deployed in Boston area to analyze the wage gap between genders without revealing the individual salaries.
+\end{itemize}
+Some applications are still in the works:
+\begin{itemize}
+    \item Study/Analysis on Medical Data. Every institution has limited data, but they cannot openly share that data due to regulations. How could they jointly do analysis on this data without revealing the data.
+    \item Fraud Detection (banks). Users might have cards at multiple banks, they want to jointly detect fraud but do not want to share their transactions.
+\end{itemize}
+
+When we normally talk about cryptography, we talk about `slowing down' the system (crypto makes everything slower). In the case of MPC, though, we've enabled new features that were not otherwise possible without these tools.
+
+\subsection{Setting}
+Our setting is that we have $n$ parties $P_1, \dots, P_n$ with private inputs $x_1, \dots, x_n$. They want to jointly compute $f(x_1, \dots, x_n)$.
+
+In terms of communication infrastructure: we usually assume point-to-point channels between each pair $(p_i, p_j)$. We know how to do this (key exchange, authenticated encryption, etc). Sometimes, we also assume a reliable broadcast channel where every other party gets information.
+
+There is a single adversary that can ``corrupt'' a subset of the parties (at most $t$).
+
+\emph{What properties do we want out of this system?} Here are some common security properties we might want:
+\begin{description}
+    \item[Correctness.] The function is computed correctly.
+    \item[Privacy.] Only the output is revealed.
+    \item[Independence of Inputs.] Parties cannot choose their inputs depending on others' inputs.
+\end{description}
+Also with security guarantees:
+\begin{description}
+    \item[Security with Abort.] The adversary may ``abort'' the protocol. This prevents honest parties from receiving the output. This is the weakest model.
+    \item[Fairness.] If one party receives the output, then all parties will receive the output.
+    \item[Guaranteed Output Delivery (GOD):] Honest parties \emph{always} receive output. Even if adversarial parties leave, the honest parties will simply continue the protocol.
+\end{description}
+
+We also have some characterizations of adversaries:
+\begin{itemize}
+    \item Allowed adversarial behavior:
+          \begin{itemize}
+              \item Semi-honest (or passive/honest-but-curious): They follow the protocol description honestly, but they try to extract more information by inspecting the transcript. This is the weaker model.
+              \item Malicious/active: These adversaries can deviate arbitrarily from protocol description.
+          \end{itemize}
+    \item Adversary's computing power:
+          \begin{itemize}
+              \item Unbounded computing power: this gives us information-theoretic (IT) security.
+              \item PPT bounded: this gives us computational security.
+          \end{itemize}
+\end{itemize}
+
+If you're interested, you can look into the literature of how to define security for MPCs. The idea is similar to that of ZK proofs---everything an adversary can do (see the transcript) can be simulated by a simulator who only has the input and output.
diff --git a/notes.pdf b/notes.pdf
diff --git a/notes.tex b/notes.tex
@@ -97,7 +97,7 @@
 \include{lectures/2024-03-04.tex}
 \include{lectures/2024-03-06.tex}
 \include{lectures/2024-03-11.tex}
-% \include{lectures/2023-03-14.tex}
+\include{lectures/2024-03-13.tex}
 % \include{lectures/2023-03-16.tex}
 % \include{lectures/2023-03-21.tex}
 % \include{lectures/2023-03-23.tex}