Background
What are zero-knowledge proofs?
HollowDB utilizes zero-knowledge proofs within to provide a zero-knowledge authentication scheme. So, what are they?
Zero-Knowledge Proofs
Zero-Knowledge Proofs (ZKPs) is a method that allows one to prove that a given statement is true, without revealing any other information other than the statement itself! We usually refer to the proving party as the Prover, and the verifying party as the Verifier.
Example statements are:
“I know the solution to some puzzle”, which the prover must prove without showing the solution itself.
“I know some such that ”, which the prover must prove without revealing what is.
“I know the private key that corresponds to some public key”, which the prover must prove without revealing the private key.
Regarding the last example, if you have been into the Web3 for some time you might think “can’t we do that by signing a message with some private key and use ecrecover
to get the public key?” and you would be right! Indeed, that is a zero-knowledge proof (although a sub-class of it called honest-verifier ZKP).
Anatomy of a ZKP
Looking at a zero-knowledge protocol as a very high-level diagram, we have the following flow:
A prover has some secret inputs (a witness) that they would like to keep secret, and they might also have some public inputs. They feed these into an algebraic circuit, which is really like an electric circuit but instead of electricity, it works on non-negative integers (i.e. elements of a finite field) and you can only do addition and multiplication.
As a result, they get the output of this computation, along with a proof. Note that the output is not really necessary too, you could also have a proof without giving any outputs, which is a way of saying “hey I have ran this circuit that you have told me to, and I got no errors”. An example of this is a Sudoku solution prover circuit, where a public puzzle is provided and the user feeds their secret solution to the circuit. The circuit then makes sure the solution is valid, and basically compiles without failures if indeed it is valid.
Hashing
Before we move on, we also need to describe what “hashing” is, which is used extensively in the zero-knowledge proofs of HollowDB. Hashing is simply a function that takes some arbitrary input, and outputs a fixed-length output. We refer to the input as preimage, and the output as digest or hash.
We expect the following properties from a hash function :
Given some hash such that it should be really hard to find what is. This is called preimage resistance.
Given an input , it should be really hard to find another input such that . This is called second-preimage resistance.
Given two inputs and , it should be really unlikely that . This is called collision resistance.
The output of the hash function should appear “random”, i.e. it should be distributed as even as possible. In doing so, even just a slight change in the input should completely change the output. This is commonly referred to as avalanche effect.
Note that the input size is arbitrary but the output size is fixed, is that a problem for the properties above? Well it certainly could be; however, in practice the output length of these hash functions are pretty big, such as 256-bits or 512-bits. There are possible outputs for a 256-bit output, which is a lot more than the number of atoms in the world.
So how does hashing relate to zero-knowledge proofs? Imagine that you wrote the entire hash function as an arithmetic circuit, and you provide the preimage as the secret input. The output will be the digest, and you will have a proof that you know what preimage resulted in this digest. In other words, you can prove the statement “I know some such that for a publicly known " by simply writing the entire hash function as a circuit.
There are many different hash functions with varying security levels and output lengths, and the most important thing to note is that not all of them are circuit-friendly. What this means is that, some hash functions (e.g. SHA256) are really costly to implement with a circuit. A higher cost means more gates and more constraints, thus requiring a longer proving time and circuit-setup time. Thankfully, there are friendlier hash functions, a well-known one being the Poseidon hash.
Last updated