Proof of (IPFS) Content

Olaf van Wijk
4 min readApr 20, 2024

--

Using Risc0’s ZKVM & Zero-Knowledge-Proofs

A bit of background

IPFS is quite prevalent in all things decentralized, from NFT information to entire decentralized websites and even complete libraries.

Hashes are often referenced in smart contracts to reference some state, signed messages, or as stated above, contents of NFT’s. However, that always was the end of it. There was no way to make any claims, on-chain to these references. That had to change!

I have doing Circom ZKP’s for a while now and proving merkle trees is the go to method to proof any kind of inclusion. TBH, for most use-cases this is still FAR superior to using IPFS content references, but……. in the end IPFS content is no different. They are tree’s. Not merkle trees in the traditional sense but trees nonetheless.

Now, a problem we quickly face with commonly used ZKP methods is that we need quite some prior knowledge of the tree structure and proof sizes when we build the circuits. But with IPFS we usually don’t have control on how someone else structures such data.

To the rescue: Risc0, which allows us to use just Rust to build any proof easily using libraries already existing! And for IPFS this mainly means Protobuff interactions!

The result

Well, this is fairly straightforward:

A zero-knowledge proof build with Risc0 and Bonsai as a prover. Capable of proving any sub-selection of bytes that it is part of any IPFS File! Yes! ANY IPFS file! I have tested it with CSV files that are >22GB(!!) big! Want to prove the existence of a single line and reference this inside a smart contract? No problem! Totally possible! (Using Bonsai the entire process takes ~70–140 seconds based on the size of the subselection and if it cross multiple IPFS blocks)

The process

After some trial and error, this turned out to be quite simple:

As the ‘host’ (non ZKP related code, could be any language)

  1. Select the start and end byte position of the IPFS file.
  2. Retrieve the IPFS file with your local IPFS node.
  3. Find IPFS blocks containing the subselection
  4. Produce a proof by hashing according to IPFS’s hashing methods based on protobuff messages.
  5. Transform this proof in a single byte array that allows for ‘branched’ hashing, similar to Merkel proofs.

Using the prover:

Just hash the content and tree in correct order to provide the IPFS encoded content-hash. Return the content hash and subselection of the content as part of the receipt and Voila! There you have it! A proof of content! How cool is that!

The content that goes to the prover has a simple structure but is a single stream of bytes that is recursively hashed like this:

hash(hash(hash(ipfs-block[n.......[subselection]......m], ipfs-blocks.... ), ipfs-blocks...)

Now to be fair, since the structure of IPFS is known, it would be possible to automatically construct a set of fixed circuits that could prove different content sizes but using Risc0 this is a breeze!

Limits & the code

First of all, this test was done using IPFS v0 so the possible use is quite limited as it doesn’t include all the IPFS constructions being used nowadays. This also means it doesn’t do folders, but there is no reason why it shouldn’t be able to do that.

Secondly, this was one of my first times trying out Rust (definitively not the last!), so the code is quite messy, poorly documented, and generally in poor shape. There is not a lot of code so I am pretty sure a seasoned Rust developer can make sense of it! So before you look! Take my sincere apologies but I didn’t have time to make this a nice clean OS library ;)

The code can be found here: https://github.com/ovanwijk/risc0-ipfs

How useful is this?

Not sure, if you have control over the data you want to reference on chain, you will always be better off using more efficient methods. However! If you want to prove something in data structures you have no control over, this is very useful! Want to make a quote on-chain based on the IPFS hosted Wikipedia? Well this got you covered.

If every blockchain would host each of their blocks on IPFS and want to reference contents cross-blockchain? This could actually turn out to be quite useful and used as a basis of cross-chain messaging.

Why publicate this?

I was planning to show this on ETH Istanbul but couldn’t make it last minute. I have no intention to utilize any of this for any product I am working on but it would be a waste to not show this! As I think it is quite a nice concept. The code has just been collecting digital dust since then. However, I might use these concepts in a hackathon when I have time. But for the rest anyone is free to use it as they see fit. If you ever build a multi million dollar project with it, some credits are welcome of course ;).

If you have a cool (hackathon) idea or want me to walk you through the code feel or want to adopt the method? Feel free to drop a DM on twitter: https://twitter.com/ovanwijk

Cheers!

--

--

No responses yet