🎉 Gate.io Growth Points Lucky Draw Round 🔟 is Officially Live!
Draw Now 👉 https://www.gate.io/activities/creditprize?now_period=10
🌟 How to Earn Growth Points for the Draw?
1️⃣ Enter 'Post', and tap the points icon next to your avatar to enter 'Community Center'.
2️⃣ Complete tasks like post, comment, and like to earn Growth Points.
🎁 Every 300 Growth Points to draw 1 chance, win MacBook Air, Gate x Inter Milan Football, Futures Voucher, Points, and more amazing prizes!
⏰ Ends on May 4, 16:00 PM (UTC)
Details: https://www.gate.io/announcements/article/44619
#GrowthPoints#
Buterin: Insights into cross-L2 reads for wallets and other use cases
Author: Vitalik Buterin Compilation: Vernacular Blockchain
In my previous article on the three shifts, I outlined some of the key reasons why it is valuable to start explicitly thinking about L1 + cross-L2 support, wallet security, and privacy as necessary fundamental features of the ecosystem stack Things are built as plugins that can be individually designed by a single wallet.
This post will focus more directly on the technical aspects of a particular subproblem, such as: how to more easily read L1 from L2, L2 from L1, or L2 from another L2. Addressing this issue is critical to enabling asset/keystore separation architectures, but it also has valuable use cases in other areas, most notably optimizing reliable cross-L2 call chains, including use cases such as moving assets between L1 and L2 . This article catalog What is the goal? What does cross-chain proof look like? What kind of proof scheme can we use? Merkle proof ZK SNARKs Dedicated KZG certificate Verkle tree proof polymerization direct status read How does L2 learn the most recent Ethereum state root? Wallets on non-L2s chains privacy protection Summarize
1. What is the goal?
Once L2 becomes mainstream, users will own assets across multiple L2s, and possibly L1s as well. Once smart contract wallets (multisig, social recovery, or otherwise) become mainstream, the keys needed to access certain accounts will change over time, and older keys will no longer need to be valid. Once those two things happen, users will need a way to change the keys that have access to many accounts located in many different places without making an extremely large number of transactions. In particular, we need a way to deal with counterfactual addresses: addresses that have not been “registered” on-chain in any way, but still need to receive and hold funds securely. We all rely on counterfactual addresses: when you first use Ethereum, you can generate an ETH address that someone can use to pay you without "registering" the address on-chain (this requires paying a txfee, So already hold some ETH). For EOA, all addresses start with a counterfactual address. With smart contract wallets, counterfactual addresses are still possible thanks in large part to CREATE2, which allows you to have an ETH address that can only be filled by a smart contract with a code that matches a specific hash.
EIP-1014 (CREATE2) Address Calculation Algorithm
However, smart contract wallets introduce a new challenge: the possibility of access key changes. This address is a hash of the initcode and can only contain the wallet's initial verification key. The current verification key will be stored in the wallet's storage, but that storage record will not magically propagate to other L2s. If a user has many addresses on many L2s, including addresses not known to the L2 they are on (because they are counterfactual), then there seems to be only one way to allow users to change their keys: an asset/keystore separation architecture. Each user has (i) a "keystore contract" (either on L1 or one specific L2) that stores verification keys for all wallets and rules for changing keys, and (ii) "wallet contracts" that read across chains for verification keys.
There are two ways to achieve this:
Lightweight version (only checks for updated keys): Each wallet stores the verification key locally and contains a function that can be called to check the cross-chain proof of the current state of the keystore and update its locally stored verification key key to match. When the wallet is first used on a particular L2, this function must be called to get the current authentication key from the keystore. -Pros: Cross-chain proofs are used sparingly, so it doesn't matter if cross-chain proofs are expensive. All funds can only be used with the current key, so it remains safe.
2. What does cross-chain proof look like?
To demonstrate the full complexity, we will explore the most difficult case: keystore on one L2, wallet on a different L2. Only half of this design is needed if the keystore on the wallet is on L1.
Assume the keystore is on Linea and the wallet is on Kakarot. A full proof of the wallet key includes:
3. What types of proof schemes can we use?
There are five main options: -Merkle proof -General ZK-SNARKs
"Aggregation" refers to the aggregation of all proofs provided by users within each block into one large meta-proof that combines all proofs. This is possible with SNARK and KZG, but not with Merkle forks (you can combine Merkle forks a bit, but it only saves you log(txs per block)/log(total number of keystores), in practice It's probably 15-30% in the middle, so it's probably not worth the price). Aggregation only becomes worthwhile when the scenario has a large number of users, so in practice a version 1 implementation could omit aggregation and implement it in version 2.
4. How will Merkle proofs work?
This one is easy: just follow the diagram in the previous section. More precisely, each "proof" (assuming the case of proving the maximum difficulty of one L2 to another L2) will contain: A Merkle branch attesting to the state root of L2 holding the keystore, given the latest state root of Ethereum known to L2. The keystore holds L2's state root stored in a known storage slot at a known address (the contract on L1 represents L2), so the path through the tree can be hardcoded. A Merkle branch attesting to the current verification key, given the state root of L2 holding the keystore. Here again, the authentication key is stored in a known storage slot at a known address, so the path can be hardcoded. Unfortunately, Ethereum Proofs of State are complex, but libraries exist for validating them, and if you use those libraries, this mechanism is not too complicated to implement. **The bigger issue is cost. **Merkle proofs are very long, unfortunately the Patricia tree is ~3.9 times longer than necessary (to be exact: an ideal Merkle proof for a tree holding N objects is 32 * log2(N) bytes long, and because Ethereum's Patricia trees have 16 leaves per child, and the proof of these trees is 32 * 15 * log16(N) ~= 125 * log2(N) bytes long). In a state with about 250 million (~2²⁸) accounts, this makes each proof 125 * 28 = 3500 bytes, or about 56,000 gas, plus the additional cost of decoding and verifying the hash. Together, the two proofs would end up costing around 100,000 to 150,000 gas (if used per transaction, excluding signature verification) - much higher than the current base price of 21,000 gas per transaction. However, if the proof is verified on L2, the discrepancy gets worse. Computation inside L2 is cheap because the computation is done off-chain and in an ecosystem with far fewer nodes than L1. On the other hand, data must be published to L1. So the comparison isn't 21k gas vs 15k gas; it's 21k L2 gas vs 100k L1 gas. We can calculate what this means by looking at the comparison between the L1 gas cost and the L2 gas cost:
Currently, L1 is 15-25 times more expensive than L2 for simple sending and 20-50 times more expensive for token swaps. Simple sending is relatively large amount of data, but the calculation amount of exchange is much larger. Therefore, the swap is a better benchmark for the approximate cost of L1 computation versus L2 computation. Taking all of this into account, if we assume a 30x cost ratio between L1 computational cost and L2 computational cost, this seems to imply that the cost of putting a Merkle proof on L2 could be equivalent to 50 regular transactions. Of course, using binary Merkle trees reduces the cost by a factor of ~4, but even then, in most cases, the cost is too high - if we are willing to sacrifice no longer compatibility with Ethereum's current hexagonal state tree, we Look for better options.
5. How will the ZK-SNARK proof work?
The use of ZK-SNARKs is also conceptually easy to understand: you just replace the Merkle proofs in the diagram above with ZK-SNARKs proving the existence of those Merkle proofs. A ZK-SNARK requires ~400,000 GAS for computation, which takes about 400 bytes (compared to: a basic transaction requires 21,000 gas and 100 bytes, which can be reduced to ~25 words after compression in the future Festival). Therefore, from a computational point of view, the cost of ZK-SNARKs is 19 times the cost of today's basic transactions, and from a data point of view, the cost of ZK-SNARKs is 4 times that of today's basic transactions and 16 times the cost of future basic transactions times. Those numbers are a big improvement over the Merkel proof, but they're still pretty expensive. There are two ways to improve on this: (i) special purpose KZG proofs, or (ii) aggregation, similar to ERC-4337 aggregation but using fancier math. We can study both at the same time.
6. How will special-purpose KZG proofs work?
Warning, this section is more mathematical than the others. This is because we're moving beyond general purpose tools, and building some cheaper special purpose ones, so we have to go "under the hood" more. If deep math isn't your thing, skip straight to the next section. First, a recap of how KZG commitments work: We can represent a set of data [D_1...D_n] using the KZG proof of a polynomial derived from the data: specifically, the polynomial P, where P(w) = D_1, P(w²) = D _2 ... P(wⁿ) = D_n. w here is the "uniform root", the value of wN = 1 for some evaluation domain size N (this is all done in finite fields). To "commit" to P, we create an elliptic curve point com(P) = P₀ * G + P₁ * S₁ + ... + Pk * Sk. here: -G is the generator point for the curve -Pi is the ith coefficient of the polynomial P -Si is the ith point in the trusted setup To prove that P(z) = a, we create a quotient polynomial Q = (P - a) / (X - z), and create a commitment com(Q). It is only possible to create such a polynomial if P(z) is actually equal to a. To verify the proof, we check the equation Q * (X - z) = P - a by performing an elliptic curve check on the proof com(Q) and the polynomial commitment com(P): we check e(com(Q), com( X - z)) ? = e(com(P) - com(a), com(1)) Some key properties to know include:
Therefore, a full proof requires:
7. How will the Verkle tree work?
Verkle trees essentially involve stacking together KZG commitments (or IPA commitments, which can be more efficient and use simpler encryption): to store 2⁴⁸ values, you make a KZG commitment to a list of 2²⁴ values, each of which is itself a KZG Commitment to 2²⁴ Value. Verkle trees are strongly considered for Ethereum state trees, because Verkle trees can be used to hold key-value mappings, not just lists (basically, you can create a tree of size 2²⁵⁶ but start empty, only if you Only fill specific parts of the tree when you actually need to fill them).
What a Vicker tree looks like. In fact, you can give each node a width of 256 == 2⁸ for IPA-based trees, and 2²⁴ for KZG-based trees. Proofs in Verkle trees are somewhat longer than in KZG; they can be hundreds of bytes long. They are also difficult to verify, especially if you try to aggregate many proofs into one. Actually, Verkle trees should be considered like Merkle trees, but more feasible without SNARKing (because of lower data cost), and SNARKing is cheaper (because of lower proof cost). The biggest advantage of Verkle trees is that they can coordinate data structures: Verkle proofs can be used directly for L1 or L2 states, have no superposition structure, and use exactly the same mechanism for L1 and L2. Once quantum computers become a problem, or once Merkle branching proves to be sufficiently efficient, Verkle trees can be replaced in-place by binary hash trees with suitable SNARK-friendly hash functions.
8. Aggregates
If N users making N transactions (or more realistically, N ERC-4337 UserOperations) need to prove N cross-chain claims, we can save a lot of money by aggregating these proofs: combining these transactions into a block or The builder of the bundle that goes into the block can create a proof that proves all of these claims at the same time. This could mean:
Diagram from the BLS wallet implementation post showing the workflow for BLS aggregate signatures in an early version of ERC-4337. The workflow for aggregating cross-chain proofs may look very similar.
9. Direct status reading
A final possibility, also only available for L2 reading L1 (and not L1 reading L2), is to modify L2 so that they directly make static calls to contracts on L1. This can be done with opcodes or precompilation, which allows calls to the L1, where you provide the target address, gas and calldata, and it returns the output, but since these calls are static, they can't actually change any L1 state. L2 has to know that L1 has already processed the deposit, so there's nothing at all preventing such a thing from being implemented; this is mostly a technical implementation challenge (see: this from an optimistic RFP to support static calls to L1). Note that if the keystore is on L1, and L2 integrates L1 static calls, no attestation is required at all! However, if L2 does not integrate L1 static calls, or if the keystore is on L2 (which may eventually have to be done once L1 becomes too expensive for users to use even a little bit), then proof will be required.
10. How does L2 learn the most recent Ethereum state root?
All of the above schemes require L2 to access the nearest L1 state root or the entire nearest L1 state. Fortunately, all L2s already have some functionality to access the latest L1 state. This is because they need such functionality to handle messages from L1 to L2, especially deposits. In fact, if L2 has a deposit function, then you can use that L2 as is to move the L1 state root into a contract on L2: just have the contract on L1 call the BLOCKHASH opcode, and pass it to L2 as a deposit message . The full block header can be received on the L2 side and its state root extracted. However, each L2 preferably has an explicit way to directly access the full latest L1 state or the nearest L1 state root. The main challenge in optimizing the way L2 receives the latest L1 state root is to achieve security and low latency at the same time:
Blocks of the L2 chain can depend not only on previous L2 blocks, but also on L1 blocks. If L1 recovers over such a link, L2 will recover as well. It's worth noting that this is also how earlier (pre-Dank) versions of sharding were envisioned to work; see here for the code.
11. How much connection to Ethereum does another chain need to hold a keystore rooted in Ethereum or L2 wallet?
Surprisingly, not that many. It doesn't even need to be a rollup actually: if it's L3 or validation, it's fine to keep a wallet there, as long as you hold the keystore on an L1 or ZK rollup. What you really need is the chain to have direct access to the Ethereum state root, and a technical and social commitment to be willing to reorganize if Ethereum reorganizes, and to hard fork if Ethereum hard forks. An interesting research question is to determine the extent to which a chain can have this form of connection with multiple other chains (eg. Ethereum and Zcash). Doing this naively is possible: if Ethereum or Zcash reorganizes, your chain can agree to reorganize (and hard fork if Ethereum or Zcash hard forks), but your node operators and your community usually have two times technological and political dependencies. Therefore, this technique can be used to connect to some other chains, but at an increased cost. ZK bridge-based schemes have attractive technical properties, but their key weakness is that they are not robust to 51% attacks or hard forks. There may be smarter solutions as well.
12. Privacy protection
Ideally, we also want to preserve privacy. If you have many wallets managed by the same keystore, then we want to ensure that:
The main challenge with this approach currently is that aggregation requires the aggregator to create a recursive SNARK, which is currently very slow. With KZG, we can use this work (see also: a more formal version of this work in the Caulk paper) on non-indexed revealing KZG proofs as a starting point. However, aggregation of blind proofs is an open problem that requires more attention. Unfortunately, reading L1 directly from within L2 does not preserve privacy, although implementing direct read functionality is still very useful, both to minimize latency and because of its utility for other applications.
13. Summary
To have a cross-chain social recovery wallet, the most realistic workflow is a wallet that maintains a keystore in one location, and a wallet that maintains wallets in many locations, where the wallet reads the keystore to (i) update its authentication key locally view, or (ii) in the process of validating each transaction. A key element that makes this possible is cross-chain proofs. We need to work on optimizing these proofs. ZK-SNARKs or custom KZG solutions awaiting Verkle's proof seem to be the best options. In the long run, an aggregation protocol (where bundlers generate aggregated proofs as part of creating a bundle of all user actions submitted by users) is necessary to minimize costs. This should probably be integrated into the ERC-4337 ecosystem, although changes to ERC-4337 may be required. L2 should be optimized to minimize the latency of reading the L1 state (or at least the state root) from within L2. It is ideal for L2s to read L1 state directly, saving proof space. Wallets can not only be on L2; you can also put wallets on systems with a lower level of connectivity to Ethereum (L3, or even just agree to include the Ethereum state root and independent chain of forks). However, the keystore should be on L1 or high security ZK rollup L2. Using L1 saves a lot of complexity, but even that can be too expensive in the long run, so a keystore needs to be used on L2. Preserving privacy will require extra work and make certain choices more difficult. But we should probably turn to privacy-preserving solutions anyway, at least making sure that whatever we come up with is privacy-preserving compatible.