Hi everyone,
In this discussion, I'm only interested in the size of a block when sent across
the wire during propagation. It's desirable in this context to have small block
sizes (in bytes) so that blocks can traverse the network quickly without
causing too much additional traffic. As most nodes will generally receive the
data included in a transaction twice (as the transaction is propagated and in a
block), some obvious optimizations are possible.
As a reminder, a Grin block is composed of the following:
- A header, including the proof of work, some Merkle-ish roots, a reference to
the previous block hash, etc. All the header is always required.
- A set of inputs as a direct link to previous outputs.
- A set of outputs, containing a commitment and a range proof.
- Transaction kernels, containing a signature, a public key and the explicit
fee.
The largest piece of data by a good margin right now is the range proof (~5 KB
each). Therefore we can introduce a first optimization:
- Block gets propagated sans range proof.
- Upon reception, a node checks whether it has all the range proofs for all
outputs included in the block.
- If any missing, the node requests them from the peer that sent the block.
- Validation can proceed, skipping the range proofs that were already known
(assuming they were already validated by the transaction pool).
The second largest piece of data is composed of the transaction kernels (~100
bytes each). We can't fully avoid sending those as they're not "anchored" on
anything, like the range proof is anchored on an output. However we could send
a shortened version of their hash. In this scheme, when sent across the wire, a
block would only contain the first 16 bytes (or maybe even 12) of the hash of
the whole kernel. Upon reception, an exchange similar to the one for range
proofs would occur. Note that this requires the pool maintains a mapping of
those truncated hashes with the actual kernels.
Beyond this, optimizations become non-trivial. We do *not* want to introduce
some transaction-level hash as transactions are disaggregated in blocks by
design. We could introduce something like inverted bloom filters for inputs or
outputs but those are still rather small (32 to 40 bytes each) and so the
tradeoff of space saving over complexity doesn't seem worth it at this point.
Both optimizations add a couple protocol level messages (request/response for
each data type) and some additional interfaces and logic for the existence
checks on the transaction pool. That creates a little bit of additional
complexity, but stays contained and fairly manageable. The first optimization
is by far the most interesting. I'm thinking of the second one more as a "nice
to have".
Does it all sound reasonable? Am I missing something?
- Igno
--
Mailing list: https://launchpad.net/~mimblewimble
Post to : [email protected]
Unsubscribe : https://launchpad.net/~mimblewimble
More help : https://help.launchpad.net/ListHelp