Hi everyone,

In this discussion, I'm only interested in the size of a block when sent across 
the wire during propagation. It's desirable in this context to have small block 
sizes (in bytes) so that blocks can traverse the network quickly without 
causing too much additional traffic. As most nodes will generally receive the 
data included in a transaction twice (as the transaction is propagated and in a 
block), some obvious optimizations are possible.

As a reminder, a Grin block is composed of the following:

- A header, including the proof of work, some Merkle-ish roots, a reference to 
the previous block hash, etc. All the header is always required.

- A set of inputs as a direct link to previous outputs.

- A set of outputs, containing a commitment and a range proof.

- Transaction kernels, containing a signature, a public key and the explicit 
fee.


The largest piece of data by a good margin right now is the range proof (~5 KB 
each). Therefore we can introduce a first optimization:

- Block gets propagated sans range proof.

- Upon reception, a node checks whether it has all the range proofs for all 
outputs included in the block.

- If any missing, the node requests them from the peer that sent the block.

- Validation can proceed, skipping the range proofs that were already known 
(assuming they were already validated by the transaction pool).


The second largest piece of data is composed of the transaction kernels (~100 
bytes each). We can't fully avoid sending those as they're not "anchored" on 
anything, like the range proof is anchored on an output. However we could send 
a shortened version of their hash. In this scheme, when sent across the wire, a 
block would only contain the first 16 bytes (or maybe even 12) of the hash of 
the whole kernel. Upon reception, an exchange similar to the one for range 
proofs would occur. Note that this requires the pool maintains a mapping of 
those truncated hashes with the actual kernels.

Beyond this, optimizations become non-trivial. We do *not* want to introduce 
some transaction-level hash as transactions are disaggregated in blocks by 
design. We could introduce something like inverted bloom filters for inputs or 
outputs but those are still rather small (32 to 40 bytes each) and so the 
tradeoff of space saving over complexity doesn't seem worth it at this point.

Both optimizations add a couple protocol level messages (request/response for 
each data type) and some additional interfaces and logic for the existence 
checks on the transaction pool. That creates a little bit of additional 
complexity, but stays contained and fairly manageable. The first optimization 
is by far the most interesting. I'm thinking of the second one more as a "nice 
to have".

Does it all sound reasonable? Am I missing something?

- Igno
-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp

Reply via email to