Hi Eric, Thanks, but I get the impression that the similarity is rather superficial.
To address your points: > (1) higher than necessary storage space requirement due to storing the > indexing data required for correlate the spends, and Hmm. No. Spends are simply scanned in the spend-tree (full tree, prunable, fully 5.6gb), or caught by the spend-index (bit index, non-prunable, fully 180mb). Neither impose significant storage requirements. > 2) higher than necessary validation complexity and cost in terms of > computing the spent-ness (including spender height) of an output. > > With the exception of de-linking (not deleted) in the case of reorgs, the > entire store is append only, implemented in a small set of memory > mapped file I guess this is the key difference. As the spend-tree stores the spend information in a tree structure, no reorgs are required, and the resulting code is actually much less complex. Bitcrust simply scans the tree. Although earlier designs used a skip-list, it turns out that accompanied by a spent-index lagging a few blocks behind, raw scanning is faster then anything even though it needs to scan ~5 blocks times ~4000 inputs before reaching the first spent-index, the actual scan is highly cache efficient and little more then a "REP SCASQ", reaching sub-microsecond per input on each core *including* the lookup in the spend index. > I don't follow this part, maybe you could clarify. A spends index > grows with the size of the spend set (forever) as it cannot be pruned, > which certainly exceeds the size of the UTXO set (unless nothing is > spent). The advantage is that you don't have to keep rewriting the > store when you use a spends set (because the store can be append only). My point is, that the spend tree grows per *input* of a transaction instead of per *output* of a transaction, because this is what is scanned on order validation. The spend tree can be pruned because the spend index (~200mb) catches early spends. Disregarding the baseload script validation, the peak load order validation of bitcrust is more negatively effected by a transaction with many inputs than by a transaction of many outputs. I encourage you to check out the results at https://bitcrust.org Regards, Tomas On Fri, Apr 7, 2017, at 01:38, Eric Voskuil wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > On 04/06/2017 03:12 PM, Tomas via bitcoin-dev wrote: > > Hi Tomas, > > > I have been working on a bitcoin implementation that uses a > > different approach to indexing for verifying the order of > > transactions. Instead of using an index of unspent outputs, double > > spends are verified by using a spend-tree where spends are scanned > > against spent outputs instead of unspent outputs. > > This is the approach that genjix used in libbitcoin version2. With the > exception of de-linking (not deleted) in the case of reorgs, the > entire store is append only, implemented in a small set of memory > mapped files. The downsides to the approach are: > > (1) higher than necessary storage space requirement due to storing the > indexing data required for correlate the spends, and > > (2) higher than necessary validation complexity and cost in terms of > computing the spent-ness (including spender height) of an output. > > His implementation used a hash table, so performance-wise it did quite > well and would theoretically outperform a tree, O(1) vs. O(log2(N)). > > > This allows for much better concurrency, as not only blocks, but > > also individual inputs can be verified fully in parallel. > > I was successful in parallelizing input validation (across the inputs > of an unconfirmed tx and across the set of all inputs in a block) > using the v2 store. However, it is not the case that the spends > approach is necessary for concurrency. > > To resolve the above two problems the version3 store does not use a > spends table/index. Nor does it store any table of UTXOs. Yet > validation is highly parallelized. Instead of additional indexes it > uses the tx hash table, augmented with 32 bits per output for spender > height. So there is a O(1) cost of finding the tx and a O(N) cost of > finding the spender height where N is the number of outputs in the tx. > But because the number of outputs in a tx is bounded (by block size) > this is constant time in the number of transactions. > > This works out much faster than the spends table, and without the > storage cost or complexity disadvantages. It also scales with > available hardware, as the memory mapped files become in-memory hash > tables. For low memory machines we found it was important to implement > an opaque UTXO cache to limit paging, but for higher end systems zero > cache is optimal. > > > I am sharing this not only to ask for your feedback, but also to > > call for a clear separation of protocol and implementations: As > > this solution, reversing the costs of outputs and inputs, seems to > > have excellent performance characteristics (as shown in the test > > results), updates to the protocol addressing the UTXO growth, might > > not be worth considering *protocol improvements* and it might be > > best to address these concerns as implementation details. > > I don't follow this part, maybe you could clarify. A spends index > grows with the size of the spend set (forever) as it cannot be pruned, > which certainly exceeds the size of the UTXO set (unless nothing is > spent). The advantage is that you don't have to keep rewriting the > store when you use a spends set (because the store can be append only). > > Feel free to message me if you'd like to discuss in more detail, or to > continue on the libbitcoin mailing list (copied). > > e > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.22 (GNU/Linux) > > iQEcBAEBCAAGBQJY5tFpAAoJEDzYwH8LXOFOcMgH/2mw5iOvUYNwvZ2z0KKTSUOA > Pd8d5mKoWvd94QxhQ+RyTbkEkMhHl75+zcBgRsfUTtZlBIe/Z0+OgVIN6ibEw+WD > w7k3HqgQi9gLgydEelxTAX+z3dJ24n4kCCdKAmZbBuK+Yr/7AViugbEqYemKepku > pRWZZS74MUvrYesc0xPn4Ao3DTzMjjY0K2mkuqV8jlwdfZjlAQX9pTx+iSCuMhkd > HJ8w7s8QnjVnUeOlLe29mZwaFJPyOTLJMqgDE6s2sXacAy5QQbVCatygvDQ8A/wC > ktBnKPFb2lGX3bGKu/KwABegBy/hyec+NP0wFR+0MVivCwTK1+SjeHu5MNOSVlM= > =tfVj > -----END PGP SIGNATURE----- _______________________________________________ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev