Hi Bruno, 1. Isn't this exactly the same issue that WriteBatchWithIndex transactions have, whereby exceeding (or likely to exceed) configured memory needs to trigger an early commit?
2. This is one of my big concerns. Ultimately, any approach based on cracking open RocksDB internals and using it in ways it's not really designed for is likely to have some unforseen performance or consistency issues. 3. What's your motivation for removing these early commits? While not ideal, I think they're a decent compromise to ensure consistency whilst maintaining good and predictable performance. All 3 of your suggested ideas seem *very* complicated, and might actually make behaviour less predictable for users as a consequence. I'm a bit concerned that the scope of this KIP is growing a bit out of control. While it's good to discuss ideas for future improvements, I think it's important to narrow the scope down to a design that achieves the most pressing objectives (constant sized restorations during dirty close/unexpected errors). Any design that this KIP produces can ultimately be changed in the future, especially if the bulk of it is internal behaviour. I'm going to spend some time next week trying to re-work the original WriteBatchWithIndex design to remove the newTransaction() method, such that it's just an implementation detail of RocksDBStore. That way, if we want to replace WBWI with something in the future, like the SST file management outlined by John, then we can do so with little/no API changes. Regards, Nick