On Mon, Nov 19, 2012 at 6:43 AM, Stefan Fuhrmann <stefan.fuhrm...@wandisco.com> wrote: > A crashed writer process may leave a corrupt protorev and / or > other incomplete files. There is no atomic incremental change > here. The caller (client) using the crashed process is supposed > to detect the crash and abandon the transaction.
I don't think that "supposed to" is documented anywhere (unless you want to consider our clients behavior as documenation). After dealing with the stuck being_written flag that produced rep_write_cleanup, I don't think we should assume that going forward. It may not be worth dealing with in fsfs but within fsfs2 I think we should make representation operations even to the protorev files (or whatever their equivalent is) be atomic. Especially since these operations are usually driven across separate HTTP requests when we're using DAV. Then again, beyond improving repo consistency in some edge cases it'd also allow us to support parallelizing the PUTs. There may be other issues I'm not aware of off the top of my head in doing that right now, but at least in fsfs it's absolutely not supported since for the entire duration of the PUT you have the protorev file locked.