> As for the problem...the fact that you using 4.2, would seem > to make the algorith > open(<zero as new file>) > write(whatever we have as history) > close(set eof to where we are). > > What file system are you are? is it local or networked?
Local, ext3. > one way for it to be zero is if the last bash exiting had no history, > cuz the zero/truncate of each open can zero the file from any previous > bash being left. I thought of that too, but it's not the case for me. Even after the failure has wiped the old history, my new shells have at least 1-2 commands kicking around. So I could imagine my nice 500-line history turning into a 2-line one, but not zero-length. > I can also see the possibility of some kernel or file system routine > waiting after you issue the close call so that it doesn't have to zero > the area where data is arriving. I.e. it might only zero the file beyond > the valid text AFTER some delay (5 seconds?) OR might wait until the file > is closed, so if you completely overwrite the old file with text, the > kernel won't have to zero anything out. If so, that would be a big bug. When you're truncating a file to a shorter length, some filesystems do indeed delay freeing the blocks in hopes of reusing them. But the length is set to zero when the O_TRUNC happens, and likewise if you write n bytes, the length is immediately increased by n. There are certain races on some filesystems that could cause the n bytes to be incorrect (e.g., garbage), but that generally happens only on system crashes. There's a paper on this from a few years back; I'd have to review it to be sure but my recollection is that you can't get zero-length files in the absence of system or hardware failures. (However, I'm not sure whether they used SMPs...) Still, I suppose it could be a kernel bug. Maybe I'll have to write a better test program and let it run overnight. > in the case of write...close to non-pre-zeroed record, the operation > becomes a read-modify-write. Thing is, if proc 3 goes out for the > partial buffer > (~4k is likely), it may have been completely zeroed from proc2 closing > where proc3 > wants to write. No generic Linux filesystem that I'm aware of zeroes discarded data at any time; it's too expensive. And the partial buffer would be in-memory at that point, since the processes can exit much faster than the buffer could be written to disk. So I don't think that's it. > (multi-threaded ops on real multiple execution units do the darnest things). Ain't that the truth! -- Geoff Kuenning ge...@cs.hmc.edu http://www.cs.hmc.edu/~geoff/ An Internet that is not Open represents a potentially grave risk to freedoms of many sorts -- freedom of speech and other civil liberties, freedom of commerce, and more -- and that openness is what we must so diligently work to both preserve and expand. -- Lauren Weinstein