Julian Foad wrote: > On Thu, 2010-02-18, Neels J Hofmeyr wrote: >> Great, moving forward fast on pristine design questions! > > Hi Neels. > > Did you start working the new knowledge into a document? Lots of stuff > was said in this thread and it would be useful to see where we are at.
thanks for poking :) > I have a couple of comments. > > > THE PRISTINE-WRITE API > > I was thinking about the "write" API and how an API designed around a > stream is surely better than one designed around "tell me the path where > I can put a file". > > The objection to "give me a writable stream and I'll write to it" was > that the stream close handler wouldn't know whether the stream was > closed after a successful write or because of an error. We can rectify > this by adding a second step: require the caller to call a special > "commit" API to close the stream on success, and any other way of > closing the stream would abandon the new content. > > // Caller passes the checksum if it knows it, or passes NULL and > // the store in that case will calculate the checksum. > stream = pristine_start_new_file(expected_checksum); > > [caller writes to stream] > > // Now the store commits the new text, verifies the checksum > // (if it was given an expected one) and returns it. > new_checksum = pristine_close_new_file(stream); Let's also look at the advantages of the current design of svn_wc__db_pristine_get_tempdir() and _install(): pseudocode: [[[ pristine_get_tempdir(&tempdir) /* Make tempfile */ svn_stream_open_unique(&tempstream, &tempfile_path, tempdir) /* Write and checksum */ tempstream = svn_stream_checksummed2(tempstream, &write_checksum) write to and close tempstream pristine_install(tempfile_path, write_checksum) ]]] When there is only a write-stream available, some options get lost, while _get_tempdir() and _install() offer the whole palette of possibilities. For convenience, we can offer a wrapper for the first three instructions above. But additionally, we can offer the possibility to determine a file location *on the same file system device* as the pristine store. This is useful when we can provide some external tool or specialized function X with a destination path in advance. We create a tempfile in pristine_temp_dir() and pass that location to X. X writes or OS-copies or <whatever>s to the final device, possibly saving one copy-across-fs-devices operation per pristine. Also, X may already have a checksum tee in its implementation, obsoleting the need to calculate again. In that case it's nice that pristine_install() doesn't care where the checksum came from. (Also, if we are going to validate the checksum on "every" read, there is no need to be paranoid about correctness of the checksum when writing -- we anyway have to "expect" corruptions to occur after any successful write.) Note that your proposed usage is a subset of the usage provided by _get_temp_dir() and _install(). Your way does KISS some API complexity goodbye when thinking in human terms, but the OS will find the _get_temp_dir() and _install() simpler in some cases :P The simpler usage of above pseudocode would be: pseudocode: [[[ /* convenience function provided by API or whomever. */ svn_error_t * pristine_get_write_stream(stream_t **write_stream, svn_checksum_t *write_checksum, const char **temp_path) { pristine_get_tempdir(&tempdir) /* Make tempfile */ svn_stream_open_unique(&tempstream, &tempfile_path, tempdir) /* Write checksum to the checksum location provided by caller */ *write_stream = svn_stream_checksummed2(tempstream, write_checksum) } ... /* calling code */ { pristine_get_write_stream(&stream, &write_checksum, &stream_temp_path) write to and close stream if (had priori checksum and does not match write_checksum): bail pristine_install(stream_temp_path, write_checksum) } ]]] This is similar to what you propose, when considering that your implementation would also need a baton to pass the stream temp path along (right?). So IMHO it boils down to: Do you think it is better to opaquify such baton for the sacrifice of not being able to get a write destination on the target device in advance? I vote for removing _pristine_write() and for keeping _get_temp_dir() and _install(), plus adding a convenience function in the spirit of above pseudocode to make the stream-only case sit comfortably. Yay? > > Now let's examine the ways in which a caller might want to give new > content to the store: > > 1. Caller asks for a writable stream and pushes content to that, then > calls a "commit" function. > > 2. Caller has a readable stream and wants the store to pull from that. > > 3. Caller has a (non-temporary) file and wants the store to read from > that file. > > 4. Caller has to create a temporary file for reasons beyond its > control (output of an external tool perhaps) and wants the store to take > the entire file by an atomic move if possible. This is the case where it > would be more efficient if it know where to put the file in the first > place. > > The caller can easily implement 2 and 3 in terms of an API that provides > 1, so that just leaves 1 and 4 that are worthwhile as an API. > > I feel that (1) is by far the more important one to have, and (4) is a > specialist optimisation. ...which may nevertheless come up and is not that different in its implementation, IMHO. As above. Internal API. > > > VERIFYING CHECKSUMS > > I didn't read everything you were discussing but I got worried by > hearing about providing options for the caller to request checksums to > be verified or not per call. That sounds like too much complexity. I'm > sure we should start with a global compile-time verification enable > switch, and if we really find we need more fine-grained control then we > should consider how to provide it then. It might not need an API flag: > for example we might decide it should automatically verify on the first > read and once in every hundred reads, or all sorts of internal > possibilities like that. Yes, during pristine_read(), we will want to switch on checksum validation by default, until we find the need to do otherwise. We argued so far that the API could checksum along internally, perform the checksum validation on stream close, and just ignore the calculated checksum for partial reads, which does not happen a lot. The check mode, which I presume you are worried about, comes in when we want to determine whether we have a valid pristine or not in various situations. Previously, I ended up having these checkmodes: _usable: "The pristine exists in both DB and file system and size and mtime match up" _known: "Take any one indication that a pristine exists at face value, do that in any way, but fast please." _valid: "In addition to checking DB and filesystem, read the whole pristine file and checksum it." Thinking about it now, we can lose _valid. That can be done by checking _usable and pumping the stream provided by pristine_read(). Or *only* a pristine_read() and stream pump, even. We were also ambiguous about _known. Some argue that it's too special, while it makes sense to me (in case of many checks done for heuristics) and Greg says he remembers to have known such a use case to exist, heh. But I now think we should drop the enum svn_wc__db_checkmode_t entirely. Let's provide the _usable check as a single function in 1.7. If we find the need, we can add a faster check in a later release, or we can even make this single check function faster by never checking mtime and len (except when _read()ing). Having gone that far, one could argue that we need no _check function at all, we can just have pristine_read(). If it returns a stream, it indicates existence and usability, just close it and be merry. Since _usable will anyway cause disk I/O when we decide to check mtime and size via fstat, the argument that we want a check function that causes no disk I/O is half-hearted. On the other hand, if we were only having a _read() API function, we would want to have a convenience function that closes the stream for those callers that just want _usable. In this case, the convenience function would provide practically the same API as providing a separate _check() function in the first place, just slightly less optimal (causing file preread instead of just fstat traffic). It would also blur the option of easily making all the check() use cases faster by a central handle. Some more pseudocode to illustrate: No check() function: pristine_read(stream_t **stream): if (static_check_mtime_and_len() == match): *stream = <open checksumming stream> convenience_pristine_check(svn_boolean_t *exists): pristine_read(&stream) *exists = (stream != NULL) if (stream): <close stream> <-- no need to have opened it With check() function in the API: pristine_read(stream_t **stream): /* same as above */ pristine_check(svn_boolean_t *exists): *exists = (static_check_mtime_and_len() == match) I still favor the explicit check() function included in the API and keep with my above suggestion for 1.7. We can provide a second, faster check function later or even make the single check function faster. > > >> The one thing left now is: >>> Can someone explain a motivation for even creating a database row before >>> the pristine file is moved into place in the pristine store? I currently >>> don't see why it can't be way simpler. > [...] > > I would just write it down the way you think it should be in the main > flow of your document, and mention outstanding questions like this in > notes. > > "Simultaneous or multi-threaded clients" would be my first reaction to > that particular question. You guys already agreed on what I think in the other two posts, i.e. that we just let people write concurrently. With pristine_install(), there isn't anything that can go wrong concerning concurrency -- assuming a 'mv' is atomic that's obvious, but even if not, the concurrent processes should anyway write the same values to the same file offsets. (not given when we have compressed pristines! see below) Q: Is there any deadlock / error with concurrent OS 'mv's to the same location? I am assuming not, but am not 100% sure. If there turns out to be one, we can still create a lock-file write lock thang (which could also serve to prevent two separate processes from downloading the same pristine twice, if we choose to cover that). However, I am now also thinking about database consistency. If one concurrent process wins the database race, while another wins the file system race, there may be a mismatching mtime in the database (questionable if the mtime resolution is no less than a whole eternal second, but it still remains possible; also, mtime currently does not exist in the schema but possibly should). Would this problem go away if we mv the pristine file into place while the DB transaction is open? Would SQLite mutex the two processes apart for free? (talking just about the final 'mv' during _install()) The same problem as with mtime is more aggravating with compression. If one process decides to compress the pristine while the other decides to write it out plainly, we have a concurrency problem in *both* DB and on disk. We could avoid this by having absolute per-pristine-store rules in place about when a pristine is compressed, e.g. compress all those being larger than a certain size. Alternatively, we could append a .gz/.bz2/.zip suffix to the checksum filename for compressed files, doing away with concurrency problems in the file system. It would then not matter which process wins the DB race, but the pristine would exist twice (once compressed, once uncompressed). We have not put any thought in implementing compression yet. I think we should drop compression from 1.7, and plan to provide an opaque compression mechanism behind pristine_install() based on absolute rules. Or maybe not, because disk space is too cheap to bother and no-one has called for a compressed pristine store because we already put some kind of high-water mark in place. Who knows. (I don't like pristine compression. If we're optimizing for disk space, why store it in the first place. It makes a lot of things way more complex.) Either way, it would be nice to not have a "locked" state in the pristine database. E.g. if we allow the database row to be able to say "I'm busy writing this file", then we need to decide what the other process should do in that case. Wait indefinitely? I'd prefer not to have that, but rather have the option for lock-file based optimization to prevent the same content from being downloaded twice at the same time, emphasis on *option*. Lock files are cleaned up when a process dies. Not so the DB row. <Reminding myself that the need for MD5 sums forces us to have some storage over and above the pristine itself. Still dreaming of a DB-less pristine store like echo "content" > SHA1SUMSHA1SUMSHA1SUM touch MD5SUMMD5SUM ln -s MD5SUMMD5SUM SHA1SUMSHA1SUMSHA1SUM.md5 but that's probably nonsense. Having a DB is also better for implementing future features.> Comments welcome! As you can see, reading this thread again now, a little while later, has brought some more things to the surface. I shall cast a newer version of the pristine API design doc I started, but my time for today is *so* over :) *neels feels the early morning breath of a two-year-old down his sleepy neck ~Neels
signature.asc
Description: OpenPGP digital signature