I am investigating the following DAV + FSFS server error message which we at WANdisco have seen rarely, but in at least two real installations, and am sharing it here in case anyone can shed light on it. It happens during a DAV 'MERGE':
Can't open file '[...]/db/transactions/props': No such file or directory FSFS is trying to open an invalid path; it should be '[...]/db/transactions/<txn-id>/props'. From the source code, I can see how this must happen at a low level: the txn-id path component will be omitted if and only if something passes a NULL transaction-id string to fs_fs.c:get_txn_proplist(). But why and from where? The error might be triggered by the WANdisco software sending commands in the wrong sequence, or something like that, and so to that extent it may not be interesting to the general community. In one case, detailed below, it happened after a DELETE was sent, more than an hour into a very long commit. In another case, it happened after a start-commit hook failed. The server might be receiving a DAV MERGE request when it isn't expecting one. Can anyone comment on that, given this limited information? Here are some Apache access log entries from one occurrence, with paths anonymized and the client Id string (always "SVN/1.6.6 (r40053) neon/0.28.3") omitted for brevity: [[[ 127.0.0.1 - UserK [16/Dec/2009:20:25:17 -0800] "MKACTIVITY /svn/RepoJ/!svn/act/ActivityA HTTP/1.1" 201 312 "-" 127.0.0.1 - UserK [16/Dec/2009:23:20:45 -0800] "PROPPATCH /svn/RepoJ/!svn/wrk/ActivityA/branches/BranchD/PathG HTTP/1.1" 207 582 "-" 127.0.0.1 - UserM [17/Dec/2009:00:12:45 -0800] "MERGE /svn/RepoJ/branches/BranchH/PathC HTTP/1.1" 200 953 "-" 127.0.0.1 - UserM [17/Dec/2009:00:12:59 -0800] "MERGE /svn/RepoJ/branches/BranchP/PathC HTTP/1.1" 200 965 "-" 127.0.0.1 - UserK [17/Dec/2009:00:21:19 -0800] "DELETE /svn/RepoJ/!svn/act/ActivityA HTTP/1.1" 204 - "-" 127.0.0.1 - UserK [17/Dec/2009:00:32:45 -0800] "MKACTIVITY /svn/RepoJ/!svn/act/ActivityB HTTP/1.1" 201 312 "-" 127.0.0.1 - UserK [17/Dec/2009:00:39:58 -0800] "MERGE /svn/RepoJ/branches/BranchD/PathJ HTTP/1.1" 500 279 "-" 127.0.0.1 - UserK [17/Dec/2009:00:39:59 -0800] "PROPPATCH /svn/RepoJ/!svn/wrk/ActivityB/branches/BranchD/PathE HTTP/1.1" 207 569 "-" 127.0.0.1 - UserK [17/Dec/2009:00:39:59 -0800] "MERGE /svn/RepoJ/branches/BranchD/PathJ HTTP/1.1" 500 279 "-" ]]] and the corresponding error log: [[[ [Thu Dec 17 00:39:58 2009] [error] [client 127.0.0.1] Could not MERGE resource "/svn/RepoJ/!svn/act/ActivityA" into "/svn/RepoJ/branches/BranchD/PathJ". [500, #0] [Thu Dec 17 00:39:58 2009] [error] [client 127.0.0.1] There was a problem opening the transaction specified by this activity. [500, #2] [Thu Dec 17 00:39:58 2009] [error] [client 127.0.0.1] Can't open file '/svn/repos/RepoJ/db/transactions/props': No such file or directory [500, #2] ]]] The error appears to match the second-last MERGE in the access log. I don't have a copy of the body of the MERGE request in this case. The error refers to an activity Id that was the subject of the DELETE that was logged 18 minutes earlier. A comment from one of our guys looking at the logs: "We are seeing 22,600 changes for Activity A and no merge at the end (just a DELETE). Then we see about 931 commands for Activity B and a MERGE that fails for Activity A." Is this a case where the client side shouldn't be sending a MERGE at this point? Even if that is the primary problem, I'm still interested in Subversion's response to this MERGE request. I want to assert(txn-id != NULL) in the relevant FSFS functions, as in the attached patch. I determined the comments in the attached patch after manually passing NULL instead of the txn-id. However, that won't get me far on its own. To try to reproduce the error, do you think I should try sending a sequence of DAV commands similar to the sequence shown above? Any other ideas on what might be happening or how to diagnose this? - Julian
Add assertions and comments in FSFS where the txn-id must be non-null. Errors have occasionally been seen that indicate it must have been null in at least one of these places. This is a patch in progress, to help with debugging, not necessarily to be committed. * subversion/libsvn_fs_fs/fs_fs.c (path_txn_dir, get_txn_proplist, svn_fs_fs__change_txn_props, svn_fs_fs__get_txn, commit_body, svn_fs_fs__txn_proplist): Add assertions and comments. --This line, and those below, will be ignored-- Index: subversion/libsvn_fs_fs/fs_fs.c =================================================================== --- subversion/libsvn_fs_fs/fs_fs.c (revision 956813) +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) @@ -317,6 +317,7 @@ path_revprops(svn_fs_t *fs, svn_revnum_t static APR_INLINE const char * path_txn_dir(svn_fs_t *fs, const char *txn_id, apr_pool_t *pool) { + SVN_ERR_ASSERT(txn_id != NULL); return svn_dirent_join_many(pool, fs->path, PATH_TXNS_DIR, apr_pstrcat(pool, txn_id, PATH_EXT_TXN, NULL), NULL); @@ -4584,6 +4585,12 @@ get_txn_proplist(apr_hash_t *proplist, { svn_stream_t *stream; + /* The following error has been seen several times in real life: + * "Can't open file '[...]/db/transactions/props': No such file or directory" + * The correct path would be: '[...]/db/transactions/<txn-id>/props'. + * The only way that can happen is if txn_id is NULL here. */ + SVN_ERR_ASSERT(txn_id != NULL); + /* Open the transaction properties file. */ SVN_ERR(svn_stream_open_readonly(&stream, path_txn_props(fs, txn_id, pool), pool, pool)); @@ -4622,6 +4629,8 @@ svn_fs_fs__change_txn_props(svn_fs_txn_t int i; svn_error_t *err; + /* ### Passing txn-id NULL here -> incorrect results but not an error msg. */ + SVN_ERR_ASSERT(txn->id != NULL); err = get_txn_proplist(txn_prop, txn->fs, txn->id, pool); /* Here - and here only - we need to deal with the possibility that the transaction property file doesn't yet exist. The rest of the @@ -4669,6 +4678,8 @@ svn_fs_fs__get_txn(transaction_t **txn_p txn = apr_pcalloc(pool, sizeof(*txn)); txn->proplist = apr_hash_make(pool); + /* ### Passing txn-id NULL here -> "Can't open 'transactions/props'" */ + SVN_ERR_ASSERT(txn_id != NULL); SVN_ERR(get_txn_proplist(txn->proplist, fs, txn_id, pool)); root_id = svn_fs_fs__id_txn_create("0", "0", txn_id, pool); @@ -6040,6 +6051,8 @@ commit_body(void *baton, apr_pool_t *poo apr_hash_t *proplist = apr_hash_make(pool); svn_sqlite__stmt_t *stmt; + /* ### Passing txn-id NULL here -> ??? (no effect seen yet) */ + SVN_ERR_ASSERT(cb->txn->id != NULL); SVN_ERR(get_txn_proplist(proplist, cb->fs, cb->txn->id, pool)); SVN_ERR(svn_sqlite__get_statement(&stmt, ffd->revprop_db, @@ -7111,6 +7124,9 @@ svn_fs_fs__txn_proplist(apr_hash_t **tab apr_pool_t *pool) { apr_hash_t *proplist = apr_hash_make(pool); + + /* ### Passing txn-id NULL here -> "Can't open 'transactions/props'" */ + SVN_ERR_ASSERT(txn->id != NULL); SVN_ERR(get_txn_proplist(proplist, txn->fs, txn->id, pool)); *table_p = proplist;