Ivan Zhakov wrote on Tue, Jul 19, 2011 at 21:54:19 +0400: > On Tue, Jul 19, 2011 at 21:46, Daniel Shahaf <d...@daniel.shahaf.name> wrote: > > C. Michael Pilato wrote on Tue, Jul 19, 2011 at 09:45:09 -0400: > >> On 07/19/2011 03:49 AM, Ivan Zhakov wrote: > >> > On Mon, Jul 18, 2011 at 23:50, Daniel Shahaf <d...@daniel.shahaf.name> > >> > wrote: > >> >> Ivan notes on IRC that this fixes issues with non-ASCII lock tokens. > >> >> (At least, he reports errors in 'svn ls -v' over http://; but for me > >> >> 'ls' and 'info' work fine over svn://.) > >> >> > >> >> Are lock tokens even permitted to be non-ASCII? > >> >> (both backends generate ASCII-only lock tokens (in the same manner)) > >> >> > >> >> If they are, are they required to be in UTF-8? Or can they be, say, > >> >> arbitrary octet strings? > >> >> > >> > I didn't find explicit specification for lock token encoding, but it > >> > handled everywhere as normal utf-8 string. > >> > > >> > That's why I decided to implement minimum patch just to decide lock > >> > token returned from hook output to utf-8 string. > >> > > >> > But I'm fine to require lock token to be ASCII only string. > >> > >> Having this as an open question is unacceptable, so we need to make a > >> decision about it. To my knowledge, ASCII is the only thing used for lock > >> tokens in the code we ship. There's no telling what encoding could be in > > > > For the record, our code generates tokens in the form > > "opaquelocktoken:$UUID". > > > >> use if folks are taking advantage of the pre-lock-hook token dictation > >> feature -- Apache will give them a "C" locale environment, but sure anyone > >> who has advanced enough configuration to require taking advantage of this > >> feature is also setting the locale from inside those hooks. > >> > >> FWIW, I think UTF-8 is a fine choice, especially if we've been internally > >> handling this string as UTF-8 already. But let's document the fact > >> somewhere, please (preferably in the pre-lock-hook template comments). > >> > > > > Looking again, svn_fs.h:2059 documents that lock tokens are URI's. > > (and, consequently, are printable ASCII) > > > > We also need them to be XML-safe for their use in ra_dav LOCK responses. > > > > So, I'm thinking that: > > > > * We should validate the supplied lock tokens, not for being well-formed > > UTF-8 as in Ivan's change, but for being ASCII-only. > > > > * We should move the validation to the FS layer. > > > > Thoughts? > > > I'm fine with this approach, but I think we should leave explicit > validation of lock tokens in pre-lock hook handling for better > diagnostic.
I don't think that's necessary, given that we don't (and likely won't) do anything between calling the pre-lock hook and calling svn_fs_lock(). Anyway, untested patch: [[[ Index: subversion/libsvn_fs_fs/lock.c =================================================================== --- subversion/libsvn_fs_fs/lock.c (revision 1147872) +++ subversion/libsvn_fs_fs/lock.c (working copy) @@ -950,7 +950,7 @@ svn_fs_fs__generate_lock_token(const char **token, want to use the fs UUID + some incremented number? For now, we generate a URI that matches the DAV RFC. We could change this to some other URI scheme someday, if we wish. */ - *token = apr_pstrcat(pool, "opaquelocktoken:", + *token = apr_pstrcat(pool, "opaquelocktoken:", /* ### */ svn_uuid_generate(pool), (char *)NULL); return SVN_NO_ERROR; } Index: subversion/libsvn_fs/fs-loader.c =================================================================== --- subversion/libsvn_fs/fs-loader.c (revision 1147872) +++ subversion/libsvn_fs/fs-loader.c (working copy) @@ -28,6 +28,7 @@ #include <apr_md5.h> #include <apr_thread_mutex.h> #include <apr_uuid.h> +#include <apr_uri.h> #include "svn_types.h" #include "svn_dso.h" @@ -1303,6 +1304,31 @@ svn_fs_lock(svn_lock_t **lock, svn_fs_t *fs, const return svn_error_create (SVN_ERR_XML_UNESCAPABLE_DATA, NULL, _("Lock comment contains illegal characters")); + } + + /* Enforce that the token be an XML-safe URI. */ + if (token) + { + apr_uri_t uri; + apr_status_t status; + + status = apr_uri_parse(pool, token, &uri); + if (status) + return svn_error_createf(SVN_ERR_FS_BAD_LOCK_TOKEN, + svn_error_wrap_apr(status, NULL), + _("Can't parse token '%s' as a URI"), + token); + + if (strcmp(uri.scheme, "opaquelocktoken")) + return svn_error_createf(SVN_ERR_FS_BAD_LOCK_TOKEN, NULL, + _("Lock token URI '%s' has bad scheme; " + "expected '%s'"), + token, "opaquelocktoken"); + + if (! svn_xml_is_xml_safe(token, strlen(token))) + return svn_error_create( + SVN_ERR_FS_BAD_LOCK_TOKEN, NULL, + _("Lock token URI is not XML-safe")); } if (expiration_date < 0) Index: subversion/libsvn_fs_base/lock.c =================================================================== --- subversion/libsvn_fs_base/lock.c (revision 1147872) +++ subversion/libsvn_fs_base/lock.c (working copy) @@ -250,7 +250,7 @@ svn_fs_base__generate_lock_token(const char **toke want to use the fs UUID + some incremented number? For now, we generate a URI that matches the DAV RFC. We could change this to some other URI scheme someday, if we wish. */ - *token = apr_pstrcat(pool, "opaquelocktoken:", + *token = apr_pstrcat(pool, "opaquelocktoken:", /* ### */ svn_uuid_generate(pool), (char *)NULL); return SVN_NO_ERROR; } ]]] I'll commit something along these lines (after testing that it works and so on) if I don't hear otherwise.