It's all described and discussed here: http://wiki.apache.org/subversion/UnicodeComposition
This branch is only exploring the client-side effects. The server needs to adjust to make the whole thing bullet-proof. -- Brane On 12.11.2012 17:11, Bert Huijben wrote: > >> -----Original Message----- >> From: br...@apache.org [mailto:br...@apache.org] >> Sent: maandag 12 november 2012 16:37 >> To: comm...@subversion.apache.org >> Subject: svn commit: r1408325 - /subversion/branches/wc-collate- >> path/subversion/libsvn_subr/sqlite.c >> >> Author: brane >> Date: Mon Nov 12 15:36:47 2012 >> New Revision: 1408325 >> >> URL: http://svn.apache.org/viewvc?rev=1408325&view=rev >> Log: >> On the wc-collate-path branch: Enable GLOB and LIKE operator >> replacements. > Completely unrelated to this patch, but I'm still wondering what your total > approach/plan on this branch will be. > > I can see that we handle this collate in sqlite (even though this breaks > using a plain sqlite3 as tool on wc.db, etc.), but the > notes/unicode-composition-for-filenames describes several other problems that > need a fix at the same time in order not to break at least some current > subversion users. > > One of these things is that we use hashtables to represent all nodes in a > directory in several places. In some cases we get this from the working copy, > in some cases from the db and in even other cases from the repository. Some > of these may be normalized in some way, while others are not (especially with > our compatibility guarantees within 1.X) > > I'm afraid that just getting wc.db compatible with normalization will just > shift the problem one layer, while still not fixing the real problem. Erik > Huelsmann thoroughly investigated this problem space some years ago and he > documented that fixing the wc library is not enough for fixing the generic > case. And if we are not fixing the generic case, I'm wondering if we should > really work on a major slowdown of every common operation. > > We currently have a binary format, that can be used as a hash key, so many > comparison and lookup operations are constant time. > I'm not sure how they are after installing the collate handling. > > > If we leave the generic case, there are easier ways to resolve this issue. > One such thing would be to make apr (or a wrapper in Subversion) normalize > the on disk paths in the other direction and deny (on the server) the > non-normalized paths. This would eliminate the slowdown on most use cases > that don't have a problem right now, and keep the code clean for future > problems. > > If we have to check for collate handling everywhere in libsvn_wc and > libsvn_client we make it much harder for outside developers to create patches > and even fewer core subversion developers would dare touch these layers. > > > > I'm glad somebody is finally looking into these issues, but I think we should > look at the full picture before we can talk about getting this back on trunk. > > Bert > > -- Branko Čibej Director of Subversion | WANdisco | www.wandisco.com