On 29 nov 2013, at 21:09, Branko Čibej <br...@wandisco.com> wrote: > On 29.11.2013 20:42, Ivan Zhakov wrote: >> On 29 November 2013 22:22, <br...@apache.org> wrote: >>> Author: brane >>> Date: Fri Nov 29 18:22:00 2013 >>> New Revision: 1546619 >>> >>> URL: http://svn.apache.org/r1546619 >>> Log: >>> * branches/fsfs-ucsnorm/BRANCH-README: New file. >>> >>> Added: >>> subversion/branches/fsfs-ucsnorm/BRANCH-README (with props) >>> >>> Added: subversion/branches/fsfs-ucsnorm/BRANCH-README >>> URL: >>> http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/BRANCH-README?rev=1546619&view=auto >>> ============================================================================== >>> --- subversion/branches/fsfs-ucsnorm/BRANCH-README (added) >>> +++ subversion/branches/fsfs-ucsnorm/BRANCH-README [UTF-8] Fri Nov 29 >>> 18:22:00 2013 >>> @@ -0,0 +1,66 @@ >>> +The purpose of this [fsfs-ucsnorm] branch is to implement two optional >>> +checks related to Unicode normalisation to FSFS. >>> + >>> + >>> +Option: Prevent name collisions >>> +=============================== >>> + >>> +If this option is enabled, FSFS will reject operations that would >>> +create two different representations of the same name in the same >>> +directory. This will prevent situations where a user could see more >>> +than one form of the name in a directory listing: >> Nice feature, but why in FS layer? May be it's better to implement >> this feature on svn_repos layer? > > It's not, for at least two reasons: > Users of the FS API must have the same constraints as repository clients, > otherwise the whole thing falls on its face. > The repos layer cannot implement this optimally; at a rough guess, it would > have to double the number of lookups performed: > The node cache in an FSFS implementation detail, and this option will affect > how cache keys are generated. > Likewise for actual lookups into the on-disk representation.
Just want to say that, in my opinion, the design described in BRANCH-README since r1546640 looks very good. You might remember from back when I did some specification work (in the wiki) that I am a strong proponent of the "normalization-preserving" approach to the problem. I believe n-p makes many issues dealing with existing repositories much easier to manage, in most cases go away completely unless there are actually normalization conflicts. E.g. the issue raised by Bert 2013-11-24 regarding mergeinfo is not a problem with n-p (I guess without thinking too much about it). Thanks for working on normalization, /Thomas Å.