On Sun, Dec 9, 2012 at 11:42 AM, Branko Čibej <br...@wandisco.com> wrote: > On 09.12.2012 09:14, Daniel Shahaf wrote: >> Johan Corveleyn wrote on Sun, Dec 09, 2012 at 01:02:55 +0100: >>> Last week a colleague managed to commit a non-LF-normalized >>> svn:eol-style=native file in our repository again. As explained in >>> issue #4065 [1], this causes all kinds of problems. >>> >>> I suspect there might be a bug in SVNKit, some edge-case where it >>> doesn't correctly translate the to-be-committed files to >>> LF-termination. But regardless, I'd like to protect my repository. I >>> don't fully control the clients that people use, and those clients can >>> always have bugs, ... >>> >>> So ... how hard would it be to fix issue #4065, making the server >>> enforce the right eol-ness for correct operation? It's a genuine >>> question, I have no idea :-). >>> >> for i in $(svnlook changed -t $TXN $REPOS); do >> if propget = native || propget = LF ; then >> svnlook cat -t $TXN $REPOS $i | xxd -c1 -p | grep -i 0d >> fi >> done >> >> And writing that made me realise... "contains CR" is such a simple >> condition, that you don't need the fulltext for it --- you would be able >> to implement it by looking at the "literal new text" segments within >> svndiff streams directly. However, I'm not quite sure whether this >> observation is the key to a simple implementation or to an >> unnecessarily-complicated one. >> >>> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4065 - server >>> should enforce LF normalization for svn:eol-style=native files > > The server does not look at the contents of files, or the value of > properties. It does not even know that properties /have/ semantics. The > only reasonable place to do this would be in a pre-commit hook. Anything > else would require a major change in the design of the server and/or > libsvn_repos. > > If the client encounters non-normalized files with svn:eol-style=native > and does /not/ properly normalize them regardless, that's a client bug. > Though of course the file would immediately show up as modified as soon > as it was updated.
Bah, that just sounds like the client (any client) should paper over these "corrupt files". I don't think that's a good solution. These files are present in the repository in a corrupt form, not following the expected invariant that they should have LF line endings. If you 'svnlook cat' such a file, it will have CRLF line endings, while normal svn:eol-style=native files will always have LF line endings when 'svnlook cat'ed. Any "client" that diffs that change ('svnlook diff', 'svn diff -c $REV $REPOS', ...), will see a full file diff. They would all have to normalize that "corrupt file" on the fly to give a good diff like it should have been if it weren't corrupted. Note that the Apache repository itself already contains such corruptions (see [1], the user thread started by Konstantin Kolinko, which prompted me to file the issue). Any large repository will have them, because of the diversity of svn clients and bugs they may contain. These buggy clients ruin the experience for everyone else. I'd like there to be some better protection against this problem, centrally. A pre-commit hook is an option, and I'll probably have to go that route if there is no other way. But I see two problems with it: 1) It's dog slow. Doing an "svnlook pg" and then potentially "svnlook cat" for every changed file can easily make the pre-commit hook take longer than the standard client-side timeout of 30 seconds (for neon, dunno about the serf timeout). Just imagine doing a catch-up merge touching 1000 files. Parsing both the properties and the content from one "svnlook diff" call might be a workaround, but yuck. 2) Am I the only one who wants to protect his repository against this corruption? Judging from [1], I don't think so. It doesn't make sense that everyone starts writing this pre-commit hook, for something that IMHO is an obvious anti-corruption protection. I think every repository out there deserves to be protected against this. [1] http://svn.haxx.se/users/archive-2011-10/0287.shtml -- Johan