Daniel Berlin wrote:
Complete alphabetical order is not in the cards for diff, at least for
diffs involving server side (diffs that are client side are easily
sorted by filename).
This is because it would require losing the "streaminess" of the
protocol (unlike cvs, the client and the server in svn are really
seperate, and the client just gets a stream of results.  Sorting would
require at least holding all the directory entries in the server at
once, before sending them to the client, if not worse),

Huh? I don't get this. You sort filenames *in* the server *before* you generate diffs. And you do the sorting within each directory; i.e. early before you do anything else. What does the "streaming protocol" have to do with this? Dogenerate teh stream in order, that's all?

How to do this depends on the internal data structures of the
repository, and I realize that the possibility of renames does
complicate those data structures.  But there has to be a data structure
and api to navigate the file hierarchy - a "tree walker".  That
tree walker should sort by filename before doing anything with
the contents of a directory.

Yes, the sorting does cost some time, but sorting a directory is
pretty fast.  You can do much better than quicksort: Put each filename
in one of 27 buckets, one for each of A/a to Z/z, and one for "other",
and the

as well as their being locale issues

We're talking about filenames for source code. At least for gcc, it's fine to hardcode a case-folding "code point" sort order.

Preferably, the sort algorithm should match 'ls'.  (Specifically
GNU ls - IIRC BSD ls doesn't case-fold, which I think is wrong.
It's especially ridiculous on a case-folding Mac, but I couldn't
convince the BSD peopel of that.)

(the server would have to know the client's locale
to sort the files so they appeared in the alphabetical order you
expect).

No problem. The client's request can include the current LOCALE value. However, I'm not sure that's derirable. Obviously the charset and language used for filenames cannot be client-dependent. The sort order could be client-dependent, but since it might not match the server language, I don't think it makes sense. If I'm a German speaker working with repository containing English filenames, the sort order should be English, regardless of my LOCALE.

In other words, so far the cost of trying to do it has
outweighed the benefit of having diffs appear in some well-specified
order.

Having output in a well-specified order is very important. How else would I be able to compare two 'diff' runs otherwise? How would I write a regression test for 'svn diff'? Of course I can postprocess the output, but it's much more convenient and efficient to just sort each directory before diffing each file.

More generally, any listing that humans are expected to see should
be sorted.  If you put it in random order people will wonder if there
is a meaning to the order.  An 'ls' that doesn't by default sort the
output is obviously Wrong.

Now I'm not suggesting this is a show-stopper issue, or that you should
be responsible for fixing it.  But clearly, if svn output is not by
default in a predictable output, that is most definitely a serious
(but not critical) bug.  (It's ok to have a "don't sort" option to
speed things up, but it shouldn't be the default.)
--
        --Per Bothner
[EMAIL PROTECTED]   http://per.bothner.com/

Reply via email to