Daniel Berlin wrote:
Complete alphabetical order is not in the cards for diff, at least for
diffs involving server side (diffs that are client side are easily
sorted by filename).
This is because it would require losing the "streaminess" of the
protocol (unlike cvs, the client and the server in svn are really
seperate, and the client just gets a stream of results. Sorting would
require at least holding all the directory entries in the server at
once, before sending them to the client, if not worse),
Huh? I don't get this. You sort filenames *in* the server *before*
you generate diffs. And you do the sorting within each directory;
i.e. early before you do anything else. What does the "streaming
protocol" have to do with this? Dogenerate teh stream in order,
that's all?
How to do this depends on the internal data structures of the
repository, and I realize that the possibility of renames does
complicate those data structures. But there has to be a data structure
and api to navigate the file hierarchy - a "tree walker". That
tree walker should sort by filename before doing anything with
the contents of a directory.
Yes, the sorting does cost some time, but sorting a directory is
pretty fast. You can do much better than quicksort: Put each filename
in one of 27 buckets, one for each of A/a to Z/z, and one for "other",
and the
as well as their being locale issues
We're talking about filenames for source code. At least for gcc,
it's fine to hardcode a case-folding "code point" sort order.
Preferably, the sort algorithm should match 'ls'. (Specifically
GNU ls - IIRC BSD ls doesn't case-fold, which I think is wrong.
It's especially ridiculous on a case-folding Mac, but I couldn't
convince the BSD peopel of that.)
(the server would have to know the client's locale
to sort the files so they appeared in the alphabetical order you
expect).
No problem. The client's request can include the current LOCALE value.
However, I'm not sure that's derirable. Obviously the charset and
language used for filenames cannot be client-dependent. The sort
order could be client-dependent, but since it might not match the
server language, I don't think it makes sense. If I'm a German
speaker working with repository containing English filenames, the
sort order should be English, regardless of my LOCALE.
In other words, so far the cost of trying to do it has
outweighed the benefit of having diffs appear in some well-specified
order.
Having output in a well-specified order is very important. How else
would I be able to compare two 'diff' runs otherwise? How would I
write a regression test for 'svn diff'? Of course I can postprocess
the output, but it's much more convenient and efficient to just sort
each directory before diffing each file.
More generally, any listing that humans are expected to see should
be sorted. If you put it in random order people will wonder if there
is a meaning to the order. An 'ls' that doesn't by default sort the
output is obviously Wrong.
Now I'm not suggesting this is a show-stopper issue, or that you should
be responsible for fixing it. But clearly, if svn output is not by
default in a predictable output, that is most definitely a serious
(but not critical) bug. (It's ok to have a "don't sort" option to
speed things up, but it shouldn't be the default.)
--
--Per Bothner
[EMAIL PROTECTED] http://per.bothner.com/