Thanks Paul.

Yes, it would seem that the true solution to my problem is doing the following 
(as you suggested):

use "sort -k 1,1 -t '|'"

This ensures that I sort on the first field--whereas "sort -k1 -t '|'" does 
not, as much as I wanted it to. ;) Since I was joining on only the first field 
I should've only been sorting on the first field. So, perhaps the only logical 
conflict with my usage here is that "join" works on the first field by default 
(as far as I can tell from join --help) while "sort" does not. But I guess this 
makes sense since "sort" is used for much more (bizarre use cases) than just as 
a pre-step to "join."

I'll read up on the coreutils manual next time.

Thanks for being patient with me and for the great feedback. :)

--Randall


-----Original Message-----
From: Paul Eggert [mailto:egg...@cs.ucla.edu] 
Sent: Friday, January 21, 2011 1:26 AM
To: Randall Lewis
Cc: 7878-d...@debbugs.gnu.org
Subject: Re: bug#7878: "sort" bug--inconsistent single-column sorting 
influenced by other columns?

On 01/20/2011 11:29 PM, Randall Lewis wrote:
> Also, who would've thought that the default "sort" would be incompatible with 
> "join" and that you would need to write the command like this every time you 
> wanted to use "join"?
> 
> LC_ALL=C sort test1.txt

No, "sort" and "join" use the same collating sequence by default.

It sounds like you have a different problem: you
weren't sorting by the same field that you were
joining on.  For example, if you want to use plain
"join" then you need to sort via "sort -k 1b,1".
Or, if you want to use "join -t '|'" then you
also need to use "sort -k 1,1 -t '|'".

This is documented in the coreutils manual.

It may be that "LC_ALL=C sort" worked around your
problem on your particular test case, but it won't
work in general.



Reply via email to