Thanks Paul. Yes, it would seem that the true solution to my problem is doing the following (as you suggested):
use "sort -k 1,1 -t '|'" This ensures that I sort on the first field--whereas "sort -k1 -t '|'" does not, as much as I wanted it to. ;) Since I was joining on only the first field I should've only been sorting on the first field. So, perhaps the only logical conflict with my usage here is that "join" works on the first field by default (as far as I can tell from join --help) while "sort" does not. But I guess this makes sense since "sort" is used for much more (bizarre use cases) than just as a pre-step to "join." I'll read up on the coreutils manual next time. Thanks for being patient with me and for the great feedback. :) --Randall -----Original Message----- From: Paul Eggert [mailto:egg...@cs.ucla.edu] Sent: Friday, January 21, 2011 1:26 AM To: Randall Lewis Cc: 7878-d...@debbugs.gnu.org Subject: Re: bug#7878: "sort" bug--inconsistent single-column sorting influenced by other columns? On 01/20/2011 11:29 PM, Randall Lewis wrote: > Also, who would've thought that the default "sort" would be incompatible with > "join" and that you would need to write the command like this every time you > wanted to use "join"? > > LC_ALL=C sort test1.txt No, "sort" and "join" use the same collating sequence by default. It sounds like you have a different problem: you weren't sorting by the same field that you were joining on. For example, if you want to use plain "join" then you need to sort via "sort -k 1b,1". Or, if you want to use "join -t '|'" then you also need to use "sort -k 1,1 -t '|'". This is documented in the coreutils manual. It may be that "LC_ALL=C sort" worked around your problem on your particular test case, but it won't work in general.