bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Paul Eggert
On 08/17/2012 10:40 PM, Jim Meyering wrote: > I've adjusted your commit log to look like this. > Is that ok with you? Sure, that all looks good. Thanks for doing that.

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Jim Meyering
Paul Eggert wrote: > OK, I scratched my head for a bit and came up with the following > further patch, which addresses the issues that I mentioned. > > Subject: [PATCH] sort: simpler fix for sort -u data-loss bug > > * src/sort.c (overlap): Remove. > (fillbuf): Do not try to copy saved lines, as th

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Jim Meyering
Paul Eggert wrote: > OK, I scratched my head for a bit and came up with the following > further patch, which addresses the issues that I mentioned. ... > Subject: [PATCH] sort: simpler fix for sort -u data-loss bug > > * src/sort.c (overlap): Remove. > (fillbuf): Do not try to copy saved lines, as

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Paul Eggert
On 08/17/2012 01:31 PM, Jim Meyering wrote: > So we definitely have a *second* bug here. Yes, I noticed. It definitely counts as a double-ouch. I'm glad the bug report prompted us to read this code more carefully. My latest patch should fix both bugs.

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Paul Eggert
OK, I scratched my head for a bit and came up with the following further patch, which addresses the issues that I mentioned. >From ac405d343c379096c7ed51b481d5ed08ee18d6e0 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Fri, 17 Aug 2012 13:26:00 -0700 Subject: [PATCH] sort: simpler fix for sort

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Jim Meyering
Paul Eggert wrote: > On 08/17/2012 12:53 PM, Jim Meyering wrote: >> safe_text is initially NULL and we enter that block >> only when we're about to fread into a buffer that overlaps >> the current saved_line.text buffer. > > Sorry, I wasn't clear enough. I was worried about the > case when saved_l

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Paul Eggert
On 08/17/2012 12:53 PM, Jim Meyering wrote: > safe_text is initially NULL and we enter that block > only when we're about to fread into a buffer that overlaps > the current saved_line.text buffer. Sorry, I wasn't clear enough. I was worried about the case when saved_line.text does not overlap the

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Jim Meyering
Paul Eggert wrote: > On 08/17/2012 12:36 PM, Jim Meyering wrote: >> The first time the safe_text buffer is allocated >> it will have to be disjoint from the line.text buffer >> and from the buffer into which we're about to fread. >> Thereafter, regardless of reallocation, overlap should >> always

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Paul Eggert
On 08/17/2012 12:36 PM, Jim Meyering wrote: > The first time the safe_text buffer is allocated > it will have to be disjoint from the line.text buffer > and from the buffer into which we're about to fread. > Thereafter, regardless of reallocation, overlap should > always be false. I haven't though

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Jim Meyering
Paul Eggert wrote: > On 08/16/2012 02:03 PM, Jim Meyering wrote: >> * src/sort.c (saved_line): New static/global, renamed and moved from... >> (write_unique): ...here. > > I see a couple of problems with this patch. Pedantically, > the behavior of 'overlap' is undefined on hosts that > use a segme

bug#9780: sort -u throws out non-duplicates

2012-08-17 Thread Paul Eggert
On 08/16/2012 02:03 PM, Jim Meyering wrote: > * src/sort.c (saved_line): New static/global, renamed and moved from... > (write_unique): ...here. I see a couple of problems with this patch. Pedantically, the behavior of 'overlap' is undefined on hosts that use a segmented architecture, because '<=

bug#9780: sort -u throws out non-duplicates

2012-08-16 Thread Jim Meyering
Jim Meyering wrote: ... > In case anyone is chomping at the bit, here's a preliminary patch: > > Here's a smaller test case that appears to be host/nproc-independent: > It should print two lines: 1, then 7. > Without this patch, it prints only "7". > > (yes 7|head -11; echo 1)|sort --parallel=1

bug#9780: sort -u throws out non-duplicates

2012-08-16 Thread Jim Meyering
Jim Meyering wrote: > Jim Meyering wrote: >> Jim Meyering wrote: >> ... >>> Here's a smaller test case that appears to be host/nproc-independent: >>> It should print two lines: 1, then 7. >>> Without this patch, it prints only "7". >>> >>> (yes 7|head -11; echo 1)|sort --parallel=1 -S32b -u >>>

bug#9780: sort -u throws out non-duplicates

2012-08-16 Thread Bernhard Voelker
On 08/16/2012 10:09 AM, Jim Meyering wrote: >> FYI, here's the required test: >> > >> > (yes 7|head -10; echo 1)|sed 's/^/1 /'|sort -k2,2 --p=1 -S32b -u >> > >> > Without the if (key) { ... } part of my patch, it would fail. >> > I had to tweak the number of '7's (s/11/10) in the input to make

bug#9780: sort -u throws out non-duplicates

2012-08-16 Thread Jim Meyering
Jim Meyering wrote: > Jim Meyering wrote: > ... >> Here's a smaller test case that appears to be host/nproc-independent: >> It should print two lines: 1, then 7. >> Without this patch, it prints only "7". >> >> (yes 7|head -11; echo 1)|sort --parallel=1 -S32b -u >> >> Of course, it needs more/

bug#9780: sort -u throws out non-duplicates

2012-08-16 Thread Jim Meyering
Jim Meyering wrote: ... > Here's a smaller test case that appears to be host/nproc-independent: > It should print two lines: 1, then 7. > Without this patch, it prints only "7". > > (yes 7|head -11; echo 1)|sort --parallel=1 -S32b -u > > Of course, it needs more/better comments, NEWS and > test

bug#9780: sort -u throws out non-duplicates

2012-08-15 Thread Jim Meyering
Jim Meyering wrote: > Paul Eggert wrote: >> Thanks very much for that test case; I've confirmed the bug on >> my platform with the latest 'sort'. If nobody else gets to it >> I will try to take a look at it when I find the time >> (most likely in a week or so). > > Yes, thanks again! > That is a s

bug#9780: sort -u throws out non-duplicates

2012-08-14 Thread Jim Meyering
Paul Eggert wrote: > Thanks very much for that test case; I've confirmed the bug on > my platform with the latest 'sort'. If nobody else gets to it > I will try to take a look at it when I find the time > (most likely in a week or so). Yes, thanks again! That is a serious bug. It has been around

bug#9780: sort -u throws out non-duplicates

2012-08-14 Thread Paul Eggert
Thanks very much for that test case; I've confirmed the bug on my platform with the latest 'sort'. If nobody else gets to it I will try to take a look at it when I find the time (most likely in a week or so).

bug#9780: sort -u throws out non-duplicates

2011-10-18 Thread Pádraig Brady
On 10/18/2011 09:48 AM, Bernhard Rosenkraenzer wrote: > On Mon, 17 Oct 2011 20:22:52 -0600, Eric Blake wrote: >> On 10/17/2011 06:59 PM, Bernhard Rosenkraenzer wrote: >> Thanks for the report. Unfortunately, you did not provide enough >> information to reproduce this - for example, what platform a

bug#9780: sort -u throws out non-duplicates

2011-10-18 Thread Jim Meyering
Bernhard Rosenkraenzer wrote: > On Mon, 17 Oct 2011 20:22:52 -0600, Eric Blake wrote: >> On 10/17/2011 06:59 PM, Bernhard Rosenkraenzer wrote: >> Thanks for the report. Unfortunately, you did not provide enough >> information to reproduce this - for example, what platform are you >> running on? >

bug#9780: sort -u throws out non-duplicates

2011-10-18 Thread Bernhard Rosenkraenzer
On Mon, 17 Oct 2011 20:22:52 -0600, Eric Blake wrote: On 10/17/2011 06:59 PM, Bernhard Rosenkraenzer wrote: Thanks for the report. Unfortunately, you did not provide enough information to reproduce this - for example, what platform are you running on? Fairly current Linux -- kernel 3.1-rc9, eg

bug#9780: sort -u throws out non-duplicates

2011-10-17 Thread Eric Blake
tag 9780 moreinfo thanks On 10/17/2011 06:59 PM, Bernhard Rosenkraenzer wrote: [bero@matterhorn tmp]$ wget http://bero.eu/java-source-list [...] [bero@matterhorn tmp]$ tr ' ' '\n' Thanks for the report. Unfortunately, you did not provide enough information to reproduce this - for example, wh

bug#9780: sort -u throws out non-duplicates

2011-10-17 Thread Bernhard Rosenkraenzer
On Tue, 18 Oct 2011 01:59:12 +0100, Bernhard Rosenkraenzer wrote: Note the missing .../java/java/security/cert/X509Certificate.java The problem occurs (at least) with sort from coreutils 8.12, 8.13 and 8.14. This is locale related... Seems to happen in any non-C locale. [bero@matterhorn ~]$

bug#9780: sort -u throws out non-duplicates

2011-10-17 Thread Bernhard Rosenkraenzer
[bero@matterhorn tmp]$ wget http://bero.eu/java-source-list [...] [bero@matterhorn tmp]$ tr ' ' '\n' X509Certificate libcore/luni/src/main/java/java/security/cert/X509Certificate.java libcore/luni/src/main/java/javax/security/cert/X509Certificate.java This is correct... [bero@matterhorn tmp]$ t