On 05/06/2013 07:54 PM, Cojocaru Alexandru wrote:
> On Mon, 29 Apr 2013 00:59:20 +0100
> Pádraig Brady <p...@draigbrady.com> wrote:
> 
>> So I reinstated the bit vector which was a little tricky
>> to do while maintaining performance, but it works very well.
> I think it works because we are avoiding a memory access
> inside `next_item' this way.
> 
> With this patch I try to keep the CPU benefits for `--output-d'
> and when large ranges are specified, even without the bitarray.
> 
> Because of the sentinel now the max line len supported will be
> `(size_t)-1 - 1' and no more `(size_t)-1'. Is this an issue?
> 
> PS: This patch also fix a little bug inside `set_fields'.

It's always best to have separate changes.
I've split the fix out (attached) with an associated test.

thanks,
Pádraig.
>From b54b47f954c9b97bdb2dbbf51ead908ccb3a4f13 Mon Sep 17 00:00:00 2001
From: Cojocaru Alexandru <xo...@gmx.com>
Date: Tue, 7 May 2013 13:01:46 +0100
Subject: [PATCH] cut: fix handling of overlapping ranges

This issue was introduced in commit v8.21-43-g3e466ad

* src/cut.c (set_fields): Process all range pairs when merging.
* tests/misc/cut-huge-range.sh: Add a test for this edge case.
Also fix an issue where we could miss reported errors due
to truncation of the 'err' file.
---
 src/cut.c                    |    6 +++---
 tests/misc/cut-huge-range.sh |    8 +++++++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/cut.c b/src/cut.c
index b347b30..9501b3a 100644
--- a/src/cut.c
+++ b/src/cut.c
@@ -496,9 +496,9 @@ set_fields (const char *fieldstr)
           if (rp[j].lo <= rp[i].hi)
             {
               rp[i].hi = MAX (rp[j].hi, rp[i].hi);
-              memmove (rp + j, rp + j + 1,
-                       (n_rp - j - 1) * sizeof (struct range_pair));
-              --n_rp;
+              memmove (rp + j, rp + j + 1, (n_rp - j - 1) * sizeof *rp);
+              n_rp--;
+              j--;
             }
           else
             break;
diff --git a/tests/misc/cut-huge-range.sh b/tests/misc/cut-huge-range.sh
index 887197a..9905cd7 100755
--- a/tests/misc/cut-huge-range.sh
+++ b/tests/misc/cut-huge-range.sh
@@ -27,7 +27,13 @@ getlimits_
 
 # Up to and including coreutils-8.21, cut would allocate possibly needed
 # memory upfront.  Subsequently memory is allocated as required.
-(ulimit -v 20000; : | cut -b1-$INT_MAX > err 2>&1) || fail=1
+(ulimit -v 20000; : | cut -b1-$INT_MAX >> err 2>&1) || fail=1
+
+# Ensure ranges are merged correctly when large range logic is in effect
+echo 1 > exp
+(dd bs=1MB if=/dev/zero count=1; echo '1') |
+cut -b1-1000000,2-3,4-5,1000001 2>>err | tail -c2 > out || fail=1
+compare exp out || fail=1
 
 compare /dev/null err || fail=1
 
-- 
1.7.7.6

Reply via email to