On 05/06/2013 07:54 PM, Cojocaru Alexandru wrote: > On Mon, 29 Apr 2013 00:59:20 +0100 > Pádraig Brady <p...@draigbrady.com> wrote: > >> So I reinstated the bit vector which was a little tricky >> to do while maintaining performance, but it works very well. > I think it works because we are avoiding a memory access > inside `next_item' this way. > > With this patch I try to keep the CPU benefits for `--output-d' > and when large ranges are specified, even without the bitarray. > > Because of the sentinel now the max line len supported will be > `(size_t)-1 - 1' and no more `(size_t)-1'. Is this an issue? > > PS: This patch also fix a little bug inside `set_fields'.
It's always best to have separate changes. I've split the fix out (attached) with an associated test. thanks, Pádraig.
>From b54b47f954c9b97bdb2dbbf51ead908ccb3a4f13 Mon Sep 17 00:00:00 2001 From: Cojocaru Alexandru <xo...@gmx.com> Date: Tue, 7 May 2013 13:01:46 +0100 Subject: [PATCH] cut: fix handling of overlapping ranges This issue was introduced in commit v8.21-43-g3e466ad * src/cut.c (set_fields): Process all range pairs when merging. * tests/misc/cut-huge-range.sh: Add a test for this edge case. Also fix an issue where we could miss reported errors due to truncation of the 'err' file. --- src/cut.c | 6 +++--- tests/misc/cut-huge-range.sh | 8 +++++++- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/src/cut.c b/src/cut.c index b347b30..9501b3a 100644 --- a/src/cut.c +++ b/src/cut.c @@ -496,9 +496,9 @@ set_fields (const char *fieldstr) if (rp[j].lo <= rp[i].hi) { rp[i].hi = MAX (rp[j].hi, rp[i].hi); - memmove (rp + j, rp + j + 1, - (n_rp - j - 1) * sizeof (struct range_pair)); - --n_rp; + memmove (rp + j, rp + j + 1, (n_rp - j - 1) * sizeof *rp); + n_rp--; + j--; } else break; diff --git a/tests/misc/cut-huge-range.sh b/tests/misc/cut-huge-range.sh index 887197a..9905cd7 100755 --- a/tests/misc/cut-huge-range.sh +++ b/tests/misc/cut-huge-range.sh @@ -27,7 +27,13 @@ getlimits_ # Up to and including coreutils-8.21, cut would allocate possibly needed # memory upfront. Subsequently memory is allocated as required. -(ulimit -v 20000; : | cut -b1-$INT_MAX > err 2>&1) || fail=1 +(ulimit -v 20000; : | cut -b1-$INT_MAX >> err 2>&1) || fail=1 + +# Ensure ranges are merged correctly when large range logic is in effect +echo 1 > exp +(dd bs=1MB if=/dev/zero count=1; echo '1') | +cut -b1-1000000,2-3,4-5,1000001 2>>err | tail -c2 > out || fail=1 +compare exp out || fail=1 compare /dev/null err || fail=1 -- 1.7.7.6