On Sun, 09 Dec 2012 21:45:03 +0100
Jim Meyering <j...@meyering.net> wrote:

> Thanks for the patch.
> This is large enough that you'll have to file a copyright assignment.
> For details, see the "Copyright assignment" section in the file
> named HACKING.

Fine.


> Have you considered performance in the common case?
> I suspect that a byte or field number larger than 1000 is
> not common.  That is why, in the FIXME comment above,
> I suggested to use an adaptive approach.  I had the feeling
> (don't remember if I profiled it) that testing a bit per
> input field would be more efficient than an in-range test.

Yes, it was the first thing I checked. And there's no performance loss.


> If you construct test cases and gather timing data, please do so
> in a reproducible manner and include details when you report back,
> so we can compare on different types of systems.

Here are my benchmarks:

OS:       Parabola GNU/linux-libre (linux-libre v3.6.8-1)
Compiler: GCC 4.7.2
Cflags:   -O2
LANG:     C
CPU:      Intel Core Duo  (1.86 GHz) (L1 Cache 64KiB) (L2 Cache 2MiB)
Main memory:
 - Bank 0: DIMM DRAM Synchronous (1GiB) (width 64 bits)
 - Bank 1: DIMM DRAM Synchronous (1GiB) (width 64 bits)

NOTE: information gathered with `lshw'.


Summary (see the attached file for complete data):

### small ranges
cut-pre: 0:01.84
cut-post: 0:01.36
cut-split: 0:01.25

### bigger ranges
cut-pre: 0:11.74
cut-post: 0:09.20
cut-split: 0:07.91 ***

### fields
cut-pre: 0:02.90
cut-post: 0:02.68
cut-split: 0:02.85

### --output-delimiter
cut-pre: 0:02.90
cut-post: 0:02.74
cut-split: 0:02.80


NOTES:
 cut-pre is the current implementation and was compiled from commit ec48beadf.
 cut-post was compiled after applying the above patch to commit ec48beadf.
 cut-split was compiled after applying the `split-print_kth' patch to commit 
ec48beadf.


The main advantages cames from splitting `print_kth' into two
separate functions, so now `print_kth' does fewer checks.


Best regards,
Cojocaru Alexandru
OS:       Parabola GNU/linux-libre (linux-libre v3.6.8-1)
Compiler: GCC 4.7.2
Cflags:   -O2
LANG:     C
CPU:      Intel Core Duo  (1.86 GHz) (L1 64KiB) (L2 2MiB)
Main memory:
 - Bank 0: DIMM DRAM Synchronous (1GiB) (width 64 bits)
 - Bank 1: DIMM DRAM Synchronous (1GiB) (width 64 bits)

NOTE: information gathered with `lshw'.


bash$ ./cut-pre 2> /dev/null # try not to count caching of shared libraries

### small ranges
bash$ for i in `seq 1 1000000`; do echo "abcdfeg" >> big-file; done

bash$ for i in 1 2 3; do /usr/bin/time ./cut-pre -b1,3 big-file > /dev/null; 
echo ; done
1.72user 0.11system 0:01.84elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

1.75user 0.08system 0:01.84elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

1.76user 0.07system 0:01.84elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+167minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-post -b1,3 big-file > /dev/null; 
echo; done
1.23user 0.12system 0:01.36elapsed 99%CPU (0avgtext+0avgdata 560maxresident)k
0inputs+0outputs (0major+165minor)pagefaults 0swaps

1.25user 0.09system 0:01.36elapsed 99%CPU (0avgtext+0avgdata 560maxresident)k
0inputs+0outputs (0major+165minor)pagefaults 0swaps

1.25user 0.09system 0:01.36elapsed 99%CPU (0avgtext+0avgdata 556maxresident)k
0inputs+0outputs (0major+164minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-split -b1,3 big-file > /dev/null; 
echo ; done
1.15user 0.09system 0:01.25elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

1.15user 0.08system 0:01.25elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+167minor)pagefaults 0swaps

1.14user 0.10system 0:01.25elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+167minor)pagefaults 0swaps


### bigger ranges
bash$ yes $(for i in $(seq 1 100000); do echo -n a; done) | dd of=big-lines 
ibs=100001 count=10000 iflag=fullblock

bash$ for i in 1 2 3; do /usr/bin/time ./cut-pre -b50-100,101-105,9999 
big-lines > /dev/null; echo; done
11.01user 0.70system 0:11.74elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

11.02user 0.70system 0:11.74elapsed 99%CPU (0avgtext+0avgdata 576maxresident)k
0inputs+0outputs (0major+169minor)pagefaults 0swaps

11.04user 0.66system 0:11.73elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-post -b50-100,101-105,9999 
big-lines > /dev/null; echo; done
8.65user 0.52system 0:09.20elapsed 99%CPU (0avgtext+0avgdata 560maxresident)k
0inputs+0outputs (0major+165minor)pagefaults 0swaps

8.59user 0.58system 0:09.20elapsed 99%CPU (0avgtext+0avgdata 556maxresident)k
0inputs+0outputs (0major+164minor)pagefaults 0swaps

8.53user 0.65system 0:09.21elapsed 99%CPU (0avgtext+0avgdata 560maxresident)k
0inputs+0outputs (0major+165minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-split -b50-100,101-105,9999 
big-lines > /dev/null; echo; done
7.22user 0.66system 0:07.91elapsed 99%CPU (0avgtext+0avgdata 576maxresident)k
0inputs+0outputs (0major+169minor)pagefaults 0swaps

7.26user 0.61system 0:07.90elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

7.24user 0.64system 0:07.91elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps


### fields
bash$ yes "a:b:c:d:e" | dd of=fields ibs=10 count=1000000 iflag=fullblock

bash$ for i in 1 2 3; do /usr/bin/time ./cut-pre -f2,3 -d: fields > /dev/null; 
echo; done
2.82user 0.06system 0:02.90elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+167minor)pagefaults 0swaps

2.80user 0.05system 0:02.87elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+167minor)pagefaults 0swaps

2.79user 0.05system 0:02.85elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-post -f2,3 -d: fields > /dev/null; 
echo; done
2.58user 0.09system 0:02.68elapsed 99%CPU (0avgtext+0avgdata 556maxresident)k
0inputs+0outputs (0major+164minor)pagefaults 0swaps

2.63user 0.05system 0:02.69elapsed 99%CPU (0avgtext+0avgdata 560maxresident)k
0inputs+0outputs (0major+165minor)pagefaults 0swaps

2.61user 0.07system 0:02.69elapsed 99%CPU (0avgtext+0avgdata 556maxresident)k
0inputs+0outputs (0major+164minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-split -f2,3 -d: fields > 
/dev/null; echo; done
2.79user 0.05system 0:02.85elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

2.61user 0.06system 0:02.69elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

2.63user 0.09system 0:02.73elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+167minor)pagefaults 0swaps



### --output-delimiter

bash$ for i in 1 2 3; do /usr/bin/time ./cut-pre -f2,3 -d: --output-d=' ' 
fields > /dev/null; echo; done

2.82user 0.06system 0:02.90elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

2.81user 0.06system 0:02.88elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

2.80user 0.05system 0:02.86elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-post -f2,3 -d: --output-d=' ' 
fields > /dev/null; echo; done
2.67user 0.05system 0:02.74elapsed 99%CPU (0avgtext+0avgdata 556maxresident)k
0inputs+0outputs (0major+164minor)pagefaults 0swaps

2.60user 0.09system 0:02.70elapsed 99%CPU (0avgtext+0avgdata 556maxresident)k
0inputs+0outputs (0major+164minor)pagefaults 0swaps

2.59user 0.08system 0:02.68elapsed 99%CPU (0avgtext+0avgdata 556maxresident)k
0inputs+0outputs (0major+164minor)pagefaults 0swaps


bash$ for i in 1 2 3; do /usr/bin/time ./cut-split -f2,3 -d: --output-d=' ' 
fields > /dev/null; echo; done
2.75user 0.04system 0:02.80elapsed 99%CPU (0avgtext+0avgdata 568maxresident)k
0inputs+0outputs (0major+167minor)pagefaults 0swaps

2.63user 0.05system 0:02.69elapsed 99%CPU (0avgtext+0avgdata 572maxresident)k
0inputs+0outputs (0major+168minor)pagefaults 0swaps

2.61user 0.08system 0:02.70elapsed 99%CPU (0avgtext+0avgdata 576maxresident)k
0inputs+0outputs (0major+169minor)pagefaults 0swaps

NOTES:
 cut-pre is the current implementation and was compiled from commit ec48beadf.
 cut-post was compiled after applying the above patch to commit ec48beadf.
 cut-split was compiled after applying the attached patch to commit ec48bead.

Attachment: split-print_kth
Description: Binary data

Reply via email to