On 3/2/12 6:47 AM, Jean-François Gagné wrote: > uname output: Linux xxxxxxxx 2.6.32-5-amd64 #1 SMP Tue Jun 14 09:42:28 UTC > 2011 x86_64 GNU/Linux > Machine Type: x86_64-pc-linux-gnu > > Bash Version: 4.1 > Patch Level: 5 > Release Status: release > > Description: > When reading data with the 'read' buildin from a redirection, read has > unexpected behavior after reading 2G of data. > > Repeat-By: > > > yes "0123456789abcdefghijklmnopqrs" | head -n 100000000 > file > while read line; do file=${line:0:10}; echo $file; done < file | uniq -c > > > results in > > > 71582790 0123456789 > 1 mnopqrs > 3 0123456789 > 1 mnopqrs > 3 0123456789 > 1 mnopqrs > 3 0123456789 > 1 mnopqrs > 3 0123456789 > ... > > So the problem happens after reading 71.582.790 x30 = 2.147.483.700 bytes of > data, just a little over 2^31.
Compile and run the attached program. If it prints out `4', which it does on all of the Debian systems I've tried, file offsets are limited to 32 bits, and accessing files greater than 2 GB is going to be unreliable. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/
#include <unistd.h> #include <stdlib.h> #include <stdio.h> main(int c, char **v) { printf("%d\n", (int)sizeof(off_t)); exit(0); }