I noticed large realloc()s still being done by fread_file() for the common case of a regular file. As well as being inefficient, it could needlessly fail for large files.
$ truncate -s1M t.f $ ltrace ./a.out < t.f __libc_start_main(0x8048624, 1, 0xbfd86c34, 0x8048940, 0x8048930 <unfinished ...> fileno(0x7b7420) = 0 __fxstat(3, 0, 0xbfd86acc) = 0 ftello(0x7b7420, 0xbfd86acc, 0xb78c22ac, 0xf63d4e2e, 3) = 0 malloc(1048577) = 0xb77c0008 fread(0xb77c0008, 1, 1048576, 0x7b7420) = 1048576 realloc(0xb77c0008, 1572865) = 0xb763f008 fread(0xb773f008, 1, 524288, 0x7b7420) = 0 __errno_location() = 0xb78c1688 ferror(0x7b7420) = 0 realloc(0xb763f008, 1048577) = 0xb763f008 After applying the attached patch I get: $ ltrace ./a.out < t.f __libc_start_main(0x8048624, 1, 0xbfa14994, 0x8048940, 0x8048930 <unfinished ...> fileno(0x7b7420) = 0 __fxstat(3, 0, 0xbfa1482c) = 0 ftello(0x7b7420, 0xbfa1482c, 0xb783a2ac, 0xf63d4e2e, 3) = 0 malloc(1048578) = 0xb7738008 fread(0xb7738008, 1, 1048577, 0x7b7420) = 1048576 __errno_location() = 0xb7839688 ferror(0x7b7420) = 0 On a related note I was wondering if we should fall back to increasing the buffer by +=BUFSIZ rather than *=2 when we get ENOMEM? If we do then we'd need to be more careful with the overflow checking with lines like the following: if (size + BUFSIZ + 1 > alloc) cheers, Pádraig.
diff --git a/lib/read-file.c b/lib/read-file.c index 0a15c5a..4ea38f7 100644 --- a/lib/read-file.c +++ b/lib/read-file.c @@ -65,7 +65,8 @@ fread_file (FILE * stream, size_t * length) return NULL; } - alloc = alloc_off + 1; + /* 1 > file size so eof immediately detected. */ + alloc = alloc_off + 2; buf = malloc (alloc); if (!buf) @@ -120,8 +121,10 @@ fread_file (FILE * stream, size_t * length) if (ferror (stream)) break; - /* Shrink the allocated memory if possible. */ - if (size + 1 < alloc) + /* Shrink the allocated memory if possible, + but don't bother about the extra byte + allocated to detect the eof. */ + if (size < alloc - 2) { char *smaller_buf = realloc (buf, size + 1); if (smaller_buf != NULL)