I'm not sure what's causing this, but depending on the compression program used, the bz2 module sometimes exits earlier.
I used pbzip2 to compress my bz2 files and read through the file using the bz2 module. The file descriptor always exits much earlier than where the actual EOF is. If I use bzip2 instead of pbzip2 to compress the files, then everything is fine. My files are generally big (several GBs) so decompressing them is not a wise choice, and it is a little unfortunate that I can't use pbzip2 because it's usually much faster than bz2. -- http://mail.python.org/mailman/listinfo/python-list