I need to write a "fast" file reader in python for binary files structured as:
… x[0] y[0] z[0] x[1] y[1] z[1] … where c[k] is the k-th element from sequence c. As mentioned, the file is binary -- spaces above are just for visualization. Each element, c[k], is a 16-bit int. I can assume I know the number of sequences in the file a priori. Files are stored and processed on a WinXP machine (in case Endian-ness matters). I'm looking for suggestions on how to tackle this problem, and how to design my reader class. Or if anyone has sample code: that would be appreciated, too. Some questions I have to start with: (i) should I use generators for iterating over the file? (ii) how can I handle the 16-bit word aspect of the binary data? (iii) ultimately, the data will need to be processed in chunks of M-values at a time... I assume this means I need some form of buffered io wrapper, but I'm not sure where to start with this. Thanks! Marcus Ps -- this seems like a general stream processing problem… if anyone can recommend good refs (web) on stream processing....
-- http://mail.python.org/mailman/listinfo/python-list