On 10Mar2015 04:01, Paulo da Silva <p_s_d_a_s_i_l_v_a...@netcabo.pt> wrote:
On 10-03-2015 00:55, Dave Angel wrote:
On 03/09/2015 08:45 PM, Paulo da Silva wrote:
What is the best way to read a file that begins with some few text lines
and whose rest is a binary stream?
[...]
Generally speaking, you can treat a piece of a binary (input) file as an
encoded string, so you want to open the file as binary, locate the part
that's text, and then explicitly decode the string from that.
That's waht I did. However, I was thinking of some other more efficient
and simple way. For example a command to read text and another to read
bytes.
For .pnm photo files I read the entire file (I needed it in memory
anyway), splited a copy separated by b'\n', got the headers stuff and
then used the original remaining bytes as the photo pixels.
But this is very tricky! I am on linux, but if I ran this program on
windows I needed to change it to "eat" also the '\r'.
If you're in Python 3 (recommended!) and you're parsing the headers as text,
you should be converting your split binary into strings anyway. So you can just
use .strip() or rstrip(); either will remove trailing '\r' and '\n', so it will
work in both UNIX and Windows.
In the .pnm case the headers don't have special chars. They fit into
ascii. But in a file who have them it would be also difficult to compute
the consumed length.
I presume you're gathering the headers in "binary" mode and decoding each to a
string. So you know the consumed length from the binary half; that they're
different lengths after decoding to strings is then irrelevant.
Cheers,
Cameron Simpson <c...@zip.com.au>
These are my principles, and if you don't like them, I have others.
- Groucho Marx
--
https://mail.python.org/mailman/listinfo/python-list