On 10Mar2015 22:38, Paulo da Silva <p_s_d_a_s_i_l_v_a...@netcabo.pt> wrote:
On 10-03-2015 04:14, Cameron Simpson wrote:
On 10Mar2015 04:01, Paulo da Silva <p_s_d_a_s_i_l_v_a...@netcabo.pt> wrote:
But this is very tricky! I am on linux, but if I ran this program on
windows I needed to change it to "eat" also the '\r'.

If you're in Python 3 (recommended!) and you're parsing the headers as
text, you should be converting your split binary into strings anyway. So
you can just use .strip() or rstrip(); either will remove trailing '\r'
and '\n', so it will work in both UNIX and Windows.

I didn't know strip removes \r.

The documentation for str.split says it strips "whitespace" by default. In the string module doco it says:

 string.whitespace
   A string containing all ASCII characters that are considered
   whitespace.  This includes the characters space, tab, linefeed,
   return, formfeed, and vertical tab.

[...]
I presume you're gathering the headers in "binary" mode and decoding
each to a string. So you know the consumed length from the binary half;
that they're different lengths after decoding to strings is then
irrelevant.
You are right.
I am still a little confused about python3.

In this context the main point is that python 3 has a nice clean separation of str (as text) and bytes (as octet sized small ints). In general that makes it easier to work with in contexts like this because you are never confused about which you are dealing with.

Since binary files (returning bytes from reads) also have a convenient readline method looking for byte 10 ('\n') this makes you current task tractable: read "binary" lines, getting bytes objects ending in byte 10, then decode each bhytes object into str objects based on the text encoding (typically utf-8, or iso8859-1 or ascii for some protocols/formats not thinking strongly about bytes vs text).

Once decoded, you can then work on them as text without worrying about their former binary encoding.

Cheers,
Cameron Simpson <c...@zip.com.au>

Institutions will try to preserve the problem to which they are the solution.
- Clay Shirky, 2012
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to