On 10Mar2015 22:38, Paulo da Silva <p_s_d_a_s_i_l_v_a...@netcabo.pt> wrote:
On 10-03-2015 04:14, Cameron Simpson wrote:
On 10Mar2015 04:01, Paulo da Silva <p_s_d_a_s_i_l_v_a...@netcabo.pt> wrote:
But this is very tricky! I am on linux, but if I ran this program on
windows I needed to change it to "eat" also the '\r'.
If you're in Python 3 (recommended!) and you're parsing the headers as
text, you should be converting your split binary into strings anyway. So
you can just use .strip() or rstrip(); either will remove trailing '\r'
and '\n', so it will work in both UNIX and Windows.
I didn't know strip removes \r.
The documentation for str.split says it strips "whitespace" by default. In the
string module doco it says:
string.whitespace
A string containing all ASCII characters that are considered
whitespace. This includes the characters space, tab, linefeed,
return, formfeed, and vertical tab.
[...]
I presume you're gathering the headers in "binary" mode and decoding
each to a string. So you know the consumed length from the binary half;
that they're different lengths after decoding to strings is then
irrelevant.
You are right.
I am still a little confused about python3.
In this context the main point is that python 3 has a nice clean separation of
str (as text) and bytes (as octet sized small ints). In general that makes it
easier to work with in contexts like this because you are never confused about
which you are dealing with.
Since binary files (returning bytes from reads) also have a convenient readline
method looking for byte 10 ('\n') this makes you current task tractable: read
"binary" lines, getting bytes objects ending in byte 10, then decode each
bhytes object into str objects based on the text encoding (typically utf-8, or
iso8859-1 or ascii for some protocols/formats not thinking strongly about bytes
vs text).
Once decoded, you can then work on them as text without worrying about their
former binary encoding.
Cheers,
Cameron Simpson <c...@zip.com.au>
Institutions will try to preserve the problem to which they are the solution.
- Clay Shirky, 2012
--
https://mail.python.org/mailman/listinfo/python-list