Brad Miller <bonel...@gmail.com> added the comment:
On Thu, Mar 26, 2009 at 4:29 PM, Barry A. Warsaw <rep...@bugs.python.org>wrote:
>
> Barry A. Warsaw <ba...@python.org> added the comment:
>
> I propose that you only document the getitem header access API. I.e.
> the thing that info() gives you can be used to access the message
> headers via message['content-type']. That's an API common to both
> rfc822.Messages (the ultimate base class of mimetools.Message) and
> email.message.Message.
>
As I've found myself in the awkward position of having to explain the new
3.0 api to my students I've thought about this and have some
ideas/questions.
I'm also willing to help with the documentation or any enhancements.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'addinfourl' object is unsubscriptable
I wish I new what an addinfourl object was.
'Fri, 27 Mar 2009 00:41:34 GMT'
'Fri, 27 Mar 2009 00:41:34 GMT'
['Date', 'Server', 'Last-Modified', 'ETag', 'Accept-Ranges',
'Content-Length', 'Connection', 'Content-Type']
Using x.headers over x.info() makes the most sense to me, but I don't know
that I can give any good rationale. Which would we want to document?
'text/html; charset=ISO-8859-1'
I guess technically this is correct since the charset is part of the
Content-Type header in HTTP but it does make life difficult for what I think
will be a pretty common use case in this new urllib: read from the url (as
bytes) and then decode them into a string using the appropriate character
set.
As you follow this road, you have the confusing option of these three calls:
'iso-8859-1'
>>> x.headers.get_charsets()
['iso-8859-1']
I think it should be a bug that get_charset() does not return anything in
this case. It is not at all clear why get_content_charset() and
get_charset() should have different behavior.
Brad
>
> ----------
> nosy: +barry
>
> _______________________________________
> Python tracker <rep...@bugs.python.org>
> <http://bugs.python.org/issue4773>
> _______________________________________
>
----------
Added file: http://bugs.python.org/file13430/unnamed
_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue4773>
_______________________________________
<div><br><div class="gmail_quote">On Thu, Mar 26, 2009 at 4:29 PM, Barry A.
Warsaw <span dir="ltr"><<a
href="mailto:rep...@bugs.python.org">rep...@bugs.python.org</a>></span>
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
Barry A. Warsaw <<a href="mailto:ba...@python.org">ba...@python.org</a>>
added the comment:<br>
<br>
I propose that you only document the getitem header access API. Â I.e.<br>
the thing that info() gives you can be used to access the message<br>
headers via message['content-type']. Â That's an API common to
both<br>
rfc822.Messages (the ultimate base class of mimetools.Message) and<br>
email.message.Message.<br></blockquote><div><br></div><div>As I've found
myself in the awkward position of having to explain the new 3.0 api to my
students I've thought about this and have some ideas/questions.<div>
<br></div><div>I'm also willing to help with the documentation or any
enhancements.</div><div><br></div><div>>>> x =
urllib.request.urlopen('<a
href="http://knuth.luther.edu/python/test.html">http://knuth.luther.edu/python/test.html</a>')</div>
<div><div>>>> x['Date']</div><div>Traceback (most recent call
last):</div><div>Â Â File "<stdin>", line 1, in
<module></div><div>TypeError: 'addinfourl' object is
unsubscriptable</div>
<div><br></div><div>I wish I new what an addinfourl object
was.</div><div><br></div><div><div>>>> <a
href="http://x.info">x.info</a>()['Date']</div><div>'Fri, 27 Mar
2009 00:41:34 GMT'</div><div><br>
</div><div><div>>>> x.headers['Date']</div><div>'Fri, 27
Mar 2009 00:41:34 GMT'</div><div><br></div><div><div>>>>
x.headers.keys()</div><div>['Date', 'Server',
'Last-Modified', 'ETag', 'Accept-Ranges',
'Content-Length', 'Connection', 'Content-Type']</div>
<div><br></div><div>Using x.headers over <a href="http://x.info">x.info</a>()
 makes the most sense to me, but I don't know that I can give any good
rationale. Â Which would we want to document?</div></div></div></div>
<div><br></div><div><div>>>>
x.headers['Content-Type']</div><div>'text/html;
charset=ISO-8859-1'</div><div><br></div><div>I guess technically this is
correct since the charset is part of the Content-Type header in HTTP but it
does make life difficult for what I think will be a pretty common use case in
this new urllib: Â read from the url (as bytes) and then decode them into a
string using the appropriate character set.</div>
<div><br></div></div><div>As you follow this road, you have the confusing
option of these three calls:</div><div><br></div><div><div>>>>
x.headers.get_charset()</div><div>>>>
x.headers.get_content_charset()</div>
<div>'iso-8859-1'</div><div>>>>
x.headers.get_charsets()</div><div>['iso-8859-1']</div><div><br></div><div>I
think it should be a bug that get_charset() does not return anything in this
case. Â It is not at all clear why get_content_charset() and get_charset()
should have different behavior.</div>
<div><br></div><div>Brad</div></div><div><br></div></div></div><div>Â </div><blockquote
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex;">
<br>
----------<br>
nosy: +barry<br>
<div><div></div><div class="h5"><br>
_______________________________________<br>
Python tracker <<a
href="mailto:rep...@bugs.python.org">rep...@bugs.python.org</a>><br>
<<a href="http://bugs.python.org/issue4773"
target="_blank">http://bugs.python.org/issue4773</a>><br>
_______________________________________<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Brad
Miller<br>Assistant Professor, Computer Science<br>Luther College<br>
</div>
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com