[issue4773] HTTPMessage not documented and has inconsistent API across 2.6/3.0

Brad Miller Thu, 26 Mar 2009 18:00:34 -0700

Brad Miller <[email protected]> added the comment:

On Thu, Mar 26, 2009 at 4:29 PM, Barry A. Warsaw <[email protected]>wrote:


>
> Barry A. Warsaw <[email protected]> added the comment:
>
> I propose that you only document the getitem header access API.  I.e.
> the thing that info() gives you can be used to access the message
> headers via message['content-type'].  That's an API common to both
> rfc822.Messages (the ultimate base class of mimetools.Message) and
> email.message.Message.
>

As I've found myself in the awkward position of having to explain the new
3.0 api to my students I've thought about this and have some
ideas/questions.
I'm also willing to help with the documentation or any enhancements.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'addinfourl' object is unsubscriptable

I wish I new what an addinfourl object was.

'Fri, 27 Mar 2009 00:41:34 GMT'

'Fri, 27 Mar 2009 00:41:34 GMT'

['Date', 'Server', 'Last-Modified', 'ETag', 'Accept-Ranges',
'Content-Length', 'Connection', 'Content-Type']

Using x.headers over x.info()  makes the most sense to me, but I don't know
that I can give any good rationale.  Which would we want to document?

'text/html; charset=ISO-8859-1'

I guess technically this is correct since the charset is part of the
Content-Type header in HTTP but it does make life difficult for what I think
will be a pretty common use case in this new urllib:  read from the url (as
bytes) and then decode them into a string using the appropriate character
set.

As you follow this road, you have the confusing option of these three calls:

'iso-8859-1'
>>> x.headers.get_charsets()
['iso-8859-1']

I think it should be a bug that get_charset() does not return anything in
this case.  It is not at all clear why get_content_charset() and
get_charset() should have different behavior.

Brad

>
> ----------
> nosy: +barry
>
> _______________________________________
> Python tracker <[email protected]>
> <http://bugs.python.org/issue4773>
> _______________________________________
>

----------
Added file: http://bugs.python.org/file13430/unnamed

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue4773>
_______________________________________

<div><br><div class="gmail_quote">On Thu, Mar 26, 2009 at 4:29 PM, Barry A. 
Warsaw <span dir="ltr">&lt;<a 
href="mailto:[email protected]";>[email protected]</a>&gt;</span> 
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 
.8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
Barry A. Warsaw &lt;<a href="mailto:[email protected]";>[email protected]</a>&gt; 
added the comment:<br>
<br>
I propose that you only document the getitem header access API. Â I.e.<br>
the thing that info() gives you can be used to access the message<br>
headers via message[&#39;content-type&#39;]. Â That&#39;s an API common to 
both<br>
rfc822.Messages (the ultimate base class of mimetools.Message) and<br>
email.message.Message.<br></blockquote><div><br></div><div>As I&#39;ve found 
myself in the awkward position of having to explain the new 3.0 api to my 
students I&#39;ve thought about this and have some ideas/questions.<div>
<br></div><div>I&#39;m also willing to help with the documentation or any 
enhancements.</div><div><br></div><div>&gt;&gt;&gt; x = 
urllib.request.urlopen(&#39;<a 
href="http://knuth.luther.edu/python/test.html";>http://knuth.luther.edu/python/test.html</a>&#39;)</div>
<div><div>&gt;&gt;&gt; x[&#39;Date&#39;]</div><div>Traceback (most recent call 
last):</div><div>Â Â File &quot;&lt;stdin&gt;&quot;, line 1, in 
&lt;module&gt;</div><div>TypeError: &#39;addinfourl&#39; object is 
unsubscriptable</div>
<div><br></div><div>I wish I new what an addinfourl object 
was.</div><div><br></div><div><div>&gt;&gt;&gt; <a 
href="http://x.info";>x.info</a>()[&#39;Date&#39;]</div><div>&#39;Fri, 27 Mar 
2009 00:41:34 GMT&#39;</div><div><br>
</div><div><div>&gt;&gt;&gt; x.headers[&#39;Date&#39;]</div><div>&#39;Fri, 27 
Mar 2009 00:41:34 GMT&#39;</div><div><br></div><div><div>&gt;&gt;&gt; 
x.headers.keys()</div><div>[&#39;Date&#39;, &#39;Server&#39;, 
&#39;Last-Modified&#39;, &#39;ETag&#39;, &#39;Accept-Ranges&#39;, 
&#39;Content-Length&#39;, &#39;Connection&#39;, &#39;Content-Type&#39;]</div>
<div><br></div><div>Using x.headers over <a href="http://x.info";>x.info</a>() 
Â makes the most sense to me, but I don&#39;t know that I can give any good 
rationale. Â Which would we want to document?</div></div></div></div>
<div><br></div><div><div>&gt;&gt;&gt; 
x.headers[&#39;Content-Type&#39;]</div><div>&#39;text/html; 
charset=ISO-8859-1&#39;</div><div><br></div><div>I guess technically this is 
correct since the charset is part of the Content-Type header in HTTP but it 
does make life difficult for what I think will be a pretty common use case in 
this new urllib: Â read from the url (as bytes) and then decode them into a 
string using the appropriate character set.</div>
<div><br></div></div><div>As you follow this road, you have the confusing 
option of these three calls:</div><div><br></div><div><div>&gt;&gt;&gt; 
x.headers.get_charset()</div><div>&gt;&gt;&gt; 
x.headers.get_content_charset()</div>
<div>&#39;iso-8859-1&#39;</div><div>&gt;&gt;&gt; 
x.headers.get_charsets()</div><div>[&#39;iso-8859-1&#39;]</div><div><br></div><div>I
 think it should be a bug that get_charset() does not return anything in this 
case. Â It is not at all clear why get_content_charset() and get_charset() 
should have different behavior.</div>
<div><br></div><div>Brad</div></div><div><br></div></div></div><div>Â </div><blockquote
 class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc 
solid;padding-left:1ex;">
<br>
----------<br>
nosy: +barry<br>
<div><div></div><div class="h5"><br>
_______________________________________<br>
Python tracker &lt;<a 
href="mailto:[email protected]";>[email protected]</a>&gt;<br>
&lt;<a href="http://bugs.python.org/issue4773"; 
target="_blank">http://bugs.python.org/issue4773</a>&gt;<br>
_______________________________________<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Brad 
Miller<br>Assistant Professor, Computer Science<br>Luther College<br>
</div>

_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4773] HTTPMessage not documented and has inconsistent API across 2.6/3.0

Reply via email to