from:"karl"

[issue46337] urllib.parse: Allow more flexibility in schemes and URL resolution behavior

2022-02-15 Thread karl



karl  added the comment:

Just to note that there is a maintained list of officially accepted schemes at 
IANA.
https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml

In addition there is a list of unofficial schemes on wikipedia 
https://en.wikipedia.org/wiki/List_of_URI_schemes#Unofficial_but_common_URI_schemes

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue46337>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue13294] http.server - HEAD request when no resource is defined.

2011-10-30 Thread karl


New submission from karl :

A very simple HTTP server

#!/usr/bin/python3
import http.server
from os import chdir

# CONFIG
ROOTPATH = '/Your/path/'
PORT = 8000

# CODE
def run(server_class=http.server.HTTPServer, 
server_handler=http.server.SimpleHTTPRequestHandler):
server_address = ('', PORT)
httpd = server_class(server_address, server_handler)
httpd.serve_forever()

class MyRequestHandler(http.server.SimpleHTTPRequestHandler):
def do_GET(self):
pass

if __name__ == '__main__':
chdir(ROOTPATH)
print("server started on PORT: "+str(PORT))
run(server_handler=MyRequestHandler)


Let's start the server.

% python3 serveur1.py 
server started on PORT: 8000


And let's do a GET request with curl.

% curl -v http://localhost:8000/
* About to connect() to localhost port 8000 (#0)
*   Trying ::1... Connection refused
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 
> OpenSSL/0.9.8r zlib/1.2.5
> Host: localhost:8000
> Accept: */*
> 
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
* Closing connection #0

The server sends nothing because GET is not defined and I haven't defined 
anything in case of errors. So far so good. Now let's do a HEAD request on the 
same resource.

% curl -vsI http://localhost:8000/
* About to connect() to localhost port 8000 (#0)
*   Trying ::1... Connection refused
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8000 (#0)
> HEAD / HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 
> OpenSSL/0.9.8r zlib/1.2.5
> Host: localhost:8000
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
HTTP/1.0 200 OK
< Server: SimpleHTTP/0.6 Python/3.1.2
Server: SimpleHTTP/0.6 Python/3.1.2
< Date: Sun, 30 Oct 2011 14:19:00 GMT
Date: Sun, 30 Oct 2011 14:19:00 GMT
< Content-type: text/html; charset=utf-8
Content-type: text/html; charset=utf-8
< Content-Length: 346
Content-Length: 346

< 
* Closing connection #0


The server shows in the log the request
localhost - - [30/Oct/2011 10:19:00] "HEAD / HTTP/1.1" 200 -

And is answering.

I would suggest that the default behavior is to have something similar to the 
one for the GET aka nothing. Or to modify the library code that for any 
resources not yet defined. The server answers a code 403 Forbidden. 

I could submit a patch in the next few days.

--
components: Library (Lib)
messages: 146639
nosy: karlcow, orsenthil
priority: normal
severity: normal
status: open
title: http.server - HEAD request when no resource is defined.
type: feature request
versions: Python 3.1, Python 3.2

___
Python tracker 
<http://bugs.python.org/issue13294>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue13295] html5 template for Lib/http/server.py

2011-10-30 Thread karl


New submission from karl :

The code has a set of old HTML templates. Here is a patch to change it to very 
simple html5 templates.

--
components: Library (Lib)
files: server-html5.patch
keywords: patch
messages: 146641
nosy: karlcow, orsenthil
priority: normal
severity: normal
status: open
title: html5 template for Lib/http/server.py
type: feature request
versions: Python 3.1, Python 3.2
Added file: http://bugs.python.org/file23554/server-html5.patch

___
Python tracker 
<http://bugs.python.org/issue13295>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue13295] html5 template for Lib/http/server.py

2011-10-31 Thread karl


karl  added the comment:

Ezio, Martin, 

HTML 3.2, HTML 4.01 are not outdated. They have stable specifications. That 
said their doctypes have not influence at all in browsers. The html5 doctype 
 has been chosen because it was the minimal string of characters 
that put the browsers into strict mode rendering (See Quirks Mode in CSS). The 
W3C validator is the only tool implementing an SGML parser able to understand 
HTML 3.2 and HTML 4.01. Note also that the W3C validtor includes an html5 
validator if the concern is the validity of the output.

--

___
Python tracker 
<http://bugs.python.org/issue13295>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue13295] html5 template for Lib/http/server.py

2011-10-31 Thread karl


karl  added the comment:

Yup. I doesn't bring anything except putting the output in line with the 
reality of browsers implementations. You may close it. I don't mind.

--

___
Python tracker 
<http://bugs.python.org/issue13295>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue13294] http.server - HEAD request when no resource is defined.

2011-11-12 Thread karl


karl  added the comment:

Eric,

Two possible solutions to explore:

Either the HEAD reports exactly the same thing than a GET without the body, 
because it is the role of the GET, but that means indeed adding support for the 
HEAD. 

or creating a catch-all answer for all unknown or not implemented methods with 
a "501 Method not implemented" response from the server.

Right now the HEAD returns something :)

I still need to propose a patch. Daily job get into the way :)

--

___
Python tracker 
<http://bugs.python.org/issue13294>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2193] Cookie Colon Name Bug

2011-06-29 Thread karl


karl  added the comment:

@Luke

did you have the opportunity to look at 
http://greenbytes.de/tech/webdav/rfc6265.html 

If there is something which doesn't match reality in that document that would 
be cool to have feedback about it.

--

___
Python tracker 
<http://bugs.python.org/issue2193>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue6098] xml.dom.minidom incorrectly claims DOM Level 3 conformance

2010-10-14 Thread karl


karl  added the comment:

The source of 3.1/lib/python3.1/xml/dom/__init__.py is correct 

===
minidom -- A simple implementation of the Level 1 DOM with namespace
   support added (based on the Level 2 specification) and other
   minor Level 2 functionality.
===

Even the level 2  implementation is partial. Some of the Level 3 implementation 
are based on a 9 April 2002 Working Draft. Comments like these ones are into 
the code.

# Node interfaces from Level 3 (WD 9 April 2002)

To note that there will be a need for a big code change in xml.dom to implement 
in the future webdomcore. Maybe a xml.dom.webdomcore would be welcome. 
http://www.w3.org/TR/domcore/

The request is valid.

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue6098>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5762] AttributeError: 'NoneType' object has no attribute 'replace'

2010-10-14 Thread karl


karl  added the comment:

This following markup creates the mistake as described earlier in the comments





This markup doesn't





It returns





When using this markup





It outputs the right markup,





So the mistake occurs really when xmlns="".  I have checked and the following 
markup is a conformant markup according to the XML specification so xmlns="" or 
bar="" are conformant on the root element.





XML Namespaces are defined in another specification. 
http://www.w3.org/TR/REC-xml-names/. In the section of Namespaces default 
http://www.w3.org/TR/REC-xml-names/#defaulting, The specification is clear.

"The attribute value in a default namespace declaration MAY be empty. This has 
the same effect, within the scope of the declaration, of there being no default 
namespace."

the proposed "if data:" earlier in the comment solves the issue. I have 
attached a unit testcase as required by Mark Lawrence (BreamoreBoy)

--
keywords: +patch
nosy: +karlcow
Added file: http://bugs.python.org/file19239/test-minidom-xmlns.patch

___
Python tracker 
<http://bugs.python.org/issue5762>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2193] Cookie Colon Name Bug

2011-01-04 Thread karl


karl  added the comment:

The rules for parsing and setting the cookies are different. Server should 
always produce strict cookies only. So the production rules are to be done 
accordingly to the specification.

Adam Barth is working right now on an update of the  "HTTP State Management 
Mechanism" specification. See 
http://tools.ietf.org/html/draft-ietf-httpstate-cookie

The name production rules are still defined in RFC2696

What browsers ignores or not in characters depends from browsers to browsers. 
(IETF server is down right now, and I can't link to the appropriate section for 
parsing the values.)

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue2193>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2193] Cookie Colon Name Bug

2011-01-04 Thread karl


karl  added the comment:

Ah the server is back the rules for the User Agents are defined here 
http://tools.ietf.org/html/draft-ietf-httpstate-cookie#section-5

--

___
Python tracker 
<http://bugs.python.org/issue2193>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2193] Cookie Colon Name Bug

2011-01-05 Thread karl

karl added the comment:

John: Ah sorry, if I misunderstood. The bug seems to say that it is about the
Cookie Name and legal chars for this cookie name. What I was trying to say is
that the processing of the Cookie Name is different depending if you are a
client or a server *and* that there is a specification being developed by Adam
Barth (part of browser vendors) to obsolete RFC 2109.

In the case of Server sending to the Client
Set-Cookie: Name=Value

The rules for production of the cookies must be strict. Always. aka the module
is used for creating a cookie and indeed the "colon" character is forbidden.
The "token" syntax for valid chars and invalid chars are defined now in
RFC2696. It means that any US-ASCII characters EXCEPT those are authorized:

control characters (octets 0-31) and DEL (octet 127) and, the following
characters “(“, “)”, “<”, “>”, �...@”, “,”, “;”, “:”, “", “/”, “[“, “]”, “?”,
“=”, “{“, “}”, the double quote character itself, US-ASCII SP (octet 32) or the
tabulation (octet 9)

Then if you use the Cookie Module for a client it is not anymore the same story.

In the case of Client storing the value of the cookie sent by a server.
See the section "5.2. The Set-Cookie Header",
http://tools.ietf.org/html/draft-ietf-httpstate-cookie-20#section-5.2

quote:

If the user agent does not ignore the Set-Cookie header
field in its entirety, the user agent MUST parse the
field-value of the Set-Cookie header field as a
set-cookie-string (defined below).

NOTE: The algorithm below is more permissive than the
grammar in Section 4.1. For example, the algorithm strips
leading and trailing whitespace from the cookie name and
value (but maintains internal whitespace), whereas the
grammar in Section 4.1 forbids whitespace in these
positions. User agents use this algorithm so as to
interoperate with servers that do not follow the
recommendations in Section 4."

/quote

then the algorithm is described. Which means that what the server will parse
will not be necessary what the server have generated.

Section 5.4 says how the Cookie Header should be sent to the server with an
algorithm for what will receive the server.

John, do you think there is a missing algorithm for parsing the value of cookie
header when sent by the client?

___
Python tracker
<http://bugs.python.org/issue2193>
___
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2193] Cookie Colon Name Bug

2011-01-05 Thread karl


karl  added the comment:

agreed. :)

Then my question about parsing rules for libraries. Is interoperability a plus 
here.

--

___
Python tracker 
<http://bugs.python.org/issue2193>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2193] Cookie Colon Name Bug

2011-01-29 Thread karl


karl  added the comment:

@aclover
see my comment http://bugs.python.org/issue2193#msg125423
Adam Barth is working for Google on Chrome. 
The RFC being written is made in cooperation with other browser developers.

If you have comments about this RFC you are welcome to add comment on freenode 
at #whatwg.

--

___
Python tracker 
<http://bugs.python.org/issue2193>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18182] xml.dom.createElement() does not take implicit namespaces into account

2019-09-26 Thread karl



karl  added the comment:

The current specification as of today documents
https://dom.spec.whatwg.org/#dom-document-createelementns


If you run this in the browser console, 

var nsdoc = 'http://foo.bar/zoo';
var xmldoc = document.implementation.createDocument(nsdoc, 'Zoo', null);
var cpd = document.createElementNS(nsdoc, 'Compound');
var chimp = document.createElementNS(nsdoc, 'Chimp');
cpd.appendChild(chimp)
xmldoc.documentElement.appendChild(cpd);

/* serializing */
var docserializer = new XMLSerializer();
var flatxml = docserializer.serializeToString(xmldoc);
flatxml


you get:

http://foo.bar/zoo";>
  

  



but if you run this in the browser console,

var nsdoc = 'http://foo.bar/zoo';
var xmldoc = document.implementation.createDocument(nsdoc, 'Zoo', null);
var cpd = document.createElement('Compound');
var chimp = document.createElement('Chimp');
cpd.appendChild(chimp)
xmldoc.documentElement.appendChild(cpd);

/* serializing */
var docserializer = new XMLSerializer();
var flatxml = docserializer.serializeToString(xmldoc);
flatxml


you get:


http://foo.bar/zoo";>
  http://www.w3.org/1999/xhtml";>

  



which is a complete different beast.


I don't think there is an issue here. And we can close this bug safely.

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue18182>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue22377] %Z in strptime doesn't match EST and others

2019-10-01 Thread karl



karl  added the comment:

I created a PR following the recommendations of p-ganssle
https://github.com/python/cpython/pull/16507

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue22377>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19683] test_minidom has many empty tests

2019-10-02 Thread karl



karl  added the comment:

@zach.ware
@r.david.murray


So I was looking at that issue. There is a lot of work. 

I had a couple of questions, because there are different categories


# Empty tests for existing functions.

This seems to be straightforward as they would complete the module.

Example:
```python
def testGetAttributeNode(self): pass
```
https://github.com/python/cpython/blob/3e04cd268ee9a57f95dc78d8974b21a6fac3f666/Lib/test/test_minidom.py#L412

which refers to: `GetAttributeNode`
https://github.com/python/cpython/blob/3e04cd268ee9a57f95dc78d8974b21a6fac3f666/Lib/xml/dom/minidom.py#L765-L768
https://github.com/python/cpython/blob/3e04cd268ee9a57f95dc78d8974b21a6fac3f666/Lib/test/test_minidom.py#L285-L294



# Tests without any logical reference in the module.

This is puzzling because I'm not sure which DOM feature they should be testing.

For example:

```
def testGetAttrList(self):
pass
```

https://github.com/python/cpython/blob/3e04cd268ee9a57f95dc78d8974b21a6fac3f666/Lib/test/test_minidom.py#L383-L384


Or maybe this is just supposed to test Element.attributes returning a list of 
attributes, such as 
`NamedNodeMap [ def="ghi", jkl="mno"]` returned by a browser.


```
>>> import xml.dom.minidom
>>> from xml.dom.minidom import parse, Node, Document, parseString
>>> from xml.dom.minidom import getDOMImplementation
>>> dom = parseString("")
>>> el = dom.documentElement
>>> el.setAttribute("def", "ghi")
>>> el.setAttribute("jkl", "mno")
>>> el.attributes

```

or is it supposed to test something like 

```
>>> el.attributes.items()
[('def', 'ghi'), ('jkl', 'mno')]
```

This is slightly confusing. And the missing docstrings are not making it easier.



# Tests which do not really test the module(?)

I think for example about this, which is testing that `del` is working, but it 
doesn't have anything to do with the DOM. 

```
def testDeleteAttr(self):
dom = Document()
child = dom.appendChild(dom.createElement("abc"))

self.confirm(len(child.attributes) == 0)
child.setAttribute("def", "ghi")
self.confirm(len(child.attributes) == 1)
del child.attributes["def"]
self.confirm(len(child.attributes) == 0)
dom.unlink()
```

https://github.com/python/cpython/blob/3e04cd268ee9a57f95dc78d8974b21a6fac3f666/Lib/test/test_minidom.py#L285-L294

Specifically when there is a function for it: `removeAttribute`
https://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-6D6AC0F9 

which is tested just below that test.
https://github.com/python/cpython/blob/3e04cd268ee9a57f95dc78d8974b21a6fac3f666/Lib/test/test_minidom.py#L296-L305

so I guess these should be removed or do I miss something in the testing logic?




# Missing docstrings.

Both the testing module and the module lack a lot of docstrings.
Would it be good to fix this too, probably in a separate commit.



# DOM Level 2

So the module intent is to implement DOM Level 2.
but does that make sense in the light of 
https://dom.spec.whatwg.org/

Should minidom tries to follow the current DOM spec?

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue19683>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19683] test_minidom has many empty tests

2019-10-03 Thread karl


karl  added the comment:

err… Errata on my previous comment.

"""
Simple implementation of the Level 1 DOM.

Namespaces and other minor Level 2 features are also supported.
"""
https://github.com/python/cpython/blob/c65119d5bfded03f80a9805889391b66fa7bf551/Lib/xml/dom/minidom.py#L1-L3


https://www.w3.org/TR/REC-DOM-Level-1/

--

___
Python tracker 
<https://bugs.python.org/issue19683>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9004] datetime.utctimetuple() should not set tm_isdst flag to 0

2019-10-03 Thread karl



karl  added the comment:

@gaurav The pull request 
https://github.com/python/cpython/pull/10870
has been closed in favor of 
https://github.com/python/cpython/pull/15773
which has already been merged.

So we can probably close here.

--
message_count: 7.0 -> 8.0
nosy: +karlcow
nosy_count: 7.0 -> 8.0
pull_requests: +16145
pull_request: https://github.com/python/cpython/pull/15773

___
Python tracker 
<https://bugs.python.org/issue9004>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1375011] http.cookies, Cookie.py: Improper handling of duplicate cookies

2019-10-03 Thread karl



karl  added the comment:

Relevant spec
https://tools.ietf.org/html/rfc6265

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue1375011>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44423] copy2 / sendfile fails on linux with large file

2021-06-14 Thread karl



New submission from karl :

by copy a large file e.g.

-rwxrwxr-x 1 1002 1001 5359338160 Feb  9  2019 xxx_file_xxx.mdx

copy2 / sendfile / fastcopy fails with:


Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
  File "/usr/local/lib/python3.8/dist-packages/pybcpy/diff_bak_copy.py", line 
212, in _init_copy_single
shutil.copy2(f, dest_path)
  File "/usr/lib/python3.8/shutil.py", line 432, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.8/shutil.py", line 272, in copyfile
_fastcopy_sendfile(fsrc, fdst)
  File "/usr/lib/python3.8/shutil.py", line 169, in _fastcopy_sendfile
raise err
  File "/usr/lib/python3.8/shutil.py", line 149, in _fastcopy_sendfile
sent = os.sendfile(outfd, infd, offset, blocksize)
OSError: [Errno 75] Value too large for defined data type: 'xxx_file_xxx.mdx' 
-> 'dest/xxx_file_xxx.mdx'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
  File "/usr/local/lib/python3.8/dist-packages/pybcpy/__main__.py", line 433, 
in 
main_func()
  File "/usr/local/lib/python3.8/dist-packages/pybcpy/__main__.py", line 425, 
in main_func
args.func(args)
  File "/usr/local/lib/python3.8/dist-packages/pybcpy/__main__.py", line 75, in 
cmd_init
) = dbak.init_backup_repo(tarmode=args.tar)
  File "/usr/local/lib/python3.8/dist-packages/pybcpy/diff_bak_copy.py", line 
231, in init_backup_repo
files = p.map(self._init_copy_single, ifiles)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
OSError: [Errno 75] Value too large for defined data type: 'xxx_file_xxx.mdx' 
-> 'dest/xxx_file_xxx.mdx'


reference to code:
https://github.com/kr-g/pybcpy/blob/master/pybcpy/diff_bak_copy.py

--
messages: 395862
nosy: kr-g
priority: normal
severity: normal
status: open
title: copy2 / sendfile fails on linux with large file
type: crash
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue44423>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44423] copy2 / sendfile fails on linux with large file

2021-09-21 Thread karl



karl  added the comment:

could not reproduce the error

--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue44423>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25479] Increase unit test coverage for abc.py

2020-12-31 Thread karl



Change by karl :


--
keywords: +patch
nosy: +karlcow
nosy_count: 2.0 -> 3.0
pull_requests: +22875
pull_request: https://github.com/python/cpython/pull/24034

___
Python tracker 
<https://bugs.python.org/issue25479>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25479] Increase unit test coverage for abc.py

2020-12-31 Thread karl



karl  added the comment:

@iritkatriel Github PR done. 
https://github.com/python/cpython/pull/24034

--

___
Python tracker 
<https://bugs.python.org/issue25479>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4643] cgitb.html fails if getattr call raises exception

2021-01-01 Thread karl



Change by karl :


--
nosy: +karlcow
nosy_count: 4.0 -> 5.0
pull_requests: +22878
stage: test needed -> patch review
pull_request: https://github.com/python/cpython/pull/24038

___
Python tracker 
<https://bugs.python.org/issue4643>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4643] cgitb.html fails if getattr call raises exception

2021-01-01 Thread karl



karl  added the comment:

Converted into GitHub PR https://github.com/python/cpython/pull/24038

--

___
Python tracker 
<https://bugs.python.org/issue4643>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4643] cgitb.html fails if getattr call raises exception

2021-01-02 Thread karl



karl  added the comment:

> The getattr call here has a default value, so it should not raise 
> AttributeError. It should also not raise any other exception because a valid 
> implementation of __getattr__ should raise only AttributeError:


but probably the intent of the patch here is to surface a meaningful error with 
regards to the script, more than an error on the cgitb python lib itself as 
this is a traceback tool. A bit like an HTML validator which continue to 
process things to give some kind of meaning.

Diving into previous issues about scanvars
https://bugs.python.org/issue966992

Similar
https://bugs.python.org/issue1047397 The last comment is basically this issue 
here. 

There is a patch which is lot better for this bug and which addresses the 
issues here. 
https://github.com/python/cpython/pull/15094

It has not been merged yet. Not sure why or if there is anything missing. 

Probably this bug could be closed as a duplicate of 
https://bugs.python.org/issue966992


And someone needs to push to push the latest bits for 
https://github.com/python/cpython/pull/15094


What do you think @iritkatriel ?

I will close my PR.

--

___
Python tracker 
<https://bugs.python.org/issue4643>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41748] HTMLParser: parsing error

2021-01-03 Thread karl


karl  added the comment:

Ezio,

TL,DR: Testing in browsers and adding two tests for this issue. 
   Should I create a PR just for the tests?

https://github.com/python/cpython/blame/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/test/test_htmlparser.py#L479-L485


A: comma without spaces
---


Tests for browsers:
data:text/html,text

Serializations:
* Firefox, Gecko (86.0a1 (2020-12-28) (64-bit)) 
* Edge, Blink (Version 89.0.752.0 (Version officielle) Canary (64 bits))
* Safari, WebKit (Release 117 (Safari 14.1, WebKit 16611.1.7.2))

Same serialization in these 3 rendering engines
text


Adding:

def test_comma_between_unquoted_attributes(self):
# bpo 41748
self._run_check('',
[('starttag', 'div', [('class', 'bar,baz=asd')])])


❯ ./python.exe -m test -v test_htmlparser

…
test_comma_between_unquoted_attributes 
(test.test_htmlparser.HTMLParserTestCase) ... ok
…

Ran 47 tests in 0.168s

OK

== Tests result: SUCCESS ==

1 test OK.

Total duration: 369 ms
Tests result: SUCCESS


So this is working as expected for the first test.


B: comma with spaces


Tests for browsers:
data:text/html,text

Serializations:
* Firefox, Gecko (86.0a1 (2020-12-28) (64-bit)) 
* Edge, Blink (Version 89.0.752.0 (Version officielle) Canary (64 bits))
* Safari, WebKit (Release 117 (Safari 14.1, WebKit 16611.1.7.2))

Same serialization in these 3 rendering engines
text


Adding
def test_comma_with_space_between_unquoted_attributes(self):
# bpo 41748
self._run_check('',
[('starttag', 'div', [
('class', 'bar'),
(',baz', 'asd')])])


❯ ./python.exe -m test -v test_htmlparser


This is failing.

==
FAIL: test_comma_with_space_between_unquoted_attributes 
(test.test_htmlparser.HTMLParserTestCase)
--
Traceback (most recent call last):
  File "/Users/karl/code/cpython/Lib/test/test_htmlparser.py", line 493, in 
test_comma_with_space_between_unquoted_attributes
self._run_check('',
  File "/Users/karl/code/cpython/Lib/test/test_htmlparser.py", line 95, in 
_run_check
self.fail("received events did not match expected events" +
AssertionError: received events did not match expected events
Source:
''
Expected:
[('starttag', 'div', [('class', 'bar'), (',baz', 'asd')])]
Received:
[('data', '')]

--


I started to look into the code of parser.py which I'm not familiar (yet) with.

https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/html/parser.py#L42-L52

Do you have a suggestion to fix it?

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue41748>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41748] HTMLParser: comma in attribute values with/without space

2021-01-03 Thread karl



Change by karl :


--
title: HTMLParser: parsing error -> HTMLParser: comma in attribute values 
with/without space

___
Python tracker 
<https://bugs.python.org/issue41748>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41748] HTMLParser: comma in attribute values with/without space

2021-01-03 Thread karl



karl  added the comment:

Ah!

This is fixing it

diff --git a/Lib/html/parser.py b/Lib/html/parser.py
index 6083077981..790666 100644
--- a/Lib/html/parser.py
+++ b/Lib/html/parser.py
@@ -44,7 +44,7 @@
   (?:\s*=+\s*# value indicator
 (?:'[^']*'   # LITA-enclosed value
   |"[^"]*"   # LIT-enclosed value
-  |(?!['"])[^>\s]*   # bare value
+  |(?!['"])[^>]*   # bare value
  )
  (?:\s*,)*   # possibly followed by a comma
)?(?:\s|/(?!>))*




Ran 48 tests in 0.175s

OK

== Tests result: SUCCESS ==

--

___
Python tracker 
<https://bugs.python.org/issue41748>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41748] HTMLParser: comma in attribute values with/without space

2021-01-03 Thread karl



Change by karl :


--
keywords: +patch
pull_requests: +22904
stage: test needed -> patch review
pull_request: https://github.com/python/cpython/pull/24072

___
Python tracker 
<https://bugs.python.org/issue41748>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue28937] str.split(): allow removing empty strings (when sep is not None)

2021-01-03 Thread karl



Change by karl :


--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue28937>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42821] HTMLParser: subsequent duplicate attributes should be ignored

2021-01-04 Thread karl



New submission from karl :

This comes up while working on issue 41748


browser input 
data:text/html,text

browser output
text

Actual HTMLParser output

see https://github.com/python/cpython/pull/24072#discussion_r551158342
('starttag', 'div', [('class', 'bar'), ('class', 'foo')])]

Expected HTMLParser output
('starttag', 'div', [('class', 'bar')])]

--
components: Library (Lib)
messages: 384308
nosy: karlcow
priority: normal
severity: normal
status: open
title: HTMLParser: subsequent duplicate attributes should be ignored
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue42821>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25258] HtmlParser doesn't handle void element tags correctly

2021-01-04 Thread karl



karl  added the comment:

The parsing rules for tokenization of html are at 
https://html.spec.whatwg.org/multipage/parsing.html#tokenization

In the stack of open elements, there are specific rules for certain elements. 
https://html.spec.whatwg.org/multipage/parsing.html#special

from a DOM point of view, there is indeed no difference in between 


https://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cimg%20src%3D%22somewhere%22%3E%3Cimg%20src%3D%22somewhere%22%2F%3E

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue25258>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25258] HtmlParser doesn't handle void element tags correctly

2021-01-04 Thread karl



karl  added the comment:

I wonder if the confusion comes from the name. The HTMLParser is kind of a 
tokenizer more than a full HTML parser, but that's probably a detail. It 
doesn't create a DOM Tree which you can access, but could help you to build a 
DOM Tree (!= DOM Document object)

https://html.spec.whatwg.org/multipage/parsing.html#overview-of-the-parsing-model

> Implementations that do not support scripting do not have to actually create 
> a DOM Document object, but the DOM tree in such cases is still used as the 
> model for the rest of the specification.

--

___
Python tracker 
<https://bugs.python.org/issue25258>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19683] test_minidom has many empty tests

2021-01-06 Thread karl



karl  added the comment:

@zach.ware 
@r.david.murray

I'm going through the source currently. 

I see that the test file is using:

class MinidomTest(unittest.TestCase):
def confirm(self, test, testname = "Test"):
self.assertTrue(test, testname)


Is there a specific reason to use this form instead of just directly using 
self.assertEqual or similar forms for new tests or reorganizing some of the 
tests. 


I see that it is used for example for giving a testname but

def testAAA(self):
dom = parseString("")
el = dom.documentElement
el.setAttribute("spam", "jam2")
self.confirm(el.toxml() == '', "testAAA")


testAAA is not specifically helping. :)

--

___
Python tracker 
<https://bugs.python.org/issue19683>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19683] test_minidom has many empty tests

2021-01-07 Thread karl


karl  added the comment:

These methods are not used anywhere in the code. 

https://github.com/python/cpython/blob/5c30145afb6053998e3518befff638d207047f00/Lib/xml/dom/minidom.py#L71-L80

What was the purpose when they were created… hmm maybe blame would give clue. 
Ah they were added a long time ago

https://github.com/python/cpython/commit/73678dac48e5858e40cba6d526970cba7e7c769c#diff-365c30899ded02b18a2d8f92de47af6ca213eefe7883064c8723598da600ea42R83-R88

but never used? or was it in the spirit to reserve the keyword for future use?
https://developer.mozilla.org/en-US/docs/Web/API/Node/firstChild

--

___
Python tracker 
<https://bugs.python.org/issue19683>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19683] test_minidom has many empty tests

2021-01-07 Thread karl



karl  added the comment:

Ah no. They ARE used

through defproperty and minicompat.py

get = getattr(klass, ("_get_" + name))

--

___
Python tracker 
<https://bugs.python.org/issue19683>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19683] test_minidom has many empty tests

2021-01-07 Thread karl



Change by karl :


--
pull_requests: +22980
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/24152

___
Python tracker 
<https://bugs.python.org/issue19683>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41748] HTMLParser: comma in attribute values with/without space

2021-01-10 Thread karl



karl  added the comment:

Status: The PR should be ready and completed
https://github.com/python/cpython/pull/24072
and eventually be merged at a point. 
Thanks to ezio.melotti for the wonderful guidance.

--

___
Python tracker 
<https://bugs.python.org/issue41748>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36661] Missing dataclass decorator import in dataclasses module docs

2020-07-28 Thread karl



karl  added the comment:

This should be closed. The PR has been merged and the doc is now up to date.

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue36661>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40236] datetime.datetime.strptime get day error

2020-07-28 Thread karl



karl  added the comment:

Same on macOS 10.15.6 (19G73)


Python 3.8.3 (v3.8.3:6f8c8320e9, May 13 2020, 16:29:34) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
datetime.datetime(2024, 1, 3, 0, 0)
>>> datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
datetime.datetime(2024, 1, 3, 0, 0)


Also 
https://pubs.opengroup.org/onlinepubs/007908799/xsh/strptime.html

note that iso8601 doesn't have this issue.
%V - ISO 8601 week of the year as a decimal number [01, 53].
https://en.wikipedia.org/wiki/ISO_week_date

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue40236>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40236] datetime.datetime.strptime get day error

2020-07-28 Thread karl



karl  added the comment:

Also this.

>>> import datetime
>>> d0 = datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d0.strftime("%Y-%W-%w %H:%M:%S")
'2024-01-3 00:00:00'
>>> d1 = datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d1.strftime("%Y-%W-%w %H:%M:%S")
'2024-01-3 00:00:00'
>>> d2301 = datetime.datetime.strptime("2023-0-1 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d2311 = datetime.datetime.strptime("2023-1-1 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d2301
datetime.datetime(2022, 12, 26, 0, 0)
>>> d2311
datetime.datetime(2023, 1, 2, 0, 0)
>>> d2311.strftime("%Y-%W-%w %H:%M:%S")
'2023-01-1 00:00:00'
>>> d2301.strftime("%Y-%W-%w %H:%M:%S")
'2022-52-1 00:00:00'


Week 0 2023 became Week 52 2022 (which is correct but might lead to surprises)

--

___
Python tracker 
<https://bugs.python.org/issue40236>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42104] xml.etree should support contains() function

2020-10-20 Thread karl



New submission from karl :

In XPath 1.0 
The function contains() is available 

> Function: boolean contains(string, string)
> The contains function returns true if the first argument string contains the 
> second argument string, and otherwise returns false.

In https://www.w3.org/TR/1999/REC-xpath-19991116/#function-contains


```

   One attribute: doc
   Two Attributes: doc test
   One Attribute: test

```

Currently, we can do this

```
>>> from lxml import etree
>>> root = etree.fromstring("""
...One attribute
...Two Attributes: doc test
...Two Attributes: doc2 test
... 
... """)
>>> elts = root.xpath("//p[@class='doc']")
>>> elts, etree.tostring(elts[0])
([], b'One attribute\n   ')
```


One way of extracting the list of 2 elements which contains the attribute doc 
with XPath is:


```
>>> root.xpath("//p[contains(@class, 'doc')]")
[, ]
>>> [etree.tostring(elt) for elt in root.xpath("//p[contains(@class, 'doc')]")]
[b'One attribute: doc\n   ', b'Two 
Attributes: doc test\n   ']
```


There is no easy way to extract all elements containing a "doc" value in a 
multi-values attribute in python 3.10 with xml.etree, which is quite common in 
html. 


```
>>> import xml.etree.ElementTree as ET
>>> root = ET.fromstring("""
...One attribute: doc
...Two Attributes: doc test
...One Attribute: test
... """
... )
>>> root.xpath("//p[contains(@class, 'doc')]")
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'xpath'
```

--
components: Library (Lib)
messages: 379185
nosy: karlcow
priority: normal
severity: normal
status: open
title: xml.etree should support contains() function
type: enhancement
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue42104>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24772] Smaller viewport shifts the "expand left menu" character into the text

2019-06-20 Thread karl



karl  added the comment:

I'm at Mozilla All Hands this week. 
I'll check if my solution still makes sense next week and will make a pull 
request and/or propose another solution. 
Thanks for the reminder. adding to my calendar.

--

___
Python tracker 
<https://bugs.python.org/issue24772>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24772] Smaller viewport shifts the "expand left menu" character into the text

2019-07-04 Thread karl



karl  added the comment:

So I had time to look at it today. 
And it would probably be better to solve 
https://bugs.python.org/issue23312

which would make this one here useless and would actually provide a solution 
for many people.

--

___
Python tracker 
<https://bugs.python.org/issue24772>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23312] google thinks the docs are mobile unfriendly

2019-07-04 Thread karl



karl  added the comment:

This issue should probably be addressed now on 
https://github.com/python/python-docs-theme

--
nosy: +karlcow

___
Python tracker 
<https://bugs.python.org/issue23312>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23312] google thinks the docs are mobile unfriendly

2019-07-04 Thread karl



karl  added the comment:

I created https://github.com/python/python-docs-theme/issues/30

--

___
Python tracker 
<https://bugs.python.org/issue23312>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8136] urllib.unquote decodes percent-escapes with Latin-1

2017-06-04 Thread karl


karl added the comment:

#8143 was fixed.

Python 2.7.10 (default, Feb  7 2017, 00:08:15) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> urllib.unquote(u"%CE%A3")
u'\xce\xa3'

What should become of this one?

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue8136>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15799] httplib client and statusline

2012-08-28 Thread karl


New submission from karl:

The current parsing of HTTP status line seems strange with regards to its 
definition in HTTP.

http://hg.python.org/cpython/file/3.2/Lib/http/client.py#l307

Currently the code is 
version, status, reason = line.split(None, 2)

>>> status1 = "HTTP/1.1 200 OK"
>>> status2 = "HTTP/1.1 200 "
>>> status3 = "HTTP/1.1 200"
>>> status1.split(None, 2)
['HTTP/1.1', '200', 'OK']
>>> status2.split(None, 2)
['HTTP/1.1', '200']
>>> status3.split(None, 2)
['HTTP/1.1', '200']

According to the production rules of HTTP/1.1 bis only status1 and status2 are 
valid.

  status-line = HTTP-version SP status-code SP reason-phrase CRLF
  — http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-20#section-3.1.2

with reason-phrase  = *( HTAB / SP / VCHAR / obs-text ) aka 0 or more 
characters.

I'm also not sure what are the expected ValueError with additional parsing 
rules which seems even more bogus.

First modification should be

>>> status1.split(' ', 2)
['HTTP/1.1', '200', 'OK']
>>> status2.split(' ', 2)
['HTTP/1.1', '200', '']

Which would be correct for the first two, with an empty reason-phrase
The third one is still no good.

>>> status3.split(' ', 2)
['HTTP/1.1', '200']

An additional check could be done with 

len(status.split(' ', 2)) == 3

Will return False in the third case.

Do you want me to create a patch and a test for it?

--
messages: 169293
nosy: karlcow
priority: normal
severity: normal
status: open
title: httplib client and statusline
type: enhancement
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue15799>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15799] httplib client and statusline

2012-08-28 Thread karl


karl added the comment:

ok.

status lines 1 and 2 are valid. 
the third one is invalid and should trigger a 

raise BadStatusLine(line)

The code at line 318 is bogus as it will parse happily the third line without 
raising an exception.
http://hg.python.org/cpython/file/3.2/Lib/http/client.py#l318

--

___
Python tracker 
<http://bugs.python.org/issue15799>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15799] httplib client and statusline

2012-08-28 Thread karl


karl added the comment:

Fair enough, it could be a warning when

* more than one space in between http version and status code
* if there is a missing space after the status code


I'm not advocating for being strict only. I'm advocating for giving the tools 
to developer to assess that things are right and choose or not to ignore and 
having to avoid to patch the libraries or rewrite modules when you create code 
which needs to be strict for specifically validating responses and requests. :)

ps: I haven't checked yet if the server counter part of httplib was strict in 
the production rule.

--

___
Python tracker 
<http://bugs.python.org/issue15799>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15799] httplib client and statusline

2012-08-30 Thread karl


karl added the comment:

So what do we do with it? 
Do I created a patch or do we close that bug? :)
No hard feelings about it.

--

___
Python tracker 
<http://bugs.python.org/issue15799>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21325] Missing Generic EXIF library for images in the standard library

2014-04-21 Thread karl


New submission from karl:

There is a room for a consistent and good EXIF library for the Python Standard 
Library.

--
components: Library (Lib)
messages: 216978
nosy: karlcow
priority: normal
severity: normal
status: open
title: Missing Generic EXIF library for images in the standard library
type: enhancement

___
Python tracker 
<http://bugs.python.org/issue21325>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread karl


karl added the comment:

Mark,

The code is using urllib for demonstrating the issue with wikipedia and other 
sites which are blocking python-urllib user agents because it is used by many 
spam harvesters.

The proposal is about giving a possibility in robotparser lib to add a feature 
for setting up the user-agent.

--

___
Python tracker 
<http://bugs.python.org/issue15851>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread karl


karl added the comment:

Note that one of the proposal is to just document in
https://docs.python.org/3/library/urllib.robotparser.html
the proposal made in msg169722 (available in 3.4+)

robotparser.URLopener.version = 'MyVersion'

--

___
Python tracker 
<http://bugs.python.org/issue15851>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2014-06-22 Thread karl


karl added the comment:

→ python
Python 2.7.5 (default, Mar  9 2014, 22:15:05) 
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import robotparser
>>> rp = robotparser.RobotFileParser('http://somesite.test.site/robots.txt')
>>> rp.read()
>>> 


Let's check the server logs:

127.0.0.1 - - [23/Jun/2014:08:44:37 +0900] "GET /robots.txt HTTP/1.0" 200 92 
"-" "Python-urllib/1.17"

Robotparser by default was using in 2.* the Python-urllib/1.17 user agent which 
is traditionally blocked by many sysadmins. A solution has been already 
proposed above:

This is the proposed test for 3.4

import urllib.robotparser
import urllib.request
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'MyUa/0.1')]
urllib.request.install_opener(opener)
rp = urllib.robotparser.RobotFileParser('http://localhost:')
rp.read()


The issue is not anymore about changing the lib, but just about documenting on 
how to change the RobotFileParser default UA. We can change the title of this 
issue if it's confusing. Or close it and open a new one for documenting what 
makes it easier :)

Currently robotparser.py imports urllib user agent.
http://hg.python.org/cpython/file/7dc94337ef67/Lib/urllib/request.py#l364

It's a common failure we encounter when using urllib in general, including 
robotparser.


As for wikipedia, they fixed their server side user agent sniffing, and do not 
filter anymore python-urllib. 

GET /robots.txt HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate, compress
Host: en.wikipedia.org
User-Agent: Python-urllib/1.17

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3161
Cache-control: s-maxage=3600, must-revalidate, max-age=0
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 5208
Content-Type: text/plain; charset=utf-8
Date: Sun, 22 Jun 2014 23:59:16 GMT
Last-modified: Tue, 26 Nov 2013 17:39:43 GMT
Server: Apache
Set-Cookie: GeoIP=JP:Tokyo:35.6850:139.7514:v4; Path=/; Domain=.wikipedia.org
Vary: X-Subdomain
Via: 1.1 varnish, 1.1 varnish, 1.1 varnish
X-Article-ID: 19292575
X-Cache: cp1065 miss (0), cp4016 hit (1), cp4009 frontend hit (215)
X-Content-Type-Options: nosniff
X-Language: en
X-Site: wikipedia
X-Varnish: 2529666795, 2948866481 2948865637, 4134826198 4130750894


Many other sites still do. :)

--
versions: +Python 3.4 -Python 3.5

___
Python tracker 
<http://bugs.python.org/issue15851>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12455] urllib2 forces title() on header names, breaking some requests

2014-06-23 Thread karl


karl added the comment:

Mark,

I'm happy to followup. 
I will be in favor of removing any capitalization and not to change headers 
whatever they are. Because it doesn't matter per spec. Browsers do not care 
about the capitalization. And I haven't identified Web Compatibility issues 
regarding the capitalization.

That said, it seems that Cal msg139512 had an issue, I would love to know which 
server/API had this behavior to fill a but at http://webcompat.com/

So…

Where do we stand? Feature or removing anything which modifies the 
capitalization of headers?

--

___
Python tracker 
<http://bugs.python.org/issue12455>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-06-23 Thread karl


karl added the comment:

@Mark,

yup, I can do that.
I just realized that since my contribution there was a PSF Contributor 
Agreement. This is signed.

I need to dive a bit again in the code to remember where things fail.

--

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15873] datetime: add ability to parse RFC 3339 dates and times

2014-06-28 Thread karl


karl added the comment:

I had the issue today. I needed to parse a date with the following format.

2014-04-04T23:59:00+09:00

and could not with strptime.

I see a discussion in March 2014 
http://code.activestate.com/lists/python-ideas/26883/ but no followup. 

For references:
http://www.w3.org/TR/NOTE-datetime
http://tools.ietf.org/html/rfc3339

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue15873>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15873] datetime: add ability to parse RFC 3339 dates and times

2014-06-28 Thread karl


karl added the comment:

On closer inspection, Anders Hovmöller proposal doesn't work.
https://github.com/boxed/iso8601

At least for the microseconds part. 

In http://tools.ietf.org/html/rfc3339#section-5.6, the microsecond part is 
defined as:

time-secfrac= "." 1*DIGIT

In http://www.w3.org/TR/NOTE-datetime, same thing:
 s= one or more digits representing a decimal fraction of a second

Anders considers it to be only six digits. It can be more or it can be less. :) 

Will comment on github too.

--

___
Python tracker 
<http://bugs.python.org/issue15873>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15873] datetime: add ability to parse RFC 3339 dates and times

2014-06-28 Thread karl


karl added the comment:

Noticed some people doing the same thing

https://github.com/tonyg/python-rfc3339
http://home.blarg.net/~steveha/pyfeed.html
https://wiki.python.org/moin/WorkingWithTime

--

___
Python tracker 
<http://bugs.python.org/issue15873>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15873] datetime: add ability to parse RFC 3339 dates and times

2014-06-29 Thread karl


karl added the comment:

After inspections, the best library for parsing RFC3339 style date is 
definitely:
https://github.com/tonyg/python-rfc3339/

Main code at
https://github.com/tonyg/python-rfc3339/blob/master/rfc3339.py

--

___
Python tracker 
<http://bugs.python.org/issue15873>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-09-23 Thread karl


karl added the comment:

Ok this is an attempt at solving the issue with lowercase. I find my get_header 
a bit complicated, but if you have a better idea. :) I'll modify the patches.

I have try to run the tests on the mac here but I have an issue currently.

→ ./python.exe -V
Python 3.5.0a0

Traceback (most recent call last):
  File "./Tools/scripts/patchcheck.py", line 6, in 
import subprocess
  File "/Users/karl/code/cpython/Lib/subprocess.py", line 353, in 
import signal
ImportError: No module named 'signal'
make: *** [patchcheck] Error 1

--
Added file: http://bugs.python.org/file36695/issue-5550-3.patch

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-09-23 Thread karl


Changes by karl :


Removed file: http://bugs.python.org/file36695/issue-5550-3.patch

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-09-23 Thread karl


karl added the comment:

And I had to do a typo in patch3. Submitting patch4. Sorry about that.

--
Added file: http://bugs.python.org/file36698/issue-5550-4.patch

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17322] urllib.request add_header() currently allows trailing spaces (and other weird stuff)

2014-09-23 Thread karl


karl added the comment:

Just a follow up for giving the stable version of the now new RFC version for 
HTTP 1.1

HTTP header field names parsing
http://tools.ietf.org/html/rfc7230#section-3.2.4

--

___
Python tracker 
<http://bugs.python.org/issue17322>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-09-24 Thread karl


karl added the comment:

Ok my tests are ok.

→ ./python.exe -m unittest -v Lib/test/test_urllib2net.py 

test_close (Lib.test.test_urllib2net.CloseSocketTest) ... ok
test_custom_headers (Lib.test.test_urllib2net.OtherNetworkTests) ... FAIL
test_file (Lib.test.test_urllib2net.OtherNetworkTests) ... test_ftp 
(Lib.test.test_urllib2net.OtherNetworkTests) ... skipped "Resource 
'ftp://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/00README-Legal-Rules-Regs'
 is not available"
test_headers_case_sensitivity (Lib.test.test_urllib2net.OtherNetworkTests) ... 
ok
test_redirect_url_withfrag (Lib.test.test_urllib2net.OtherNetworkTests) ... ok
test_sites_no_connection_close (Lib.test.test_urllib2net.OtherNetworkTests) ... 
ok
test_urlwithfrag (Lib.test.test_urllib2net.OtherNetworkTests) ... ok
test_ftp_basic (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_ftp_default_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_ftp_no_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_ftp_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_basic (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_default_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_no_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok


I wonder if test_custom_headers fails because of my modifications.

--

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue22478] tests for urllib2net are in bad shapes

2014-09-24 Thread karl


New submission from karl:

→ ./python.exe -V
Python 3.4.2rc1+

→ hg tip
changeset:   92532:6dcc96fa3970
tag: tip
parent:  92530:ad45c2707006
parent:  92531:8eb4eec8626c
user:Benjamin Peterson 
date:Mon Sep 22 22:44:21 2014 -0400
summary: merge 3.4 (#22459)


When working on issue #5550, I realized that some tests are currently failing.

Here the log of running:
→ ./python.exe -m unittest -v Lib/test/test_urllib2net.py 


test_close (Lib.test.test_urllib2net.CloseSocketTest) ... ok
test_custom_headers (Lib.test.test_urllib2net.OtherNetworkTests) ... FAIL
test_file (Lib.test.test_urllib2net.OtherNetworkTests) ... test_ftp 
(Lib.test.test_urllib2net.OtherNetworkTests) ... skipped "Resource 
'ftp://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/00README-Legal-Rules-Regs'
 is not available"
test_redirect_url_withfrag (Lib.test.test_urllib2net.OtherNetworkTests) ... ok
test_sites_no_connection_close (Lib.test.test_urllib2net.OtherNetworkTests) ... 
ok
test_urlwithfrag (Lib.test.test_urllib2net.OtherNetworkTests) ... ok
test_ftp_basic (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_ftp_default_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_ftp_no_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_ftp_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_basic (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_default_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_no_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok
test_http_timeout (Lib.test.test_urllib2net.TimeoutTest) ... ok

==
ERROR: test_file (Lib.test.test_urllib2net.OtherNetworkTests) 
(url='file:/Users/karl/code/cpython/%40test_61795_tmp')
--
Traceback (most recent call last):
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 243, in 
_test_urls
f = urlopen(url, req, TIMEOUT)
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 33, in 
wrapped
return _retry_thrice(func, exc, *args, **kwargs)
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 23, in 
_retry_thrice
return func(*args, **kwargs)
  File "/Users/karl/code/cpython/Lib/urllib/request.py", line 447, in open
req = Request(fullurl, data)
  File "/Users/karl/code/cpython/Lib/urllib/request.py", line 267, in __init__
origin_req_host = request_host(self)
  File "/Users/karl/code/cpython/Lib/urllib/request.py", line 250, in 
request_host
host = _cut_port_re.sub("", host, 1)
TypeError: expected string or buffer

==
ERROR: test_file (Lib.test.test_urllib2net.OtherNetworkTests) 
(url=('file:///nonsensename/etc/passwd', None, ))
------
Traceback (most recent call last):
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 243, in 
_test_urls
f = urlopen(url, req, TIMEOUT)
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 33, in 
wrapped
return _retry_thrice(func, exc, *args, **kwargs)
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 23, in 
_retry_thrice
return func(*args, **kwargs)
  File "/Users/karl/code/cpython/Lib/urllib/request.py", line 447, in open
req = Request(fullurl, data)
  File "/Users/karl/code/cpython/Lib/urllib/request.py", line 267, in __init__
origin_req_host = request_host(self)
  File "/Users/karl/code/cpython/Lib/urllib/request.py", line 250, in 
request_host
host = _cut_port_re.sub("", host, 1)
TypeError: expected string or buffer

==
FAIL: test_custom_headers (Lib.test.test_urllib2net.OtherNetworkTests)
--
Traceback (most recent call last):
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 186, in 
test_custom_headers
self.assertEqual(request.get_header('User-agent'), 'Test-Agent')
AssertionError: 'Python-urllib/3.4' != 'Test-Agent'
- Python-urllib/3.4
+ Test-Agent


--
Ran 16 tests in 124.879s

FAILED (failures=1, errors=2, skipped=1)

--
components: Tests
messages: 227417
nosy: karlcow
priority: normal
severity: normal
status: open
title: tests for urllib2net are in bad shapes
versions: Python 3.5

___
Python tracker 
<http://bugs.python.org/issue22478>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-09-24 Thread karl


karl added the comment:

Opened issue #22478 for the tests failing. Not related to my modification.

--

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue22478] tests for urllib2net are in bad shapes

2014-09-24 Thread karl


karl added the comment:

ok let's see

→ ./python.exe -m unittest -v 
Lib.test.test_urllib2net.OtherNetworkTests.test_custom_headers
test_custom_headers (Lib.test.test_urllib2net.OtherNetworkTests) ... FAIL

==
FAIL: test_custom_headers (Lib.test.test_urllib2net.OtherNetworkTests)
--
Traceback (most recent call last):
  File "/Users/karl/code/cpython/Lib/test/test_urllib2net.py", line 186, in 
test_custom_headers
self.assertEqual(request.get_header('User-agent'), 'Test-Agent')
AssertionError: 'Python-urllib/3.4' != 'Test-Agent'
- Python-urllib/3.4
+ Test-Agent


--
Ran 1 test in 0.551s

FAILED (failures=1)


→ ./python.exe
Python 3.4.2rc1+ (3.4:8eb4eec8626c+, Sep 23 2014, 21:53:11) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.51)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request
>>> url = 'http://127.0.0.1/'
>>> opener = urllib.request.build_opener()
>>> request = urllib.request.Request(url)
>>> request.header_items()
[]
>>> request.headers
{}
>>> request.add_header('User-Agent', 'Test-Agent')
>>> request.headers
{'User-agent': 'Test-Agent'}
>>> request.header_items()
[('User-agent', 'Test-Agent')]
>>> opener.open(request)

>>> request.get_header('User-agent'), 'Test-Agent'
('Test-Agent', 'Test-Agent')
>>> request.header_items()
[('User-agent', 'Test-Agent'), ('Host', '127.0.0.1')]
>>> request.headers
{'User-agent': 'Test-Agent'}


OK so far so good.
And my server recorded 
127.0.0.1 - - [24/Sep/2014:17:07:41 +0900] "GET / HTTP/1.1" 200 9897 "-" 
"Test-Agent"


Let's do it the way, the test has been designed.

→ ./python.exe
Python 3.4.2rc1+ (3.4:8eb4eec8626c+, Sep 23 2014, 21:53:11) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.51)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request
>>> url = 'http://127.0.0.1/'
>>> opener = urllib.request.build_opener()
>>> request = urllib.request.Request(url)
>>> request.header_items()
[]
>>> opener.open(request)

>>> request.header_items()
[('User-agent', 'Python-urllib/3.4'), ('Host', '127.0.0.1')]
>>> request.has_header('User-agent')
True
>>> request.add_header('User-Agent', 'Test-Agent')
>>> opener.open(request)

>>> request.get_header('User-agent'), 'Test-Agent'
('Python-urllib/3.4', 'Test-Agent')
>>> request.add_header('Foo', 'bar')
>>> request.header_items()
[('User-agent', 'Test-Agent'), ('Host', '127.0.0.1'), ('Foo', 'bar')]
>>> opener.open(request)

>>> request.header_items()
[('User-agent', 'Test-Agent'), ('Host', '127.0.0.1'), ('Foo', 'bar')]
>>> request.get_header('User-agent'), 'Test-Agent'
('Python-urllib/3.4', 'Test-Agent')
>>> request.headers
{'User-agent': 'Test-Agent', 'Foo': 'bar'}


And the server recorded.

127.0.0.1 - - [24/Sep/2014:17:12:52 +0900] "GET / HTTP/1.1" 200 9897 "-" 
"Python-urllib/3.4"
127.0.0.1 - - [24/Sep/2014:17:12:52 +0900] "GET / HTTP/1.1" 200 9897 "-" 
"Python-urllib/3.4"
127.0.0.1 - - [24/Sep/2014:17:14:15 +0900] "GET / HTTP/1.1" 200 9897 "-" 
"Python-urllib/3.4"

So it seems that User-Agent is immutable once it has been set the first time. 
Not in  the same dictionary.


>>> request.unredirected_hdrs
{'User-agent': 'Python-urllib/3.4', 'Host': '127.0.0.1'}

--

___
Python tracker 
<http://bugs.python.org/issue22478>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue22478] tests for urllib2net are in bad shapes

2014-09-24 Thread karl


karl added the comment:

Ah! the User-Agent (or anything which is in unredirected_hdrs) will not be 
updated if it has already been set once. 
https://hg.python.org/cpython/file/064f6baeb6bd/Lib/urllib/request.py#l1154


>>> headers = dict(request.unredirected_hdrs)
>>> headers
{'User-agent': 'Python-urllib/3.4', 'Host': '127.0.0.1'}
>>> request.headers
{'User-agent': 'Test-Agent', 'Foo': 'cool'}
>>> headers.update(dict((k, v) for k, v in request.headers.items() if k not in 
>>> headers))
>>> headers
{'User-agent': 'Python-urllib/3.4', 'Host': '127.0.0.1', 'Foo': 'cool'}

--

___
Python tracker 
<http://bugs.python.org/issue22478>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-09-24 Thread karl


karl added the comment:

OK after fixing my repo (Thanks orsenthil) I got the tests running properly. 
The inspection order of the two dictionary was not right, so I had to modify a 
bit the patch.

→ ./python.exe -m unittest -v 
Lib.test.test_urllib2net.OtherNetworkTests.test_headers_case_sensitivity
test_headers_case_sensitivity (Lib.test.test_urllib2net.OtherNetworkTests) ... 
ok

--
Ran 1 test in 0.286s

OK

→ ./python.exe -m unittest -v 
Lib.test.test_urllib2net.OtherNetworkTests.test_custom_headers
test_custom_headers (Lib.test.test_urllib2net.OtherNetworkTests) ... ok

--
Ran 1 test in 0.575s

OK


New patch issue5550-5.patch
unlinking issue5550-4.patch

--
Added file: http://bugs.python.org/file36717/issue5550-5.patch

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5550] [urllib.request]: Comparison of HTTP headers should be insensitive to the case

2014-09-24 Thread karl


Changes by karl :


Removed file: http://bugs.python.org/file36698/issue-5550-4.patch

___
Python tracker 
<http://bugs.python.org/issue5550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15799] httplib client and statusline

2014-09-24 Thread karl


karl added the comment:

Let's close this.

>>> "HTTP/1.1301 ".split(None, 2)
['HTTP/1.1', '301']
>>> "HTTP/1.1301 ".split(' ', 2)
['HTTP/1.1', '', '  301 ']

I think it would be nice to have a way to warn without stopping, but the last 
comment from r.david.murray makes sense too. :)

--
resolution:  -> not a bug
status: open -> closed

___
Python tracker 
<http://bugs.python.org/issue15799>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17319] http.server.BaseHTTPRequestHandler send_response_only doesn't check the type and value of the code.

2014-09-24 Thread karl


karl added the comment:

Where this is defined in the new RFC.

http://tools.ietf.org/html/rfc7230#section-3.1.2

 status-line = HTTP-version SP status-code SP reason-phrase CRLF

Things to enforce

 status-code= 3DIGIT

Response status code are now defined in 
http://tools.ietf.org/html/rfc7231#section-6
with something important.

   HTTP status codes are extensible.  HTTP clients are not required to
   understand the meaning of all registered status codes, though such
   understanding is obviously desirable.  However, a client MUST
   understand the class of any status code, as indicated by the first
   digit, and treat an unrecognized status code as being equivalent to
   the x00 status code of that class, with the exception that a
   recipient MUST NOT cache a response with an unrecognized status code.

   For example, if an unrecognized status code of 471 is received by a
   client, the client can assume that there was something wrong with its
   request and treat the response as if it had received a 400 (Bad
   Request) status code.  The response message will usually contain a
   representation that explains the status.

That should help.

The full registry of status code is defined here
http://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml


@dmi.baranov 

In the patch
+def _is_valid_status_code(code):
+return isinstance(code, int) and 0 <= code <= 999

Maybe there is a missing check where the len(str(code)) == 3

--

___
Python tracker 
<http://bugs.python.org/issue17319>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-06 Thread karl


karl added the comment:

This is the correct behavior
  GET http://example.com/foo
with a response containing 
  302 and Location: /bar#test
must trigger
  http://example.com/bar#test



Location is defined in 
http://tools.ietf.org/html/rfc7231#section-7.1.2

7.1.2. Location


   The "Location" header field is used in some responses to refer to a
   specific resource in relation to the response.  The type of
   relationship is defined by the combination of request method and
   status code semantics.

 Location = URI-reference

   The field value consists of a single URI-reference.  When it has the
   form of a relative reference ([RFC3986], Section 4.2), the final
   value is computed by resolving it against the effective request URI
   ([RFC3986], Section 5).

A bit after in the spec.


   If the Location value provided in a 3xx (Redirection) response does
   not have a fragment component, a user agent MUST process the
   redirection as if the value inherits the fragment component of the
   URI reference used to generate the request target (i.e., the
   redirection inherits the original reference's fragment, if any).

   For example, a GET request generated for the URI reference
   "http://www.example.org/~tim"; might result in a 303 (See Other)
   response containing the header field:

 Location: /People.html#tim

   which suggests that the user agent redirect to
   "http://www.example.org/People.html#tim";

   Likewise, a GET request generated for the URI reference
   "http://www.example.org/index.html#larry"; might result in a 301
   (Moved Permanently) response containing the header field:

 Location: http://www.example.net/index.html

   which suggests that the user agent redirect to
   "http://www.example.net/index.html#larry";, preserving the original
   fragment identifier.

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue18119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-06 Thread karl


karl added the comment:

Takahashi-san,

Ah sorry misunderstood which part your were talking about. I assume wrongly you 
were talking about navigation. 

Yes for the request which is sent to the server it should be
http://tools.ietf.org/html/rfc7230#section-5.3.1
So refactoring your example.

1st request:

GET /foo HTTP/1.1
Accept: text/html
Host: example.com

server response
HTTP/1.1 302 Found
Location: /bar#test

second request must be
GET /bar HTTP/1.1
Accept: text/html
Host: example.com

As for the navigation context is indeed part of the piece of code taking in 
charge the document after being parsed and not the one doing the HTTP request. 
(putting it here just that people understand)


(to be tested)
For server side receiving invalid Request-line 
http://tools.ietf.org/html/rfc7230#section-3.1.1

   Recipients of an invalid request-line SHOULD respond with either a
   400 (Bad Request) error or a 301 (Moved Permanently) redirect with
   the request-target properly encoded.  A recipient SHOULD NOT attempt
   to autocorrect and then process the request without a redirect, since
   the invalid request-line might be deliberately crafted to bypass
   security filters along the request chain.

--

___
Python tracker 
<http://bugs.python.org/issue18119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-06 Thread karl


karl added the comment:

In class urlopen_HttpTests
https://hg.python.org/cpython/file/4f314dedb84f/Lib/test/test_urllib.py#l191

there is a test for invalid redirects

def test_invalid_redirect(self):
https://hg.python.org/cpython/file/4f314dedb84f/Lib/test/test_urllib.py#l247

And one for fragments
def test_url_fragment(self):
https://hg.python.org/cpython/file/4f314dedb84f/Lib/test/test_urllib.py#l205

which refers to http://bugs.python.org/issue11703

code in 
https://hg.python.org/cpython/file/d5688a94a56c/Lib/urllib.py

--

___
Python tracker 
<http://bugs.python.org/issue18119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-07 Thread karl


karl added the comment:

OK I fixed the code. The issue is here
https://hg.python.org/cpython/file/1e1c6e306eb4/Lib/urllib/request.py#l656

newurl = urlunparse(urlparts)

Basically it reinjects the fragment in the new url. The fix is easy.

if urlparts.fragment:
urlparts = list(urlparts)
urlparts[5] = ""
newurl = urlunparse(urlparts)

I was trying to make a test for it, but failed. Could someone help me for the 
test so I can complete the patch?

Added the code patch only.

--
keywords: +patch
Added file: http://bugs.python.org/file36832/issue18119-code-only.patch

___
Python tracker 
<http://bugs.python.org/issue18119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11448] docs for HTTPConnection.set_tunnel are ambiguous

2014-01-21 Thread karl


karl added the comment:

ooops right, my bad.

s/on port 8080. We first/on port 8080, we first/

better?

--

___
Python tracker 
<http://bugs.python.org/issue11448>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue747320] rfc2822 formatdate functionality duplication

2014-02-03 Thread karl


karl added the comment:

Eric,

what do you recommend to move forward with this bug and patches?
Need guidance.
Do you have an example for "(A minor thing: I would use “attribute” instead of 
“variable” in the docstrings.)"

Also which code base I should use? A lot of water has gone under the bridge in 
one year. :)

--

___
Python tracker 
<http://bugs.python.org/issue747320>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3791] bsddb not completely removed

2009-03-30 Thread karl


karl  added the comment:

On the mac version there is an issue with the python version installed
by default. 

Python 2.5.1 (r251:54863, Jan 13 2009, 10:26:13) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin

 File
"/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/dbhash.py",
line 5, in 
   import bsddb
 File
"/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/bsddb/__init__.py",
line 51, in 
   import _bsddb
ImportError: No module named _bsddb

--
components: +Macintosh -Windows
nosy: +karlcow
versions: +Python 2.5 -Python 3.0

___
Python tracker 
<http://bugs.python.org/issue3791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue747320] rfc2822 formatdate functionality duplication

2013-02-25 Thread karl


karl added the comment:

http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-7.1.1

quoting from HTTP 1.1 bis

   Prior to 1995, there were three different formats commonly used by
   servers to communicate timestamps.  For compatibility with old
   implementations, all three are defined here.  The preferred format is
   a fixed-length and single-zone subset of the date and time
   specification used by the Internet Message Format [RFC5322].

 HTTP-date= IMF-fixdate / obs-date

   An example of the preferred format is

 Sun, 06 Nov 1994 08:49:37 GMT; IMF-fixdate

   Examples of the two obsolete formats are

 Sunday, 06-Nov-94 08:49:37 GMT   ; obsolete RFC 850 format
 Sun Nov  6 08:49:37 1994 ; ANSI C's asctime() format

   A recipient that parses a timestamp value in an HTTP header field
   MUST accept all three formats.  A sender MUST generate the IMF-
   fixdate format when sending an HTTP-date value in a header field.


What http.server.BaseHTTPRequestHandler.date_time_string is currently doing

>>> import time
>>> timestamp = time.time()
>>> weekdayname = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
>>> monthname = [None,'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun','Jul', 'Aug', 
>>> 'Sep', 'Oct', 'Nov', 'Dec']
>>> year, month, day, hh, mm, ss, wd, y, z = time.gmtime(timestamp)
>>> s = "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (weekdayname[wd],day, 
>>> monthname[month], year,hh, mm, ss)
>>> s
'Mon, 25 Feb 2013 19:26:34 GMT'



what email.utils.formatdate is doing:

>>> import email.utils
>>> email.utils.formatdate(timeval=None,localtime=False, usegmt=True)
'Mon, 25 Feb 2013 19:40:04 GMT'
>>> import time
>>> ts = time.time()
>>> email.utils.formatdate(timeval=ts,localtime=False, usegmt=True)
'Mon, 25 Feb 2013 19:51:50 GMT'

I createad a patch 
s = email.utils.formatdate(timestamp, False, True)

I didn't touch the log method which has a different format which is anyway not 
compatible with email.utils.

--
keywords: +patch
nosy: +karlcow
Added file: http://bugs.python.org/file29240/server.patch

___
Python tracker 
<http://bugs.python.org/issue747320>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7370] BaseHTTPServer reinventing rfc822 date formatting

2013-02-25 Thread karl


karl added the comment:

I think it is now fixed by my patch in http://bugs.python.org/issue747320

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue7370>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue747320] rfc2822 formatdate functionality duplication

2013-02-25 Thread karl


karl added the comment:

Made a mistake in the previous server.patch, use server.2.patch

--
Added file: http://bugs.python.org/file29241/server2.patch

___
Python tracker 
<http://bugs.python.org/issue747320>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue747320] rfc2822 formatdate functionality duplication

2013-02-25 Thread karl


Changes by karl :


Removed file: http://bugs.python.org/file29240/server.patch

___
Python tracker 
<http://bugs.python.org/issue747320>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11448] docs for HTTPConnection.set_tunnel are ambiguous

2013-02-25 Thread karl


karl added the comment:

This is a possible additional example for set_tunnel, modification of 
python3.3/html/_sources/library/http.client.txt

Hope it helps.

--
nosy: +karlcow
Added file: http://bugs.python.org/file29243/http.client.patch

___
Python tracker 
<http://bugs.python.org/issue11448>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17302] HTTP/2.0 - Implementations/Testing efforts

2013-02-26 Thread karl


New submission from karl:

Are there plans to develop an HTTP/2.0 library in parallel of the specification 
development? It will not be ready before years, but it would be good to have an 
evolving implementation. Or should it be done outside of python.org?

Reference: https://github.com/http2

--
components: Library (Lib)
messages: 183086
nosy: karlcow
priority: normal
severity: normal
status: open
title: HTTP/2.0 - Implementations/Testing efforts
versions: Python 3.4

___
Python tracker 
<http://bugs.python.org/issue17302>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17302] HTTP/2.0 - Implementations/Testing efforts

2013-02-26 Thread karl


karl added the comment:

agreed on HTTP/1.1, is there a plan to fix it too ;) because the current 
http.server seems to be untouchable without breaking stuff all around :)

--

___
Python tracker 
<http://bugs.python.org/issue17302>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12921] http.server.BaseHTTPRequestHandler.send_error and trailing newline

2013-02-26 Thread karl


karl added the comment:

Testing your code in Listing 1

→ curl -sI http://localhost:9000/
HTTP/1.0 501 Unsupported method ('HEAD')
Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:38:32 GMT
Content-Type: text/html;charset=utf-8
Connection: close

So this is normal, 
http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-6.6.2

except that it would be better to use "501 Not Implemented" through the prose 
is optional. The content-type is also kind of useless. That would deserve to 
open another bug.


And

→ curl http://localhost:9000/
Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:39:46 GMT
Content-Type: text/html;charset=utf-8
Connection: close

http://www.w3.org/TR/html4/strict.dtd";>



Error response


Error response
Error code: 500
Message: Traceback (most recent call last):
  File "server.py", line 9, in do_GET
assert(False)
AssertionError
.
Error code explanation: 500 - Server got itself in trouble.



OK. The server is answering with HTTP/1.0 and then a Traceback… which has 
nothing to do here.


We can see that in more details with a telnet

→ telnet localhost 9000
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost:9000

HTTP/1.0 500 Traceback (most recent call last):
  File "server.py", line 9, in do_GET
assert(False)
AssertionError

Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:49:04 GMT
Content-Type: text/html;charset=utf-8
Connection: close

http://www.w3.org/TR/html4/strict.dtd";>



Error response


Error response
Error code: 500
Message: Traceback (most recent call last):
  File "server.py", line 9, in do_GET
assert(False)
AssertionError
.
Error code explanation: 500 - Server got itself in trouble.



Note that when not sending the traceback with the following code

#!/usr/bin/env python3.3

import http.server
import traceback

class httphandler(http.server.BaseHTTPRequestHandler):
  def do_GET(self):
try:
  assert(False)
except:
  self.send_error(500)

if __name__=='__main__':
  addr=('',9000)
  http.server.HTTPServer(addr,httphandler).serve_forever()

Everything is working well.

→ telnet localhost 9000
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost:9000

HTTP/1.0 500 Internal Server Error
Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:51:46 GMT
Content-Type: text/html;charset=utf-8
Connection: close

http://www.w3.org/TR/html4/strict.dtd";>



Error response


Error response
Error code: 500
Message: Internal Server Error.
Error code explanation: 500 - Server got itself in trouble.


Connection closed by foreign host.

I'm looking at http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l404

For the second part of your message. I don't think the two issues should be 
mixed. Maybe open another bug report.

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue12921>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12921] http.server.BaseHTTPRequestHandler.send_error and trailing newline

2013-02-26 Thread karl


karl added the comment:

OK I had understand a bit better. 

self.send_error(code, msg) is used for

* The body
* The HTTP header
* and the log

That's bad, very bad.

I do not think it should be used for the HTTP Header at all.

--

___
Python tracker 
<http://bugs.python.org/issue12921>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12921] http.server.BaseHTTPRequestHandler.send_error and trailing newline

2013-02-26 Thread karl


karl added the comment:

ok I modify the code of server.py so that the server doesn't send the private 
message but the one which is already assigned by the library as it should. If 
there is a need for customization, there should be two separate variables, but 
which could lead to the same issues.

After modifications this is what I get.

→ telnet localhost 9000
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost:9000

HTTP/1.0 500 Internal Server Error
Server: BaseHTTP/0.6 Python/3.3.0
Date: Wed, 27 Feb 2013 00:21:21 GMT
Content-Type: text/html;charset=utf-8
Connection: close

http://www.w3.org/TR/html4/strict.dtd";>



Error response


Error response
Error code: 500
Message: Traceback (most recent call last):
  File "server.py", line 11, in do_GET
assert(False)
AssertionError
.
Error code explanation: 500 - Server got itself in trouble.


Connection closed by foreign host.


I joined the patch: server.issue12921.patch

--
keywords: +patch
Added file: http://bugs.python.org/file29255/server.issue12921.patch

___
Python tracker 
<http://bugs.python.org/issue12921>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17302] HTTP/2.0 - Implementations/Testing efforts

2013-02-27 Thread karl


karl added the comment:

Read the thread. Thanks Antoine. Better understanding. I'm still discovering 
how the community is really working.

Trying to fix a few things in the mean time 

http://bugs.python.org/issue12921
http://bugs.python.org/issue747320
http://bugs.python.org/issue11448
http://bugs.python.org/issue7370 (maybe this one is a duplicate of 747320)

This one 
http://bugs.python.org/issue15799
which is still "open", make me thinks, that it might in the new class to have 
things for strict production rules, and some parsing rules with a warning mode, 
if the user cares.

Thanks again for the context Antoine. Maybe we should close this bug as 
postponed?

--

___
Python tracker 
<http://bugs.python.org/issue17302>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17319] http.server.BaseHTTPRequestHandler send_response_only doesn't check the type and value of the code.

2013-02-27 Thread karl


New submission from karl:

def send_response_only(self, code, message=None):
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l448

There is no type checking on code or if the code is appropriate. Let's take 


==
#!/usr/bin/env python3.3
import http.server


class HTTPHandler(http.server.BaseHTTPRequestHandler):
"A very simple server"
def do_GET(self):
self.send_response(200)
self.send_header('Content-type', 'text/plain')
self.end_headers()
self.wfile.write(bytes('Response body\n\n', 'latin1'))

if __name__ == '__main__':
addr = ('', 9000)
http.server.HTTPServer(addr, HTTPHandler).serve_forever()
=

A request is working well.

=
→ http GET localhost:9000
HTTP/1.0 200 OK
Server: BaseHTTP/0.6 Python/3.3.0
Date: Thu, 28 Feb 2013 04:00:44 GMT
Content-type: text/plain

Response body

=

And the server log is 

127.0.0.1 - - [27/Feb/2013 23:00:44] "GET / HTTP/1.1" 200 -

Then let's try


=
#!/usr/bin/env python3.3
import http.server


class HTTPHandler(http.server.BaseHTTPRequestHandler):
"A very simple server"
def do_GET(self):
self.send_response(999)
self.send_header('Content-type', 'text/plain')
self.end_headers()
self.wfile.write(bytes('Response body\n\n', 'latin1'))

if __name__ == '__main__':
addr = ('', 9000)
http.server.HTTPServer(addr, HTTPHandler).serve_forever()
=

The response is

=
→ http GET localhost:9000
HTTP/1.0 999 
Server: BaseHTTP/0.6 Python/3.3.0
Date: Thu, 28 Feb 2013 03:55:54 GMT
Content-type: text/plain

Response body

=

and the log server is

127.0.0.1 - - [27/Feb/2013 22:55:12] "GET / HTTP/1.1" 999 -

And finally 

=
#!/usr/bin/env python3.3
import http.server


class HTTPHandler(http.server.BaseHTTPRequestHandler):
"A very simple server"
def do_GET(self):
self.send_response('foobar')
self.send_header('Content-type', 'text/plain')
self.end_headers()
self.wfile.write(bytes('Response body\n\n', 'latin1'))

if __name__ == '__main__':
addr = ('', 9000)
http.server.HTTPServer(addr, HTTPHandler).serve_forever()
=

The response is then

=
→ http GET localhost:9000
HTTPConnectionPool(host='localhost', port=9000): Max retries exceeded with url: 
/
=

and the server log is 

127.0.0.1 - - [27/Feb/2013 22:56:39] "GET / HTTP/1.1" foobar -

Exception happened during processing of request from ('127.0.0.1', 53917)
Traceback (most recent call last):
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socketserver.py",
 line 306, in _handle_request_noblock
self.process_request(request, client_address)
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socketserver.py",
 line 332, in process_request
self.finish_request(request, client_address)
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socketserver.py",
 line 345, in finish_request
self.RequestHandlerClass(request, client_address, self)
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socketserver.py",
 line 666, in __init__
self.handle()
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/server.py",
 line 400, in handle
self.handle_one_request()
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/server.py",
 line 388, in handle_one_request
method()
  File "../25/server.py", line 8, in do_GET
self.send_response('foobar')
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/server.py",
 line 444, in send_response
self.send_response_only(code, message)
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/server.py",
 line 459, in send_response_only
(self.protocol_version, code, message)).encode(
TypeError: %d format: a number is required, not str


Both error messages and server logs are not very helpful. 
Shall we fix it? 
What others think?

I guess there should be test cases too?
I'm happy to make unit test cases

[issue17307] HTTP PUT request Example

2013-02-28 Thread karl


karl added the comment:

Sentil:

About the PUT/POST, I would say:

A POST and PUT methods differs only by the 
intent of the enclosed representation. In the 
case of a POST, the target resource (URI) on 
the server decides what is the meaning of the 
enclosed representation in the POST request.
In a PUT request, the enclosed representation 
is meant to replace the state of the target 
resource (URI) on the server.
It is why PUT is idempotent.

About the code:

* http.client
I would remove the following comment, because the term "file" is confusing in 
HTTP terms.

   # This will create a resource file with contents of BODY

or I would say more exactly


   # This creates an HTTP message 
   # with the content of BODY as the enclosed representation 
   # for the resource http://localhost:8080/foobar

   >>> import http.client
   >>> BODY = "Some data"
   >>> conn = http.client.HTTPConnection("localhost", 8080)
   >>> conn.request("PUT", "/foobar", BODY) # The actual PUT Request
   >>> response = conn.getresponse()
   >>> print(resp.status, response.reason)

Maybe it would be good to display the message which is sent so people can 
understand what goes on the wire.

* urllib
the client code for urllib doesn't create challenge, I would had a content-type 
but no hard feeling about it. On the other hand the server code makes me a bit 
uncomfortable. It sets people into believing that this is the way you should 
reply to a PUT which is not really the case.

1. If the resource was not existing and has been successfully created, the 
server MUST answer 204.
2. if the resource already exists and has been successfully replaced/modified, 
then the server SHOULD answer either 200 or 204 (depending on the design choice)

There are plenty of other cases depending on the constraints.  See for the 
details 
http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-4.3.4

If we keep the server code, I would be willing to have a note saying that it is 
not usable as-is in production code. 

Does that make sense? :)

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue17307>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12455] urllib2 forces title() on header names, breaking some requests

2013-02-28 Thread karl


karl added the comment:

Note that HTTP header fields are case-insensitive.
See http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging#section-3.2

   Each HTTP header field consists of a case-insensitive field name
   followed by a colon (":"), optional whitespace, and the field value.

Basically the author of a request can set them to whatever he/she wants. But we 
should, IMHO, respect the author intent. It might happen that someone will 
choose a specific combination of casing to deal with broken servers and/or 
proxies. So a cycle of set/get/send should not modify at all the headers.

--
nosy: +karlcow

___
Python tracker 
<http://bugs.python.org/issue12455>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17322] urllib.request add_header() currently allows trailing spaces

2013-02-28 Thread karl


New submission from karl:

For HTTP header field names parsing, see 
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.2.4

   No whitespace is allowed between the header field-name and colon. In
   the past, differences in the handling of such whitespace have led to
   security vulnerabilities in request routing and response handling.  A
   server MUST reject any received request message that contains
   whitespace between a header field-name and colon with a response code
   of 400 (Bad Request). A proxy MUST remove any such whitespace from a
   response message before forwarding the message downstream.


In python3.3 currently
 
>>> import urllib.request
>>> req = urllib.request.Request('http://www.example.com/')
>>> req.add_header('FoO ', 'Yeah')
>>> req.header_items()
[('Foo ', 'Yeah'), ('User-agent', 'Python-urllib/3.3'), ('Host', 
'www.example.com')]

The space has not been removed. So we should fix that at least. This is a bug. 
I'm not familiar with the specific security issues mentioned in the spec.  

Note that many things can be done too: :/

>>> req.add_header('FoO \n blah', 'Yeah')
>>> req.add_header('Foo:Bar\nFoo2', 'Yeah')
>>> req.header_items()
[('Foo:bar\nfoo2', 'Yeah'), ('Foo \n blah', 'Yeah'), ('Foo ', 'Yeah'), 
('User-agent', 'Python-urllib/3.3'), ('Host', 'www.example.com')]


I will check for making a patch tomorrow.

--
components: Library (Lib)
messages: 183234
nosy: karlcow, orsenthil
priority: normal
severity: normal
status: open
title: urllib.request add_header() currently allows trailing spaces
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue17322>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17322] urllib.request add_header() currently allows trailing spaces (and other weird stuff)

2013-02-28 Thread karl


Changes by karl :


--
title: urllib.request add_header() currently allows trailing spaces -> 
urllib.request add_header() currently allows trailing spaces (and other weird 
stuff)

___
Python tracker 
<http://bugs.python.org/issue17322>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17322] urllib.request add_header() currently allows trailing spaces (and other weird stuff)

2013-02-28 Thread karl


karl added the comment:

Yet another one leading spaces :(

>>> req = urllib.request.Request('http://www.example.com/')
>>> req.header_items()
[]
>>> req.add_header(' Foo3', 'Ooops')
>>> req.header_items()
[(' foo3', 'Ooops')]
>>> req.headers
{' foo3': 'Ooops'}

--

___
Python tracker 
<http://bugs.python.org/issue17322>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

1 2 3 >

1 - 100 of 233 matches

Mail list logo