[issue36372] Flawed handling of URL redirect

2019-03-19 Thread Brrr Grrr


New submission from Brrr Grrr :

I'm unable to use `urlopen` to open 
'https://www.annemergmed.com/article/S0196-0644(99)70271-4/abstract' with 
Python 3.7. I believe this to be flawed URL redirection, possibly due to flawed 
URL parsing.

```python
from sys import version
version
'3.7.2 (default, Dec 25 2018, 03:50:46) \n[GCC 7.3.0]'
from urllib.request import urlopen
urlopen('https://www.annemergmed.com/article/S0196-0644(99)70271-4/abstract')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
  File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.7/urllib/request.py", line 563, in error
result = self._call_chain(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
  File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.7/urllib/request.py", line 563, in error
result = self._call_chain(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
  File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.7/urllib/request.py", line 563, in error
result = self._call_chain(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
  File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.7/urllib/request.py", line 563, in error
result = self._call_chain(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
  File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.7/urllib/request.py", line 563, in error
result = self._call_chain(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
  File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.7/urllib/request.py", line 563, in error
result = self._call_chain(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
  File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.7/urllib/request.py", line 563, in error
result = self._call_chain(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
  File "/usr/lib/python3.7/urllib/request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.7/

[issue36372] Flawed handling of URL redirect

2019-03-19 Thread Brrr Grrr


Brrr Grrr  added the comment:

Please note that the `requests` package, for example, has no trouble reading 
this URL. I don't want to use that package for this task for certain other 
reasons though.

```python
>>> import requests
>>> requests.__version__
'2.21.0'
>>> requests.get('https://www.annemergmed.com/article/S0196-0644(99)70271-4/abstract')

```

--

___
Python tracker 
<https://bugs.python.org/issue36372>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36372] Flawed handling of URL redirect

2019-03-19 Thread Brrr Grrr


Brrr Grrr  added the comment:

This error is not due to cookies either. I tried `HTTPCookieProcessor` with no 
luck. Cookies help with opening certain other URLs but evidently not with this 
one.

--

___
Python tracker 
<https://bugs.python.org/issue36372>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36372] Flawed handling of URL redirect

2019-03-19 Thread Brrr Grrr


Brrr Grrr  added the comment:

That's not to say that cookies are not needed for this URL. They may very well 
be needed using HTTPCookieProcessor. I'm saying that cookies alone won't solve 
this issue.

--

___
Python tracker 
<https://bugs.python.org/issue36372>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36372] Flawed handling of URL redirect

2019-03-19 Thread Brrr Grrr


Brrr Grrr  added the comment:

I now used a custom HTTPRedirectHandler with `max_redirections = 20`. The 
default is 10. This workaround addresses the issue, although it doesn't rule 
out a cleaner fix.

--

___
Python tracker 
<https://bugs.python.org/issue36372>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com