[issue22114] You cannot call communicate() safely after receiving EAGAIN

2014-07-31 Thread Amrith Kumar

New submission from Amrith Kumar:

Environment:
- Linux (Ubuntu 14.04 LTS, x64) running Python 2.7.
- Code uses eventlet.green.subprocess

Code establishes subprocess as:

   subprocess.Popen(command,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE, ...)

It then calls communicate()

communicate() throws EAGAIN

The code calls communicate() again

This fails with "ValueError: I/O operation on closed file"

This exception comes from eventlet/greenio.py because an operation (flush) has 
been attempted on a closed file.

The complete explanation of this failure is this.

When communicate() is called in this way (WITH NO INPUT but with all three 
handles stdin, stdout, stderr), code in communicate() bypasses the "1 handle 
optimization" and goes directly to _communicate()

_communicate() will first send any input on stdin along, and flush and close 
stdin(). 

>From that point forward, if anything to do with stderr and stdout returns 
>EAGAIN, an attempt to call communicate() again will fail because stdin has now 
>been closed().

This is because the code in _communicate() preemptively closes stdin if there 
is no input but does not reset stdin.

   def _communicate(self, input):
if self.stdin:
# Flush stdio buffer.  This might block, if the user has
# been writing to .stdin in an uncontrolled fashion.
self.stdin.flush()
if not input:
self.stdin.close()

The fix for this is relatively straightforward (conceptually).

   def _communicate(self, input):
if self.stdin:
# Flush stdio buffer.  This might block, if the user has
# been writing to .stdin in an uncontrolled fashion.
self.stdin.flush()
if not input:
self.stdin.close()
self.stdin = None # This should take care of it.

However, a partial workaround from the client perspective is this.

1. If you have no input, set stdin to None
2. If you do have input and you get EAGAIN, you cannot safely retry because 
your input may have already been completely or partially sent to the 
subprocess; you have to assume failure.


Backtrace:

Traceback (most recent call last):
File 
"/opt/stack/trove/trove/tests/unittests/guestagent/test_mongodb_manager.py", 
line 58, in test_prepare_from_backup
self._prepare_dynamic(backup_id='backup_id_123abc')
File 
"/opt/stack/trove/trove/tests/unittests/guestagent/test_mongodb_manager.py", 
line 91, in _prepare_dynamic
backup_info=backup_info)
File "trove/guestagent/datastore/mongodb/manager.py", line 66, in 
prepare
operating_system.update_owner('mongodb', 'mongodb', mount_point)
File "trove/guestagent/common/operating_system.py", line 109, in 
update_owner
run_as_root=True, root_helper="sudo")
File "trove/common/utils.py", line 278, in execute_with_timeout
return execute(*args, **kwargs)
File "trove/openstack/common/processutils.py", line 186, in execute
result = obj.communicate()
File "/usr/lib/python2.7/subprocess.py", line 799, in communicate
return self._communicate(input)
File "/usr/lib/python2.7/subprocess.py", line 1396, in _communicate
self.stdin.flush()
File 
"/opt/stack/trove/.tox/py27/local/lib/python2.7/site-packages/eventlet/greenio.py",
 line 419, in _operationOnClosedFile
raise ValueError("I/O operation on closed file")

ValueError: I/O operation on closed file

--
components: IO
messages: 224398
nosy: amrith
priority: normal
severity: normal
status: open
title: You cannot call communicate() safely after receiving EAGAIN
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22114] You cannot call communicate() safely after receiving EAGAIN

2014-07-31 Thread Amrith Kumar

Amrith Kumar added the comment:

I'll supply a patch for this proposed change.

--

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22114] You cannot call communicate() safely after receiving EAGAIN

2014-07-31 Thread Amrith Kumar

Amrith Kumar added the comment:

I see three comments, one from r.david.murray, one from haypo and one from 
pitrou. I'll try and address all three.

r.david.murray:

The code in question is in 
https://github.com/openstack/oslo-incubator/blob/master/openstack/common/processutils.py#L177-L189

note that we're catching EAGAIN and EINTR.

I have not been able to isolate this down to a simple repro without the rest of 
this paraphernalia but I'm trying.

So, we are 'catching' EAGAIN or EINTR here and we're trying to handle it to the 
best of our ability. However, if the underlying layer is not setup to handle a 
retry, our best efforts will be fruitless.

That is what is happening here. 

The reason for this code (ignoring the retry of 20) was put in place exactly 
because a call to communicate() received an EAGAIN.

The issue therefore is that in order for the higher level to properly handle 
this, communicate() should be setup to handle a second call, which it currently 
is not. 

haypo and pitrou: that may be true; I'm not competent to comment on that.

But, as pointed out in earlier comment (and modulo this may be eventlet 
specific), just catching more exceptions isn't the answer.

if the descriptor is closed, the thing that communicate/_communicate() call 
should be able to handle that situation. And this bug illustrates that at least 
eventlet doesn't handle that.

However, I submit to you that this is NOT an eventlet issue. Here's why.

The failure here is that a flush call is being attempted on a closed 
descriptor. I believe that the implementation of flush (in eventlet) is 
legitimately throwing an exception indicating that the state machine was 
violated (cannot flush on closed descriptor). 

The close() was invoked by subprocess.py after it finished doing what it 
thought it had to do with stdin on the first invocation. therefore I believe it 
must be the responsibility of subprocess.py to make sure that when invoked 
again, it doesn't step on itself.

Either that, or subprocess.py's communicate() implementation should indicate 
that it can only be called once, capture all exceptions that would point a user 
to retry (such as EAGAIN and EINTR) and mask them and return some EFATAL.

--
components: +IO -Library (Lib)

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22114] You cannot call communicate() safely after receiving an exception (EINTR or EAGAIN)

2014-07-31 Thread Amrith Kumar

Changes by Amrith Kumar :


--
title: You cannot call communicate() safely after receiving EAGAIN -> You 
cannot call communicate() safely after receiving an exception (EINTR or EAGAIN)

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22114] You cannot call communicate() safely after receiving an exception (EINTR or EAGAIN)

2014-07-31 Thread Amrith Kumar

Amrith Kumar added the comment:

haypo; no argument from me. I just updated the description to make it more 
exact.

I may have received EINTR or EAGAIN.

In either event, I need to debug some more and try and do a clean repro and 
then figure out a proper fix.

In the meanwhile, I proposed a fix for the openstack code; if you are so 
inclined, please take a look.

https://review.openstack.org/#/c/110933/

--

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22114] You cannot call communicate() safely after receiving an exception (EINTR or EAGAIN)

2014-07-31 Thread Amrith Kumar

Amrith Kumar added the comment:

r.david.murray, I guess the issue we have is that something (hypothesis: 
eventlet) is causing the blocking API to communicate() to throw an EAGAIN.

This (https://bugs.launchpad.net/nova/+bug/1271331) is the bug that was fixed a 
while ago in OpenStack that introduced the notion of calling communicate() 
multiple times.

--

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22114] You cannot call communicate() safely after receiving an exception (EINTR or EAGAIN)

2014-08-02 Thread Amrith Kumar

Amrith Kumar added the comment:

After some debugging and reading code in python's subprocess.py (v2.7), here's 
what I'm seeing.

(a) the error has nothing to do with eventlet and monkey-patching.
(b) the code in _communicate_with_select() and potentially 
_communicate_with_poll() are the problem.

What's the problem?
---

The code in _communicate_with_select() correctly sets up to handle the read and 
write calls without blocking on any of them. It does this by establishing the 
two (read and write) lists of descriptors and calls select without no timeout 
specified. 

When select returns, and indicates that a socket is in (for example) the read 
list, what that means is that an attempt to read() will not block. However, it 
is possible on some systems (Linux for sure) that 

(a) a socket is non-blocking, and
(b) a call to select indicates that the socket is ready to read, and
(c) a call to read the socket returns an error EAGAIN (aka EWOULDBLOCK).

Callers of read() and write() on non-blocking sockets should be prepared to 
handle this situation. 

The python code in _communicate_with_select() is not.

It assumes that if select() returns that a read fd is ready for read, that a 
call to read it will produce 0 or more bytes. The calls to read() for stdout 
and stderr are not guarded with exception handlers. However, if a socket is 
setup as non-blocking, any call can produce EWOULDBLOCK, EAGAIN, ...

Adding some debugging code it was possible to recreate the problem and show 
that the backtrace was (in this case it happened with python 2.6)

Traceback (most recent call last):
[...]
  File "trove/openstack/common/processutils.py", line 186, in execute
result = obj.communicate()
  File "/usr/lib64/python2.6/subprocess.py", line 732, in communicate
stdout, stderr = self._communicate(input, endtime)
  File "/usr/lib64/python2.6/subprocess.py", line 1318, in _communicate
stdout, stderr = self._communicate_with_select(input, endtime)
  File "/usr/lib64/python2.6/subprocess.py", line 1483, in 
_communicate_with_select
data = os.read(self.stdout.fileno(), 1024)
OSError: [Errno 11] Resource temporarily unavailable

The correct fix for this is to make _communicate_with_select() and maybe 
_communicate_with_poll() properly handle the read() and write() calls and be 
better prepared to handle a thrown condition of EAGAIN from os.read or os.write.

--

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22114] You cannot call communicate() safely after receiving an exception (EINTR or EAGAIN)

2014-08-05 Thread Amrith Kumar

Amrith Kumar added the comment:

The issue turned out to be something funky in monkey patching.

os.read() shouldn't be getting called if the monkey patch was correct in the 
first place.

--
resolution:  -> not a bug
status: open -> closed

___
Python tracker 
<http://bugs.python.org/issue22114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28841] urlparse.urlparse() parses invalid URI without generating an error (examples provided)

2016-11-30 Thread Amrith Kumar

New submission from Amrith Kumar:

The urlparse library incorrectly accepts a URI which specifies an invalid host 
identifier.

Example:

http://www.example.com:/abc

Looking at the URI specifications, this is an invalid URI.

By definition, you are supposed to specify a "hostport" and a hostport is 
defined as:

https://www.w3.org/Addressing/URL/uri-spec.html

hostport
host [ : port ]

The BNF indicates that : is only valid if a port is also specified.

See current behavior; I submit to you that this should generate an exception.

https://gist.github.com/anonymous/8504f160ff90649890b5a2a82f8028b0

--
components: Library (Lib)
messages: 282086
nosy: amrith
priority: normal
severity: normal
status: open
title: urlparse.urlparse() parses invalid URI without generating an error 
(examples provided)
type: behavior
versions: Python 2.7, Python 3.4

___
Python tracker 
<http://bugs.python.org/issue28841>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28841] urlparse.urlparse() parses invalid URI without generating an error (examples provided)

2016-11-30 Thread Amrith Kumar

Amrith Kumar added the comment:

As requested ...

>>> urlparse.urlparse('http://www.google.com:/abc')
ParseResult(scheme='http', netloc='www.google.com:', path='/abc', params='', 
query='', fragment='')

I submit to you that this should generate an error.

--

___
Python tracker 
<http://bugs.python.org/issue28841>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28841] urlparse.urlparse() parses invalid URI without generating an error (examples provided)

2016-11-30 Thread Amrith Kumar

Amrith Kumar added the comment:

Eric, Martin,

I'm sure this is an interpretation (and I'll be fair and give you both) that a 
URL of the form:

http://hostname.domain.tld:/path

should be held invalid. The rationale is per the section of the spec you quoted.

The other side of the argument is that the hostname and port are defined as:

hostname [ : port ] 

where port is defined as

*DIGIT

This implies that 0 digits is also valid.

I submit to you that the ambiguity would be removed if they:

- removed the paragraph telling people not to emit a : if there was no port, or
- changing the port definition to 1*DIGIT

Absent that, I believe that the paragraph implies that the intent was 1*DIGIT.

And therefore a : with no following number should generate an error.

I raise this because the behavior of urlparse() (the way it does things now) is 
being cited as the reason why other routines in OpenStack should handle this 
when the reason we were getting URL's with a : and no port was in fact an 
oversight.

So, in hearing the rationalization as being that "urlparse() does it, so ..." I 
figured I'd come to the source and see if we could make urlparse() do things 
unambiguously.

Now, if the reason it does this is because someone who came before me made the 
argument that a : with no port should be accepted, I guess I'm out of luck.

--

___
Python tracker 
<http://bugs.python.org/issue28841>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com