[Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Eli Bendersky
Hello,

argparse does prefix matching as long as there are no conflicts. For
example:

argparser = argparse.ArgumentParser()
argparser.add_argument('--sync-foo', action='store_true')
args = argparser.parse_args()

If I pass "--sync" to this script, it recognizes it as "--sync-foo". This
behavior is quite surprising although I can see the motivation for it. At
the very least it should be much more explicitly documented (AFAICS it's
barely mentioned in the docs).

If there's another argument registered, say "--sync-bar" the above will
fail due to a conflict.

Now comes the nasty part. When using "parse_known_args" instead of
"parse_args", the above happens too - --sync is recognized for --sync-foo
and captured by the parser. But this is wrong! The whole idea of
parse_known_args is to parse the known args, leaving unknowns alone. This
prefix matching harms more than it helps here because maybe the program
we're actually acting as a front-end for (and hence using parse_known_args)
knows about --sync and wants to get it.

Unless I'm missing something, this is a bug. But I'm also not sure whether
we can do anything about it at this point, as existing code *may* be
relying on it. The right thing to do would be to disable this prefix
matching when parse_known_args is called.

Again, at the very least this should be documented (for parse_known_args
not less than a warning box, IMHO).

Eli
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Guido van Rossum
I think matching on the shortest unique prefix is common for command line
parsers in general, not just argparse. I believe optparse did this too, and
even the venerable getopt does! I think all this originated in the original
(non-Python) GNU standard for long option parsing. All that probably
explains why the docs hardly touch upon it.

As to why parse_known_args also does this, I can see the reasoning behind
this behavior: to the end user, "--sync" is a valid option, so it would be
surprising if it didn't get recognized under certain conditions.

I suppose you were badly bitten by this recently? Can you tell us more
about what happened?


On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky  wrote:

> Hello,
>
> argparse does prefix matching as long as there are no conflicts. For
> example:
>
> argparser = argparse.ArgumentParser()
> argparser.add_argument('--sync-foo', action='store_true')
> args = argparser.parse_args()
>
> If I pass "--sync" to this script, it recognizes it as "--sync-foo". This
> behavior is quite surprising although I can see the motivation for it. At
> the very least it should be much more explicitly documented (AFAICS it's
> barely mentioned in the docs).
>
> If there's another argument registered, say "--sync-bar" the above will
> fail due to a conflict.
>
> Now comes the nasty part. When using "parse_known_args" instead of
> "parse_args", the above happens too - --sync is recognized for --sync-foo
> and captured by the parser. But this is wrong! The whole idea of
> parse_known_args is to parse the known args, leaving unknowns alone. This
> prefix matching harms more than it helps here because maybe the program
> we're actually acting as a front-end for (and hence using parse_known_args)
> knows about --sync and wants to get it.
>
> Unless I'm missing something, this is a bug. But I'm also not sure whether
> we can do anything about it at this point, as existing code *may* be
> relying on it. The right thing to do would be to disable this prefix
> matching when parse_known_args is called.
>
> Again, at the very least this should be documented (for parse_known_args
> not less than a warning box, IMHO).
>
> Eli
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread R. David Murray
On Tue, 26 Nov 2013 09:30:10 -0800, Eli Bendersky  wrote:
> Hello,
> 
> argparse does prefix matching as long as there are no conflicts. For
> example:
> 
> argparser = argparse.ArgumentParser()
> argparser.add_argument('--sync-foo', action='store_true')
> args = argparser.parse_args()
> 
> If I pass "--sync" to this script, it recognizes it as "--sync-foo". This
> behavior is quite surprising although I can see the motivation for it. At
> the very least it should be much more explicitly documented (AFAICS it's
> barely mentioned in the docs).
> 
> If there's another argument registered, say "--sync-bar" the above will
> fail due to a conflict.
> 
> Now comes the nasty part. When using "parse_known_args" instead of
> "parse_args", the above happens too - --sync is recognized for --sync-foo
> and captured by the parser. But this is wrong! The whole idea of
> parse_known_args is to parse the known args, leaving unknowns alone. This
> prefix matching harms more than it helps here because maybe the program
> we're actually acting as a front-end for (and hence using parse_known_args)
> knows about --sync and wants to get it.
> 
> Unless I'm missing something, this is a bug. But I'm also not sure whether
> we can do anything about it at this point, as existing code *may* be
> relying on it. The right thing to do would be to disable this prefix
> matching when parse_known_args is called.
> 
> Again, at the very least this should be documented (for parse_known_args
> not less than a warning box, IMHO).

I'm not sure whether or not it is a bug, but there are a number of
bugs involving argument prefix matching in the tracker.  Including one
to be able to disable it, that has a patch but didn't get into 3.4
(issue 14910).

This topic seems more suited to the bug tracker than python-dev.

Stephen hasn't been very active lately.  There is at least one
non-committer who has been, but no committers have made time to review
the patches and take action (they are on my list, but there is a lot
of stuff ahead of them).  So if you'd like to help out in that regard,
that would be great :)

--David
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Eli Bendersky
On Tue, Nov 26, 2013 at 9:38 AM, Guido van Rossum  wrote:

> I think matching on the shortest unique prefix is common for command line
> parsers in general, not just argparse. I believe optparse did this too, and
> even the venerable getopt does! I think all this originated in the original
> (non-Python) GNU standard for long option parsing. All that probably
> explains why the docs hardly touch upon it.
>
> As to why parse_known_args also does this, I can see the reasoning behind
> this behavior: to the end user, "--sync" is a valid option, so it would be
> surprising if it didn't get recognized under certain conditions.
>
> I suppose you were badly bitten by this recently? Can you tell us more
> about what happened?
>

Sure. We have a Python script that serves as a gateway to another program.
That other program has a "--sync" option. The gateway script has a
"--sync-foo" option. When the gateway script is invoked with "--sync", we'd
expect it to pass it to the program; instead, it matches it to its own
"--sync-foo" and consumes the option.

Practically, this means a big caveat on exactly the use case
parse_known_args was designed for: whenever I have a Python script using
argparse and passing unknown arguments to other programs, I have to
manually make sure there are no common prefixes between any commands to
avoid this problem.

Frankly I don't see how the current behavior can be seen as the intended
one.

Eli





>
>
> On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky  wrote:
>
>> Hello,
>>
>> argparse does prefix matching as long as there are no conflicts. For
>> example:
>>
>> argparser = argparse.ArgumentParser()
>> argparser.add_argument('--sync-foo', action='store_true')
>> args = argparser.parse_args()
>>
>> If I pass "--sync" to this script, it recognizes it as "--sync-foo". This
>> behavior is quite surprising although I can see the motivation for it. At
>> the very least it should be much more explicitly documented (AFAICS it's
>> barely mentioned in the docs).
>>
>> If there's another argument registered, say "--sync-bar" the above will
>> fail due to a conflict.
>>
>> Now comes the nasty part. When using "parse_known_args" instead of
>> "parse_args", the above happens too - --sync is recognized for --sync-foo
>> and captured by the parser. But this is wrong! The whole idea of
>> parse_known_args is to parse the known args, leaving unknowns alone. This
>> prefix matching harms more than it helps here because maybe the program
>> we're actually acting as a front-end for (and hence using parse_known_args)
>> knows about --sync and wants to get it.
>>
>> Unless I'm missing something, this is a bug. But I'm also not sure
>> whether we can do anything about it at this point, as existing code *may*
>> be relying on it. The right thing to do would be to disable this prefix
>> matching when parse_known_args is called.
>>
>> Again, at the very least this should be documented (for parse_known_args
>> not less than a warning box, IMHO).
>>
>> Eli
>>
>> ___
>> Python-Dev mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Glenn Linderman
Eli did give his use case... a front end for a program that has a 
parameter "--sync", and a front end preprocessor of some sort was trying 
to use "--sync-foo" as an argument, and wanted "--sync" to be left in 
the parameters to send on to the back end program.


Design of the front-end might better be aware of back end parameters and 
not conflict, but the documentation could be improved, likely.


It might also be possible to add a setting to disable the prefix 
matching feature, for code that prefers it not be done. Whether that is 
better done as a global setting, or a per-parameter setting I haven't 
thought through. But both the constructor and the parameter definitions 
already accept a variable number of named parameters, so I would think 
it would be possible to add another, and retain backward compatibility 
via an appropriate default.


On 11/26/2013 9:38 AM, Guido van Rossum wrote:
I think matching on the shortest unique prefix is common for command 
line parsers in general, not just argparse. I believe optparse did 
this too, and even the venerable getopt does! I think all this 
originated in the original (non-Python) GNU standard for long option 
parsing. All that probably explains why the docs hardly touch upon it.


As to why parse_known_args also does this, I can see the reasoning 
behind this behavior: to the end user, "--sync" is a valid option, so 
it would be surprising if it didn't get recognized under certain 
conditions.


I suppose you were badly bitten by this recently? Can you tell us more 
about what happened?



On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky > wrote:


Hello,

argparse does prefix matching as long as there are no conflicts.
For example:

argparser = argparse.ArgumentParser()
argparser.add_argument('--sync-foo', action='store_true')
args = argparser.parse_args()

If I pass "--sync" to this script, it recognizes it as
"--sync-foo". This behavior is quite surprising although I can see
the motivation for it. At the very least it should be much more
explicitly documented (AFAICS it's barely mentioned in the docs).

If there's another argument registered, say "--sync-bar" the above
will fail due to a conflict.

Now comes the nasty part. When using "parse_known_args" instead of
"parse_args", the above happens too - --sync is recognized for
--sync-foo and captured by the parser. But this is wrong! The
whole idea of parse_known_args is to parse the known args, leaving
unknowns alone. This prefix matching harms more than it helps here
because maybe the program we're actually acting as a front-end for
(and hence using parse_known_args) knows about --sync and wants to
get it.

Unless I'm missing something, this is a bug. But I'm also not sure
whether we can do anything about it at this point, as existing
code *may* be relying on it. The right thing to do would be to
disable this prefix matching when parse_known_args is called.

Again, at the very least this should be documented (for
parse_known_args not less than a warning box, IMHO).

Eli



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Eli Bendersky
On Tue, Nov 26, 2013 at 9:46 AM, R. David Murray wrote:

> On Tue, 26 Nov 2013 09:30:10 -0800, Eli Bendersky 
> wrote:
> > Hello,
> >
> > argparse does prefix matching as long as there are no conflicts. For
> > example:
> >
> > argparser = argparse.ArgumentParser()
> > argparser.add_argument('--sync-foo', action='store_true')
> > args = argparser.parse_args()
> >
> > If I pass "--sync" to this script, it recognizes it as "--sync-foo". This
> > behavior is quite surprising although I can see the motivation for it. At
> > the very least it should be much more explicitly documented (AFAICS it's
> > barely mentioned in the docs).
> >
> > If there's another argument registered, say "--sync-bar" the above will
> > fail due to a conflict.
> >
> > Now comes the nasty part. When using "parse_known_args" instead of
> > "parse_args", the above happens too - --sync is recognized for --sync-foo
> > and captured by the parser. But this is wrong! The whole idea of
> > parse_known_args is to parse the known args, leaving unknowns alone. This
> > prefix matching harms more than it helps here because maybe the program
> > we're actually acting as a front-end for (and hence using
> parse_known_args)
> > knows about --sync and wants to get it.
> >
> > Unless I'm missing something, this is a bug. But I'm also not sure
> whether
> > we can do anything about it at this point, as existing code *may* be
> > relying on it. The right thing to do would be to disable this prefix
> > matching when parse_known_args is called.
> >
> > Again, at the very least this should be documented (for parse_known_args
> > not less than a warning box, IMHO).
>
> I'm not sure whether or not it is a bug, but there are a number of
> bugs involving argument prefix matching in the tracker.  Including one
> to be able to disable it, that has a patch but didn't get into 3.4
> (issue 14910).
>
> This topic seems more suited to the bug tracker than python-dev.
>
> Stephen hasn't been very active lately.  There is at least one
> non-committer who has been, but no committers have made time to review
> the patches and take action (they are on my list, but there is a lot
> of stuff ahead of them).  So if you'd like to help out in that regard,
> that would be great :)
>

Being able to disable prefix matching is certainly a good idea. I'll nosy
myself on Issue 14910 and will try to find some time for it.

Eli
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Eli Bendersky
On Tue, Nov 26, 2013 at 9:53 AM, Glenn Linderman wrote:

>  Eli did give his use case... a front end for a program that has a
> parameter "--sync", and a front end preprocessor of some sort was trying to
> use "--sync-foo" as an argument, and wanted "--sync" to be left in the
> parameters to send on to the back end program.
>
> Design of the front-end might better be aware of back end parameters and
> not conflict, but the documentation could be improved, likely.
>
> It might also be possible to add a setting to disable the prefix matching
> feature, for code that prefers it not be done. Whether that is better done
> as a global setting, or a per-parameter setting I haven't thought through.
> But both the constructor and the parameter definitions already accept a
> variable number of named parameters, so I would think it would be possible
> to add another, and retain backward compatibility via an appropriate
> default.
>
>
FWIW I'm not advocating a breaking behavior change here - I fully realize
the ship has sailed. I'm interested in mitigation actions, though. Making
the documentation explain this explicitly + adding an option to disable
prefix matching (in 3.5 since we're past the 3.4 features point) will go a
long way for alleviating this gotcha.

Eli



>
> On 11/26/2013 9:38 AM, Guido van Rossum wrote:
>
>  I think matching on the shortest unique prefix is common for command
> line parsers in general, not just argparse. I believe optparse did this
> too, and even the venerable getopt does! I think all this originated in the
> original (non-Python) GNU standard for long option parsing. All that
> probably explains why the docs hardly touch upon it.
>
>  As to why parse_known_args also does this, I can see the reasoning behind
> this behavior: to the end user, "--sync" is a valid option, so it would be
> surprising if it didn't get recognized under certain conditions.
>
>  I suppose you were badly bitten by this recently? Can you tell us more
> about what happened?
>
>
> On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky  wrote:
>
>>   Hello,
>>
>>  argparse does prefix matching as long as there are no conflicts. For
>> example:
>>
>> argparser = argparse.ArgumentParser()
>> argparser.add_argument('--sync-foo', action='store_true')
>> args = argparser.parse_args()
>>
>>  If I pass "--sync" to this script, it recognizes it as "--sync-foo".
>> This behavior is quite surprising although I can see the motivation for it.
>> At the very least it should be much more explicitly documented (AFAICS it's
>> barely mentioned in the docs).
>>
>> If there's another argument registered, say "--sync-bar" the above will
>> fail due to a conflict.
>>
>> Now comes the nasty part. When using "parse_known_args" instead of
>> "parse_args", the above happens too - --sync is recognized for --sync-foo
>> and captured by the parser. But this is wrong! The whole idea of
>> parse_known_args is to parse the known args, leaving unknowns alone. This
>> prefix matching harms more than it helps here because maybe the program
>> we're actually acting as a front-end for (and hence using parse_known_args)
>> knows about --sync and wants to get it.
>>
>>  Unless I'm missing something, this is a bug. But I'm also not sure
>> whether we can do anything about it at this point, as existing code *may*
>> be relying on it. The right thing to do would be to disable this prefix
>> matching when parse_known_args is called.
>>
>>  Again, at the very least this should be documented (for parse_known_args
>> not less than a warning box, IMHO).
>>
>> Eli
>>
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/eliben%40gmail.com
>
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Paul Moore
On 26 November 2013 18:16, Eli Bendersky  wrote:
> FWIW I'm not advocating a breaking behavior change here - I fully realize
> the ship has sailed. I'm interested in mitigation actions, though. Making
> the documentation explain this explicitly + adding an option to disable
> prefix matching (in 3.5 since we're past the 3.4 features point) will go a
> long way for alleviating this gotcha.

Prefix matching is pretty much the norm in my experience, so I don't
think it should surprise people in general. That's not to say that
being more explicit in the docs wouldn't be a good thing anyway. For
the "passing unknown parameters through" case, I think having a means
to disable prefix matching is good (and it's a reasonably important
use case).

It *does* seem to me to be obvious from the name that an alternative
method named "parse_known_args" is the way to disable prefix matching.
So I think it needs to be clearly documented that it *isn't* and
precisely what purpose it is intended to serve. And it definitely
doesn't do that at the moment.

My feeling is that the following changes would be good:

1. Clarify in the documentation of parse_known_args that the function
does not behave any differently from parse_args in terms of what
arguments it accepts (and explicitly give an example showing that
prefix arguments *are* consumed). This can be done for 3.4 as it
doesn't change functionality.
2. Add a keyword argument to parse_known_args that allows the user to
disable prefix matching (I'd suggest match_prefix with a default of
True). This would be for 3.5.

Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching

2013-11-26 Thread Robert Kern

On 2013-11-26 18:16, Eli Bendersky wrote:


FWIW I'm not advocating a breaking behavior change here - I fully realize the
ship has sailed. I'm interested in mitigation actions, though. Making the
documentation explain this explicitly + adding an option to disable prefix
matching (in 3.5 since we're past the 3.4 features point) will go a long way for
alleviating this gotcha.


There is already the One Obvious Way to handle situations like yours: the user 
uses "--" to mark that the remaining arguments are pure arguments and not --options.


  parent-script --verbose -- ./child_script.py --no-verbose --etc

This is a common idiom across many argument processors. parse_known_args() is 
not a good solution for this situation even if you mitigate the prefix issue. 
Exact matches of --options are still possible.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Accepting PEP 3154 for 3.4?

2013-11-26 Thread Larry Hastings

On 11/18/2013 07:48 AM, Guido van Rossum wrote:
Clearly the framing is the weakest point of the PEP (== elicits the 
most bikeshedding).


Indeed--it is still ongoing:

   http://bugs.python.org/issue19780



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com