[Python-ideas] Re: Fwd: Re: Fwd: re.findfirst()

MRAB Fri, 06 Dec 2019 11:54:47 -0800

On 2019-12-06 18:24, Andrew Barnert via Python-ideas wrote:

On Dec 6, 2019, at 09:51, Random832 <[email protected]> wrote:


If match objects are too hard to use, maybe they should be made more 
user-friendly? What about adding str and iterable semantics to match objects so 
it can be used as str(re.search(...)); tuple(re.search(...)); a, b = 
re.search(...)?


That’s a clever idea, and it might work.

1. Match objects are also be returned by re.match, and you wouldn'texpect that to look for more matches.

2. What would tuple(re.search(...)) do? Wouldn't it do the same astuple(re.findall(...))?

3. a, b = re.search(...) would fail if it didn't return exactly 2matches, and it would keep looking after the second match for a thirdmatch because that's how assigning from an iterator currently works -it's iterated until it's exhausted.

For iteration, the only question is what it returns when there’s only one 
capture group. If you do that with the findall entries you’ll get a tuple of 
the characters in the string, rather than a single-element tuple. I don’t think 
that’s behavior anyone would actually want for tuple(match) if we were 
designing the whole re module API from scratch. But would it be too 
inconsistent if you didn’t do it that way?

For string, str(match) already works, and sometimes provides useful debugging info. 
At the REPL this is probably no big deal (it’s easier to dump the repr than the str 
anyway), but what about logs? For example. I’ve got a parse error on a request, and 
my logs tell me the last successful match was <_sre.SRE_Match object; span=(21137, 
21142), match='alpha'>, so I know to look around 21137 characters into the request 
to find the problem. After upgrading Python, the logs would just say alpha, which 
wouldn’t help me. I’d have to go change the code to log %r instead of %s (or, maybe, 
stop being so hacky and explicitly log the span and groups, and also log where the 
failed search started rather than guessing from the previous one, and make the parser 
give useful errors in the first place, etc.) before I could debug future requests. 
You’re not supposed to even rely on repr being consistent across Python 
implementations and versions, much less on str being developer- rather than 
user-friendly, but sometimes people do, and sometimes we all have to deal with their 
code. I don’t think this is a huge objection, but it is worth figuring out how often 
and how badly people would be affected.

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/7OAKXZ4DNULSISJS5RNACSUVMY77F7J4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Fwd: Re: Fwd: re.findfirst()

Reply via email to