[issue2650] re.escape should not escape underscore

2011-04-10 Thread Ezio Melotti
Changes by Ezio Melotti : -- resolution: -> fixed stage: needs patch -> committed/rejected status: open -> closed ___ Python tracker ___ _

[issue2650] re.escape should not escape underscore

2011-04-10 Thread Roundup Robot
Roundup Robot added the comment: New changeset dda33191f7f5 by Ezio Melotti in branch 'default': #2650: re.escape() no longer escapes the "_". http://hg.python.org/cpython/rev/dda33191f7f5 -- ___ Python tracker ___

[issue2650] re.escape should not escape underscore

2011-04-03 Thread Ezio Melotti
Ezio Melotti added the comment: Georg, do you think a versionchanged note should be added for this? The change is minor and the patch updates the documentation to reflect the change. -- ___ Python tracker

[issue2650] re.escape should not escape underscore

2011-03-25 Thread Ezio Melotti
Ezio Melotti added the comment: The attached patch (issue2650.diff) adds '_' to the list of chars that are not escaped. -- keywords: +patch Added file: http://bugs.python.org/file21390/issue2650.diff ___ Python tracker

[issue2650] re.escape should not escape underscore

2011-03-25 Thread Ezio Melotti
Ezio Melotti added the comment: I did a few more tests and using a re.sub seems indeed slower (the implementation is just 4 lines though, and it's more readable): wolf@hp:~/dev/py/3.1$ ./python -m timeit -s 'import re,string; escape_pattern = re.compile("([^\x00a-zA-Z0-9])")' 'escape_pattern.

[issue2650] re.escape should not escape underscore

2011-03-25 Thread Roundup Robot
Roundup Robot added the comment: New changeset d52b1faa7b11 by Ezio Melotti in branch '2.7': #2650: Refactor re.escape and its tests. http://hg.python.org/cpython/rev/d52b1faa7b11 -- ___ Python tracker

[issue2650] re.escape should not escape underscore

2011-03-25 Thread Roundup Robot
Roundup Robot added the comment: New changeset 1402c719b7cf by Ezio Melotti in branch '3.1': #2650: Refactor the tests for re.escape. http://hg.python.org/cpython/rev/1402c719b7cf New changeset 9147f7ed75b3 by Ezio Melotti in branch '3.1': #2650: Add tests with non-ascii chars for re.escape. ht

[issue2650] re.escape should not escape underscore

2011-03-14 Thread Ezio Melotti
Ezio Melotti added the comment: re.escape and its tests can be refactored in 2.7/3.1, the '_' can be added to the list of chars that are not escaped in 3.3. I'll put together a patch and fix this unless someone thinks that the '_' should be escaped in 3.3 too. -- assignee: -> ezio.m

[issue2650] re.escape should not escape underscore

2011-03-14 Thread SilentGhost
SilentGhost added the comment: I think these are two different questions: 1. What to escape 2. What to do about poor performance of the re.escape when re.sub is used In my opinion, there isn't any justifiable reason to escape non-meta characters: it doesn't affect matching; escaped strings a

[issue2650] re.escape should not escape underscore

2011-03-13 Thread Ezio Melotti
Ezio Melotti added the comment: I took a look to what other languages do, and it turned out that: perl escapes [^A-Za-z_0-9] [0]; .net escapes the metachars and whitespace [1]; java escapes the metachars or escape sequences [2]; ruby escapes the metachars [3]; It might be OK to exclude _ from

[issue2650] re.escape should not escape underscore

2011-03-12 Thread SilentGhost
SilentGhost added the comment: Here is the latest patch for test_re incorporating review suggestions by Ezio and some improvements along the way. -- Added file: http://bugs.python.org/file21096/test_re.diff ___ Python tracker

[issue2650] re.escape should not escape underscore

2011-02-23 Thread SilentGhost
Changes by SilentGhost : Removed file: http://bugs.python.org/file20389/test_re.diff ___ Python tracker ___ ___ Python-bugs-list mailing list U

[issue2650] re.escape should not escape underscore

2011-02-23 Thread SilentGhost
Changes by SilentGhost : Added file: http://bugs.python.org/file20860/test_re.diff ___ Python tracker ___ ___ Python-bugs-list mailing list Uns

[issue2650] re.escape should not escape underscore

2011-01-13 Thread A.M. Kuchling
Changes by A.M. Kuchling : -- nosy: -akuchling ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyth

[issue2650] re.escape should not escape underscore

2011-01-13 Thread yeswanth
yeswanth added the comment: @James test results for py3k python -m timeit -s "$(printf "import re\ndef escape(s):\n return re.sub('([][.^$*+?{}\\|()])', '\\\1', s)")" 'escape("!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()")' 10 loops, best of 3: 17.1

[issue2650] re.escape should not escape underscore

2011-01-13 Thread James Y Knight
James Y Knight added the comment: Right you are, it seems that python's regexp implementation is terribly slow when doing replacements with a substitution in them. (fixing the broken test, as you pointed out changed the timing to 97.6 usec vs the in-error-reported 18.3usec.) Oh well. I still

[issue2650] re.escape should not escape underscore

2011-01-13 Thread SilentGhost
SilentGhost added the comment: James, I think the setup statement should have been: "import re\ndef escape(s):\n return re.sub(r'([][.^$*+?{}\\|()])', r'\\\1', s)")" note the raw string literals. The timings that I got after applying file20388 (http://bugs.python.org/file20388/issue2650.dif

[issue2650] re.escape should not escape underscore

2011-01-13 Thread James Y Knight
James Y Knight added the comment: Show your speed test? Looks 2.5x faster to me. But I'm running this on python 2.6, so I guess it's possible that the re module's speed was decimated in Py3k. python -m timeit -s "$(printf "import re\ndef escape(s):\n return re.sub('([][.^$*+?{}\\|()])', '\\\

[issue2650] re.escape should not escape underscore

2011-01-13 Thread SilentGhost
SilentGhost added the comment: The naïve version of the code proposed was about 3 times slower than existing version. However, the test, I think, is valuable enough. So, I'm reinstating it. -- Added file: http://bugs.python.org/file20389/test_re.diff __

[issue2650] re.escape should not escape underscore

2011-01-13 Thread SilentGhost
Changes by SilentGhost : Removed file: http://bugs.python.org/file20388/issue2650.diff ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue2650] re.escape should not escape underscore

2011-01-13 Thread SilentGhost
SilentGhost added the comment: Here is the patch, including adjustment to the test. -- Added file: http://bugs.python.org/file20388/issue2650.diff ___ Python tracker ___

[issue2650] re.escape should not escape underscore

2011-01-12 Thread SilentGhost
Changes by SilentGhost : -- nosy: +SilentGhost ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pytho

[issue2650] re.escape should not escape underscore

2011-01-12 Thread Antoine Pitrou
Antoine Pitrou added the comment: > As James said I have written the patch using only regular expressions . > This is going to be my first patch . I need help writing the test for it You will find the current tests in Lib/test/test_re.py. To execute them, run: $ ./python -m test.regrtest -v te

[issue2650] re.escape should not escape underscore

2011-01-12 Thread yeswanth
yeswanth added the comment: As James said I have written the patch using only regular expressions . This is going to be my first patch . I need help writing the test for it -- ___ Python tracker __

[issue2650] re.escape should not escape underscore

2011-01-11 Thread yeswanth
Changes by yeswanth : -- nosy: +swamiyeswanth ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python

[issue2650] re.escape should not escape underscore

2011-01-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: James, could you propose a proper patch? Even better if you also give a couple of timing results, just for the record? -- versions: +Python 3.2 -Python 2.7, Python 3.1 ___ Python tracker

[issue2650] re.escape should not escape underscore

2011-01-08 Thread Georg Brandl
Georg Brandl added the comment: The loop looks strange to me too, not to mention inefficient compared with a regex replacement done in C. -- nosy: +georg.brandl ___ Python tracker _

[issue2650] re.escape should not escape underscore

2011-01-07 Thread James Y Knight
James Y Knight added the comment: I just ran into the impl of escape after being surprised that '/' was being escaped, and then was completely amazed that it wasn't just implemented as a one-line re.subn. Come on, a loop for string replacement? This is *in* the freaking re module for pete's s

[issue2650] re.escape should not escape underscore

2010-11-25 Thread Matthew Barnett
Matthew Barnett added the comment: Re the regex module (issue #2636), would a good compromise be: regex.escape(user_input, special_only=True) to maintain compatibility? -- nosy: +mrabarnett ___ Python tracker

[issue2650] re.escape should not escape underscore

2009-09-12 Thread Björn Lindqvist
Björn Lindqvist added the comment: In my app, I need to transform the regexp created from user input so that it matches unicode characters with their ascii equivalents. For example, if someone searches for "el nino", that should match the string "el ñino". Similarly, searching for "el ñino" shou

[issue2650] re.escape should not escape underscore

2009-04-29 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyt

[issue2650] re.escape should not escape underscore

2008-09-28 Thread Jeffrey C. Jacobs
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>: -- nosy: +timehorse ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bugs-list ma

[issue2650] re.escape should not escape underscore

2008-09-28 Thread Jeffrey C. Jacobs
Changes by Jeffrey C. Jacobs <[EMAIL PROTECTED]>: -- versions: +Python 2.7, Python 3.1 -Python 2.6, Python 3.0 ___ Python tracker <[EMAIL PROTECTED]> ___ __

[issue2650] re.escape should not escape underscore

2008-06-28 Thread Morten Lied Johansen
Morten Lied Johansen <[EMAIL PROTECTED]> added the comment: In my particular case, we were passing the regex on to a database which has regex support syntactically equal to Python, so it seemed natural to use re.escape to make sure we weren't matching against the pattern we really wanted. The

[issue2650] re.escape should not escape underscore

2008-06-28 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: > The escaped regexp is not utf-8 (why should it be?) I suppose it is annoying if you want to print the escaped regexp for debugging purposes. Anyway, I suppose someone should really decide if improving re.escape is worth it, and if not, clo

[issue2650] re.escape should not escape underscore

2008-06-26 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc <[EMAIL PROTECTED]> added the comment: The escaped regexp is not utf-8 (why should it be?), but it still matches the same bytes in the searched text, which has to be utf-8 encoded anyway: >>> text = u"été".encode('utf-8') >>> regexp = u"é".encode('utf-8') >>> re.findall(rege

[issue2650] re.escape should not escape underscore

2008-06-26 Thread Morten Lied Johansen
Morten Lied Johansen <[EMAIL PROTECTED]> added the comment: One issue that the current implementation has, which I can't see have been commented on here, is that it kills utf8 characters (and probably every other character encoding that is multi-byte). A é character in an utf8 encoded string w

[issue2650] re.escape should not escape underscore

2008-06-14 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Talking about performance, why use a loop to escape special characters when you could use a regular expression to escape them all at once? -- nosy: +pitrou ___ Python tracker <[EMAIL PROTECTED]>

[issue2650] re.escape should not escape underscore

2008-05-08 Thread A.M. Kuchling
A.M. Kuchling <[EMAIL PROTECTED]> added the comment: I haven't assessed the patch, but wouldn't mind to see it applied to an alpha release or to 3.0; +0 from me. Given that the next 2.6 release is planned to be a beta, though, the release manager would have to rule. Note that I don't think thi

[issue2650] re.escape should not escape underscore

2008-05-08 Thread Russ Cox
Russ Cox <[EMAIL PROTECTED]> added the comment: On Thu, May 8, 2008 at 12:12 PM, Alexander Belopolsky <[EMAIL PROTECTED]> wrote: > > Alexander Belopolsky <[EMAIL PROTECTED]> added the comment: > > On Thu, May 8, 2008 at 11:45 AM, Russ Cox <[EMAIL PROTECTED]> wrote: > .. >> My argument is only th

[issue2650] re.escape should not escape underscore

2008-05-08 Thread Alexander Belopolsky
Alexander Belopolsky <[EMAIL PROTECTED]> added the comment: On Thu, May 8, 2008 at 11:45 AM, Russ Cox <[EMAIL PROTECTED]> wrote: .. > My argument is only that Python should behave the same in > this respect as other systems that use substantially the same > regular expressions. > This is not

[issue2650] re.escape should not escape underscore

2008-05-08 Thread Russ Cox
Russ Cox <[EMAIL PROTECTED]> added the comment: > You don't need to get so defensive. I did not raise a performance > problem, I was simply responding to Rafael's "AFAIK the lookup on > dictionaries is faster than on lists" comment. I did not say that you > *should* rewrite your patch the way I

[issue2650] re.escape should not escape underscore

2008-05-08 Thread Alexander Belopolsky
Alexander Belopolsky <[EMAIL PROTECTED]> added the comment: On Thu, May 8, 2008 at 10:36 AM, Russ Cox <[EMAIL PROTECTED]> wrote: .. > The title of this issue (#2650) is "re.escape should not escape underscore", > not "re.escape is too slow and too easy to read". > Neither does the title say "r

[issue2650] re.escape should not escape underscore

2008-05-08 Thread Russ Cox
Russ Cox <[EMAIL PROTECTED]> added the comment: > Lorenz's patch uses a set, not a list for special characters. Set > lookup is as fast as dict lookup, but a set takes less memory because it > does not have to store dummy values. More importantly, use of frozenset > instead of dict makes the

[issue2650] re.escape should not escape underscore

2008-05-08 Thread Alexander Belopolsky
Alexander Belopolsky <[EMAIL PROTECTED]> added the comment: Lorenz's patch uses a set, not a list for special characters. Set lookup is as fast as dict lookup, but a set takes less memory because it does not have to store dummy values. More importantly, use of frozenset instead of dict makes

[issue2650] re.escape should not escape underscore

2008-05-07 Thread Rafael Zanella
Rafael Zanella <[EMAIL PROTECTED]> added the comment: AFAIK the lookup on dictionaries is faster than on lists. Patch added, mainly a compilation of the previous patches with an expanded test. -- nosy: +zanella Added file: http://bugs.python.org/file10215/re_patch.diff

[issue2650] re.escape should not escape underscore

2008-04-28 Thread Lorenz Quack
Lorenz Quack <[EMAIL PROTECTED]> added the comment: >> The loop in escape should really use enumerate >> instead of "for i in range(len(pattern))". > >It needs i to edit s[i]. enumerate(iterable) returns a tuple for each element in iterable containing the index and the element itself. I attach

[issue2650] re.escape should not escape underscore

2008-04-24 Thread Russ Cox
Russ Cox <[EMAIL PROTECTED]> added the comment: > The loop in escape should really use enumerate > instead of "for i in range(len(pattern))". It needs i to edit s[i]. > Instead of using a loop, can't the test just > use "self.assertEqual(re.esacpe(same), same)?" Done. > Also, please add tes

[issue2650] re.escape should not escape underscore

2008-04-23 Thread Benjamin Peterson
Benjamin Peterson <[EMAIL PROTECTED]> added the comment: Thanks. The loop in escape should really use enumerate instead of "for i in range(len(pattern))". Instead of using a loop, can't the test just use "self.assertEqual(re.esacpe(same), same)?" Also, please add tests for what re.escape should

[issue2650] re.escape should not escape underscore

2008-04-23 Thread Russ Cox
Changes by Russ Cox <[EMAIL PROTECTED]>: -- keywords: +patch Added file: http://bugs.python.org/file10080/re.patch __ Tracker <[EMAIL PROTECTED]> __ ___

[issue2650] re.escape should not escape underscore

2008-04-17 Thread Benjamin Peterson
Benjamin Peterson <[EMAIL PROTECTED]> added the comment: Would you like to work on a patch? __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailin

[issue2650] re.escape should not escape underscore

2008-04-17 Thread Russ Cox
Russ Cox <[EMAIL PROTECTED]> added the comment: > It seems that escape is pretty dumb. The documentations says that > re.escape escapes all non-alphanumeric characters, and it does that > faithfully. It would seem more useful to have a list of meta-characters > and just escape those. This is more

[issue2650] re.escape should not escape underscore

2008-04-17 Thread Benjamin Peterson
Benjamin Peterson <[EMAIL PROTECTED]> added the comment: It seems that escape is pretty dumb. The documentations says that re.escape escapes all non-alphanumeric characters, and it does that faithfully. It would seem more useful to have a list of meta-characters and just escape those. This is mor

[issue2650] re.escape should not escape underscore

2008-04-17 Thread Guido van Rossum
Changes by Guido van Rossum <[EMAIL PROTECTED]>: -- versions: +Python 2.6, Python 3.0 -Python 2.5 __ Tracker <[EMAIL PROTECTED]> __ ___ Python-b

[issue2650] re.escape should not escape underscore

2008-04-17 Thread Guido van Rossum
Changes by Guido van Rossum <[EMAIL PROTECTED]>: -- keywords: +easy __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubsc

[issue2650] re.escape should not escape underscore

2008-04-17 Thread Russ Cox
Changes by Russ Cox <[EMAIL PROTECTED]>: -- components: +Regular Expressions __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list

[issue2650] re.escape should not escape underscore

2008-04-17 Thread Russ Cox
New submission from Russ Cox <[EMAIL PROTECTED]>: import re print re.escape("_") Prints \_ but should be _. This behavior differs from Perl and other systems: _ is an identifier character and as such does not need to be escaped. -- messages: 65585 nosy: rsc severity: normal status: ope