[issue46410] TypeError when parsing regexp with unicode named character sequence escape

2022-01-18 Thread Matthew Barnett
Matthew Barnett added the comment: They're not supported in string literals either: Python 3.10.1 (tags/v3.10.1:2cd268a, Dec 6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more inf

[issue46515] Benefits Of Phool Makhana

2022-01-25 Thread Matthew Barnett
Change by Matthew Barnett : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue46515> ___ ___ Python-bugs-list

[issue46627] Regex hangs indefinitely

2022-02-03 Thread Matthew Barnett
Matthew Barnett added the comment: That pattern has: (?P[^]]+)+ Is that intentional? It looks wrong to me. -- ___ Python tracker <https://bugs.python.org/issue46

[issue46825] slow matching on regular expression

2022-02-22 Thread Matthew Barnett
Matthew Barnett added the comment: The expression is a repeated alternative where the first alternative is a repeat. Repeated repeats can result in a lot of attempts and backtracking and should be avoided. Try this instead: (0|1(01*0)*1

[issue13169] Regular expressions with 0 to 65536 repetitions raises OverflowError

2011-10-13 Thread Matthew Barnett
Matthew Barnett added the comment: The quantifiers use 65535 to represent no upper limit, so ".{0,65535}" is equivalent to ".*". For example: >>> re.match(".*", "x" * 10).span() (0, 10) >>> re.match(".{0,65535}", &

[issue13169] Regular expressions with 0 to 65536 repetitions raises OverflowError

2011-10-14 Thread Matthew Barnett
Matthew Barnett added the comment: The limit is an implementation detail. The pattern is compiled into codes which are then interpreted, and it just happens that the codes are (usually) 16 bits, giving a range of 0..65535, but it uses 65535 to represent no limit and doesn't warn i

[issue13592] repr(regex) doesn't include actual regex

2011-12-13 Thread Matthew Barnett
Matthew Barnett added the comment: In reply to Ezio, the repr of a large string, list, tuple or dict is also long. The repr of a compiled regex should probably also show the flags, but should it just be the numeric value? -- ___ Python tracker

[issue13592] repr(regex) doesn't include actual regex

2011-12-13 Thread Matthew Barnett
Matthew Barnett added the comment: Actually, one possibility that occurs to me is to provide the flags within the pattern. The .pattern attribute gives the original pattern, but repr could give the flags in-line at the start of the pattern: >>> # Assuming Python 3. >>>

[issue13592] repr(regex) doesn't include actual regex

2011-12-22 Thread Matthew Barnett
Matthew Barnett added the comment: I'm just adding this to the regex module and I've come up against a possible issue. The regex module supports named lists, which could be very big. Should the entire contents of those lists also be shown in the repr?They would have to be if the

[issue13652] Creating lambda functions in a loop has unexpected results when resolving variables used as arguments

2011-12-22 Thread Matthew Barnett
Matthew Barnett added the comment: That's not a bug. This might help to explain what's going on: What do (lambda) function closures capture in Python? http://stackoverflow.com/questions/2295290/what-do-lambda-function-closures-capture-in-python -- nosy: +

[issue12177] re.match raises MemoryError

2011-05-25 Thread Matthew Barnett
Matthew Barnett added the comment: This also raises MemoryError: re.match(r'()*?1', 'a1') but none of these do: re.match(r'()+1', 'a1') re.match(r'()*1', 'a1') -- nosy: +mrabarnett ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2011-07-11 Thread Matthew Barnett
Matthew Barnett added the comment: The new regex imlementation is hosted here: https://code.google.com/p/mrab-regex-hg/ The span of m['a_thing'] is m.span('a_thing'), if that helps. The named groups are listed on the pattern object, which can be accessed via m.re: >

[issue12671] urlopen returning empty string

2011-07-31 Thread Matthew Barnett
New submission from Matthew Barnett : Someone over at StackOverflow had a problem with urlopen in Python 3.2.1: http://stackoverflow.com/questions/6892573/problem-with-urlopen/6892843#6892843 This is the code: from urllib.request import urlopen f = urlopen('http://online.ws

[issue12671] urlopen returning empty string

2011-07-31 Thread Matthew Barnett
Matthew Barnett added the comment: Just been told this bug has already been reported as issue #12576. -- resolution: -> duplicate ___ Python tracker <http://bugs.python.org/issu

[issue12671] urlopen returning empty string

2011-07-31 Thread Matthew Barnett
Changes by Matthew Barnett : -- status: open -> closed ___ Python tracker <http://bugs.python.org/issue12671> ___ ___ Python-bugs-list mailing list Unsubscri

[issue12728] Python re lib fails case insensitive matches on Unicode data

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12728> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12729> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12730] Python's casemapping functions are untrustworthy due to narrow/wide build issues

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12730> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12731> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12732] Can't portably use Unicode in Python identifiers

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12732> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12733] Request for grapheme support in Python re lib

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12733> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12734] Request for property support in Python re lib

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12734> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12735] request full Unicode collation support in std python library

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12735> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-12 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue12736> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-12 Thread Matthew Barnett
Matthew Barnett added the comment: In a narrow build, a codepoint in the astral plane is encoded as surrogate pair. I could implement a workaround for it in the regex module, but I think that the proper place to fix it is in the language as a whole, perhaps by implementing PEP 393 ("Fle

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-13 Thread Matthew Barnett
Matthew Barnett added the comment: There are occasions when you want to do string slicing, often of the form: pos = my_str.index(x) endpos = my_str.index(y) substring = my_str[pos : endpos] To me that suggests that if UTF-8 is used then it may be worth profiling to see whether caching the

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-13 Thread Matthew Barnett
Matthew Barnett added the comment: You're right about starting the second search from where the first finished. Caching the position would be an advantage there. The memory cost of extra pointers wouldn't be so bad if UTF-8 took less space than the current format. Regex isn'

[issue12749] lib re cannot match non-BMP ranges (all versions, all builds)

2011-08-14 Thread Matthew Barnett
Matthew Barnett added the comment: On a narrow build, "\N{MATHEMATICAL SCRIPT CAPITAL A}" is stored as 2 code units, and neither re nor regex recombine them when compiling a regex or looking for a match. regex supports \xNN, \u and \U and \N{XYZ} itself, so they can be

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-14 Thread Matthew Barnett
Matthew Barnett added the comment: Have a look here: http://98.245.80.27/tcpc/OSCON2011/gbu/index.html -- ___ Python tracker <http://bugs.python.org/issue12

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-15 Thread Matthew Barnett
Matthew Barnett added the comment: For what it's worth, I've had idea about string storage, roughly based on how *nix stores data on disk. If a string is small, point to a block of codepoints. If a string is medium-sized, point to a block of pointers to codepoint blocks. If a

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-19 Thread Matthew Barnett
Matthew Barnett added the comment: For the "Line_Break" property, one of the possible values is "Inseparable", with 2 permitted aliases, the shorter "IN" (which is reasonable) and "Inseperable" (ouch!). -- _

[issue12789] re.Scanner don't support more then 2 groups on regex

2011-08-20 Thread Matthew Barnett
Matthew Barnett added the comment: Even if this bug is fixed, it still won't work as you expect, and this s why. The Scanner function accepts a list of 2-tuples. The first item of the tuple is a regex and the second is a function. For example: re.Scanner([(r"\d+", number)

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Matthew Barnett
Matthew Barnett added the comment: There are some oddities in Unicode case-folding. Under full case-folding, both "\N{LATIN CAPITAL LETTER SHARP S}" and "\N{LATIN SMALL LETTER SHARP S}" fold to "ss", which means that those codepoints match each other. Howe

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-28 Thread Matthew Barnett
Matthew Barnett added the comment: The regex module currently uses simple case-folding, although I'm working towards full case-folding, as listed in http://www.unicode.org/Public/UNIDATA/CaseFolding.txt. -- ___ Python tracker

[issue2636] Adding a new regex module (compatible with re)

2011-09-01 Thread Matthew Barnett
Matthew Barnett added the comment: The regex module supports nested sets and set operations, eg. r"[[a-z]--[aeiou]]" (the letters from 'a' to 'z', except the vowels). This means that literal '[' in a set needs to be escaped. For example, re module s

[issue2636] Adding a new regex module (compatible with re)

2011-09-01 Thread Matthew Barnett
Matthew Barnett added the comment: I think I need a show of hands. Should the default be old behaviour (like re) or new behaviour? (It might be old now, new later.) Should there be a NEW flag (as at present), or an OLD flag, or a VERSION parameter (0=old, 1=new, 2

[issue2636] Adding a new regex module (compatible with re)

2011-09-02 Thread Matthew Barnett
Matthew Barnett added the comment: The least disruptive change would be to have a NEW flag for the new behaviour, as at present, and an OLD flag for the old behaviour. Currently the default is old behaviour, but in the future it will be new behaviour. The differences would be: Old

[issue2636] Adding a new regex module (compatible with re)

2011-09-02 Thread Matthew Barnett
Matthew Barnett added the comment: So, VERSION0 and VERSION1, with "(?V0)" and "(?V1)" in the pattern? -- ___ Python tracker <http://bu

[issue7951] Should str.format allow negative indexes when used for __getitem__ access?

2010-08-11 Thread Matthew Barnett
Matthew Barnett added the comment: I agree with Kamil and Germán. I would've expected negative indexes for sequences to work. Negative indexes for fields is a different matter. -- ___ Python tracker <http://bugs.python.org/i

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-14 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20100814.zip is a new version of the regex module. I've added default Unicode word boundaries and renamed the Pattern and Match classes. Over to you, Alex. :-) -- Added file: http://bugs.python.org/file18532/issue2636-2010081

[issue7255] "Default" word boundaries for Unicode data?

2010-08-14 Thread Matthew Barnett
Matthew Barnett added the comment: These have been added to the new 'regex' module. See issue #2636 or PyPI at: http://pypi.python.org/pypi/regex -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.

[issue7255] "Default" word boundaries for Unicode data?

2010-08-15 Thread Matthew Barnett
Matthew Barnett added the comment: If you're on Windows (x86, 32-bit) then compilation isn't necessary - just use the appropriate _regex.pyd. -- ___ Python tracker <http://bugs.python.

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-15 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20100816.zip is a new version of the regex module. Unfortunately I came across a bug in the handing of sets. More unit tests added. -- Added file: http://bugs.python.org/file18541/issue2636-20100816.zip

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-23 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20100824.zip is a new version of the regex module. More speedups. Getting towards Perl speed now, depending on the regex. :-) -- Added file: http://bugs.python.org/file18621/issue2636-20100824.zip

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-11 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20100912.zip is a new version of the regex module. More speedups. I've been comparing the speed against Perl wherever possible. In some cases Perl is lightning fast, probably because regex is built into the language and it doesn't hav

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett
Matthew Barnett added the comment: Another flag? Hmm. How about this instead: if a scoped flag appears at the end of a regex (and would therefore normally have no effect) then it's treated as though it's at the start of the regex. Thus: foo(?i) is treated like:

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett
Matthew Barnett added the comment: The tests for re include these regexes: a.b(?s) a.*(?s)b I understand what Georg said previously about some people preferring to put them at the end, but I personally wouldn't do that because some regex implementations support scoped inline

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett
Matthew Barnett added the comment: OK, so would it be OK if there was, say, a NEW (N) flag which made the inline flags (?flags) scoped and allowed splitting on zero-width matches? -- ___ Python tracker <http://bugs.python.org/issue2

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20100913.zip is a new version of the regex module. I've removed the ZEROWIDTH flag and added the NEW flag, which turns on the new behaviour such as splitting on zero-width matches and positional flags. If the NEW flag isn't turned o

[issue1708652] Exact matching

2010-09-17 Thread Matthew Barnett
Matthew Barnett added the comment: Does this request still stand? If so then I'll add it to the new regex module. -- ___ Python tracker <http://bugs.python.org/issu

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-17 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20100918.zip is a new version of the regex module. I've added 'pos' and 'endpos' arguments to regex.sub and regex.subn and refactored a little. I can't think of any other features that need to be added or see any mor

[issue1708652] Exact matching

2010-09-18 Thread Matthew Barnett
Matthew Barnett added the comment: '$' matches at the end of the string or at a newline at the end of a string (if multiline mode isn't turned on). '\Z' matches only at the end of the string. If not even the OP is convinced of the need, then I have

[issue2027] Module containing C implementations of common text algorithms

2010-09-20 Thread Matthew Barnett
Matthew Barnett added the comment: I've started on a module called 'texttools'. So far it has Levenshtein and Porter (both coded in C). If there's interest I'll put it on PyPI. Suggestions for other additions? -- nosy: +mrabarnett _

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-21 Thread Matthew Barnett
Matthew Barnett added the comment: I use Python 3, where len("\U00010337") == 2 on a narrow build. Yes, wide Unicode on a narrow build is a problem: >>> regex.findall("\\U00010337", "a\U00010337bc") [] >>> regex.findall("(?i)\\U00010337&quo

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-10-08 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101009.zip is a new version of the regex module. It appears from a posting in python-list and a closer look at the docs that string positions in the 're' module are limited to 32 bits, even on 64-bit builds. I think it's because

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-10-14 Thread Matthew Barnett
Matthew Barnett added the comment: I am not able to build or test a 64-bit version. The update was to the source files to ensure that if it is compiled for 64 bits then the string positions will also be 64-bit. This change was prompted by a poster who tried to use the re module of a 64-bit

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-10-29 Thread Matthew Barnett
Matthew Barnett added the comment: That's a bug. I'll fix it as soon has I've reinstalled the SDK. -- ___ Python tracker <http://bugs.py

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-10-29 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101029.zip is a new version of the regex module. I've also added to the unit tests. -- Added file: http://bugs.python.org/file19419/issue2636-20101029.zip ___ Python tracker <http://bugs.py

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-10-29 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101030.zip is a new version of the regex module. I've also added yet more to the unit tests. -- Added file: http://bugs.python.org/file19422/issue2636-20101030.zip ___ Python tracker

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-10-30 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101030a.zip is a new version of the regex module. This bug was a bit more difficult to fix, but I think it's OK now! -- Added file: http://bugs.python.org/file19435/issue2636-20101030a.zip ___ P

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-01 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101101.zip is a new version of the regex module. I hope it's finally fixed this time! :-) -- Added file: http://bugs.python.org/file19456/issue2636-20101101.zip ___ Python tracker

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-01 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101102.zip is a new version of the regex module. -- Added file: http://bugs.python.org/file19460/issue2636-20101102.zip ___ Python tracker <http://bugs.python.org/issue2

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-02 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101102a.zip is a new version of the regex module. msg120204 relates to issue #1519638 "Unmatched group in replacement". In 'regex' an unmatched group is treated as an empty string in a replacement template. This behaviour is

[issue10328] re.sub[n] doesn't seem to handle /Z replacements correctly in all cases

2010-11-05 Thread Matthew Barnett
Matthew Barnett added the comment: It's a bug caused by trying to avoid getting stuck when a zero-width match is found. Basically the fix is to advance one character after a zero-width match, but that doesn't always give the correct result. There are a number of related issues

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-05 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101106.zip is a new version of the regex module. Fix for issue 10328, which regex also shared. -- Added file: http://bugs.python.org/file19514/issue2636-20101106.zip ___ Python tracker <h

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-11 Thread Matthew Barnett
Matthew Barnett added the comment: It looks like a similar problem to msg116252 and msg116276. -- ___ Python tracker <http://bugs.python.org/issue2636> ___ ___

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-13 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101113.zip is a new version of the regex module. It now supports Unicode 6.0.0. -- Added file: http://bugs.python.org/file19597/issue2636-20101113.zip ___ Python tracker <http://bugs.python.

[issue11307] re engine exhaustively explores more than necessary

2011-02-24 Thread Matthew Barnett
Matthew Barnett added the comment: It's a known issue (see issue #1662581, for example). There's a new implementation at PyPI which doesn't have this problem: http://pypi.python.org/pypi/regex -- nosy: +mrabarnett ___ Python

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2011-03-14 Thread Matthew Barnett
Matthew Barnett added the comment: @Gregory: I've added you to the project. I'm currently trying to fix a problem with iterators shared across threads. As a temporary measure, the current release on PyPI doesn't enable multithreading for them. The mrab-regex-hg project doe

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2011-03-15 Thread Matthew Barnett
Matthew Barnett added the comment: I've fixed the problem with iterators for both Python 3 and Python 2. They can now be shared safely across threads. I've updated the release on PyPI. -- ___ Python tracker <http://bugs.python.

[issue6210] Exception Chaining missing method for suppressing context

2011-03-16 Thread Matthew Barnett
Matthew Barnett added the comment: I've been looking through the list of current keywords and the best syntax I could come up with for suppressing the context is: try: x / y except ZeroDivisionError as e: raise as Exception( 'Invalid value for y' ) T

[issue11665] Regexp findall freezes

2011-03-25 Thread Matthew Barnett
Matthew Barnett added the comment: Alex is correct. This part: [^<>]* can match an empty string, and it's nested with a repeated group. It stalls, repeatedly matching an empty string. Incidentally, my regex implementation (available on Py

[issue11733] Implement a `Counter.elements_count` method

2011-03-31 Thread Matthew Barnett
Matthew Barnett added the comment: The name isn't meaningful to me. My preference would be for something like "total_count". -- nosy: +mrabarnett ___ Python tracker <http://bugs.pyt

[issue11775] `bool(Counter({'a': 0})) is True`

2011-04-05 Thread Matthew Barnett
Matthew Barnett added the comment: It depends on what kind of object it's like. If it's like a dict then your example is clearly not empty, but if it's like a set then it /is/ empty, in which case it's empty if: all(count == 0 for count in my_counter.values

[issue11947] re.IGNORECASE does not match literal "_" (underscore)

2011-04-28 Thread Matthew Barnett
Matthew Barnett added the comment: help(re.sub) says: sub(pattern, repl, string, count=0) and re.IGNORECASE has a value of 2. Therefore this: re.sub("_", "X", subject, re.IGNORECASE) is telling it to replace at most 2 occurrences of "_".

[issue11947] re.IGNORECASE does not match literal "_" (underscore)

2011-04-28 Thread Matthew Barnett
Matthew Barnett added the comment: I don't know how much code that might break. It might not be that much; I can't remember when I last used re.sub without the default count. -- ___ Python tracker <http://bugs.python.o

[issue11957] re.sub confusion between count and flags args

2011-05-06 Thread Matthew Barnett
Matthew Barnett added the comment: Something like "" may be more Pythonic. -- ___ Python tracker <http://bugs.python.org/issue11957> ___ ___ Python-b

[issue12078] re.sub() replaces only several matches

2011-05-14 Thread Matthew Barnett
Matthew Barnett added the comment: Argument 4 of re.sub is the maximum number of replacements, NOT flags: Help on function sub in module re: sub(pattern, repl, string, count=0, flags=0) Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in

[issue12130] regex 0.1.20110514 findall overlapped not working with 'start of string' expression

2011-05-20 Thread Matthew Barnett
Matthew Barnett added the comment: Replied to the regex bug tracker. -- ___ Python tracker <http://bugs.python.org/issue12130> ___ ___ Python-bugs-list mailin

[issue7132] Regexp: capturing groups in repetitions

2010-11-18 Thread Matthew Barnett
Matthew Barnett added the comment: Earlier this week I discovered that .Net supports repeated capture and its API suggested a much cleaner approach than what Perl offered, so I'll be adding it to the regex module at: http://pypi.python.org/pypi/regex The new methods will follo

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-19 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101120.zip is a new version of the regex module. The match object now supports additional methods which return information on all the successful matches of a repeated capture group. The API was inspired by that of .Net: matchobject.captures

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-20 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101121.zip is a new version of the regex module. The captures didn't work properly with lookarounds or atomic groups. -- Added file: http://bugs.python.org/file19723/issue2636-20101121.zip ___ P

[issue1859] textwrap doesn't linebreak on "\n"

2010-11-22 Thread Matthew Barnett
Matthew Barnett added the comment: I'd be interested in having a go if I knew what the desired behaviour was, ie unit tests to confirm what was 'correct'. How should it handle line breaks? Should it treat them like any other whitespace as at present, should it honour them, o

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-23 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101123.zip is a new version of the regex module. Oops, sorry, the weird behaviour of msg11 was a bug. :-( -- Added file: http://bugs.python.org/file19786/issue2636-20101123.zip ___ Python tracker

[issue1859] textwrap doesn't linebreak on "\n"

2010-11-23 Thread Matthew Barnett
Matthew Barnett added the comment: textwrap_2010-11-23.diff is my attempt to provide a fix, if it's wanted/needed. -- Added file: http://bugs.python.org/file19791/textwrap_2010-11-23.diff ___ Python tracker <http://bugs.python.org/i

[issue10532] A bug related to matching the empty string

2010-11-25 Thread Matthew Barnett
Matthew Barnett added the comment: The spans say this: >>> for m in re.finditer('((.d.)*)*', 'adb'): print(m.span()) (0, 3) (3, 3) There's an non-empty match followed by an empty match. IHMO, not a bug. -- nosy: +mrabarnett

[issue2650] re.escape should not escape underscore

2010-11-25 Thread Matthew Barnett
Matthew Barnett added the comment: Re the regex module (issue #2636), would a good compromise be: regex.escape(user_input, special_only=True) to maintain compatibility? -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue2

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-11-29 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101130.zip is a new version of the regex module. Added 'special_only' keyword parameter (default False) to regex.escape. When True, regex.escape escapes only 'special' characters, such as '?'. -- Adde

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-06 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101207.zip is a new version of the regex module. It includes additional checks against pathological regexes. -- Added file: http://bugs.python.org/file19965/issue2636-20101207.zip ___ Python tracker

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-10 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101210.zip is a new version of the regex module. I've extended the additional checks of the previous version. It has been tested with Python 2.5 to Python 3.2b1. -- Added file: http://bugs.python.org/file20001/issue2636-2010121

[issue10704] Regex 0.1.20101210 Python 3.1 install problem Mac OS X 10.6.5

2010-12-14 Thread Matthew Barnett
Matthew Barnett added the comment: I use Windows XP, so I can't help with MacOS X. >From the error log it looks like it doesn't like the sources for Python either! -- ___ Python tracker <http://bugs.pytho

[issue10703] Regex 0.1.20101210

2010-12-14 Thread Matthew Barnett
Matthew Barnett added the comment: The regex module is intended to replace the re module, so its default behaviour is the same: in Python 2, regexes default to matching ASCII, and in Python 3, they default to matching Unicode. If you want to use a regex on a Unicode string in Python 2 then

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-23 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101224.zip is a new version of the regex module. Case-insensitive matching is now faster. The matching functions and methods now accept a keyword argument to release the GIL during matching to enable other Python threads to run concurrently

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-24 Thread Matthew Barnett
Matthew Barnett added the comment: I've been trying to push the history to Launchpad, completely without success; it just won't authenticate (no such account, even though I can log in!). I doubt that the history would be much use to

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-24 Thread Matthew Barnett
Matthew Barnett added the comment: It does have an SSH key. It's probably something simple that I'm missing. I think that the only change I'm likely to make is to a support script I use; it currently uses hard-coded paths, etc,

[issue6210] Exception Chaining missing method for suppressing context

2010-12-27 Thread Matthew Barnett
Changes by Matthew Barnett : -- nosy: +mrabarnett ___ Python tracker <http://bugs.python.org/issue6210> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-27 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101228.zip is a new version of the regex module. Sorry for the delay, the fix took me a bit longer than I expected. :-) -- Added file: http://bugs.python.org/file20176/issue2636-20101228.zip ___ Python

[issue6210] Exception Chaining missing method for suppressing context

2010-12-27 Thread Matthew Barnett
Matthew Barnett added the comment: Regarding syntax, I'm undecided between: raise with new_exception and: raise new_exception with caught_exception I think that the second form is clearer: try: ... exception SomeException as ex: raise SomeOtherExce

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-28 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101228a.zip is a new version of the regex module. It now compiles the pattern quickly. -- Added file: http://bugs.python.org/file20182/issue2636-20101228a.zip ___ Python tracker <h

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-28 Thread Matthew Barnett
Matthew Barnett added the comment: issue2636-20101229.zip is a new version of the regex module. It now compiles the pattern quickly. -- Added file: http://bugs.python.org/file20185/issue2636-20101229.zip ___ Python tracker <http://bugs.python.

  1   2   3   4   5   6   >