Re: re.sub(): replace longest match instead of leftmost match?

2011-12-19 Thread Ian Kelly
On Mon, Dec 19, 2011 at 4:15 PM, wrote: > On Dec 16, 11:49 am, John Gordon wrote: >> I'm working with IPv6 CIDR strings, and I want to replace the longest >> match of "(:|$)+" with ":".  But when I use re.sub() it replaces >> the leftmost match, even if there is a longer match later in t

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-19 Thread ting
On Dec 16, 11:49 am, John Gordon wrote: > I'm working with IPv6 CIDR strings, and I want to replace the longest > match of "(:|$)+" with ":".  But when I use re.sub() it replaces > the leftmost match, even if there is a longer match later in the string. Typically this means that your regu

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-19 Thread Duncan Booth
MRAB wrote: > On 16/12/2011 21:04, John Gordon wrote: >> In Devin >> Jeanpierre writes: >> >>> You could use re.finditer to find the longest match, and then >>> replace it manually by hand (via string slicing). >> >>> (a match is the longest if (m.end() - m.start()) is the largest -- >>> s

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread Terry Reedy
On 12/16/2011 1:36 PM, Roy Smith wrote: What you want is an IPv6 class which represents an address in some canonical form. It would have constructors which accept any of the RFC-2373 defined formats. It would also have string formatting methods to convert the internal form into any of these fo

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread MRAB
On 16/12/2011 21:04, John Gordon wrote: In Devin Jeanpierre writes: You could use re.finditer to find the longest match, and then replace it manually by hand (via string slicing). (a match is the longest if (m.end() - m.start()) is the largest -- so, max(re.finditer(...), key=3Dlambda

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread John Gordon
In Roy Smith writes: > Having done quite a bit of IPv6 work, my opinion here is that you're > trying to do The Wrong Thing. > What you want is an IPv6 class which represents an address in some > canonical form. It would have constructors which accept any of the > RFC-2373 defined formats.

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread John Gordon
In Ian Kelly writes: > >>> I'm also looking for a regexp that will remove leading zeroes in each > >>> four-digit group, but will leave a single zero if the group was all > >>> zeroes. > pattern = r'\b0{1,3}([1-9a-f][0-9a-f]*|0)\b' > re.sub(pattern, r'\1', string, flags=re.IGNORECASE) Perfect

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread John Gordon
In Devin Jeanpierre writes: > You could use re.finditer to find the longest match, and then replace > it manually by hand (via string slicing). > (a match is the longest if (m.end() - m.start()) is the largest -- > so, max(re.finditer(...), key=3Dlambda m: (m.end() =3D m.start())) I ended up

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread Roy Smith
In article , John Gordon wrote: > I'm working with IPv6 CIDR strings, and I want to replace the longest > match of "(:|$)+" with ":". But when I use re.sub() it replaces > the leftmost match, even if there is a longer match later in the string. > > I'm also looking for a regexp that wi

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread MRAB
On 16/12/2011 17:57, Ian Kelly wrote: On Fri, Dec 16, 2011 at 10:36 AM, MRAB wrote: On 16/12/2011 16:49, John Gordon wrote: According to the documentation on re.sub(), it replaces the leftmost matching pattern. However, I want to replace the *longest* matching pattern, which is not nece

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread Ian Kelly
On Fri, Dec 16, 2011 at 10:57 AM, Ian Kelly wrote: > On Fri, Dec 16, 2011 at 10:36 AM, MRAB wrote: >> On 16/12/2011 16:49, John Gordon wrote: >>> >>> According to the documentation on re.sub(), it replaces the leftmost >>> matching pattern. >>> >>> However, I want to replace the *longest* matchin

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread Ian Kelly
On Fri, Dec 16, 2011 at 10:36 AM, MRAB wrote: > On 16/12/2011 16:49, John Gordon wrote: >> >> According to the documentation on re.sub(), it replaces the leftmost >> matching pattern. >> >> However, I want to replace the *longest* matching pattern, which is >> not necessarily the leftmost match.  

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread MRAB
On 16/12/2011 16:49, John Gordon wrote: According to the documentation on re.sub(), it replaces the leftmost matching pattern. However, I want to replace the *longest* matching pattern, which is not necessarily the leftmost match. Any suggestions? I'm working with IPv6 CIDR strings, and I want

Re: re.sub(): replace longest match instead of leftmost match?

2011-12-16 Thread Devin Jeanpierre
You could use re.finditer to find the longest match, and then replace it manually by hand (via string slicing). (a match is the longest if (m.end() - m.start()) is the largest -- so, max(re.finditer(...), key=lambda m: (m.end() = m.start())) -- Devin P.S. does anyone else get bothered by how it'

Re: re.sub: escaping capture group followed by numeric(s)

2010-09-17 Thread Jon Clements
On 17 Sep, 19:59, Peter Otten <__pete...@web.de> wrote: > Jon Clements wrote: > > (I reckon this is probably a question for MRAB and is not really > > Python specific, but anyhow...) > > > Absolutely basic example: re.sub(r'(\d+)', r'\1', 'string1') > > > I've been searching around and I'm sure it'

Re: re.sub: escaping capture group followed by numeric(s)

2010-09-17 Thread Peter Otten
Jon Clements wrote: > (I reckon this is probably a question for MRAB and is not really > Python specific, but anyhow...) > > Absolutely basic example: re.sub(r'(\d+)', r'\1', 'string1') > > I've been searching around and I'm sure it'll be obvious when it's > pointed out, but how do I use the abo

Re: re.sub: escaping capture group followed by numeric(s)

2010-09-17 Thread MRAB
On 17/09/2010 19:21, Jon Clements wrote: Hi All, (I reckon this is probably a question for MRAB and is not really Python specific, but anyhow...) Absolutely basic example: re.sub(r'(\d+)', r'\1', 'string1') I've been searching around and I'm sure it'll be obvious when it's pointed out, but how

Re: re.sub and variables

2010-08-12 Thread MRAB
Steven D'Aprano wrote: On Thu, 12 Aug 2010 14:33:28 -0700, fuglyducky wrote: if anyone happens to know about passing a variable into a regex that would be great. The same way you pass anything into any string. Regexes are ordinary strings. If you want to construct a string from a variable t

Re: re.sub and variables

2010-08-12 Thread Steven D'Aprano
On Thu, 12 Aug 2010 14:33:28 -0700, fuglyducky wrote: > if anyone happens to know about > passing a variable into a regex that would be great. The same way you pass anything into any string. Regexes are ordinary strings. If you want to construct a string from a variable t = "orl", you can do an

Re: re.sub and variables

2010-08-12 Thread John Machin
On Aug 13, 7:33 am, fuglyducky wrote: > On Aug 12, 2:06 pm, fuglyducky wrote: > > > > > I have a function that I am attempting to call from another file. I am > > attempting to replace a string using re.sub with another string. The > > problem is that the second string is a variable. When I get t

Re: re.sub and variables

2010-08-12 Thread fuglyducky
On Aug 12, 2:06 pm, fuglyducky wrote: > I have a function that I am attempting to call from another file. I am > attempting to replace a string using re.sub with another string. The > problem is that the second string is a variable. When I get the > output, it shows the variable name rather than t

Re: re.sub unexpected behaviour

2010-07-06 Thread Javier Collado
Thanks for your answers. They helped me to realize that I was mistakenly using match.string (the whole string) when I should be using math.group(0) (the whole match). Best regards, Javier -- http://mail.python.org/mailman/listinfo/python-list

Re: re.sub unexpected behaviour

2010-07-06 Thread Steven D'Aprano
On Tue, 06 Jul 2010 19:10:17 +0200, Javier Collado wrote: > Hello, > > Let's imagine that we have a simple function that generates a > replacement for a regular expression: > > def process(match): > return match.string > > If we use that simple function with re.sub using a simple pattern an

Re: re.sub unexpected behaviour

2010-07-06 Thread Thomas Jollans
On 07/06/2010 07:10 PM, Javier Collado wrote: > Hello, > > Let's imagine that we have a simple function that generates a > replacement for a regular expression: > > def process(match): > return match.string > > If we use that simple function with re.sub using a simple pattern and > a string

Re: re.sub question (regular expressions)

2009-10-20 Thread Chris Seberino
On Oct 16, 9:51 am, MRAB wrote: > What do you mean "blow up"? It worked for me in Python v2.6.2. My bad. False alarm. This was one of those cases where a bug in another area appears like a bug in a different area. Thank for the help. cs -- http://mail.python.org/mailman/listinfo/python-list

Re: re.sub question (regular expressions)

2009-10-16 Thread MRAB
Chris Seberino wrote: What does this line do?... input_ = re.sub("([a-zA-Z]+)", '"\\1"', input_) Why don't you try it? Does it remove parentheses from words? e.g. (foo) -> foo ??? No, it puts quotes around them. I'd like to replace [a-zA-Z] with \w but \w makes it blow up. In other word

Re: re.sub question (regular expressions)

2009-10-16 Thread Jean-Michel Pichavant
Chris Seberino wrote: What does this line do?... input_ = re.sub("([a-zA-Z]+)", '"\\1"', input_) Does it remove parentheses from words? e.g. (foo) -> foo ??? I'd like to replace [a-zA-Z] with \w but \w makes it blow up. In other words, re.sub("(\w+)", '"\\1"', input_) blows up. Why? cs

Re: re.sub do not replace portion of match

2009-10-03 Thread Duncan Booth
J Wolfe wrote: > Hi, > > Is there a way to flag re.sub not to replace a portion of the string? > > I have a very long string that I want to add two new line's to rather > than one, but keep the value X: > > string = "testX.\n.today" <-- note X is a value > string = re.sub("test...

Re: re.sub do not replace portion of match

2009-10-02 Thread J Wolfe
Thanks Duncan, I did look at that, but it was kinda greek to me. Thanks for pulling out the part I was looking for that should do the trick. Jonathan > http://www.python.org/doc/current/library/re.html#re.sub > > > Backreferences, such as \6, are replaced with the substring matched by > > group

Re: re.sub and named groups

2009-02-11 Thread Rhodri James
On Wed, 11 Feb 2009 21:05:53 -, Paul McGuire wrote: On Feb 4, 10:51 am, "Emanuele D'Arrigo" wrote: Hi everybody, I'm having a ball with the power of regular expression Don't forget the ball you can have with the power of ordinary Python strings, string methods, and string interpolati

Re: re.sub and named groups

2009-02-11 Thread Paul McGuire
On Feb 4, 10:51 am, "Emanuele D'Arrigo" wrote: > Hi everybody, > > I'm having a ball with the power of regular expression Don't forget the ball you can have with the power of ordinary Python strings, string methods, and string interpolation! originalString = "spam:%(first)s ham:%(second)s" print

Re: re.sub and named groups

2009-02-11 Thread Shawn Milochik
> > Book recommendation: _Mastering Regular Expressions_, Jeffrey Friedl > -- > Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ I wholeheartedly second this! The third edition is out now. -- http://mail.python.org/mailman/listinfo/python-list

Re: re.sub and named groups

2009-02-10 Thread Aahz
In article <4c7158d2-5663-46b9-b950-be81bd799...@z6g2000pre.googlegroups.com>, Emanuele D'Arrigo wrote: > >I'm having a ball with the power of regular expression but I stumbled >on something I don't quite understand: Book recommendation: _Mastering Regular Expressions_, Jeffrey Friedl -- Aahz (a

Re: re.sub and named groups

2009-02-04 Thread Yapo Sébastien
> Hi everybody, > > I'm having a ball with the power of regular expression but I stumbled > on something I don't quite understand: > > theOriginalString = "spam:(?P.*) ham:(?P.*)" > aReplacementPattern = "\(\?P.*\)" > aReplacementString= "foo" > re.sub(aReplacementPattern , aReplacementString, the

Re: re.sub and named groups

2009-02-04 Thread Emanuele D'Arrigo
On Feb 4, 5:17 pm, MRAB wrote: > You could use the lazy form "*?" which tries to match as little as > possible, eg "\(\?P.*?\)" where the ".*?" matches: > spam:(?P.*) ham:(?P.*) > giving "spam:foo ham:(?P.*)". A-ha! Of course! That makes perfect sense! Thank you! Problem solved! Ciao! Manu --

Re: re.sub and named groups

2009-02-04 Thread MRAB
Emanuele D'Arrigo wrote: > Hi everybody, > > I'm having a ball with the power of regular expression but I stumbled > on something I don't quite understand: > > theOriginalString = "spam:(?P.*) ham:(?P.*)" > aReplacementPattern = "\(\?P.*\)" > aReplacementString= "foo" > re.sub(aReplacementPattern

Re: re.sub() problem (regular expression)

2007-12-13 Thread Rick Dooling
On Dec 13, 9:00 pm, Davy <[EMAIL PROTECTED]> wrote: > > What's "\1" and the whole re.sub() mean? > Read about backreferences here: http://www.regular-expressions.info/brackets.html Also see the entry on parentheses here: http://docs.python.org/lib/re-syntax.html rick -- http://mail.python.or

RE: re.sub

2007-10-16 Thread DiPierro, Massimo
Thank you this answers my question. I wanted to make sure it was actually designed this way. Massimo From: Tim Chase [EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 1:38 PM To: DiPierro, Massimo Cc: python-list@python.org; Berthiaume, Andre Subject: Re

Re: re.sub

2007-10-16 Thread Tim Chase
> Let me show you a very bad consequence of this... > > a=open('file1.txt','rb').read() > b=re.sub('x',a,'x') > open('file2.txt','wb').write(b) > > Now if file1.txt contains a \n or \" then file2.txt is not the > same as file1.txt while it should be. That's functioning as designed. If you want

Re: re.sub

2007-10-16 Thread Tim Chase
> Even stranger > > >>> re.sub('a', '\\n','bab') > 'b\nb' > >>> print re.sub('a', '\\n','bab') > b > b That's to be expected. When not using a print statement, the raw evaluation prints the representation of the object. In this case, the representation is 'b\nb'. When you use the print st

RE: re.sub

2007-10-16 Thread DiPierro, Massimo
d be. Massimo From: Tim Chase [EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 1:20 PM To: DiPierro, Massimo Cc: python-list@python.org; Berthiaume, Andre Subject: Re: re.sub > Even stranger > > >>> re.sub('a', '\\n',

RE: re.sub

2007-10-16 Thread DiPierro, Massimo
It is the fisrt line that is wrong, the second follows from the first, I agree. From: Tim Chase [EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 1:20 PM To: DiPierro, Massimo Cc: python-list@python.org; Berthiaume, Andre Subject: Re: re.sub > E

RE: re.sub

2007-10-16 Thread DiPierro, Massimo
_ From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Chris Mellon [EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 1:12 PM To: python-list@python.org Subject: Re: re.sub On 10/16/07, Massimo Di Pierro <[EMAIL PROTECTED]> wrote: > Even stranger > > >>> re.sub(&#

Re: re.sub

2007-10-16 Thread Chris Mellon
On 10/16/07, Massimo Di Pierro <[EMAIL PROTECTED]> wrote: > Even stranger > > >>> re.sub('a', '\\n','bab') > 'b\nb' > >>> print re.sub('a', '\\n','bab') > b > b > You called print, so instead of getting an escaped string literal, the string is being printed to your terminal, which is printing th

Re: re.sub

2007-10-16 Thread Massimo Di Pierro
Even stranger >>> re.sub('a', '\\n','bab') 'b\nb' >>> print re.sub('a', '\\n','bab') b b Massimo On Oct 16, 2007, at 1:54 AM, DiPierro, Massimo wrote: > Shouldn't this > print re.sub('a','\\n','bab') > b > b > > output > > b\nb > > instead? > > Massimo > > On Oct 16, 2007, at 1:34 AM, G

Re: re.sub does not replace all occurences

2007-08-07 Thread Christoph Krammer
Neil Cerutti schrieb: > In other words, the fourth argument to sub is count, not a set of > re flags. I knew it had to be something very stupid. Thanks a lot. -- http://mail.python.org/mailman/listinfo/python-list

Re: re.sub does not replace all occurences

2007-08-07 Thread Neil Cerutti
On 2007-08-07, Christoph Krammer <[EMAIL PROTECTED]> wrote: > Hello everybody, > > I wanted to use re.sub to strip all HTML tags out of a given string. I > learned that there are better ways to do this without the re module, > but I would like to know why my code is not working. I use the > followi

Re: re.sub does not replace all occurences

2007-08-07 Thread Marc 'BlackJack' Rintsch
On Tue, 07 Aug 2007 10:28:24 -0700, Christoph Krammer wrote: > Hello everybody, > > I wanted to use re.sub to strip all HTML tags out of a given string. I > learned that there are better ways to do this without the re module, > but I would like to know why my code is not working. I use the > foll

Re: re.sub and empty groups

2007-01-16 Thread harvey . thomas
Hugo Ferreira wrote: > Hi! > > I'm trying to do a search-replace in places where some groups are > optional... Here's an example: > > >> re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola").groups() > ('ola', None) > > >> re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|").groups() > ('ola', '')

Re: re.sub and empty groups

2007-01-16 Thread harvey . thomas
Hugo Ferreira wrote: > Hi! > > I'm trying to do a search-replace in places where some groups are > optional... Here's an example: > > >> re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola").groups() > ('ola', None) > > >> re.match(r"Image:([^\|]+)(?:\|(.*))?", "Image:ola|").groups() > ('ola', '')

Re: re.sub and re.MULTILINE

2007-01-09 Thread nyenyec
Paddy wrote: > Check the arguments to re.sub. > > >>> re.sub('(?m)^foo', 'bar', '\nfoo', count=0) > '\nbar' > > - Paddy. Duh! :) I appreciate it, thanks. -- nyenyec -- http://mail.python.org/mailman/listinfo/python-list

Re: re.sub and re.MULTILINE

2007-01-08 Thread Paddy
nyenyec wrote: > I feel like a complete idiot but I can't figure out why re.sub won't > match multiline strings: > > This works: > >>> re.search("^foo", "\nfoo", re.MULTILINE) > <_sre.SRE_Match object at 0x6c448> > > This doesn't. No replacement: > >>> re.sub("^foo", "bar", "\nfoo", re.MULTILINE)

Re: re.sub() backreference bug?

2006-08-17 Thread jeff emminger
thanks - that's the trick. On 8/17/06, Tim Chase <[EMAIL PROTECTED]> wrote: > Looks like you need to be using "raw" strings for your > replacements as well: > > s = re.sub(r'([A-Z]+)([A-Z][a-z])', r"\1_\2", s) > s = re.sub(r'([a-z\d])([A-Z])', r"\1_\2", s) > > This should allow the backslashes to

Re: re.sub() backreference bug?

2006-08-17 Thread Tim Chase
> Tim's given you the solution to the problem: with the re module, > *always* use raw strings in regexes and substitution strings. "always" is so...um...carved in stone. One can forego using raw strings if one prefers having one's strings looked like they were trampled by a stampede of creatu

Re: re.sub() backreference bug?

2006-08-17 Thread John Machin
[EMAIL PROTECTED] wrote: > using this code: > > import re > s = 'HelloWorld19-FooBar' > s = re.sub(r'([A-Z]+)([A-Z][a-z])', "\1_\2", s) > s = re.sub(r'([a-z\d])([A-Z])', "\1_\2", s) > s = re.sub('-', '_', s) > s = s.lower() > print "s: %s" % s > > i expect to get: > hello_world19_foo_bar > > but i

Re: re.sub() backreference bug?

2006-08-17 Thread Tim Chase
> s = re.sub(r'([A-Z]+)([A-Z][a-z])', "\1_\2", s) > s = re.sub(r'([a-z\d])([A-Z])', "\1_\2", s) > i expect to get: > hello_world19_foo_bar > > but instead i get: > hell☺_☻orld19_fo☺_☻ar Looks like you need to be using "raw" strings for your replacements as well: s = re.sub(r'([A-Z]+)([A-Z][a-z

Re: re.sub problem

2006-03-31 Thread RunLevelZero
Glad I could help. -- http://mail.python.org/mailman/listinfo/python-list

Re: re.sub problem

2006-03-31 Thread veracon
Thanks a lot! Compiling with re.DOTALL did fix my problem for the most part; there still are a few problems with my code, but I think I can fix those myself. Again, thanks! > Okay I just woke up and haven't had enough coffee so if I'm off here > please forgive me. Are you saying that if there is

Re: re.sub problem

2006-03-31 Thread RunLevelZero
Okay I just woke up and haven't had enough coffee so if I'm off here please forgive me. Are you saying that if there is an emptly line then it borks? If so just use re.S ( re.DOTALL ) and that should take care of it. It will treat the ( . ) special. Otherwise it ignores new lines. -- http://m

Re: re.sub problem

2006-03-31 Thread veracon
Actually, it happens in general when there is more than one linebreak between the open and close statements; not only when there are empty lines. -- http://mail.python.org/mailman/listinfo/python-list