Re: Curious to see alternate approach on a search/replace via regex

2013-02-15 Thread Serhiy Storchaka
On 08.02.13 03:08, Ian Kelly wrote: I think what we're seeing here is that the time needed to look up the compiled regular expression in the cache is a significant fraction of the time needed to actually execute it. There is a bug issue for this. See http://bugs.python.org/issue16389 . -- http

Re: Curious to see alternate approach on a search/replace via regex

2013-02-08 Thread Ian Kelly
On Fri, Feb 8, 2013 at 4:43 AM, Steven D'Aprano wrote: > Ian Kelly wrote: > Surely that depends on the size of the pattern, and the size of the data > being worked on. Natually. > Compiling the pattern "s[ai]t" doesn't take that much work, it's only six > characters and very simple. Applying it

Re: Curious to see alternate approach on a search/replace via regex

2013-02-08 Thread Steven D'Aprano
Ian Kelly wrote: > On Thu, Feb 7, 2013 at 10:57 PM, rh wrote: >> On Thu, 7 Feb 2013 18:08:00 -0700 >> Ian Kelly wrote: >> >>> Which is approximately 30 times slower, so clearly the regular >>> expression *is* being cached. I think what we're seeing here is that >>> the time needed to look up th

Re: Curious to see alternate approach on a search/replace via regex

2013-02-08 Thread Peter Otten
Serhiy Storchaka wrote: > On 07.02.13 11:49, Peter Otten wrote: >> ILLEGAL = "-:./?&=" >> try: >> TRANS = string.maketrans(ILLEGAL, "_" * len(ILLEGAL)) >> except AttributeError: >> # python 3 >> TRANS = dict.fromkeys(map(ord, ILLEGAL), "_") > > str.maketrans() D'oh. ILLEGAL = "-:

Re: Curious to see alternate approach on a search/replace via regex

2013-02-08 Thread Nick Mellor
Hi RH, It's essential to know about regex, of course, but often there's a better, easier-to-read way to do things in Python. One of Python's aims is clarity and ease of reading. Regex is complex, potentially inefficient and hard to read (as well as being the only reasonable way to do things so

Re: Curious to see alternate approach on a search/replace via regex

2013-02-08 Thread Ian Kelly
On Thu, Feb 7, 2013 at 10:57 PM, rh wrote: > On Thu, 7 Feb 2013 18:08:00 -0700 > Ian Kelly wrote: > >> Which is approximately 30 times slower, so clearly the regular >> expression *is* being cached. I think what we're seeing here is that >> the time needed to look up the compiled regular express

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Dave Angel
On 02/07/2013 06:13 PM, rh wrote: On Fri, 08 Feb 2013 09:45:41 +1100 Steven D'Aprano wrote: But since you don't demonstrate any actual working code, you could be correct, or you could be timing it wrong. Without seeing your timing code, my guess is that you are doing it wrong. Timing code is

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Steven D'Aprano
Ian Kelly wrote: > On Thu, Feb 7, 2013 at 4:59 PM, Steven D'Aprano > wrote: >> Oh, one last thing... pulling out "re.compile" outside of the function >> does absolutely nothing. You don't even compile anything. It basically >> looks up that a compile function exists in the re module, and that's a

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Ian Kelly
On Thu, Feb 7, 2013 at 5:55 PM, Ian Kelly wrote: > Whatever caching is being done by re.compile, that's still a 24% > savings by moving the compile calls into the setup. On the other hand, if you add an re.purge() call to the start of t1 to clear the cache: >>> t3 = Timer(""" ... re.purge() ...

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Steven D'Aprano
rh wrote: > On Fri, 08 Feb 2013 09:45:41 +1100 > Steven D'Aprano wrote: > >> rh wrote: >> >> > I am using 2.7.3 and I put the re.compile outside the function and >> > it performed faster than urlparse. I don't print out the data. >> >> I find that hard to believe. re.compile caches its results

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Steven D'Aprano
rh wrote: > I am using 2.7.3 and I put the re.compile outside the function and it > performed faster than urlparse. I don't print out the data. I find that hard to believe. re.compile caches its results, so except for the very first time it is called, it is very fast -- basically a function call

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Serhiy Storchaka
On 07.02.13 11:49, Peter Otten wrote: ILLEGAL = "-:./?&=" try: TRANS = string.maketrans(ILLEGAL, "_" * len(ILLEGAL)) except AttributeError: # python 3 TRANS = dict.fromkeys(map(ord, ILLEGAL), "_") str.maketrans() -- http://mail.python.org/mailman/listinfo/python-list

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Demian Brecht
On 2013-02-06 7:04 PM, "Steven D'Aprano" wrote: >I dispute those results. I think you are mostly measuring the time to >print the result, and I/O is quite slow. Good call, hadn't even considered that. >My tests show that using urlparse >is 33% faster than using regexes, and far more understanda

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Nick Mellor
Hi RH, translate methods might be faster (and a little easier to read) for your use case. Just precompute and re-use the translation table punct_flatten. Note that the translate method has changed somewhat for Python 3 due to the separation of text from bytes. The is a Python 3 version. from u

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Chris Angelico
On Thu, Feb 7, 2013 at 10:08 PM, jmfauth wrote: > The future is bright for ... ascii users. > > jmf So you're admitting to being not very bright? *ducks* Seriously jmf, please don't hijack threads just to whine about contrived issues of Unicode performance yet again. That horse is dead. Go fork

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread jmfauth
On 7 fév, 04:04, Steven D'Aprano wrote: > On Wed, 06 Feb 2013 13:55:58 -0800, Demian Brecht wrote: > > Well, an alternative /could/ be: > > ... > py> s = 'http://alongnameofasite1234567.com/q?sports=run&a=1&b=1' > py> assert u2f(s) == mangle(s) > py> > py> from timeit import Timer > py> setup = 'f

Re: Curious to see alternate approach on a search/replace via regex

2013-02-07 Thread Peter Otten
rh wrote: > I am curious to know if others would have done this differently. And if so > how so? > > This converts a url to a more easily managed filename, stripping the > http protocol off. > > This: > > http://alongnameofasite1234567.com/q?sports=run&a=1&b=1 > > becomes this: > > alongname

Re: Curious to see alternate approach on a search/replace via regex

2013-02-06 Thread MRAB
On 2013-02-06 21:41, rh wrote: I am curious to know if others would have done this differently. And if so how so? This converts a url to a more easily managed filename, stripping the http protocol off. This: http://alongnameofasite1234567.com/q?sports=run&a=1&b=1 becomes this: alongnameofasi

Re: Curious to see alternate approach on a search/replace via regex

2013-02-06 Thread Demian Brecht
python -m cProfile [script_name].py http://docs.python.org/2/library/profile.html#module-cProfile Demian Brecht http://demianbrecht.github.com On 2013-02-06 2:30 PM, "richard_hubbe11" wrote: >I see that urlparse uses split and not re at all and, in my tests, >urlparse >completes in less ti

Re: Curious to see alternate approach on a search/replace via regex

2013-02-06 Thread Demian Brecht
Well, an alternative /could/ be: from urlparse import urlparse parts = urlparse('http://alongnameofasite1234567.com/q?sports=run&a=1&b=1') print '%s%s_%s' % (parts.netloc.replace('.', '_'), parts.path.replace('/', '_'), parts.query.replace('&', '_').replace('=', '_') ) Although wit

Re: Curious to see alternate approach on a search/replace via regex

2013-02-06 Thread Roy Smith
In article , rh wrote: > I am curious to know if others would have done this differently. And if so > how so? > > This converts a url to a more easily managed filename, stripping the > http protocol off. I would have used the urlparse module. http://docs.python.org/2/library/urlparse.html --

mmap regex search replace

2009-04-03 Thread David Pratt
Hi. I have a circumstance where I have to search and replace a block of text in a very large file. I have written some psuedo code to locate the text and print the span of text to be removed and replaced by new block. Can someone advise what to do to remove the text span and insert with the

Search & Replace in MS Word Puzzle

2006-12-09 Thread Ola K
Hi guys, I wrote a script that works *almost* perfectly, and this lack of perfection simply puzzles me. I simply cannot point the whys, so any help on it will be appreciated. I paste it all here, the string at the beginning explains what it does: '''A script for MS Word which does the following:

Re: Search & Replace

2006-10-27 Thread DataSmash
Really appreciate all the all the different answers and learning tips! -- http://mail.python.org/mailman/listinfo/python-list

Re: Search & Replace

2006-10-27 Thread Frederic Rentsch
DataSmash wrote: > Hello, > I need to search and replace 4 words in a text file. > Below is my attempt at it, but this code appends > a copy of the text file within itself 4 times. > Can someone help me out. > Thanks! > > # Search & Replace > file = open(&quo

Re: Search & Replace

2006-10-26 Thread Paddy
DataSmash wrote: > Hello, > I need to search and replace 4 words in a text file. > Below is my attempt at it, but this code appends > a copy of the text file within itself 4 times. > Can someone help me out. > Thanks! > > # Search & Replace > file = open(&quo

Re: Search & Replace

2006-10-26 Thread Bruno Desthuilliers
DataSmash a écrit : > Hello, > I need to search and replace 4 words in a text file. > Below is my attempt at it, but this code appends > a copy of the text file within itself 4 times. > Can someone help me out. > Thanks! > > # Search & Replace > file = open("te

Re: Search & Replace

2006-10-26 Thread Marc 'BlackJack' Rintsch
ments first and rebind `text` to the string with the replacements each time, and *then* write the result *once* to the file. > # Search & Replace > file = open("text.txt", "r") > text = file.read() > file.close() > > file = open("text.txt", "

Re: Search & Replace

2006-10-26 Thread Tim Chase
> Below is my attempt at it, but this code appends > a copy of the text file within itself 4 times. > Can someone help me out. [snip] > file = open("text.txt", "w") > file.write(text.replace("Left_RefAddr", "FromLeft")) > file.write(text.replace("Left_NonRefAddr", "ToLeft")) > file.write(text.repla

Search & Replace

2006-10-26 Thread DataSmash
Hello, I need to search and replace 4 words in a text file. Below is my attempt at it, but this code appends a copy of the text file within itself 4 times. Can someone help me out. Thanks! # Search & Replace file = open("text.txt", "r") text = file.read() file.close() fi

Re: Search & Replace with RegEx

2005-07-12 Thread [EMAIL PROTECTED]
thanks for the comments + help. i think i got it working, although it's not pretty: ## import os import re theRegEx = '.*abs:.*\.*.' p = re.compile(theRegEx, re.IGNORECASE) fileToSearch = 'compreg.dat' print "File to perform search-and-replace on: " + fileToSea

Re: Search & Replace with RegEx

2005-07-12 Thread Thomas Guettler
Am Tue, 12 Jul 2005 01:11:44 -0700 schrieb [EMAIL PROTECTED]: > Hi Pythonistas, > > Here's my problem: I'm using a version of MOOX Firefox > (http://moox.ws/tech/mozilla/) that's been modified to run completely > from a USB Stick. It works fine, except when I install or uninstall an > extension,

Re: Search & Replace with RegEx

2005-07-12 Thread George Sakkis
<[EMAIL PROTECTED]> wrote: [snipped] > For example, after installing a new extension, I change in compreg.dat > > lines such as: > > abs:J:\Firefox\Firefox_Data\Profiles\default.uyw\extensions\{0538E3E3-7E9B-4d49-8831-A227C80A7AD3}\components\nsForecastfox.js,18590 > abs:J:\Firefox\Firefo

Search & Replace with RegEx

2005-07-12 Thread [EMAIL PROTECTED]
Hi Pythonistas, Here's my problem: I'm using a version of MOOX Firefox (http://moox.ws/tech/mozilla/) that's been modified to run completely from a USB Stick. It works fine, except when I install or uninstall an extension, in which case I then have to physically edit the compreg.dat file in my pro

Re: search/replace in Python (solved)

2005-05-28 Thread Leif K-Brooks
Vamsee Krishna Gomatam wrote: > text = re.sub( "([^<]*)", r' href="http://www.google.com/search?q=\1";>\1', text ) But see what happens when text contains spaces, or quotes, or ampersands, or... -- http://mail.python.org/mailman/listinfo/python-list

Re: search/replace in Python (solved)

2005-05-28 Thread Vamsee Krishna Gomatam
Leif K-Brooks wrote: > Oliver Andrich wrote: > > > For real-world use you'll want to URL encode and entityify the text: > > import cgi > import urllib > > def google_link(text): > text = text.group(1) > return '%s' % (cgi.escape(urllib.quote(text)), >

Re: search/replace in Python

2005-05-28 Thread Leif K-Brooks
Oliver Andrich wrote: > re.sub(r"(.*)",r" href=http://www.google.com/search?q=\1>\1", text) For real-world use you'll want to URL encode and entityify the text: import cgi import urllib def google_link(text): text = text.group(1) return '%s' % (cgi.escape(urllib.quote(text)),

Re: search/replace in Python

2005-05-28 Thread John Machin
Vamsee Krishna Gomatam wrote: > Hello, > I'm having some problems understanding Regexps in Python. I want > to replace "PHRASE" with > "http://www.google.com/search?q=PHRASE>PHRASE" in a block of > text. How can I achieve this in Python? Sorry for the naive question but > the documentation is

search/replace in Python

2005-05-28 Thread Vamsee Krishna Gomatam
Hello, I'm having some problems understanding Regexps in Python. I want to replace "PHRASE" with "http://www.google.com/search?q=PHRASE>PHRASE" in a block of text. How can I achieve this in Python? Sorry for the naive question but the documentation is really bad :-( Regards, GVK -- http

Re: search/replace in Python

2005-05-27 Thread Oliver Andrich
Hi, 2005/5/28, Vamsee Krishna Gomatam <[EMAIL PROTECTED]>: > Hello, > I'm having some problems understanding Regexps in Python. I want > to replace "PHRASE" with > "http://www.google.com/search?q=PHRASE>PHRASE" in a block of > text. How can I achieve this in Python? Sorry for the naive que