On 08.02.13 03:08, Ian Kelly wrote:
I think what we're seeing here is that
the time needed to look up the compiled regular expression in the
cache is a significant fraction of the time needed to actually execute
it.
There is a bug issue for this. See http://bugs.python.org/issue16389 .
--
http
On Fri, Feb 8, 2013 at 4:43 AM, Steven D'Aprano
wrote:
> Ian Kelly wrote:
> Surely that depends on the size of the pattern, and the size of the data
> being worked on.
Natually.
> Compiling the pattern "s[ai]t" doesn't take that much work, it's only six
> characters and very simple. Applying it
Ian Kelly wrote:
> On Thu, Feb 7, 2013 at 10:57 PM, rh wrote:
>> On Thu, 7 Feb 2013 18:08:00 -0700
>> Ian Kelly wrote:
>>
>>> Which is approximately 30 times slower, so clearly the regular
>>> expression *is* being cached. I think what we're seeing here is that
>>> the time needed to look up th
Serhiy Storchaka wrote:
> On 07.02.13 11:49, Peter Otten wrote:
>> ILLEGAL = "-:./?&="
>> try:
>> TRANS = string.maketrans(ILLEGAL, "_" * len(ILLEGAL))
>> except AttributeError:
>> # python 3
>> TRANS = dict.fromkeys(map(ord, ILLEGAL), "_")
>
> str.maketrans()
D'oh.
ILLEGAL = "-:
Hi RH,
It's essential to know about regex, of course, but often there's a better,
easier-to-read way to do things in Python.
One of Python's aims is clarity and ease of reading.
Regex is complex, potentially inefficient and hard to read (as well as being
the only reasonable way to do things so
On Thu, Feb 7, 2013 at 10:57 PM, rh wrote:
> On Thu, 7 Feb 2013 18:08:00 -0700
> Ian Kelly wrote:
>
>> Which is approximately 30 times slower, so clearly the regular
>> expression *is* being cached. I think what we're seeing here is that
>> the time needed to look up the compiled regular express
On 02/07/2013 06:13 PM, rh wrote:
On Fri, 08 Feb 2013 09:45:41 +1100
Steven D'Aprano wrote:
But since you don't demonstrate any actual working code, you could be
correct, or you could be timing it wrong. Without seeing your timing
code, my guess is that you are doing it wrong. Timing code is
Ian Kelly wrote:
> On Thu, Feb 7, 2013 at 4:59 PM, Steven D'Aprano
> wrote:
>> Oh, one last thing... pulling out "re.compile" outside of the function
>> does absolutely nothing. You don't even compile anything. It basically
>> looks up that a compile function exists in the re module, and that's a
On Thu, Feb 7, 2013 at 5:55 PM, Ian Kelly wrote:
> Whatever caching is being done by re.compile, that's still a 24%
> savings by moving the compile calls into the setup.
On the other hand, if you add an re.purge() call to the start of t1 to
clear the cache:
>>> t3 = Timer("""
... re.purge()
...
rh wrote:
> On Fri, 08 Feb 2013 09:45:41 +1100
> Steven D'Aprano wrote:
>
>> rh wrote:
>>
>> > I am using 2.7.3 and I put the re.compile outside the function and
>> > it performed faster than urlparse. I don't print out the data.
>>
>> I find that hard to believe. re.compile caches its results
rh wrote:
> I am using 2.7.3 and I put the re.compile outside the function and it
> performed faster than urlparse. I don't print out the data.
I find that hard to believe. re.compile caches its results, so except for
the very first time it is called, it is very fast -- basically a function
call
On 07.02.13 11:49, Peter Otten wrote:
ILLEGAL = "-:./?&="
try:
TRANS = string.maketrans(ILLEGAL, "_" * len(ILLEGAL))
except AttributeError:
# python 3
TRANS = dict.fromkeys(map(ord, ILLEGAL), "_")
str.maketrans()
--
http://mail.python.org/mailman/listinfo/python-list
On 2013-02-06 7:04 PM, "Steven D'Aprano"
wrote:
>I dispute those results. I think you are mostly measuring the time to
>print the result, and I/O is quite slow.
Good call, hadn't even considered that.
>My tests show that using urlparse
>is 33% faster than using regexes, and far more understanda
Hi RH,
translate methods might be faster (and a little easier to read) for your use
case. Just precompute and re-use the translation table punct_flatten.
Note that the translate method has changed somewhat for Python 3 due to the
separation of text from bytes. The is a Python 3 version.
from u
On Thu, Feb 7, 2013 at 10:08 PM, jmfauth wrote:
> The future is bright for ... ascii users.
>
> jmf
So you're admitting to being not very bright?
*ducks*
Seriously jmf, please don't hijack threads just to whine about
contrived issues of Unicode performance yet again. That horse is dead.
Go fork
On 7 fév, 04:04, Steven D'Aprano wrote:
> On Wed, 06 Feb 2013 13:55:58 -0800, Demian Brecht wrote:
> > Well, an alternative /could/ be:
>
> ...
> py> s = 'http://alongnameofasite1234567.com/q?sports=run&a=1&b=1'
> py> assert u2f(s) == mangle(s)
> py>
> py> from timeit import Timer
> py> setup = 'f
rh wrote:
> I am curious to know if others would have done this differently. And if so
> how so?
>
> This converts a url to a more easily managed filename, stripping the
> http protocol off.
>
> This:
>
> http://alongnameofasite1234567.com/q?sports=run&a=1&b=1
>
> becomes this:
>
> alongname
On 2013-02-06 21:41, rh wrote:
I am curious to know if others would have done this differently. And if so
how so?
This converts a url to a more easily managed filename, stripping the
http protocol off.
This:
http://alongnameofasite1234567.com/q?sports=run&a=1&b=1
becomes this:
alongnameofasi
python -m cProfile [script_name].py
http://docs.python.org/2/library/profile.html#module-cProfile
Demian Brecht
http://demianbrecht.github.com
On 2013-02-06 2:30 PM, "richard_hubbe11"
wrote:
>I see that urlparse uses split and not re at all and, in my tests,
>urlparse
>completes in less ti
Well, an alternative /could/ be:
from urlparse import urlparse
parts = urlparse('http://alongnameofasite1234567.com/q?sports=run&a=1&b=1')
print '%s%s_%s' % (parts.netloc.replace('.', '_'),
parts.path.replace('/', '_'),
parts.query.replace('&', '_').replace('=', '_')
)
Although wit
In article ,
rh wrote:
> I am curious to know if others would have done this differently. And if so
> how so?
>
> This converts a url to a more easily managed filename, stripping the
> http protocol off.
I would have used the urlparse module.
http://docs.python.org/2/library/urlparse.html
--
21 matches
Mail list logo