Please explain collections.defaultdict(lambda: 1)

2007-11-06 Thread metaperl.com
I'm reading http://norvig.com/spell-correct.html

and do not understand the expression listed in the subject which is
part of this function:

def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
  model[f] += 1
return model


Per http://docs.python.org/lib/defaultdict-examples.html

It seems that there is a default factory which initializes each key to
1. So by the end of train(), each member of the dictionary model will
have value >= 1

But why wouldnt he set the value to zero and then increment it each
time a "feature" (actually a word) is encountered? It seems that each
model value would be 1 more than it should be.

-- 
http://mail.python.org/mailman/listinfo/python-list


creating an (inefficent) alternating regular expression from a list of options

2008-09-09 Thread metaperl.com
Pyparsing has a really nice feature that I want in PLY. I want to
specify a list of strings and have them converted to a regular
expression.

A Perl module which does an aggressively optimizing job of this is
Regexp::List -
http://search.cpan.org/~dankogai/Regexp-Optimizer-0.15/lib/Regexp/List.pm

I really dont care if the expression is optimal. So the goal is
something like:

vowel_regexp = oneOf("a aa i ii u uu".split())  # yielding r'(aa|a|uu|
u|ii|i)'

Is there a public module available for this purpose?


--
http://mail.python.org/mailman/listinfo/python-list


Re: creating an (inefficent) alternating regular expression from a list of options

2008-09-18 Thread metaperl.com
On Sep 9, 9:23 am, [EMAIL PROTECTED] wrote:
>     >> I really dont care if theexpressionis optimal. So the goal is
>     >> something like:
>
>     >> vowel_regexp = oneOf("a aa i ii u uu".split())  # yielding r'(aa|a|uu|
>     >> u|ii|i)'
>
>     >> Is there a public module available for this purpose?
>
> Check Ka-Ping Yee's rxb module:
>
>    http://lfw.org/python/

Ok 
suffers from the possibility of putting shorter match before longer
one:

def either(*alternatives):
options = []
for option in alternatives:
options.append(makepat(option).regex)
return Pattern('\(' + string.join(options, '|') + '\)')


> Also, check PyPI to see if
> someone has already updated rxb for use with re.

No one has - http://pypi.python.org/pypi?%3Aaction=search&term=rxb&submit=search

no results returned



--
http://mail.python.org/mailman/listinfo/python-list


Re: creating an (inefficent) alternating regular expression from a list of options

2008-09-18 Thread metaperl.com
On Sep 9, 12:42 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote:

>
> you may also want to do re.escape on all the words, to avoid surprises
> when the choices contain special characters.

yes, thank you very much:

import re

def oneOf(s):
alts = sorted(s.split(), reverse=True)
alts = [re.escape(s) for s in alts]
return "|".join(alts)

--
http://mail.python.org/mailman/listinfo/python-list