On Jul 17, 5:40 pm, Paul Rubin <http://phr...@nospam.invalid> wrote:
> Zac Burns <zac...@gmail.com> writes:
> > An example use case for this would be for something like a large table
> > of regular expressions that would be iterated over trying to match in
> > some string. If some regular expressions are more statistically more
> > successful then the iteration will generally be short.
>
> Generally if you are matching against a bunch of regexps, they will
> tend to match overlapping sets, so you want to control precisely what
> order they are tested in.  Having stuff bubble around based on hit
> counts doesn't sound good.

As a corrollary, if you happen to be matching non-overlapping sets,
then it is a red flag that there could be some other (better) way to
dispatch than a linear regexp search.  For instance, if all your
regexps start with a different keyword, then you should dispatch on
that keyword using a dictionary.  A regexp can be used after
dispatching to extract parameters if necessary.

It's possible to have disjoint regexps without a simple dispatch
criterion, but I'd guess that's less common.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to