Rob Coops wrote:
> I have to agree with Shawn, as you follow the tree branch by branch you get
> to the final match, the total number of checks required should on average be
> lower then it is when checking every single regex, which you would have to
> do for every string that does not match anything else.
> 
> However due to a URI now or very soon (lost track of that a bit) being able
> to consist of any international character the list will be a lot longer then
> just 'a' to 'Z' and '0 to '9' so checking one character at a time could
> easily be way more work then one would initially think.
> It might therefore be better to match the first two, three or how ever many
> characters to reduce the initial number of regular expressions one has to go
> over for every single string. Then on a next step again it might pay to
> match a set of characters rather then just one at a time as this would
> reduce the number of required checks to decide if a match is possible or
> not, and to go down the next branch or toss the string back with a does not
> match response.
> 
> I would say that depending on how often you are going to do the checking and
> how dynamic your list of regular expressions is you might want to spend a
> lot more time on finding the fastest programmatic way to order this hash of
> hashes then you will have to care about the resulting time needed to do the
> matching of any number of strings using this resulting hash.

Well, if you really want speed, use C.  Use Perl just to work out the
algorithm.  Which is why I chose to do it one character at a time.  In
C, comparing strings is only slightly faster than doing one character at
a time.

Also, I would put all my nodes in one vast array.  That way I can use
indexes rather than pointers to link them.  I can then store the whole
thing in a file without serializing and de-serializing it every time I
write and read it.  And I would add a caching mechanism to it so I could
pretend the entire thing was in memory all along.


-- 
Just my 0.00000002 million dollars worth,
  Shawn

Programming is as much about organization and communication
as it is about coding.

I like Perl; it's the only language where you can bless your
thingy.

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to