Rob Coops wrote: > I have to agree with Shawn, as you follow the tree branch by branch you get > to the final match, the total number of checks required should on average be > lower then it is when checking every single regex, which you would have to > do for every string that does not match anything else. > > However due to a URI now or very soon (lost track of that a bit) being able > to consist of any international character the list will be a lot longer then > just 'a' to 'Z' and '0 to '9' so checking one character at a time could > easily be way more work then one would initially think. > It might therefore be better to match the first two, three or how ever many > characters to reduce the initial number of regular expressions one has to go > over for every single string. Then on a next step again it might pay to > match a set of characters rather then just one at a time as this would > reduce the number of required checks to decide if a match is possible or > not, and to go down the next branch or toss the string back with a does not > match response. > > I would say that depending on how often you are going to do the checking and > how dynamic your list of regular expressions is you might want to spend a > lot more time on finding the fastest programmatic way to order this hash of > hashes then you will have to care about the resulting time needed to do the > matching of any number of strings using this resulting hash.
Well, if you really want speed, use C. Use Perl just to work out the algorithm. Which is why I chose to do it one character at a time. In C, comparing strings is only slightly faster than doing one character at a time. Also, I would put all my nodes in one vast array. That way I can use indexes rather than pointers to link them. I can then store the whole thing in a file without serializing and de-serializing it every time I write and read it. And I would add a caching mechanism to it so I could pretend the entire thing was in memory all along. -- Just my 0.00000002 million dollars worth, Shawn Programming is as much about organization and communication as it is about coding. I like Perl; it's the only language where you can bless your thingy. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/