In article <3efc283f-419d-41b6-ad20-c2901c3b9...@googlegroups.com>, rusi <rustompm...@gmail.com> wrote:
> The classic data structure for this is the trie: > General idea: http://en.wikipedia.org/wiki/Trie > In python: > http://stackoverflow.com/questions/11015320/how-to-create-a-trie-in-python/ I agree that a trie fits the problem description well. If I were coding this up in C or C++, that's probably the data structure I'd use, with each node containing an array of pointers to nodes, and the array indexed by the ascii value of the next character (this doesn't translate well to unicode). Lookup speed would be awesomely fast. The problem is, that doesn't make a whole lot of sense in Python. The cited implementation uses dicts at each level. By the time you've done that, you might as well just throw all the data into one big dict and use the full search string as the key. It would be a lot less code, and probably run faster. Still, as an exercise in learning about a useful (and under-appreciated) data structure, implementing this as a trie sounds like a wonderful idea. -- https://mail.python.org/mailman/listinfo/python-list