On Jan 8, 2008, at 3:34 PM, James Youngman wrote:
On Jan 8, 2008 1:32 PM, Eric Blake <[EMAIL PROTECTED]> wrote:
(I may later factor out the two_way static functions into a
parameterized
header, in order to share code with strstr, strcasestr, ...;
however, note
that strstr can never achieve sublinear performance, because it
costs O(n)
to find the NUL byte that terminates the haystack, and it is not
safe to
read beyond the NUL).
It may be generally useful for applications to have some way of
indicating that they are searching for a constant needle in a large
number of haystacks. In such cases, the preparation only needs to be
done once at all. In the specific case of findutils, locate will
typically search for a single needle in one or two million haystacks.
Of course, locate also always knows the length of both the needle and
the haystack, too.
In this specific case of findutils, wouldn't it be even more
efficient to use a data structure that orders data according to there
common prefixes (such as Tries [http://en.wikipedia.org/wiki/Trie])?
--
Benoit Sigoure aka Tsuna
EPITA Research and Development Laboratory