Hi Eric, * Eric Blake wrote on Sun, Jan 06, 2008 at 02:23:33AM CET: > > I'd appreciate any reviews before checking it in.
Here's a rough glance at it. FWIW, the diff is not very readable (there was a patch to diffutils out there for --more-readable). > +/* We use the Two-Way string matching algorithm, which guarantees > + linear complexity with constant space. Additionally, for long > + needles, we also use a bad character shift table similar to the > + Boyer-Moore algorithm to acheive sub-linear performance. s/acheive/achieve/ (2 instances in the code) > +/* Peform a critical factorization of NEEDLE, of length NEEDLE_LEN. s/Peform/Perform/ > +static void * > +two_way_short_needle (const unsigned char *haystack, size_t haystack_len, > + const unsigned char *needle, size_t needle_len) > +{ [...] > + > + /* Perform the search. Each iteration compares the right half > + first. */ > + if (memcmp (needle, needle + period, suffix) == 0) > + { In order to test the code in both blocks following this test, you may want to also test with a periodic needle in test-memmem.c: { const char input[] = "ABC ABCDAB ABCDABCDABDE"; const char *result = memmem (input, strlen (input), "ABCDABCD", 8); ASSERT (result == input + 11); } This still won't give you full code coverage (but for the other corner cases I'd probably need to go and read the description of the algorithm ;-) [...] > diff --git a/tests/test-memmem.c b/tests/test-memmem.c > index 976f293..e900e1c 100644 > --- a/tests/test-memmem.c > +++ b/tests/test-memmem.c > @@ -152,5 +152,32 @@ main (int argc, char *argv[]) > free (haystack); > } > > + /* Check that a some long needles not present in a haystack can be s/a some // > + handled with sublinear speed. */ Nice work, BTW! Cheers, Ralf