Hi Eric,

* Eric Blake wrote on Sun, Jan 06, 2008 at 02:23:33AM CET:
> 
> I'd appreciate any reviews before checking it in.

Here's a rough glance at it.  FWIW, the diff is not very readable
(there was a patch to diffutils out there for --more-readable).

> +/* We use the Two-Way string matching algorithm, which guarantees
> +   linear complexity with constant space.  Additionally, for long
> +   needles, we also use a bad character shift table similar to the
> +   Boyer-Moore algorithm to acheive sub-linear performance.

s/acheive/achieve/  (2 instances in the code)

> +/* Peform a critical factorization of NEEDLE, of length NEEDLE_LEN.

s/Peform/Perform/

> +static void *
> +two_way_short_needle (const unsigned char *haystack, size_t haystack_len,
> +                   const unsigned char *needle, size_t needle_len)
> +{
[...]
> +
> +  /* Perform the search.  Each iteration compares the right half
> +     first.  */
> +  if (memcmp (needle, needle + period, suffix) == 0)
> +    {

In order to test the code in both blocks following this test, you may
want to also test with a periodic needle in test-memmem.c:

  {
    const char input[] = "ABC ABCDAB ABCDABCDABDE";
    const char *result = memmem (input, strlen (input), "ABCDABCD", 8);
    ASSERT (result == input + 11);
  }

This still won't give you full code coverage (but for the other corner
cases I'd probably need to go and read the description of the
algorithm ;-)

[...]
> diff --git a/tests/test-memmem.c b/tests/test-memmem.c
> index 976f293..e900e1c 100644
> --- a/tests/test-memmem.c
> +++ b/tests/test-memmem.c
> @@ -152,5 +152,32 @@ main (int argc, char *argv[])
>        free (haystack);
>    }
>  
> +  /* Check that a some long needles not present in a haystack can be

s/a some //

> +     handled with sublinear speed.  */

Nice work, BTW!

Cheers,
Ralf


Reply via email to