On Sat, May 30, 2020 at 5:06 PM David Malcolm <dmalc...@redhat.com> wrote: > On Sat, 2020-05-30 at 13:40 +0000, Pip Cet via Gcc-patches wrote: > > I think we should just omit the triangle inequality test from the > > self-test, as in the attached patch. > > I like the idea,
Thanks! > but can you update the comment so that it explains why > it's not a metric? (better to capture that in a comment in the source > rather than just in the mailing list archives). How's this?
From 2a759aba3639dd782c28c727746c0e6913390fa6 Mon Sep 17 00:00:00 2001 From: Pip Cet <pip...@gmail.com> Date: Sat, 30 May 2020 13:39:09 +0000 Subject: [PATCH] Don't test the triangle inequality in the spellcheck.c self-test. * spellcheck.c (test_data): Add problematic strings. (test_metric_conditions): Don't test the triangle inequality condition, which our distance function does not satisfy. --- gcc/ChangeLog | 6 ++++++ gcc/spellcheck.c | 22 ++++++++-------------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 4fc37369d39..e565d8ecadf 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2020-05-30 Pip Cet <pip...@gmail.com> + + * spellcheck.c (test_data): Add problematic strings. + (test_metric_conditions): Don't test the triangle inequality + condition, which our distance function does not satisfy. + 2020-05-28 Nicolas Bértolo <nicolasbert...@gmail.com> * Makefile.in: don't look for libiberty in the "pic" subdirectory diff --git a/gcc/spellcheck.c b/gcc/spellcheck.c index 7891260a258..ab17657e8dc 100644 --- a/gcc/spellcheck.c +++ b/gcc/spellcheck.c @@ -438,13 +438,17 @@ static const char * const test_data[] = { "food", "boo", "1234567890123456789012345678901234567890123456789012345678901234567890" + "abc", + "ac", + "ca", }; /* Verify that get_edit_distance appears to be a sane distance function, - i.e. the conditions for being a metric. This is done directly for a - small set of examples, using test_data above. This is O(N^3) in the size - of the array, due to the test for the triangle inequality, so we keep the - array small. */ + even though it doesn't satisfy the conditions for being a metric. (This + is because the triangle inequality fails to hold: the distance between + "ca" and "ac" is 1, and so is the distance between "abc" and "ac", but + the distance between "abc" and "ca" is 3. Algorithms that calculate the + true Levenshtein-Damerau metric are much more expensive.) */ static void test_metric_conditions () @@ -468,16 +472,6 @@ test_metric_conditions () edit_distance_t dist_ji = get_edit_distance (test_data[j], test_data[i]); ASSERT_EQ (dist_ij, dist_ji); - - /* Triangle inequality. */ - for (int k = 0; k < num_test_cases; k++) - { - edit_distance_t dist_ik - = get_edit_distance (test_data[i], test_data[k]); - edit_distance_t dist_jk - = get_edit_distance (test_data[j], test_data[k]); - ASSERT_TRUE (dist_ik <= dist_ij + dist_jk); - } } } } -- 2.27.0.rc0