On 6/12/19 9:59 AM, Richard Biener wrote: > On Tue, Jun 11, 2019 at 9:02 PM Jason Merrill <ja...@redhat.com> wrote: >> >> On 6/11/19 9:16 AM, Martin Liška wrote: >>> On 6/11/19 2:27 PM, Jason Merrill wrote: >>>> On 6/11/19 3:41 AM, Martin Liška wrote: >>>>> On 6/10/19 8:21 PM, Jason Merrill wrote: >>>>>> On Mon, Jun 10, 2019 at 3:08 AM Martin Liška <mli...@suse.cz> wrote: >>>>>>> On 6/7/19 11:43 PM, Jason Merrill wrote: >>>>>>>> On Fri, Jun 7, 2019 at 8:14 AM Martin Liška <mli...@suse.cz> wrote: >>>>>>>>> On 6/7/19 2:09 PM, Richard Biener wrote: >>>>>>>>>> On Fri, Jun 7, 2019 at 2:03 PM Martin Liška <mli...@suse.cz> wrote: >>>>>>>>>>> On 6/7/19 10:57 AM, Richard Biener wrote: >>>>>>>>>>>> On Mon, Jun 3, 2019 at 3:35 PM Martin Liška <mli...@suse.cz> wrote: >>>>>>>>>>>>> On 6/1/19 12:06 AM, Jeff Law wrote: >>>>>>>>>>>>>> On 5/22/19 3:13 AM, Martin Liška wrote: >>>>>>>>>>>>>>> On 5/21/19 1:51 PM, Richard Biener wrote: >>>>>>>>>>>>>>>> On Tue, May 21, 2019 at 1:02 PM Martin Liška <mli...@suse.cz> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 5/21/19 11:38 AM, Richard Biener wrote: >>>>>>>>>>>>>>>>>> On Tue, May 21, 2019 at 12:07 AM Jeff Law <l...@redhat.com> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 5/13/19 1:41 AM, Martin Liška wrote: >>>>>>>>>>>>>>>>>>>> On 11/8/18 9:56 AM, Martin Liška wrote: >>>>>>>>>>>>>>>>>>>>> On 11/7/18 11:23 PM, Jeff Law wrote: >>>>>>>>>>>>>>>>>>>>>> On 10/30/18 6:28 AM, Martin Liška wrote: >>>>>>>>>>>>>>>>>>>>>>> On 10/30/18 11:03 AM, Jakub Jelinek wrote: >>>>>>>>>>>>>>>>>>>>>>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška >>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> +hashtab_chk_error () >>>>>>>>>>>>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>>>>>>>>>>>> + fprintf (stderr, "hash table checking failed: " >>>>>>>>>>>>>>>>>>>>>>>>> + "equal operator returns true for a pair " >>>>>>>>>>>>>>>>>>>>>>>>> + "of values with a different hash value"); >>>>>>>>>>>>>>>>>>>>>>>> BTW, either use internal_error here, or at least if >>>>>>>>>>>>>>>>>>>>>>>> using fprintf >>>>>>>>>>>>>>>>>>>>>>>> terminate with \n, in your recent mail I saw: >>>>>>>>>>>>>>>>>>>>>>>> ...different hash valueduring RTL pass: vartrack >>>>>>>>>>>>>>>>>>>>>>>> ^^^^^^ >>>>>>>>>>>>>>>>>>>>>>> Sure, fixed in attached patch. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Martin >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> + gcc_unreachable (); >>>>>>>>>>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>>>>>>>>> Jakub >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep >>>>>>>>>>>>>>>>>>>>>>> 17 00:00:00 2001 >>>>>>>>>>>>>>>>>>>>>>> From: marxin <mli...@suse.cz> >>>>>>>>>>>>>>>>>>>>>>> Date: Mon, 29 Oct 2018 09:38:21 +0100 >>>>>>>>>>>>>>>>>>>>>>> Subject: [PATCH] Sanitize equals and hash functions in >>>>>>>>>>>>>>>>>>>>>>> hash-tables. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>>>>>>>>>> gcc/hash-table.h | 40 >>>>>>>>>>>>>>>>>>>>>>> +++++++++++++++++++++++++++++++++++++++- >>>>>>>>>>>>>>>>>>>>>>> 1 file changed, 39 insertions(+), 1 deletion(-) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h >>>>>>>>>>>>>>>>>>>>>>> index bd83345c7b8..694eedfc4be 100644 >>>>>>>>>>>>>>>>>>>>>>> --- a/gcc/hash-table.h >>>>>>>>>>>>>>>>>>>>>>> +++ b/gcc/hash-table.h >>>>>>>>>>>>>>>>>>>>>>> @@ -503,6 +503,7 @@ private: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> value_type *alloc_entries (size_t n >>>>>>>>>>>>>>>>>>>>>>> CXX_MEM_STAT_INFO) const; >>>>>>>>>>>>>>>>>>>>>>> value_type *find_empty_slot_for_expand (hashval_t); >>>>>>>>>>>>>>>>>>>>>>> + void verify (const compare_type &comparable, >>>>>>>>>>>>>>>>>>>>>>> hashval_t hash); >>>>>>>>>>>>>>>>>>>>>>> bool too_empty_p (unsigned int); >>>>>>>>>>>>>>>>>>>>>>> void expand (); >>>>>>>>>>>>>>>>>>>>>>> static bool is_deleted (value_type &v) >>>>>>>>>>>>>>>>>>>>>>> @@ -882,8 +883,12 @@ hash_table<Descriptor, Allocator> >>>>>>>>>>>>>>>>>>>>>>> if (insert == INSERT && m_size * 3 <= m_n_elements >>>>>>>>>>>>>>>>>>>>>>> * 4) >>>>>>>>>>>>>>>>>>>>>>> expand (); >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - m_searches++; >>>>>>>>>>>>>>>>>>>>>>> +#if ENABLE_EXTRA_CHECKING >>>>>>>>>>>>>>>>>>>>>>> + if (insert == INSERT) >>>>>>>>>>>>>>>>>>>>>>> + verify (comparable, hash); >>>>>>>>>>>>>>>>>>>>>>> +#endif >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> + m_searches++; >>>>>>>>>>>>>>>>>>>>>>> value_type *first_deleted_slot = NULL; >>>>>>>>>>>>>>>>>>>>>>> hashval_t index = hash_table_mod1 (hash, >>>>>>>>>>>>>>>>>>>>>>> m_size_prime_index); >>>>>>>>>>>>>>>>>>>>>>> hashval_t hash2 = hash_table_mod2 (hash, >>>>>>>>>>>>>>>>>>>>>>> m_size_prime_index); >>>>>>>>>>>>>>>>>>>>>>> @@ -930,6 +935,39 @@ hash_table<Descriptor, Allocator> >>>>>>>>>>>>>>>>>>>>>>> return &m_entries[index]; >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> +#if ENABLE_EXTRA_CHECKING >>>>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>>>> +/* Report a hash table checking error. */ >>>>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>>>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD >>>>>>>>>>>>>>>>>>>>>>> +static void >>>>>>>>>>>>>>>>>>>>>>> +hashtab_chk_error () >>>>>>>>>>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>>>>>>>>>> + fprintf (stderr, "hash table checking failed: " >>>>>>>>>>>>>>>>>>>>>>> + "equal operator returns true for a pair " >>>>>>>>>>>>>>>>>>>>>>> + "of values with a different hash value\n"); >>>>>>>>>>>>>>>>>>>>>>> + gcc_unreachable (); >>>>>>>>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>>>>>>> I think an internal_error here is probably still better >>>>>>>>>>>>>>>>>>>>>> than a simple >>>>>>>>>>>>>>>>>>>>>> fprintf, even if the fprintf is terminated with a \n :-) >>>>>>>>>>>>>>>>>>>>> Fully agree with that, but I see a lot of build errors >>>>>>>>>>>>>>>>>>>>> when using internal_error. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> The question then becomes can we bootstrap with this >>>>>>>>>>>>>>>>>>>>>> stuff enabled and >>>>>>>>>>>>>>>>>>>>>> if not, are we likely to soon? It'd be a shame to put >>>>>>>>>>>>>>>>>>>>>> it into >>>>>>>>>>>>>>>>>>>>>> EXTRA_CHECKING, but then not be able to really use >>>>>>>>>>>>>>>>>>>>>> EXTRA_CHECKING >>>>>>>>>>>>>>>>>>>>>> because we've got too many bugs to fix. >>>>>>>>>>>>>>>>>>>>> Unfortunately it's blocked with these 2 PRs: >>>>>>>>>>>>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845 >>>>>>>>>>>>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847 >>>>>>>>>>>>>>>>>>>> Hi. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I've just added one more PR: >>>>>>>>>>>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm sending updated version of the patch that provides a >>>>>>>>>>>>>>>>>>>> disablement for the 3 PRs >>>>>>>>>>>>>>>>>>>> with a new function disable_sanitize_eq_and_hash. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> With that I can bootstrap and finish tests. However, I've >>>>>>>>>>>>>>>>>>>> done that with a patch >>>>>>>>>>>>>>>>>>>> limits maximal number of checks: >>>>>>>>>>>>>>>>>>> So rather than call the disable_sanitize_eq_and_hash, can >>>>>>>>>>>>>>>>>>> you have its >>>>>>>>>>>>>>>>>>> state set up when you instantiate the object? It's not a >>>>>>>>>>>>>>>>>>> huge deal, >>>>>>>>>>>>>>>>>>> just thinking about loud. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So how do we want to go forward, particularly the >>>>>>>>>>>>>>>>>>> EXTRA_EXTRA checking >>>>>>>>>>>>>>>>>>> issue :-) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> There is at least one PR where we have a table where >>>>>>>>>>>>>>>>>> elements _in_ the >>>>>>>>>>>>>>>>>> table are never compared against each other but always >>>>>>>>>>>>>>>>>> against another >>>>>>>>>>>>>>>>>> object (I guess that's usual even), but the setup is in a >>>>>>>>>>>>>>>>>> way that the >>>>>>>>>>>>>>>>>> comparison function only works with those. With the patch >>>>>>>>>>>>>>>>>> we verify >>>>>>>>>>>>>>>>>> hashing/comparison for something that is never used. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So - wouldn't it be more "correct" to only verify >>>>>>>>>>>>>>>>>> comparison/hashing >>>>>>>>>>>>>>>>>> at lookup time, using the object from the lookup and verify >>>>>>>>>>>>>>>>>> that against >>>>>>>>>>>>>>>>>> all other elements? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I don't a have problem with that. Apparently this changes >>>>>>>>>>>>>>>>> fixes >>>>>>>>>>>>>>>>> PR90450 and PR87847. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Changes from previous version: >>>>>>>>>>>>>>>>> - verification happens only when an element is searched (not >>>>>>>>>>>>>>>>> inserted) >>>>>>>>>>>>>>>>> - new argument 'sanitize_eq_and_hash' added for >>>>>>>>>>>>>>>>> hash_table::hash_table >>>>>>>>>>>>>>>>> - new param has been introduced hash-table-verification-limit >>>>>>>>>>>>>>>>> in order >>>>>>>>>>>>>>>>> to limit number of elements that are compared within a >>>>>>>>>>>>>>>>> table >>>>>>>>>>>>>>>>> - verification happens only with flag_checking >= 2 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've been bootstrapping and testing the patch right now. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Looks like I misremembered the original patch. The issue isn't >>>>>>>>>>>>>>>> comparing random two elements in the table. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> That it fixes PR90450 is because LIM never calls >>>>>>>>>>>>>>>> find_slot_with_hash >>>>>>>>>>>>>>>> without INSERTing. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> There's updated version of the patch where I check all find >>>>>>>>>>>>>>> operations >>>>>>>>>>>>>>> (both w/ and w/o insertion). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression >>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>> except for: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> $ ./xgcc -B. >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c >>>>>>>>>>>>>>> -O2 -c >>>>>>>>>>>>>>> hash table checking failed: equal operator returns true for a >>>>>>>>>>>>>>> pair of values with a different hash value >>>>>>>>>>>>>>> during GIMPLE pass: lim >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c: >>>>>>>>>>>>>>> In function ‘fn1’: >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c:6:1: >>>>>>>>>>>>>>> internal compiler error: in hashtab_chk_error, at >>>>>>>>>>>>>>> hash-table.h:1019 >>>>>>>>>>>>>>> 6 | fn1 () >>>>>>>>>>>>>>> | ^~~ >>>>>>>>>>>>>>> 0x6c5725 hashtab_chk_error >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:1019 >>>>>>>>>>>>>>> 0xe504ea hash_table<mem_ref_hasher, false, >>>>>>>>>>>>>>> xcallocator>::verify(ao_ref* const&, unsigned int) >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:1040 >>>>>>>>>>>>>>> 0xe504ea hash_table<mem_ref_hasher, false, >>>>>>>>>>>>>>> xcallocator>::find_slot_with_hash(ao_ref* const&, unsigned int, >>>>>>>>>>>>>>> insert_option) >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:960 >>>>>>>>>>>>>>> 0xe504ea gather_mem_refs_stmt >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1501 >>>>>>>>>>>>>>> 0xe504ea analyze_memory_references >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1625 >>>>>>>>>>>>>>> 0xe504ea tree_ssa_lim >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2646 >>>>>>>>>>>>>>> 0xe504ea execute >>>>>>>>>>>>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2708 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Richi: it's after your recent patch. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> For some reason I don't see PR87847 issue any longer. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> May I install the patch with disabled sanitization in >>>>>>>>>>>>>>> tree-ssa-loop-im.c ? >>>>>>>>>>>>>> Don't we still need to deal with the naked fprintf when there's a >>>>>>>>>>>>>> failure. ie, shouldn't we be raising it with a gcc_assert or >>>>>>>>>>>>>> somesuch? >>>>>>>>>>>>> >>>>>>>>>>>>> Good point, I've just adjusted that. >>>>>>>>>>>>> >>>>>>>>>>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression >>>>>>>>>>>>> tests. >>>>>>>>>>>>> >>>>>>>>>>>>> Ready to be installed? >>>>>>>>>>>> >>>>>>>>>>>> Ugh, the cselib one is really bad. But I don't hold my breath for >>>>>>>>>>>> anyone >>>>>>>>>>>> fixing it ... >>>>>>>>>>> >>>>>>>>>>> Yes :D It's been some time and there's no interest in the PR. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> One question - there's unconditional >>>>>>>>>>>> >>>>>>>>>>>> + if (m_sanitize_eq_and_hash) >>>>>>>>>>>> + verify (comparable, hash); >>>>>>>>>>>> >>>>>>>>>>>> which will read a global variable and have (possibly not inline) >>>>>>>>>>>> call >>>>>>>>>>>> to verify on a common path even with checking disabled. So I think >>>>>>>>>>>> we want to compile this checking feature out for !CHECKING_P >>>>>>>>>>>> or at least make the if __builtin_expect (..., 0), ::verify not >>>>>>>>>>>> inlined and marked pure () (thus, !CHECKING_P is simplest ;)). >>>>>>>>>>> >>>>>>>>>>> Fixed. May I install the patch? The cselib issue can be solved >>>>>>>>>>> later.. >>>>>>>>>> >>>>>>>>>> You missed the second occurance >>>>>>>>>> >>>>>>>>>> - m_searches++; >>>>>>>>>> + if (m_sanitize_eq_and_hash) >>>>>>>>>> + verify (comparable, hash); >>>>>>>>> >>>>>>>>> Yep ;) I've just install the patch. >>>>>>>> >>>>>>>> This is breaking my build: >>>>>>>> >>>>>>>> /home/jason/gt/gcc/hash-map.h:123:71: error: no matching function for >>>>>>>> call to ‘hash_table<hash_map<mem_alloc_d\ >>>>>>>> escription<mem_usage>::mem_location_hash, mem_usage*, >>>>>>>> simple_hashmap_traits<default_hash_traits<mem_alloc_desc\ >>>>>>>> ription<mem_usage>::mem_location_hash>, mem_usage*> >::hash_entry, >>>>>>>> false, xcallocator>::hash_table(size_t&, bo\ >>>>>>>> ol&, bool&, mem_alloc_origin, const char*&, int&, const char*&)’ >>>>>>>> : m_table (n, ggc, gather_mem_stats, HASH_MAP_ORIGIN >>>>>>>> PASS_MEM_STAT) {} >>>>>>>> >>>>>>>> Looks like this needs to be updated to pass an argument to the new >>>>>>>> sanitize_eq_and_hash parameter. >>>>>>>> >>>>>>>> Jason >>>>>>> >>>>>>> Sorry for the breakage, I've just fixed that in r272104. >>>>>> >>>>>> Thanks. I'm also seeing a massive compile time hit from this: A >>>>>> constexpr testcase that I've been looking at went from compiling in 13 >>>>>> seconds to 78 seconds, 6 times as long. I would expect template-heavy >>>>>> code to see similar problems when sanitization is enabled for those >>>>>> hash tables. Could we keep the parameter low or 0 by default, and >>>>>> just do occasional sanitize runs with it explicitly enabled? >>>>> >>>>> Makes sense to me. Can you please provide a test-case which I can measure? >>>> >>>> This is the one I've been looking at: >>>> >>>> struct Int { >>>> constexpr Int(int v): v(v) {} >>>> constexpr Int& operator+=(Int b) { this->v += b.v; return *this; } >>>> constexpr Int& operator++() { ++this->v; return *this; } >>>> private: >>>> friend constexpr bool operator<(Int a, Int b) { return a.v < b.v; } >>>> int v; >>>> }; >>>> constexpr int f(int n) { >>>> Int i = {0}; >>>> Int k = {0}; >>>> k = 0; >>>> for (; k<10000; ++k) { >>>> i += k; >>>> } >>>> return n; >>>> } >>>> >>>> template<int N> struct S { >>>> static constexpr int sm = S<N-1>::sm+f(N); >>>> }; >>>> template<> struct S<0> { >>>> static constexpr int sm = 0; >>>> }; >>>> constexpr int r = S<20>::sm; >>>> >>>> Jason >>> >>> For the test-case provided I see: >>> >>> $ time g++ time.cc -c --param hash-table-verification-limit=100 >>> >>> real 0m1.855s >>> user 0m1.829s >>> sys 0m0.025s >>> >>> $ time g++ time.cc -c --param hash-table-verification-limit=0 >>> >>> real 0m1.275s >>> user 0m1.219s >>> sys 0m0.052s >>> >>> $ time g++-9 time.cc -c >>> >>> real 0m0.939s >>> user 0m0.827s >>> sys 0m0.109s >>> >>> So it's slower, but I can't confirm the huge slowdown you see. >>> Is it due to r272144? >> >> Hmm, I wonder if this is because of the >> --enable-gather-detailed-mem-stats hash tables. > > I wonder if we can reduce the overhead by making > hash-tables/maps using predefined traits not perform > the checking? Thus make [the default] whether to check or not > to check part of the traits? Surely the basic pointer-hash/map > stuff is OK and needs no extra checking.
Interesting idea! I can prepare a patch. Right now I'm testing a patch that removes sanitization for hash-tables (and maps) that track memory allocations. Martin > > Richard. > >> Jason