On Sat, Apr 11, 2020 at 4:11 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Konstantin Knizhnik <k.knizh...@postgrespro.ru> writes: > > On 25.03.2020 13:36, Richard Guo wrote: > >> I tried this recipe on different PostgreSQL versions, starting from > >> current master and going backwards. I was able to reproduce this issue > >> on all versions above 8.4. In 8.4 version, we do not output information > >> on hash buckets/batches. But manual inspection with gdb shows in 8.4 we > >> also have the dangling pointer for HashState->hashtable. I didn't check > >> versions below 8.4 though. > > > I can propose the following patch for the problem. > > I looked at this patch a bit, and I don't think it goes far enough. > What this issue is really pointing out is that EXPLAIN is not considering > the possibility of a Hash node having had several hashtable instantiations > over its lifespan. I propose what we do about that is generalize the > policy that show_hash_info() is already implementing (in a rather half > baked way) for multiple workers, and report the maximum field values > across all instantiations. We can combine the code needed to do so > with the code for the parallelism case, as shown in the 0001 patch > below. >
I looked through 0001 patch and it looks good to me. At first I was wondering if we need to check whether HashState.hashtable is not NULL in ExecShutdownHash() before we decide to allocate save space for HashState.hinstrument. And then I convinced myself that that's not necessary since HashState.hinstrument and HashState.hashtable cannot be both NULL there. > > In principle we could probably get away with back-patching 0001, > at least into branches that already have the HashState.hinstrument > pointer. I'm not sure it's worth any risk though. A much simpler > fix is to make sure we clear the dangling hashtable pointer, as in > 0002 below (a simplified form of Konstantin's patch). The net > effect of that is that in the case where a hash table is destroyed > and never rebuilt, EXPLAIN ANALYZE would report no hash stats, > rather than possibly-garbage stats like it does today. That's > probably good enough, because it should be an uncommon corner case. > Yes it's an uncommon corner case. But I think it may still surprise people that most of the time the hash stat shows well but sometimes it does not. Thanks Richard