On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule <pavel.steh...@gmail.com> wrote: > > so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova > <a.lubennik...@postgrespro.ru> napsal: >> >> On 16.10.2020 12:07, Julien Rouhaud wrote: >> >> Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <pavel.steh...@gmail.com> a >> écrit : >>> >>> >>> >>> pá 16. 10. 2020 v 9:43 odesílatel <e.sokol...@postgrespro.ru> napsal: >>>> >>>> Hi, hackers. >>>> For some distributions of data in tables, different loops in nested loop >>>> joins can take different time and process different amounts of entries. >>>> It makes average statistics returned by explain analyze not very useful >>>> for DBA. >>>> To fix it, here is the patch that add printing of min and max statistics >>>> for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE. >>>> Please don't hesitate to share any thoughts on this topic! >>> >>> >>> +1 >>> >>> This is great feature - sometimes it can be pretty messy current limited >>> format >> >> >> +1, this can be very handy! >> >> Cool. >> I have added your patch to the commitfest, so it won't get lost.
Thanks! I'll also try to review it next week. >> https://commitfest.postgresql.org/30/2765/ >> >> I will review the code next week. Unfortunately, I cannot give any feedback >> about usability of this feature. >> >> User visible change is: >> >> - -> Nested Loop (actual rows=N loops=N) >> + -> Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2) > > > This interface is ok - there is not too much space for creativity. Yes I also think it's ok. We should also consider usability for tools like explain.depesz.com, I don't know if the current output is best. I'm adding Depesz and Pierre which are both working on this kind of tool for additional input. > I can imagine displaying variance or average - but I am afraid about very bad > performance impacts. The original counter (rows here) is already an average right? Variance could be nice too. Instrumentation will already spam gettimeofday() calls for nested loops, I don't think that computing variance would add that much overhead?