Hi

Ășt 6. 7. 2021 v 16:07 odesĂ­latel Daniel Gustafsson <dan...@yesql.se> napsal:

> Looking at this I like the idea in principle, but I'm not convinced that
> auto_explain is the right tool for this.  auto_explain is for identifying
> slow
> queries, and what you are proposing is to identify queries with a certain
> "shape" (for lack of a better term) even if they aren't slow as per the
> log_min_duration setting.  If log_min_duration is deemed to crude due to
> query
> volume then sample_rate is the tool.  If sample_rate is also discarded,
> then
> pg_stat_statements seems a better option.
>

I don't think so pg_stat_statements can be used - a) it doesn't check
execution plan, so this feature can have big overhead against current
pg_stat_statements, that works just with AST, b) pg_stat_statements has one
entry per AST - but this can be problem on execution plan level, and this
is out of perspective of pg_stat_statements.

>
> Also, why just sequential scans (apart from it being this specific
> usecase)?
> If the idea is to track aspects of execution which are deemed slow, then
> tracking for example spills etc would be just as valid.  Do you have
> thoughts
> on that?
>

Yes, I thought about it more, and sometimes bitmap index scans are
problematic too, index scans in nested loops can be a problem too.

For my last customer I had to detect queries with a large bitmap index
scan. I can do it with a combination of pg_stat_statements and log
checking, but this work is not very friendly.

My current idea is to have some extension that can be tran for generally
specified executor nodes.

Sometimes I can say - I need to know all queries that does seq scan over
tabx where tuples processed > N. In other cases can be interesting to know
the queries that uses index x for bitmap index scan,


>
> That being said, a few comments on the patch:
>
> -       (auto_explain_log_min_duration >= 0 && \
> +       ((auto_explain_log_min_duration >= 0 || auto_explain_log_seqscan
> != -1) && \
> Is there a reason to not follow the existing code and check for >= 0?
>
> +       DefineCustomIntVariable("auto_explain.log_seqscan",
> It's only a matter of time before another node is proposed for logging, and
> then we'll be stuck adding log_XXXnode GUCs.  Is there a more future-proof
> way
> to do this?
>
> +       "Sets the minimum tuples produced by sequantial scans which plans
> will be logged",
> s/sequantial/sequential/
>
> -       es->analyze = (queryDesc->instrument_options &&
> auto_explain_log_analyze);
> +       es->analyze = (queryDesc->instrument_options &&
> (auto_explain_log_analyze || auto_explain_log_seqscan != -1));
> Turning on ANALYZE when log_analyze isn't set to True is a no-no IMO.
>
> + * Colllect relations where log_seqscan limit was exceeded
> s/Colllect/Collect/
>
> +       if (*relnames.data != '\0')
> +               appendStringInfoString(&relnames, ",");
> This should use appendStringInfoChar instead.
>
> +       (errmsg("duration: %.3f ms, over limit seqscans: %s, plan:\n%s",
> The "over limit" part is superfluous since it otherwise wouldn't be
> logged.  If
> we're prefixing something wouldn't it be more helpful to include the limit,
> like: "seqscans >= %d tuples returned:".  I'm not a fan of "seqscans" but
> spelling it out is also quite verbose and this is grep-able.
>
> Documentation and tests are also missing
>

Unfortunately, this idea is not well prepared. My patch was a proof concept
- but I think so it is not a good start point. Maybe it needs some tracing
API on executor level and some tool like "perf top", but for executor. Post
execution analysis is not a good direction with big overhead, and mainly it
is not friendly in critical situations. I need some handy tool like perf,
but for executor nodes. I don't know how to do it effectively.

Thank you for your review and for your time, but I think it is better to
remove this patch from commit fest. I have no idea how to design this
feature well :-/

Regards

Pavel




>
> --
> Daniel Gustafsson               https://vmware.com/
>
>

Reply via email to