RE: [EXT] Re: Troubleshooting a long running delete statement

Dirschel, Steve Wed, 06 Oct 2021 13:27:19 -0700


On 10/6/21 14:00, Dirschel, Steve wrote:
Question:

How would one troubleshoot this issue in Postgres as to why the delete was
running so long? My background is Oracle and there are various statistics I
may look at:
· One could estimate the number of logical reads the delete should do
based on expected number of rows to delete, expected logical reads against the
table per row, expected logical reads against each index per row.
· One could look in V$SQL and see how many logical reads the query was
actually doing.
· One could look at V$SESS_IO and see how many logical reads the
session was doing.

In this case you would see the query was doing way more logical reads that
expected and then try and think of scenarios that would cause that.

Here is what I could see in Postgres:
· When I did an explain on the delete I could see it was full scanning
the table. I did a full scan of the table interactively in less than 1 second
so the long runtime was not due to the full tablescan.
· I could not find the query in pg_stat_statements to see how many
shared block reads/hits the query was doing to see if the numbers were
extremely high. Based on documentation queries do not show up in
pg_stat_statements until after they complete.
· pg_stat_activity showed wait_event_type and wait_event were null for
the session every time I looked. So the session was continually using CPU.

I started looking at table definitions (indexes, FK's, etc.) and comparing to
Oracle and noticed some indexes missing. I then could see the table being
deleted from was a child table with a FK pointing to a parent table. Finally I
was able to see that the parent table was missing an index on the FK column so
for every row being deleted from the child it was full scanning the parent.
All makes sense after the fact but I'm looking for a more methodical way to
come to that conclusion by looking at database statistics.

Are there other statistics in Postgres I may have looked at to methodically
come to the conclusion that the problem was the missing index on the parent FK
column?

Thanks

This e-mail is for the sole use of the intended recipient and contains
information that may be privileged and/or confidential. If you are not an
intended recipient, please notify the sender by return e-mail and delete this
e-mail and any attachments. Certain required legal entity disclosures can be
accessed on our website:
https://www.thomsonreuters.com/en/resources/disclosures.html

Hi Steve,

First, check whether you have any triggers on the table. The best way of doing
it is to use information_schema.triggers. I have seen triggers introduce some
"mysterious" functionality in Oracle as well. Second, check constraints. Is the
table you're deleting from the parent table of a foreign key constraint(s)? If
the constraints are defined with "ON DELETE CASCADE", you maybe deleting more
than you think. If it is not defined with "ON DELETE CASCADE" or "ON DELETE SET
NULL", you would get an error. If that passes the muster, then check the
processes doing the most of IO using "iotop" or "atop". I like the latter. You
can then check what the busy processes are doing using strace -e trace=file
and, for good measure, 'perf top".

Regards

Mladen Gogala

Database Consultant

Tel: (347) 321-1217

https://dbwhisperer.wordpress.com<https://urldefense.com/v3/__https:/dbwhisperer.wordpress.com__;!!GFN0sa3rsbfR8OLyAw!N_47EusVVgJfrjPtfvI46dinpPTLwBfl4RygI-qPX8gLb8p6-A2bQvhm19pFKaZBEU1iQwfOLA$>

Thanks for the reply and I hope I’m replying to this e-mail correctly at the
bottom of the chain. We are running on AWS aurora postgres. I assume strace
-e isn’t an option given we don’t have access to the server or are you aware of
a method I could still do that without server access?

Regards

Steve

RE: [EXT] Re: Troubleshooting a long running delete statement

Reply via email to