On 10/6/21 14:00, Dirschel, Steve wrote:
Question:
How would one troubleshoot this issue in Postgres as to why the delete
was running so long? My background is Oracle and there are various
statistics I may look at:
* One could estimate the number of logical reads the delete should
do based on expected number of rows to delete, expected logical
reads against the table per row, expected logical reads against
each index per row.
* One could look in V$SQL and see how many logical reads the query
was actually doing.
* One could look at V$SESS_IO and see how many logical reads the
session was doing.
In this case you would see the query was doing way more logical reads
that expected and then try and think of scenarios that would cause that.
Here is what I could see in Postgres:
* When I did an explain on the delete I could see it was full
scanning the table. I did a full scan of the table interactively
in less than 1 second so the long runtime was not due to the full
tablescan.
* I could not find the query in pg_stat_statements to see how many
shared block reads/hits the query was doing to see if the numbers
were extremely high. Based on documentation queries do not show
up in pg_stat_statements until after they complete.
* pg_stat_activity showed wait_event_type and wait_event were null
for the session every time I looked. So the session was
continually using CPU.
I started looking at table definitions (indexes, FK's, etc.) and
comparing to Oracle and noticed some indexes missing. I then could
see the table being deleted from was a child table with a FK pointing
to a parent table. Finally I was able to see that the parent table
was missing an index on the FK column so for every row being deleted
from the child it was full scanning the parent. All makes sense after
the fact but I'm looking for a more methodical way to come to that
conclusion by looking at database statistics.
Are there other statistics in Postgres I may have looked at to
methodically come to the conclusion that the problem was the missing
index on the parent FK column?
Thanks
This e-mail is for the sole use of the intended recipient and contains
information that may be privileged and/or confidential. If you are not
an intended recipient, please notify the sender by return e-mail and
delete this e-mail and any attachments. Certain required legal entity
disclosures can be accessed on our website:
https://www.thomsonreuters.com/en/resources/disclosures.html
Hi Steve,
First, check whether you have any triggers on the table. The best way of
doing it is to use information_schema.triggers. I have seen triggers
introduce some "mysterious" functionality in Oracle as well. Second,
check constraints. Is the table you're deleting from the parent table of
a foreign key constraint(s)? If the constraints are defined with "ON
DELETE CASCADE", you maybe deleting more than you think. If it is not
defined with "ON DELETE CASCADE" or "ON DELETE SET NULL", you would get
an error. If that passes the muster, then check the processes doing the
most of IO using "iotop" or "atop". I like the latter. You can then
check what the busy processes are doing using strace -e trace=file and,
for good measure, 'perf top".
Regards
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com