On Nov 15, 2021, at 10:50 PM, Thomas Munro <thomas.mu...@gmail.com<mailto:thomas.mu...@gmail.com>> wrote:
This message originated outside your organization. On Tue, Nov 16, 2021 at 5:43 PM Robert Creager <robe...@spectralogic.com<mailto:robe...@spectralogic.com>> wrote: One CPU is pegged, the data has been sent over STDIN, so Postgres is not waiting for more, there are no other queries running using this select: So PostgreSQL is eating 100% CPU, with no value shown in wait_event_type, and small numbers of system calls are counted. In that case, is there an interesting user stack that jumps out with a profiler during the slowdown (or the kernel version, stack())? sudo dtrace -n 'profile-99 /arg0/ { @[ustack()] = count(); } tick-10s { exit(0); } I setup a monitoring script to do the dtrace stack sampler you sent once a minute on the top CPU consuming Postgres process. Now I wait until we reproduce it. #!/usr/local/bin/bash while [[ true ]]; do DATE=$(date "+%d-%H:%M:%S") PID=$(top -b | grep postgres | head -n 1 | awk '{print $1}') echo "${DATE} ${PID}" dtrace -n 'profile-99 /pid == '$PID'/ { @[ustack()] = count(); } tick-10s { exit(0); }' > dtrace/dtrace_${DATE}.txt sleep 60 done Presuming this is the type of output you are expecting: CPU ID FUNCTION:NAME 0 58709 :tick-10s postgres`AtEOXact_LargeObject+0x11 postgres`CommitTransaction+0x127 postgres`CommitTransactionCommand+0xf2 postgres`PostgresMain+0x1fef postgres`process_startup_packet_die postgres`0x73055b postgres`PostmasterMain+0xf36 postgres`0x697837 postgres`_start+0x100 `0x80095f008 1 postgres`printtup+0xf3 postgres`standard_ExecutorRun+0x136 postgres`PortalRunSelect+0x10f postgres`PortalRun+0x1c8 postgres`PostgresMain+0x1f94 postgres`process_startup_packet_die postgres`0x73055b postgres`PostmasterMain+0xf36 postgres`0x697837 postgres`_start+0x100 `0x80095f008 1 ...