pgsql 10.23 , different systems, same table , same plan, different Buffers: shared hit
Dear All I have a weird problem, I am trying to improve performance on this query : SELECT text('f...@bar.com') from mail_vessel_addressbook where text('f...@bar.com') ~* address_regex limit 1; The first system (linux) is a linux hosted in a cloud, kernel 3.16.0-4-amd64, 32GB mem, SSD, 4 x Intel(R) Xeon(R) CPU E7-4860 v2 @ 2.60GHz , The second (freebsd) system, used as test, is my local FreeBSD 13.1-RELEASE workstation, 32GB mem, ZFS/magnetic disks ,16 x AMD Ryzen 7 5800X 3800.16-MHz . Overall my workstation is faster, but my issue is not plain speed. The problem is as follows : *FreeBSD* postgres@[local]/dynacom=# explain (analyze,buffers) SELECT text('f...@bar.com') from mail_vessel_addressbook where text('f...@bar.com') ~* address_regex limit 1; QUERY PLAN -- Limit (cost=0.42..5.11 rows=1 width=32) (actual time=96.705..96.706 rows=1 loops=1) Buffers: shared hit=71 -> Index Only Scan using mail_vessel_addressbook_address_regex_idx on mail_vessel_addressbook (cost=0.42..2912.06 rows=620 width=32) (actual time=96.704..96.705 rows=1 loops=1) Filter: ('f...@bar.com'::text ~* address_regex) Rows Removed by Filter: 14738 Heap Fetches: 0 Buffers: shared hit=71 Planning time: 0.082 ms Execution time: 96.725 ms (9 rows) Time: 97.038 ms postgres@[local]/dynacom=# *Linux* dynacom=# explain (analyze,buffers) SELECT text('f...@bar.com') from mail_vessel_addressbook where text('f...@bar.com') ~* address_regex limit 1; QUERY PLAN -- Limit (cost=0.42..5.12 rows=1 width=32) (actual time=1768.725..1768.727 rows=1 loops=1) Buffers: shared hit=530 -> Index Only Scan using mail_vessel_addressbook_address_regex_idx on mail_vessel_addressbook (cost=0.42..2913.04 rows=620 width=32) (actual time=1768.724..1768.725 rows=1 loops=1) Filter: ('f...@bar.com'::text ~* address_regex) Rows Removed by Filter: 97781 Heap Fetches: 0 Buffers: shared hit=530 Planning time: 1.269 ms Execution time: 1768.998 ms (9 rows) The file in FreeBSD came by pg_dump from the linux system, I am puzzled why this huge difference in Buffers: shared hit. All table/index sizes are identical on both systems, I did vacuum full on the linux one, and also did vacuum freeze on both. I analyzed both, reindexed both (several times). Still the FreeBSD seems to access about 7 times less number of blocks from shared_buffers than linux : 71 vs 530 . There is no bloat , I tested with newly fresh table in both systems as well. Thank you for any help.
Re: Inconsistent query performance based on relation hit frequency
On 6/27/24 03:50, Laura Hausmann wrote: Heya, I hope the title is somewhat descriptive. I'm working on a decentralized social media platform and have encountered the following performance issue/quirk, and would like to ask for input, since I'm not sure I missed anything. I'm running PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 13.2.1 20230801, 64-bit, running on an Arch Linux box with 128GB of RAM & an 8c16t Ryzen 3700X CPU. Disk is a NVME RAID0. Postgres configuration: https://paste.depesz.com/s/iTv I'm using autovacuum defaults & am running a manual VACUUM ANALYZE on the entire database nightly. The relevant database parts consist of a table with posts (note), a table with users (user), and a table with follow relationships (following). The query in question takes the most recent n (e.g. 50) posts, filtered by the users follow relations. The note table on my main production instance grows by about 200k entries per week. Schema & tuple counts: https://paste.depesz.com/s/cfI Here's the shortest query I can reproduce the issue with: https://paste.depesz.com/s/RoC Specifically, it works well for users that follow a relatively large amount of users (https://explain.depesz.com/s/tJnB), and is very slow for users that follow a low amount of users / users that post infrequently (https://explain.depesz.com/s/Mtyr). From what I can tell, this is because this query causes postgres to scan the note table from the bottom (most recent posts first), discarding anything by users that are not followed. Curiously, rewriting the query like this (https://paste.depesz.com/s/8rN) causes the opposite problem, this query is fast for users with a low following count (https://explain.depesz.com/s/yHAz#query), and slow for users with a high following count (https://explain.depesz.com/s/1v6L, https://explain.depesz.com/s/yg3N). These numbers are even further apart (to the point of 10-30s query timeouts) in the most extreme outlier cases I've observed, and on lower-end hardware. I've sidestepped the issue by running either of these queries based on a heuristic that checks whether there are more than 250 matching posts in the past 7 days, recomputed once per day for every user, but it feels more like a hack than a proper solution. I'm able to make the planner make a sensible decision in both cases by setting enable_sort = off, but that tanks performance for the rest of my application, is even more of a hack, and doesn't seem to work in all cases. I've been able to reproduce this issue with mock data (https://paste.depesz.com/s/CnY), though it's not generating quite the same query plans and is behaving a bit differently. Before deep dive into everybody's favorite topic you may simplify your query : select o.* from objects o where o."userId" = :userid UNION select o.* from objects o where o."userId" IN (SELECT r."followeeId" FROM relationships r WHERE r."followerId"= :userid) postgres@[local]/laura=# explain (analyze, buffers) select o.* from objects o where o."userId" = 1 UNION select o.* from objects o where o."userId" IN (SELECT r."followeeId" FROM relati onships r WHERE r."followerId"=1) ORDER BY id DESC ; QUERY PLAN - --- Sort (cost=8622.04..8767.98 rows=58376 width=40) (actual time=1.041..1.053 rows=314 loops=1) Sort Key: o.id DESC Sort Method: quicksort Memory: 39kB Buffers: shared hit=1265 -> HashAggregate (cost=3416.92..4000.68 rows=58376 width=40) (actual time=0.900..1.006 rows=314 loops=1) Group Key: o.id, o."userId", o.data Batches: 1 Memory Usage: 1585kB Buffers: shared hit=1265 -> Append (cost=0.42..2979.10 rows=58376 width=40) (actual time=0.024..0.816 rows=314 loops=1) Buffers: shared hit=1265 -> Index Scan using "objects_userId_idx" on objects o (cost=0.42..3.10 rows=17 width=21) (actual time=0.003..0.003 rows=0 loops=1) Index Cond: ("userId" = 1) Buffers: shared hit=3 -> Nested Loop (cost=0.70..2684.12 rows=58359 width=21) (actual time=0.020..0.794 rows=314 loops=1) Buffers: shared hit=1262 -> Index Only Scan using "relationships_followerId_followeeId_idx" on relationships r (cost=0.28..7.99 rows=315 width=4) (actual time=0.011..0.030 rows=315 loops= 1) Index Cond: ("followerId" = 1) Heap Fetches: 0 Buffers: shared hit=3 -> Index Scan using "objects_userId_idx" on objects o_1 (cost=0.42..6.65 rows=185 width=21) (actual time=0.002..0.002 rows=1 loops=315)
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
On 11/12/24 21:37, Tom Lane wrote: Achilleas Mantzios writes: Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with postgres as user 1000. Now at some point we realized that whenever we run a pgsql cluster with another user (I found that after spending two good days testing), the above query runs in about 1 second. With user postgres 1000 in 30 seconds. As you saw the perf output are completely different. Don't recall details offhand, but in some situations where the calling SQL user doesn't have permissions to read particular columns, the planner will not consult statistics for those columns. That can lead to a different, less optimal plan being used. Maybe something like that is happening here? Thank you Tom, In all runs of this query the user is postgres (superuser). Besides, this is just a simple query run on a FOREIGN TABLE living in a MS SQL Server. regards, tom lane
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
Dear All false alert, I run strace and it was obvious the slow one was producing huge debug output, while the fast one did not. It was not even a tds_fdw issue. It was freetds. Turned out we have forgotten enabled debugging inside the freetds configuration. You will ask me we did I get this effect of debugging only when run with user "postgres" ? Because /tmp/freetds.log belonged to postgres! I changed owner to nobody:nogroup and dont even need to restart postgres. !! I am sorry for all the noise. Thank you so much for your prompts !! On 11/12/24 23:46, Tomas Vondra wrote: On 11/12/24 20:37, Tom Lane wrote: Achilleas Mantzios writes: Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with postgres as user 1000. Now at some point we realized that whenever we run a pgsql cluster with another user (I found that after spending two good days testing), the above query runs in about 1 second. With user postgres 1000 in 30 seconds. As you saw the perf output are completely different. Don't recall details offhand, but in some situations where the calling SQL user doesn't have permissions to read particular columns, the planner will not consult statistics for those columns. That can lead to a different, less optimal plan being used. Maybe something like that is happening here? I don't know, the query is pretty trivial, and the estimates seemed exactly the same in both cases. And it shouldn't affect how the query gets planned on the MSSQL side. But this seems really strange: Planning Time: 14029.724 ms ... Execution Time: 15102.803 ms It's not about the execution, it's about the planning. I have no idea why should the planning take this long, except maybe for waiting for a lock, or something like that. But that's not really consistent with the profile ... it's weird. I'm not familiar with tds_fdw, but I see there are a bunch of table options [1] that might affect this, namely: * use_remote_estimate * local_tuple_estimate * row_estimate_method (defaults to 'execute') Are you sure these are set to the same value on both machines? Wild random guesses: 1) Could you try running the query with jit=off? 2) Did you run ANALYZE on the foreign table? Could matter when not using remote estimates (use_remote_estimate=false). 3) Could it be some sort of memory pressure/swapping? But that would look different in the profile, AFAIK. regards [1] https://github.com/tds-fdw/tds_fdw/blob/master/ForeignTableCreation.md
Re: Slow performance of collate "en_US.utf8"
Hi Alexey On 2/27/25 15:54, Alexey Borschev wrote: Hi everyone! I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4. Test query: explain (analyze, costs, buffers, verbose) SELECT ('БвЁжЫйяЙSёYz&$' || gen.id) collate "en_US.utf8" FROM generate_series(1, 1) AS gen(id) order by 1 desc; I've got execution time like: Execution Time: 73.068 ms Same poor result with ru_RU.UTF8. With other collations time is much better: explain (analyze, costs, buffers, verbose) select ('БвЁжЫйяЙSёYz&$' || gen.id) collate "C" from generate_series(1, 1) AS gen(id) order by 1 desc; Execution Time: 4.792 ms explain (analyze, costs, buffers, verbose) SELECT ('БвЁжЫйяЙSёYz&$' || gen.id) collate "C.utf8" FROM generate_series(1, 1) AS gen(id) order by 1 desc; Execution Time: 7.473 ms explain (analyze, costs, buffers, verbose) select ('БвЁжЫйяЙSёYz&$' || gen.id) collate "und-x-icu" from generate_series(1, 1) AS gen(id) order by 1 desc; Execution Time: 13.282 ms Yes, collate C is fastest, ICU collations is ~ 2 slower then C, but "en_US.utf8" is ~ 10x slower! I suspect it is some performance issue over there. Can someone of PG hackers reproduce this please? 1) This PG17 instance was installed with default options, and initdb got en_US.utf8 as system default collation and created PG cluster with it. It seems like most PG databases are created this way with en_US.utf8 by default. It seems you initialized the cluster with libc as the locale provider. Have you tried with icu ? 2) Typical text\varchar columns are created with DB default en_US.utf8 and performs poor. explain (analyze, costs, buffers, verbose) select ('БвЁжЫйяЙSёYz&$' || gen.id) from generate_series(1, 1) AS gen(id) order by 1 desc; Execution Time: 73.600 ms 3) The index search operations are also slower with en_US.utf8, but the difference is not as high. Please see attached file with test table and indexes, with tests on index performance. System Details: postgres@borschev-pg-copydb1:~$ uname -a Linux borschev-pg-copydb1 6.1.0-7-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.20-1 (2023-03-19) x86_64 GNU/Linux postgres@borschev-pg-copydb1:~$ cat /etc/issue Debian GNU/Linux trixie/sid \n \l select version(); PostgreSQL 17.4 (Debian 17.4-1.pgdg110+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit SELECT * FROM pg_config(); |name |setting | |-|--| |BINDIR |/usr/lib/postgresql/17/bin | |DOCDIR |/usr/share/doc/postgresql-doc-17 | |HTMLDIR |/usr/share/doc/postgresql-doc-17 | |INCLUDEDIR |/usr/include/postgresql | |PKGINCLUDEDIR |/usr/include/postgresql | |INCLUDEDIR-SERVER|/usr/include/postgresql/17/server | |LIBDIR |/usr/lib/x86_64-linux-gnu | |PKGLIBDIR |/usr/lib/postgresql/17/lib | |LOCALEDIR |/usr/share/locale | |MANDIR |/usr/share/postgresql/17/man | |SHAREDIR |/usr/share/postgresql/17 | |SYSCONFDIR |/etc/postgresql-common | |PGXS |/usr/lib/postgresql/17/lib/pgxs/src/makefiles/pgxs.mk | |CONFIGURE | '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--run