Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Hi everyone, I just got the same issue on 9.6.8: 2018-05-15 11:52:01 MSK 33558 @ from [vxid:317/92895305 txid:0] [] ERROR: found xmin 2808837517 from before relfrozenxid 248712603 2018-05-15 11:52:01 MSK 33558 @ from [vxid:317/92895305 txid:0] [] CONTEXT: automatic vacuum of table "template0.pg_catalog.pg_authid" Additional details: 1) bt_index_check - no errors on both indexes. 2) SELECT pg_truncate_visibility_map('pg_authid'::regclass); + vacuum freeze - doesn't help. 3)Total 6 rows affected: template1=# select oid, ctid, xmin, xmax, cmin, cmax from pg_authid where xmin::text::bigint > (select relfrozenxid::text::bigint from pg_class where relname='pg_authid'); oid| ctid |xmin| xmax | cmin | cmax ---+++--+--+-- 183671986 | (0,90) | 3039161773 |0 |0 |0 183106699 | (1,48) | 2576823237 |0 |0 |0 183921770 | (1,50) | 3265971811 |0 |0 |0 183921914 | (1,52) | 3266122344 |0 |0 |0 187988173 | (1,58) | 4258893789 |0 |0 |0 182424977 | (1,62) | 2808837517 |0 |0 |0 on total two pages. 4) template1=# select relfrozenxid from pg_catalog.pg_class where relname = 'pg_authid'; relfrozenxid -- 2548304492 5)Rows itself looks pretty valid and correspond to the actual database users. 7)No database/server crash happened last few years, no disk errors/problems. I feel it could be related with vacuum skip locked pages patch + freeze + shared catalog combination, but cannot prove it yet. Looking for possible course of action. Probably simplest fix - drop and recreate these 6 affected users, but so far I willing spent some time research into this issue. -- Maxim Boguk Senior Postgresql DBA http://dataegret.com/ <http://www.postgresql-consulting.com/> Phone RU: +7 985 433 Phone UA: +380 99 143 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Hi Andres, > > > Looking for possible course of action. > > Probably simplest fix - drop and recreate these 6 affected users, but so > > far I willing spent some time research into this issue. > > Could you use pageinspect to get the infomasks for the affected tuples? > > Greetings, > > Andres Freund > For sample: postgres=# vacuum pg_catalog.pg_authid; ERROR: found xmin 2894889518 from before relfrozenxid 248712603 select ctid, xmin, xmax, cmin, cmax from pg_catalog.pg_authid where xmin::text::bigint=2894889518; ctid |xmin| xmax | cmin | cmax ++--+--+-- (1,26) | 2894889518 |0 |0 |0 postgres=# SELECT * FROM heap_page_items(get_raw_page('pg_authid', 1)) where t_ctid::text='(1,26)'; -[ RECORD 1 ]--- lp | 26 lp_off | 4656 lp_flags| 1 lp_len | 144 t_xmin | 2894889518 t_xmax | 0 t_field3| 0 t_ctid | (1,26) t_infomask2 | 32779 t_infomask | 10507 t_hoff | 32 t_bits | 1100 t_oid | 189787727 Any new role created in DB instantly affected by this issue. In the same time: select relfrozenxid from pg_class where relname='pg_authid'; relfrozenxid -- 2863429136 So it's interesting where value of " from before relfrozenxid 248712603" come from. -- Maxim Boguk Senior Postgresql DBA http://dataegret.com/ <http://www.postgresql-consulting.com/> Phone RU: +7 985 433 Phone UA: +380 99 143 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
On Tue, May 22, 2018 at 9:47 PM, Andres Freund wrote: > Hi, > > On 2018-05-22 21:30:43 +0300, Maxim Boguk wrote: > > For sample: > > > > postgres=# vacuum pg_catalog.pg_authid; > > ERROR: found xmin 2894889518 from before relfrozenxid 248712603 > > > > select ctid, xmin, xmax, cmin, cmax from pg_catalog.pg_authid where > > xmin::text::bigint=2894889518; > > ctid |xmin| xmax | cmin | cmax > > ++--+--+-- > > (1,26) | 2894889518 |0 |0 |0 > > > > > > postgres=# SELECT * FROM heap_page_items(get_raw_page('pg_authid', 1)) > > where t_ctid::text='(1,26)'; > > -[ RECORD 1 > > ]--- > > > > > > lp | 26 > > lp_off | 4656 > > lp_flags| 1 > > lp_len | 144 > > t_xmin | 2894889518 > > t_xmax | 0 > > t_field3| 0 > > t_ctid | (1,26) > > t_infomask2 | 32779 > > t_infomask | 10507 > > t_hoff | 32 > > t_bits | 1100 > > t_oid | 189787727 > > > > So this row has, if I didn't screw up decoding the following infomask > bits set:: > HEAP_HASNULL > HEAP_HASVARWIDTH > HEAP_HASOID > HEAP_XMIN_COMMITTED > HEAP_XMAX_INVALID > HEAP_UPDATED > > So it's not been previously frozen, which I was wondering about. > > > > Which indeed makes it > > Any new role created in DB instantly affected by this issue. > > What's txid_current()? > > > > > > In the same time: > > > > select relfrozenxid from pg_class where relname='pg_authid'; > > relfrozenxid > > -- > >2863429136 > > > > So it's interesting where value of " from before relfrozenxid 248712603" > > come from. > > Hm. That's indeed odd. Could you get a backtrace of the error with "bt > full" of the error? > > Greetings, > > Andres Freund > select txid_current(); txid_current -- 41995913769 About gdb bt - it's tricky because it is mission critical master db of huge project. I'll will try promote backup replica and check is issue persist there and if yes - we will have our playground for a while, but it will require sometime to arrange. -- Maxim Boguk Senior Postgresql DBA http://dataegret.com/ <http://www.postgresql-consulting.com/> Phone RU: +7 985 433 Phone UA: +380 99 143 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
On Tue, May 22, 2018 at 10:30 PM, Andres Freund wrote: > Hi, > > On 2018-05-22 22:18:15 +0300, Maxim Boguk wrote: > > On Tue, May 22, 2018 at 9:47 PM, Andres Freund > wrote: > > > > select relfrozenxid from pg_class where relname='pg_authid'; > > > > relfrozenxid > > > > -- > > > >2863429136 > > > select txid_current(); > > txid_current > > -- > > 41995913769 > > So that's an xid of 3341208114, if you leave the epoch out. What's > > autovacuum_freeze_max_age set to in that cluster? > postgres=# show autovacuum_freeze_max_age; autovacuum_freeze_max_age --- 2 (default value I think) > Can you show pg_controldata output, and > > relminmxid from that cluster? > postgres@db:~$ /usr/lib/postgresql/9.6/bin/pg_controldata -D /var/lib/postgresql/9.6/main pg_control version number:960 Catalog version number: 201608131 Database system identifier: 6469368654711450114 Database cluster state: in production pg_control last modified: Tue 22 May 2018 10:20:14 PM MSK Latest checkpoint location: CCB5/F9C37950 Prior checkpoint location:CCB0/43F316B0 Latest checkpoint's REDO location:CCB1/6706BD88 Latest checkpoint's REDO WAL file:0001CCB10067 Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 9:3341161759 Latest checkpoint's NextOID: 190071899 Latest checkpoint's NextMultiXactId: 59416233 Latest checkpoint's NextMultiOffset: 215588532 Latest checkpoint's oldestXID:2814548646 Latest checkpoint's oldestXID's DB: 16400 Latest checkpoint's oldestActiveXID: 3341161759 Latest checkpoint's oldestMultiXid: 54264778 Latest checkpoint's oldestMulti's DB: 16400 Latest checkpoint's oldestCommitTsXid:2814548646 Latest checkpoint's newestCommitTsXid:3341161758 Time of latest checkpoint:Tue 22 May 2018 10:05:16 PM MSK Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location:0/0 Backup end location: 0/0 End-of-backup record required:no wal_level setting:replica wal_log_hints setting:on max_connections setting: 2000 max_worker_processes setting: 8 max_prepared_xacts setting: 0 max_locks_per_xact setting: 64 track_commit_timestamp setting: on Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment:16777216 Maximum length of identifiers:64 Maximum columns in an index: 32 Maximum size of a TOAST chunk:1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 postgres=# select datname,datfrozenxid,datminmxid from pg_database order by datname; datname | datfrozenxid | datminmxid ---+--+ **| 2815939794 | 54265194 postgres | 2863429136 | 54280819 template0 | 3148297669 | 59342106 template1 | 2816765546 | 59261794 > > I might be daft here, but it's surely curious that the relfrozenxid from > the error and pg_catalog are really different (number of digits): > catalog: 2863429136 > error:248712603 > > > > About gdb bt - it's tricky because it is mission critical master db of > > huge project. > > I'll will try promote backup replica and check is issue persist there and > > if yes - we will have our playground for a while, but it will require > > sometime to arrange. > > You should be ok to just bt that in the running cluster, but I > definitely understand if you don't want to do that... I'd appreciate if > you set up the a playground, because this seems like something that'll > reappear. > My gdb kunf-fu exceptionally rusty now, so I better play in sandbox. And if error did't exist on fresh promoted replica it will give us useful info as well. -- Maxim Boguk Senior Postgresql DBA http://dataegret.com/ <http://www.postgresql-consulting.com/> Phone RU: +7 985 433 Phone UA: +380 99 143 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
> > > > About gdb bt - it's tricky because it is mission critical master db of > > huge project. > > I'll will try promote backup replica and check is issue persist there and > > if yes - we will have our playground for a while, but it will require > > sometime to arrange. > > You should be ok to just bt that in the running cluster, but I > definitely understand if you don't want to do that... I'd appreciate if > you set up the a playground, because this seems like something that'll > reappear. > > Greetings, > > Andres Freund > Ok this issue reproducible on promoted replica. So now I have my playground. Will provide gdb info in few hours. -- Maxim Boguk Senior Postgresql DBA http://dataegret.com/ <http://www.postgresql-consulting.com/> Phone RU: +7 985 433 Phone UA: +380 99 143 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
On Thu, May 24, 2018 at 12:38 PM, Maxim Boguk wrote: > > >> >> > About gdb bt - it's tricky because it is mission critical master db of >> > huge project. >> > I'll will try promote backup replica and check is issue persist there >> and >> > if yes - we will have our playground for a while, but it will require >> > sometime to arrange. >> >> You should be ok to just bt that in the running cluster, but I >> definitely understand if you don't want to do that... I'd appreciate if >> you set up the a playground, because this seems like something that'll >> reappear. >> >> Greetings, >> >> Andres Freund >> > > bt full : > #0 errmsg_internal (fmt=0x555b62e6eb70 "found xmin %u from before relfrozenxid %u") at /build/postgresql-9.6-9VrjLn/postgresql-9.6-9.6.8/build/../src/backend/utils/error/elog.c:828 edata = __func__ = "errmsg_internal" #1 0x555b62ad1cb9 in heap_prepare_freeze_tuple (tuple=, relfrozenxid=relfrozenxid@entry=248720453, relminmxid=relminmxid@entry=53644256, cutoff_xid=3485221679, cutoff_multi=, frz=frz@entry=0x555b640d1988, totally_frozen_p=0x7ffca32c0e90 "\001") at /build/postgresql-9.6-9VrjLn/postgresql-9.6-9.6.8/build/../src/backend/access/heap/heapam.c:6685 changed = freeze_xmax = 0 '\000' xid = totally_frozen = __func__ = "heap_prepare_freeze_tuple" #2 0x555b62bfd2b5 in lazy_scan_heap (aggressive=0 '\000', nindexes=2, Irel=0x555b64095948, vacrelstats=, options=26, onerel=0x555b64029498) at /build/postgresql-9.6-9VrjLn/postgresql-9.6-9.6.8/build/../src/backend/commands/vacuumlazy.c:1090 tuple_totally_frozen = 1 '\001' itemid = 0x7f2b7bc5d4fc buf = 18138842 page = offnum = 26 maxoff = 26 hastup = 1 '\001' nfrozen = 0 freespace = all_frozen = 1 '\001' tupgone = 0 '\000' prev_dead_count = 0 all_visible_according_to_vm = 0 '\000' all_visible = 1 '\001' has_dead_tuples = 0 '\000' visibility_cutoff_xid = 3490221678 relname = 0x555b640296a8 "pg_authid" relminmxid = 53644256 tups_vacuumed = 0 indstats = 0x555b64095928 relfrozenxid = 248720453 skipping_blocks = 0 '\000' frozen = 0x555b640d1988 initprog_val = {1, 2, 582} vacuumed_pages = 0 num_tuples = 80 nkeep = 0 ru0 = {tv = {tv_sec = 1527175061, tv_usec = 739743}, ru = {ru_utime = {tv_sec = 0, tv_usec = 6}, ru_stime = {tv_sec = 1, tv_usec = 416000}, {ru_maxrss = 9704, __ru_maxrss_word = 9704}, {ru_ixrss = 0, __ru_ixrss_word = 0}, {ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 33982, __ru_minflt_word = 33982}, {ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 0, __ru_nswap_word = 0}, {ru_inblock = 0, __ru_inblock_word = 0}, {ru_oublock = 8, __ru_oublock_word = 8}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 38446, __ru_nvcsw_word = 38446}, {ru_nivcsw = 2, __ru_nivcsw_word = 2}}} vmbuffer = 8763411 empty_pages = 0 nunused = 0 i = next_unskippable_block = buf = {data = 0x1 , len = -1557393520, maxlen = 32764, cursor = -1557393616} nblocks = 2 blkno = tuple = {t_len = 144, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 1}, ip_posid = 26}, t_tableOid = 1260, t_data = 0x7f2b7bc5e6b0} initprog_index = {0, 1, 5} #3 lazy_vacuum_rel (onerel=onerel@entry=0x555b64029498, options=options@entry=1, params=params@entry=0x7ffca32c11b0, bstrategy=) at /build/postgresql-9.6-9VrjLn/postgresql-9.6-9.6.8/build/../src/backend/commands/vacuumlazy.c:253 vacrelstats = Irel = 0x555b64095948 nindexes = 2 ru0 = {tv = {tv_sec = 93850993708032, tv_usec = 23936}, ru = {ru_utime = {tv_sec = 128, tv_usec = 93851007694584}, ru_stime = {tv_sec = 0, tv_usec = 0}, {ru_maxrss = 140723046059424, __ru_maxrss_word = 140723046059424}, {ru_ixrss = 93850989549501, __ru_ixrss_word = 93850989549501}, {ru_idrss = 16, __ru_idrss_word = 16}, {ru_isrss = 93851007694584, __ru_isrss_word = 93851007694584}, {ru_minflt = 140723046059472, __ru_minflt_word = 140723046059472}, {ru_majflt = 93850989559601, __ru_majflt_word = 93850989559601}, { ru_nswap = 93850994010032, __ru_nswap_word = 93850994010032}, {ru_inblock = 139833259949608, __ru_inblock_word = 139833259949608}, {ru_oublock = 93850994010032, __ru_oublock_word = 9385099401003
Re: Missing query plan for auto_explain.
On Tue, Aug 30, 2022 at 1:38 PM Matheus Martin < matheus.mar...@voidbridge.com> wrote: > Our Postgres recently started reporting considerably different execution > times for the same query. When executed from our JDBC application the > Postgres logs report an average execution time of 1500 ms but when the query > is manually executed through `psql` it doesn't take longer than 50 ms. > > With a view to investigate discrepancies in the plan we enabled > `auto_explain` in `session_preload_libraries` with > `auto_explain.log_min_duration = '1s'`. All application servers were bounced > to ensure new connections were created and picked up the changes. However > this trouble query does not have an explain plan printed, even when its > execution time exceeds the threshold (other queries do though). > > Does anyone have ideas of why the explain plan is not being printed? > > Sample log entry for trouble query executed from application: > ``` > Aug 26 09:11:33 db-931 postgres[8106]: [66-1] 2022-08-26 09:11:33 GMT [8106]: > [5-1] db=betwave,user=betwave_app_readonly_user LOG: duration: 1423.481 ms > bind : > > My understanding of how to auto_explain work - it deals only for execution calls, but in your case duration: 1423.481 ms on BIND call, before query execution. At least in my understanding - auto_explain cannot work and will not help in case of a slow BIND call (because it's a time when the query is planned but not executed). According documentation: "Query planning typically occurs when the Bind message is processed. If the prepared statement has no parameters, or is executed repeatedly, the server might save the created plan and re-use it during subsequent Bind messages for the same prepared statement." Hard to say what the reason for slow planning, but one (there could be others) likely reason is JIT work. Do you have JIT enabled? -- Maxim Boguk Senior Postgresql DBA https://dataegret.com/ Phone UA: +380 99 143 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Re: Missing query plan for auto_explain.
On Thu, Sep 8, 2022 at 1:18 PM Matheus Martin wrote: > We do have JIT enabled `jit=on` with `jit_above_cost=10`. > > I am sorry but I don't quite understand what role JIT plays in > this situation with `auto_explain`. Could you please elaborate on that? > > In your log - time spent during the execution stage (where auto-exlain can help) but during the bind/planning stage (e.g. generation of plan). So you have a problem not with slow query execution, but with slow query planning, so autoexplain cannot help in that case. JIT is one possible explanation of planning stage slowdown. Can you run explain (analyze, costs, buffers, timing) of your query? And check how long the planning stage took and check if JIT was used or not (and how much time spent during JIT if it had been used). -- Maxim Boguk Senior Postgresql DBA https://dataegret.com/ Phone UA: +380 99 143 Phone AU: +61 45 218 5678
Unexpectedly huge memory usage (over 180x of work_mem) during hash join... confirmed by TopMemoryContext results (postgresql 13.10)
hit=1795287, temp written=247932 -> Parallel Seq Scan on vacancy v (cost=0.00..414267.15 rows=23473845 width=8) (actual time=0.011..5221.328 rows=18198520 loops=4) Buffers: shared hit=1795287 -> Parallel Hash (cost=297333.92..297333.92 rows=15801812 width=8) (actual time=4352.140..4352.141 rows=12280833 loops=4) Buckets: 1048576 Batches: 128 Memory Usage: 23296kB Buffers: shared hit=1393158, temp written=167644 -> Parallel Seq Scan on account_service_package_entry aspe (cost=0.00..297333.92 rows=15801812 width=8) (actual time=0.015..2520.045 rows=12280833 loops=4) Buffers: shared hit=1393158 -> Parallel Hash (cost=258846.24..258846.24 rows=11429325 width=4) (actual time=3424.467..3424.468 rows=8908616 loops=4) Buckets: 1048576 Batches: 64 Memory Usage: 30080kB Buffers: shared hit=900350, temp written=103252 -> Parallel Seq Scan on employer_service es (cost=0.00..258846.24 rows=11429325 width=4) (actual time=0.010..2087.361 rows=8908616 loops=4) Filter: (price = 0) Rows Removed by Filter: 3624132 Buffers: shared hit=900350 Planning: Buffers: shared hit=236 Planning Time: 25.494 ms Execution Time: 541359.085 ms There are no big missestimations in rows, but during moments of high memory usage each parallel worker used over 5GB of RAM (so total over 20GB of RAM 3 worker + leader). Memory context results of single worker: TopMemoryContext: 3357032 total in 16 blocks; 46064 free (16 chunks); 3310968 used ... TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used PortalHoldContext: 24632 total in 2 blocks; 7384 free (0 chunks); 17248 used PortalContext: 540672 total in 69 blocks; 8568 free (10 chunks); 532104 used: ExecutorState: 1082567568 total in 124072 blocks; 4073608 free (7018 chunks); 1078493960 used HashTableContext: 74808 total in 5 blocks; 25008 free (3 chunks); 49800 used HashBatchContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used HashTableContext: 74808 total in 5 blocks; 25056 free (3 chunks); 49752 used HashBatchContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used HashTableContext: 40960 total in 3 blocks; 12328 free (6 chunks); 28632 used HashBatchContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used HashTableContext: 4156883208 total in 124074 blocks; 146264 free (30 chunks); 4156736944 used HashBatchContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used HashTableContext: 8192 total in 1 blocks; 4376 free (7 chunks); 3816 used HashBatchContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used ... ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used Relcache by OID: 16384 total in 2 blocks; 1424 free (2 chunks); 14960 used CacheMemoryContext: 1163704 total in 14 blocks; 197832 free (0 chunks); 965872 used relation rules: 24576 total in 6 blocks; 4896 free (1 chunks); 19680 used: pg_settings index info: 2048 total in 2 blocks; 528 free (1 chunks); 1520 used: pg_toast_2619_index 48 more child contexts containing 91136 total in 83 blocks; 27632 free (45 chunks); 63504 used PrivateRefCount: 8192 total in 1 blocks; 2616 free (0 chunks); 5576 used MdSmgr: 8192 total in 1 blocks; 3880 free (2 chunks); 4312 used LOCALLOCK hash: 32768 total in 3 blocks; 16824 free (8 chunks); 15944 used Timezones: 104120 total in 2 blocks; 2616 free (0 chunks); 101504 used ErrorContext: 8192 total in 1 blocks; 7928 free (4 chunks); 264 used Grand total: 5245755200 bytes in 248596 blocks; 4995280 free (7224 chunks); 5240759920 used Also during query execution there was huge amount of small temp files: find ~postgres/13/main/base/pgsql_tmp/ -type f | wc -l 1459383 There are two questions: 1)is this behavior normal/expected? 2)is there any way to limit memory usage of such queries to prevent OOM in random times (memory usage of almost 200x of work_mem very surprised me). PS: after crash recovery of a database with a few millions of files in pgsql_tmp takes ages (however it is fixed in newer versions so it's not a critical issue). -- Maxim Boguk Senior Postgresql DBA https://dataegret.com/ Phone UA: +380 99 143 Phone AU: +61 45 218 5678
Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2
Cond: (school_id = 36) > Planning time: 0.334 ms > Execution time: 6.542 ms > (4 rows) > > db=> select school_id, count(*) from common_student group by school_id > order by count(*) limit 6 ; > school_id | count > ---+--- > 1 |50 > 88 | 161 > 53 | 252 > 94 | 422 > 31 | 434 > 68 | 454 > (6 rows) > > dvpjxbzc=> select school_id, count(*) from common_student where school_id > = 36 group by school_id ; > school_id | count > ---+--- > 36 | 1388 > (1 row) > > db=> explain analyse select * from common_student where school_id = 1 ; > QUERY PLAN > > > -- > Bitmap Heap Scan on common_student (cost=4.91..243.76 rows=64 width=385) > (actual time=0.128..0.267 rows=50 loops=1) >Recheck Cond: (school_id = 1) >Heap Blocks: exact=16 >-> Bitmap Index Scan on "unique common_student" (cost=0.00..4.90 > rows=64 width=0) (actual time=0.110..0.110 rows=50 loops=1) > Index Cond: (school_id = 1) > Planning time: 0.177 ms > Execution time: 0.414 ms > (7 rows) > > db=> select VERSION(); >version > > > - > PostgreSQL 10.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2, > 64-bit > (1 row) > > -- Maxim Boguk Senior Postgresql DBA https://dataegret.com/ Phone RU: +7 985 433 Phone UA: +380 99 143 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk "Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"