autovacuum_freeze_max_age on append-only tables
Hi All, I'm attempting to mimic a new feature in version 13 where INSERTS will trigger vacuum for an append-only table. I'm using v11 and configuring autovacuum_freeze_max_age to a value representing some number of minutes worth of inserts on a table containing the current day events. I'm looking to understand the details of how the vacuum operates and what to expect and plan for. I first ran into an issue when a script attempted to alter the table to change the value of autovacuum_freeze_max_age while a vacuum was running. I know there is a lock conflict while the vacuum is running but I was under the impression that autovacuum_vacuum_cost_limit would limit the time blocked. The ALTER hung much longer than I expected. I'm apparently needing an education on how this "to avoid wraparound" vacuum differs from any other. I've seen it referenced as "more aggressive" but I'd like details. An upgrade to 13 is "right around the corner". Pointers to documentation I might have missed is be appreciated. -Senor
Re: autovacuum_freeze_max_age on append-only tables
Thank you both Laurenz and Peter. Laurenz - It was an article you posted a couple years ago introducing the V13 feature that got me thinking about the insert-only situation I had. Peter - I had been improperly holding anti-wraparound and aggressive in my mind as related in a way they are not. You cleared that up. 2 last questions (maybe): Are the autovacuum_vacuum_cost_* settings handled any differently for 'to avoid wraparound' vacuums? I understand that it won't give up a lock but I was expecting it to still back off due to cost and allow the query with conflicting lock to proceed. Is there any benefit to manually running a vacuum every so many inserts as opposed to using autovacuum_freeze_max_age. And in this case should it be a vacuum freeze. Rows are never updated or deleted except for the occasional roll back due to dropped network connections. Thanks again -Senor On 4/21/2022 6:35, Laurenz Albe wrote: On Wed, 2022-04-20 at 23:06 +, senor wrote: I'm apparently needing an education on how this "to avoid wraparound" vacuum differs from any other. I've seen it referenced as "more aggressive" but I'd like details. The difference is twofold, as far as I know: - it will not skip any pages just because it happens not to get a lock on them - it will refuse to die if the lock it holds on the table conflicts with a user lock Unless you are in the habit of taking strong locks on the table, you shouldn't notice a difference. Anti-wraparound VACUUM is a routine activity and does not interfere with DML, just like a normal VACUUM. Yours, Laurenz Albe
Re: autovacuum_freeze_max_age on append-only tables
Thanks David In any case the cost-based stuff throttles I/O only (per the docs at least) but even while sleeping it still holds its lock. And it won't be kicked off of the lock by other processes. I don't see where it is documented that the autovacuum cost settings are altered during the anti-wraparound vacuum so I presume it will still sleep by default. I knew I was misunderstanding something. I had picked up the impression that the vacuum process cost_delay released the lock for the period. Not just do nothing. Seems like that would be worth mentioning in the Docs. I'm learning this from the inside out in the tradition of "well someone has to do it". I'm sure I'm not alone. -Senor
empty pg_stat_progress_vacuum
Hi all, I'm not seeing any records in pg_stat_progress_vacuum even though there are always around three autovacuum operations listed in pg_stat_activity and at least one of them is "(to prevent wraparound)". They are in state 'active' and state_change is within the last 3 hours. When logging vacuums I see one just completed stating the elapsed time was 9 hours. It seems to be trying to catch up from a long running transaction. From the vacuum logs I can say it's progressing although it's not working on the tables with the oldest xids. I'm mainly wanting to understand why I'm not seeing processes in pg_stat_progress_vacuum. If I rapidly refresh I occasionally see an entry for a very small table. A manually started vacuum didn't show up either. Pg version 11.4 Rebooted a couple days ago which may have cleared a hung transaction. Logs were not enabled at the time. Disk IO is fluctuating to a degree it doesn't seem like any bottleneck. I'm not convinced the RAID is performing as well as it should but I don't think it's bad enough to be of concern. I have 2 other 11.4 installations where I've never had an issue relying on pg_stat_progress_vacuum. stats_temp_directory = '/var/run/postgresql' and directory contents are updating autovacuum | on autovacuum_analyze_scale_factor | 0.1 autovacuum_analyze_threshold| 50 autovacuum_freeze_max_age | 2 autovacuum_max_workers | 6 autovacuum_multixact_freeze_max_age | 4 autovacuum_naptime | 4 autovacuum_vacuum_cost_delay| 20 autovacuum_vacuum_cost_limit| 1500 autovacuum_vacuum_scale_factor | 0.2 autovacuum_vacuum_threshold | 50 autovacuum_work_mem | -1 work_mem | 10240 Thanks for any hints and recommendations, Senor
autovacuum hung on simple tables
Hi All, I'm still trying to get a better understanding of the autovacuum process. This is a different postgres installation as my previous posts and confusing me in new ways. Still 11.4 running on CentOS 7 and 8 nvme in software raid This issue started with postgres "...not accepting commands to avoid wraparound...". On this server I was able to stop all access to DB and dedicate resources to only postgres. I thought I could allow autovacuum to do its thing with a ton of workers. I think everything boils down to 2 questions: 1. Can autovacuum or manual vacuum be coerced into dealing with oldest first? 1a. Where might I find advice on configuring postgres resources for maximum cpu & memory maintenance use. In other words quickest path out of "not accepting commands" land. Besides increasing autovacuum_freeze_max_age. 2. What can cause autovacuum to stall? Could associated toast or index bne the cause. It appeared that autovacuum was not choosing the tables with the oldest xmin so I produced an ordered list of oldest tables with: SELECT oid::regclass, age(relfrozenxid) FROM pg_class WHERE relkind IN ('r', 't', 'm') AND age(relfrozenxid) > 20 ORDER BY 2 DESC The list contained over 6000 tables from pg_toast. They all belonged to daily reports tables. The reports are created daily and not touched again. Most of the autovacuums that did start seem to be hung. Never completing even on the simplest tables. The newest 2 autovacuums in the list are completing about one every couple seconds. CPU and disk IO are nearly idle. An example table is shown here: phantom=# select phantom-# pg_size_pretty(pg_total_relation_size(relid)) as total_size, phantom-# pg_size_pretty(pg_relation_size(relid, 'main')) as relation_size_main, phantom-# pg_size_pretty(pg_relation_size(relid, 'fsm')) as relation_size_fsm, phantom-# pg_size_pretty(pg_relation_size(relid, 'vm')) as relation_size_vm, phantom-# pg_size_pretty(pg_relation_size(relid, 'init')) as relation_size_init, phantom-# pg_size_pretty(pg_table_size(relid)) as table_size, phantom-# pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as external_size phantom-# from phantom-# pg_catalog.pg_statio_user_tables phantom-# where phantom-# relname like 'report_user_439'; total_size | relation_size_main | relation_size_fsm | relation_size_vm | relation_size_init | table_size | external_size ++---+--+++--- 80 kB | 8192 bytes | 24 kB | 8192 bytes | 0 bytes| 48 kB | 72 kB (1 row) I scripted a vacuum loop using the oldest table list. It's extremely slow but it was making better progress than autovacuum was. Using ps I see that there were as many worker processes as defined with autovacuum_max_workers but pg_stat_activity consistantly showed 19. I killed the script thinking there might be a conflict. I saw no difference after 30 minutes so restarted script. Never saw anything in pg_stat_progress_vacuum. vacuum settings: name | setting -+--- autovacuum | on autovacuum_analyze_scale_factor | 0.1 autovacuum_analyze_threshold| 50 autovacuum_freeze_max_age | 2 autovacuum_max_workers | 40 autovacuum_multixact_freeze_max_age | 4 autovacuum_naptime | 4 autovacuum_vacuum_cost_delay| 0 autovacuum_vacuum_cost_limit| 5000 autovacuum_vacuum_scale_factor | 0.2 autovacuum_vacuum_threshold | 50 autovacuum_work_mem | -1 log_autovacuum_min_duration | 0 vacuum_cleanup_index_scale_factor | 0.1 vacuum_cost_delay | 0 vacuum_cost_limit | 200 vacuum_cost_page_dirty | 20 vacuum_cost_page_hit| 1 vacuum_cost_page_miss | 10 vacuum_defer_cleanup_age| 0 vacuum_freeze_min_age | 5000 vacuum_freeze_table_age | 15000 vacuum_multixact_freeze_min_age | 500 vacuum_multixact_freeze_table_age | 15000 I'm now thinking that autovacuum getting hung up is what caused the issue to begin with. I see nothing but the successful vacuums from the script and my own fat-fingering commands in the postgres logs (set at info). Any hints are appreciated. Senor
Re: autovacuum hung on simple tables
I'm a little late getting back to this but still have no solution. I 100% agree that updating postgres is best. The usual excuses for not updating apply. It will be done when it is allowed to be done. It remains 11.4 for now. I read through the changelogs up through 11.18 and didn't see anything obviously related. The underlying cause could be pretty involved and something I wouldn't recognize. Thank you Laurenz Albe for reminding me about an important clue. I had inadvertently executed a vacuum freeze on a table that pg_stat_activity indicated was currently being autovacuumed. The manual vacuum succeeded while leaving the previous autovacuum still showing active in pg_stat_activity. Manual vacuum always completes, never stalls, but also often does not appear in pg_stat_progress_vacuum unless it's a longer process. It appears the process completes the vacuum but does not register that fact. relfrozenxid of the main table is what would be expected but an associated toast table was still very old. Cancelling all pending vacuums of the table and manually running vacuum completes in a few seconds and both the main relation and toast are updated as expected with last vacuum time updated in pg_stat_all_tables. pg_stat_progress_vacuum never had any entry. Autoacuum and autovacuum analyze both get hung. I often see the same table listed multiple times in pg_stat_activity with different pids and: state: active backend_type: autovacuum worker wait_event_type: blank state_change: 10-15 seconds after backend_start - about how long it takes to manually run vacuum on most tables. What exactly does autovacuum rely on in the stats file? I ran strace on a hung autovacuum process and saw a repeated read of /run/postgresql/db_16384.stat (tempfs). The file is 740MB which is about the same as other similar installations I've reviewed. I'm lacking in overall experience in this though. One final oddity: I ran a query for oldest relfrozenxid and redirected to file. The query took around a minute. A few seconds after it finished, I queried for 'active' in pg_stat_activity and the oldest relfrozenxid query was still listed. A few seconds later it had cleared. Can a corrupted stats file prevent autovac from reading/writing? > I scripted a vacuum loop using the oldest table list. It's extremely slow but > it was > making better progress than autovacuum was. > > Using ps I see that there were as many worker processes as defined with > autovacuum_max_workers > but pg_stat_activity consistantly showed 19. I killed the script thinking > there might be a conflict. > I saw no difference after 30 minutes so restarted script. I am not sure what exactly you are actually doing here, but you should know that there can only be one VACUUM process per table. If there is already an anti-wraparound autovacuum running on the table, a manual VACUUM will simple be blocked until the autovacuum worker is done. > Never saw anything in pg_stat_progress_vacuum. Now that would be weird, except if VACUUM cannot get the required lock on the table. > vacuum settings: > name | setting > -+--- > autovacuum | on > autovacuum_freeze_max_age | 2 > autovacuum_max_workers | 40 > autovacuum_naptime | 4 > autovacuum_vacuum_cost_delay| 0 > autovacuum_vacuum_cost_limit| 5000 > autovacuum_work_mem | -1 > vacuum_freeze_min_age | 5000 > vacuum_freeze_table_age | 15000 > > I'm now thinking that autovacuum getting hung up is what caused the issue to > begin with. I see nothing > but the successful vacuums from the script and my own fat-fingering commands > in the postgres > logs (set at info). Sorry about the reply formatting. I tried using outlook web in Edge. Maybe that was a mistake. Thanks, Senor
vacuum TOAST tables
Hi All, PostgreSQL v11.19-1PGDG.rhel7 (upgrading to 12+ is so far only talk) CentOS 7.9 I see various recommendation for queries to return the tables most in need of a vacuum that exclude the TOAST tables with something like: ...AND n.nspname NOT IN ('pg_toast') My real issue is autovacuum not keeping up in spite of seeming to have the resources to do so. My question at the moment is whether it is best to include TOAST tables when scripting a manual vacuum of oldest tables. I'm currently querying for a list of oldest X number of tables and feeding to 5-20 "threads" and monitoring resources. if it's in pg_toast namespace (which is all of them), I execute a vacuum freeze on the main table. Repeating this as necessary. All are TOAST and they belong to sets of tables that are created over a day and never updated after. These tables are months old. I've asked the developers to initiate vacuums so at the moment I'm more interested in understanding best practice in this scenario. If I understand correctly, autovacuum handles tables and their associated TOAST tables separately but a manual vacuum will also vacuum the TOAST. When manually vacuuming does it matter whether it's the main table or TOAST? I'm not where I have access but I this is close to what I'm using. I've toggled between using limit and relfrozedxid greater than X. I want to use the least resource intensive combination of query and vacuum. SELECT c.oid::regclass, age(c.relfrozenxid), o.relname, o.relnamespace FROM pg_class c LEFT JOIN pg_class o on (c.relfilenode != 0 AND c.relfilenode = o.reltoastrelid) JOIN pg_namespace n on c.relnamespace = n.oid WHERE c.relkind IN ('r', 't', 'p') AND n.nspname IN ('public','pg_toast') AND age(c.relfrozenxid) > ${max_age} ORDER BY 2 DESC I've posted before about these same systems. It'll get to age(datfrozenxid) > 2,000,000,000 and is not able to keep up until I get it back down to under ~6. Then it starts humming along as if I "cleared" something. I appreciate any advice. Thanks Senor
Re: vacuum TOAST tables
Thank you Laurenz. Current settings: name | setting | unit | source -+---+--+ autovacuum | on| | default autovacuum_analyze_scale_factor | 0.1 | | default autovacuum_analyze_threshold| 50| | default autovacuum_freeze_max_age | 2 | | default autovacuum_max_workers | 15| | configuration file autovacuum_multixact_freeze_max_age | 4 | | default autovacuum_naptime | 1 | s| configuration file autovacuum_vacuum_cost_delay| 20| ms | default autovacuum_vacuum_cost_limit| 3000 | | configuration file autovacuum_vacuum_scale_factor | 0.2 | | default autovacuum_vacuum_threshold | 50| | default autovacuum_work_mem | -1| kB | default maintenance_work_mem| 2097152 | kB | configuration file max_parallel_maintenance_workers| 2 | | default vacuum_cleanup_index_scale_factor | 0.1 | | default vacuum_cost_delay | 0 | ms | default vacuum_cost_limit | 200 | | default vacuum_cost_page_dirty | 20| | default vacuum_cost_page_hit| 1 | | default vacuum_cost_page_miss | 10| | default vacuum_defer_cleanup_age| 0 | | default vacuum_freeze_min_age | 5000 | | default vacuum_freeze_table_age | 15000 | | default vacuum_multixact_freeze_min_age | 500 | | default vacuum_multixact_freeze_table_age | 15000 | | default I've compared 'vacuums per hour' with autovacuum_vacuum_cost_delay at 0 and default with no change. My gut says there's an issue with the stats collection. Seems like autovacuum is not correctly updating the stats. As I mentioned in previous posts, I often see long-running active autovacuums in pg_stat_activity on small tables and nothing in pg_stat_progress_vacuum. I found one reference (I can't find now) to a similar issue where the suspected problem was a huge and unintended number of tables. I eliminated over half and it did improve. I'm working to convince the powers that be to upgrade to pg15 for the new stats. Not holding my breath. I'll get back to the other thread when I get time to focus on it. For now my question is answered and I'll modify my scripts Your help is appreciated. If you have additional ideas, I'm all ears. Thanks, Senor From: Laurenz Albe Sent: Thursday, April 20, 2023 11:39 PM To: senor ; pgsql-general@lists.postgresql.org Subject: Re: vacuum TOAST tables On Fri, 2023-04-21 at 04:37 +, senor wrote: > PostgreSQL v11.19-1PGDG.rhel7 (upgrading to 12+ is so far only talk) > CentOS 7.9 > > If I understand correctly, autovacuum handles tables and their associated > TOAST tables separately > but a manual vacuum will also vacuum the TOAST. That is correct. > When manually vacuuming does it matter whether it's the main table or TOAST? It makes a difference. As superuser you can directly VACUUM a toast table, and that will no VACUUM the table it belongs to. However, if you VACUUM the main table, both tables will be vacuumed, which is more work. So if you only need VACUUM on the toast table, doing that directly will be cheaper. > I've posted before about these same systems. It'll get to age(datfrozenxid) > > 2,000,000,000 and > is not able to keep up until I get it back down to under ~6. Then it > starts humming along > as if I "cleared" something. That sounds unsavory. Did you set "autovacuum_freeze_max_age" to an extremely high value? Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
pg_upgrade --jobs
The pg_upgrade --jobs option is not passed as an argument when it calls pg_dump. I haven't found anything in docs or forums mentioning a reason for not supporting under certain circumstances other than possibly for pre-9.2. The pg_upgrade docs page states that it allows multiple CPUs to be used for dump and reload of schemas. Some databases I'm upgrading have 500,000+ tables and running with a single process is greatly increasing the upgrade time. I am also using the --link option. I have tried "--jobs 20", "--jobs=20", placing this option first and last and many other variations. I am upgrading 9.2.4 to 9.6.12 on CentOS 6. Varying hardware but all with 32+ CPU cores. su - postgres -c "/usr/pgsql-9.6/bin/pg_upgrade --jobs=20 --link \ --old-bindir=/usr/pgsql-9.2/bin/ \ --new-bindir=/usr/pgsql-9.6/bin/ \ --old-datadir=/var/lib/pgsql/9.2/data/ \ --new-datadir=/var/lib/pgsql/9.6/data/" I feel like there's a simple reason I've missed but this seems pretty straight forward. A secondary plan would be to find instructions for doing the same as "pg_upgrade --link" manually so I can run "pg_dump --jobs 20". Any assist is appreciated. Thanks, S. Cervesa
Re: pg_upgrade --jobs
Thank you for responding. I did see that note and should have included that as part of my question. Since pg_upgrade is in control of how it is calling pg_dump, is there a reason pg_upgrade cannot use the directory output format when calling pg_dump? Is the schema-only operation incompatible? From: Adrian Klaver Sent: Saturday, April 6, 2019 1:52 PM To: senor; pgsql-general@lists.postgresql.org Subject: Re: pg_upgrade --jobs On 4/6/19 11:44 AM, senor wrote: > The pg_upgrade --jobs option is not passed as an argument when it calls > pg_dump. I haven't found anything in docs or forums mentioning a reason for > not supporting under certain circumstances other than possibly for pre-9.2. > The pg_upgrade docs page states that it allows multiple CPUs to be used for > dump and reload of schemas. Some databases I'm upgrading have 500,000+ tables > and running with a single process is greatly increasing the upgrade time. > > I am also using the --link option. > I have tried "--jobs 20", "--jobs=20", placing this option first and last and > many other variations. > I am upgrading 9.2.4 to 9.6.12 on CentOS 6. > Varying hardware but all with 32+ CPU cores. > > su - postgres -c "/usr/pgsql-9.6/bin/pg_upgrade --jobs=20 --link \ > --old-bindir=/usr/pgsql-9.2/bin/ \ > --new-bindir=/usr/pgsql-9.6/bin/ \ > --old-datadir=/var/lib/pgsql/9.2/data/ \ > --new-datadir=/var/lib/pgsql/9.6/data/" > > I feel like there's a simple reason I've missed but this seems pretty > straight forward. https://www.postgresql.org/docs/9.6/app-pgdump.html "--jobs=njobs Run the dump in parallel by dumping njobs tables simultaneously. This option reduces the time of the dump but it also increases the load on the database server. You can only use this option with the directory ^^^ output format because this is the only output format where multiple ^ processes can write their data at the same time." > A secondary plan would be to find instructions for doing the same as > "pg_upgrade --link" manually so I can run "pg_dump --jobs 20". > Any assist is appreciated. > Thanks, > S. Cervesa > > -- Adrian Klaver adrian.kla...@aklaver.com
Re: pg_upgrade --jobs
Thanks Tom. I suppose "pg_dump can only parallelize data dumping" answers my original question as "expected behavior" but I would like to understand the reason better. My knowledge of Postgres and other DBMSs is at casual admin level with the occasional deep dive on specific errors or analysis. I'm not averse to getting into the code. Before my OP I searched for reasons that the schema-only option would prevent pg_dump from being able to run multiple jobs and didn't find anything that I understood to confirm either way. Is the limitation simply the state of development to date or is there something about dumping the schemas that conflicts with paralleling? I'm willing to do some studying if provided links to relevant articles. The --link option to pg_upgrade would be so much more useful if it weren't still bound to serially dumping the schemas of half a million tables. As already mentioned, if there is an alternate process that mimics pg_upgrade but allows for paralleling, I'm open to that. Thanks all From: Tom Lane Sent: Saturday, April 6, 2019 3:02 PM To: senor Cc: pgsql-general@lists.postgresql.org Subject: Re: pg_upgrade --jobs senor writes: > Since pg_upgrade is in control of how it is calling pg_dump, is there a > reason pg_upgrade cannot use the directory output format when calling > pg_dump? Is the schema-only operation incompatible? Well, there's no point in it. pg_dump can only parallelize data dumping, and there's none to be done in the --schema-only case that pg_upgrade uses. Also, since pg_upgrade *does* use parallelism across multiple pg_dump calls (if you've got multiple databases in the cluster), it'd be a bit problematic to have another layer of parallelism below that, if it did indeed do anything. You don't want "--jobs=10" to suddenly turn into 100 sessions. regards, tom lane
Re: pg_upgrade --jobs
Thanks Tom for the explanation. I assumed it was my ignorance of how the schema was handled that was making this look like a problem that had already been solved and I was missing something. I fully expected the "You're Doing It Wrong" part. That is out of my control but not beyond my influence. I suspect I know the answer to this but have to ask. Using a simplified example where there are 100K sets of 4 tables, each representing the output of a single job, are there any shortcuts to upgrading that would circumvent exporting the entire schema? I'm sure a different DB design would be better but that's not what I'm working with. Thanks From: Ron Sent: Saturday, April 6, 2019 4:57 PM To: pgsql-general@lists.postgresql.org Subject: Re: pg_upgrade --jobs On 4/6/19 6:50 PM, Tom Lane wrote: senor <mailto:frio_cerv...@hotmail.com> writes: [snip] The --link option to pg_upgrade would be so much more useful if it weren't still bound to serially dumping the schemas of half a million tables. To be perfectly blunt, if you've got a database with half a million tables, You're Doing It Wrong. Heavy (really heavy) partitioning? -- Angular momentum makes the world go 'round.
Re: pg_upgrade --jobs
Thank you Adrian. I'm not sure if I can provide as much as you'd need for a definite answer but I'll give you what I have. The original scheduled downtime for one installation was 24 hours. By 21 hours it had not completed the pg_dump schema-only so it was returned to operation. The amount of data per table is widely varied. Some daily tables are 100-200GB and thousands of reports tables with stats are much smaller. I'm not connected to check now but I'd guess 1GB max. We chose to use the --link option partly because some servers do not have the disk space to copy. The time necessary to copy 1-2TB was also going to be an issue. The vast majority of activity is on current day inserts and stats reports of that data. All previous days and existing reports are read only. As is all too common, the DB usage grew with no redesign so it is a single database on a single machine with a single schema. I get the impression there may be an option of getting the schema dump while in service but possibly not in this scenario. Plan B is to drop a lot of tables and deal with imports later. I appreciate the help. From: Adrian Klaver Sent: Sunday, April 7, 2019 8:19 AM To: senor; pgsql-general@lists.postgresql.org Subject: Re: pg_upgrade --jobs On 4/6/19 5:47 PM, senor wrote: > Thanks Tom for the explanation. I assumed it was my ignorance of how the > schema was handled that was making this look like a problem that had already > been solved and I was missing something. > > I fully expected the "You're Doing It Wrong" part. That is out of my control > but not beyond my influence. > > I suspect I know the answer to this but have to ask. Using a simplified > example where there are 100K sets of 4 tables, each representing the output > of a single job, are there any shortcuts to upgrading that would circumvent > exporting the entire schema? I'm sure a different DB design would be better > but that's not what I'm working with. An answer is going to depend on more information: 1) What is the time frame for moving from one version to another? Both the setup and the actual downtime. 2) There are 500,000+ tables, but what is the amount of data involved? 3) Are all the tables active? 4) How are the tables distributed across databases in the cluster and schemas in each database? > > Thanks > > > From: Ron > Sent: Saturday, April 6, 2019 4:57 PM > To: pgsql-general@lists.postgresql.org > Subject: Re: pg_upgrade --jobs > > On 4/6/19 6:50 PM, Tom Lane wrote: > > senor <mailto:frio_cerv...@hotmail.com> writes: > > > [snip] > > The --link option to pg_upgrade would be so much more useful if it > weren't still bound to serially dumping the schemas of half a million > tables. > > > > To be perfectly blunt, if you've got a database with half a million > tables, You're Doing It Wrong. > > Heavy (really heavy) partitioning? > > -- > Angular momentum makes the world go 'round. > > > -- Adrian Klaver adrian.kla...@aklaver.com
Re: pg_upgrade --jobs
I just noticed I missed Sherrylyn's post. I did some reading about Slony and believe it is would be useful if I had the time to dig in. As pointed out, it's not an out-of-the box solution. It is included on the TODO list though. For now I can only dream of the 86 second down time. Thanks From: Sherrylyn Branchaw Sent: Sunday, April 7, 2019 6:43 AM To: senor Cc: pgsql-general@lists.postgresql.org Subject: Re: pg_upgrade --jobs are there any shortcuts to upgrading that would circumvent exporting the entire schema? By "shortcuts," do you mean you want to minimize the time and energy you put into the upgrade, or that you want to minimize database downtime? If you mean downtime, I was able to upgrade a customer-facing database with ~350,000 tables from Postgres 9.0 to 9.6 last year with only 86 seconds of downtime, using Slony, but I had to make many custom modifications to Slony and test thoroughly beforehand, and it was not for the faint of heart, the pressed for time, or the inexperienced. There may be better ways (and if so, I would be curious to learn about them), but Slony was the tool with which I was most familiar at the time. This method does, of course, require exporting the entire schema, but because our only constraint was to minimize customer downtime, and the database was online while the schema was being exported, we didn't care how long it took. Your constraints may be different. For those reading: we do know that 350,000 tables is Doing It Wrong, and we're getting rid of them, but we decided being on an EOLed version of Postgres was worse and should be fixed first. Sherrylyn
Optimize pg_dump schema-only
Hi All, I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade --link". Since this schema only dump can't take advantage of parallel processing with jobs I'm looking for any preparation or configuration settings that can improve speed. 9.2 to 9.6 CentOS 6/64bit 512GB I see only one CPU of 32 doing anything and it's often at 100%. Disk IO is minimal. Memory use varies but always plenty to spare. During upgrade I'm running: Only the upgrade - no other services work_mem = 50MB maintenance_work_mem = 2048MB shared_buffers = 30GB max_locks_per_transaction = 4096 autovacuum = off autovacuum_freeze_max_age = 15 #Had previous issues with vacuum (to prevent wrap) Truthfully, I thought I had increased work_mem until starting this email. But increasing it is just a guess unless I get advice to do so here. I'm at a knowledge level where I can only guess at the relevance of vacuum, analyze or any other preparatory actions I can complete before taking postgres offline for upgrade. My feeling is that the bottleneck is the backend and not pg_dump. School me on that if needed please. Any advice and explanation is appreciated. - Senor
Re: Optimize pg_dump schema-only
I know from a previous post to the community that pg_dump --schema-only is single threaded and the --jobs option cannot benefit pg_dump in my case (single DB, Single schema, 10+ tables). Using pg_upgrade with the --link option is very fast except for the pg_dump portion of the upgrade which takes days. I think I am looking for advice on how to optimize the postgres process for best performance of a single thread making whatever queries pg_dump --schema-only does. Obviously, that's not a common concern for DB tuning. At the moment, all I'm reasonably sure about is that a reduction in the number of tables should reduce the time needed but that is not the reality I'm faced with. Thanks, Senor On 4/28/2019 14:08, Ron wrote: On 4/28/19 3:21 PM, senor wrote: Hi All, I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade --link". Since this schema only dump can't take advantage of parallel processing with jobs I'm looking for any preparation or configuration settings that can improve speed. 9.2 to 9.6 CentOS 6/64bit 512GB I see only one CPU of 32 doing anything and it's often at 100%. Disk IO is minimal. Memory use varies but always plenty to spare. "pg_dump --schema-only" is single-threaded.
Re: Optimize pg_dump schema-only
I'm afraid it is 9.2.4. I'm using pg_upgrade from 9.6 and that is using pg_dump from 9.6. I noticed on 2 installations with similar table numbers (~200,000), schema and hardware that one was done in hours and the other didn't finish over the weekend. Keeping tabs on pg_stat_activity indicated pg_dump was still processing and nothing else running. Would you say that updating to 9.2.24 would be beneficial before upgrading to 9.6? An update is pretty quick and could be worth the time if there aren't additional requirements prior to starting the upgrade. Thank you. Senor On 4/28/2019 18:19, Tom Lane wrote: > senor writes: >> I know from a previous post to the community that pg_dump --schema-only is >> single threaded and the --jobs option cannot benefit pg_dump in my case >> (single DB, Single schema, 10+ tables). Using pg_upgrade with the --link >> option is very fast except for the pg_dump portion of the upgrade which >> takes days. > One simple question is whether the source server is the last available > minor release (9.2.24 I believe). If not, you may be missing performance > fixes that would help. pg_dump -s on 100K tables should not take "days", > so I'm guessing you're hitting some O(N^2) behavior somewhere, and it > might be something we fixed. > > Likewise make sure that pg_dump is the newest available in the destination > release series. > > regards, tom lane
vacuum visibility relevance
Hi All, Simplified Scenario: 40+ gig table with 20+ indexes receiving log event type records in an append only fashion. One of the columns is a sequence ID. PG version 11.4 If I start a vacuum on this table when sequence ID is 1 million, and the table continues to accumulate new events, when vacuum completes, should I be thinking about the state of the table and indexes as being vacuumed up to 1 million? Or is vacuum also operating on records inserted after vacuum started? Is there any reason to think about this differently when the vacuum is manual as opposed to auto? I attempted to deal with the append only issue corrected in PG13 by setting autovacuum_freeze_max_age low enough to trigger vacuum but that has been such a wild card I abandoned it. I'm now trying to find a formula to trigger a manual vacuum. There are stats gathering processes pulling from the most recently added data all the time so my priority is to keep performance high for those. Secondly, there's a ton of memory allotted so running vacuum freeze before table pages age out of memory seems like a good idea. I am having trouble with autovacuum keeping up and I suspect there's a communication problem with the stats collector but have put off looking into that because updating to PG15 was "right around the corner". Meanwhile, I have had to run multiple-thread scripts to keep up vacuum. Manual vacuums always work and updates stats correctly but autovacuums hang and occupies workers doing nothing. I realize these are 2 completely different issues. I would love to have better understanding of the initial question but it feels like I may not have to be very aware of how that works if I knew what was causing the latter issue. Any hints, pointers and references are always appreciated. - Senor
txid_current vs xact_commit stats
HI All; I was under the impression that all transactions must end with a commit or a rollback but watching stats doesn't support this. Web searches tend to return info on what a transaction is or the ratio of commits to rollbacks. I found nothing contradicting what I think I know. I've sampled pg_stat_database.xact_commit, pg_stat_database.xact_rollback and txid_current() at intervals on a few independent clusters and see that commits increase anywhere from 50% to 300% of the rate of transaction increase. Rollback remains very near zero for all clusters. Each cluster tends to stay consistently within a range (i.e. 120-130% or 50-70%). I've seen strange issues with the stats collector missing updates and causing problems with autovacuum but that wouldn't explain more commits than transactions. All clusters receive many inserts (~10-100) in single transactions but AFAIK this still counts as a single commit. Many other tables are created while processing the inserted data and much of that is done within transaction blocks. I'm not aware of anything very sophisticated in the application handling this but I could be wrong. I'm probably missing something fundamental. I know what I know but I'm not a DBA. PG version 11 & 12 on Linux Any hints and references appreciated. Thanks, Senor
sequence on daily log partitioned table
Hi All, I'm mainly interested in understanding how this works or why it can't, as opposed to just solving the problem. AI just told me this can't be done without a trigger but I'd like to confirm in case maybe I just asked the wrong question. I want to have a partitioned log table receiving input from sensor equipment. The partitions would cover some time range like an hour or day. In all cases I want an ID column to default to a nextval from a sequence but starting over for each day. If I set the default on the partitioned table, I would need to alter it at midnight - not optimal. A default set on the partition hasn't worked and I'm not sure why this is a problem. I've only ever used simplistic partitioning but this doesn't seem like an unusual need. I'm looking for the underlying strategy that prevents use of the sequence configured on the partition. A reference to something describing the decision flow would be fantastic. I've been experimenting on version 13 out of convenience but if a newer version has relevant changes, I'm interested. I'm not interested in the discussions over whether the data volume is enough to justify partitioning. Thanks in advance, Senor
Linked directory or explicit reference
Hi All; When doing an initial install of PostgreSQL on RHEL 7 or 8 derived OS via rpm, what are pros, cons and recommendations of these 2 procedures for utilizing a second disk? Secondary SSD or RAID mounted at /disk2. Option #1 1. install the rpm which creates basic user and home 2. Create symlink /var/lib/pgsql/15/data --> /disk2/data 3. initdb with no special options Or Option #2 1. install the rpm which creates basic user and home 2. initdb with --pgdata=/disk2/data Probably using included 'postgresql-12-setup' script I also link /var/lib/pgsql/data --> ../15/data so automation can reference postgresql.conf without knowing version (legacy stuff). The install is automated with a bash script which handles several options including whether there is a second disk for DB. Scripting the install with or without the second disk is straight forward but I'm concerned with either scenario causing unforeseen differences. I don't think there's a benefit to using tablespace here but I have no experience with it. The systemd service is configured with a dependency on the disk mount so I don't think there are different risks for starting postgres with missing data directory. I've run postgres in both scenarios and not had any issues. I'm interested in comments from others on their experience using these or other options. Thanks, Senor
Re: Linked directory or explicit reference
On 4/30/2024 17:31, Ron Johnson wrote: On Tue, Apr 30, 2024 at 7:00 PM Senor Cervesa wrote: Hi All; When doing an initial install of PostgreSQL on RHEL 7 or 8 derived OS via rpm, what are pros, cons and recommendations of these 2 procedures for utilizing a second disk? Secondary SSD or RAID mounted at /disk2. Option #1 1. install the rpm which creates basic user and home 2. Create symlink /var/lib/pgsql/15/data --> /disk2/data 3. initdb with no special options Or Option #2 1. install the rpm which creates basic user and home 2. initdb with --pgdata=/disk2/data Probably using included 'postgresql-12-setup' script I also link /var/lib/pgsql/data --> ../15/data so automation can reference postgresql.conf without knowing version (legacy stuff). In my experience,The PgBackRest restore feature does not like symlinks. I hadn't considered that and it's the kind of experience feedback I'm looking for. It won't be an issue for me though. The install is automated with a bash script which handles several options including whether there is a second disk for DB. Scripting the install with or without the second disk is straight forward but I'm concerned with either scenario causing unforeseen differences. I don't think there's a benefit to using tablespace here but I have no experience with it. The systemd service is configured with a dependency on the disk mount so I don't think there are different risks for starting postgres with missing data directory. I've run postgres in both scenarios and not had any issues. I'm interested in comments from others on their experience using these or other options. Is the mount point just "/disk2" when using "--pgdata=/disk2/data"? I've gotten "directory not empty" errors when the mount point is "/Database/x.y/data". When linked, it looks like: [root@test110 pgsql]# ll /var/lib/pgsql/15/data lrwxrwxrwx. 1 root root 12 May 1 05:21 /var/lib/pgsql/15/data -> /disk2/data/ I'm not sure what would trigger "directory not empty". When running initdb there is nothing under data. I could see a problem with a symlink throwing that message as a catchall though. I haven't run across any problems yet. Thank you Ron Johnson for the feedback.
vacuum an all frozen table
Hi All, I'd like to understand what's happening here and whether there is anything I can do to improve the situation. PostgreSQL v11.22 (yeah, I know. Needs upgrade) The following 3 autovacuum log entries show a vacuum of an append only table that has not had any changes since the end of 5/10/2024. There is only 1 page not skipped in each instance yet it takes over 1100 seconds to complete. Visibility map shows all frozen. The associated TOAST table is similar in numbers except that it completes in sub-second times. I understand that the vacuum is occurring due to age of pg_class.relfrozenxid for the table but what exactly is it referring to in these cases? Can that also be frozen or similar? Should I add autovacuum_freeze_max_age=4 or higher to relopts do reduce vacuum frequency. 2024-05-17 09:56:57.167 GMT "" "" LOG: automatic aggressive vacuum of table "workdb1.public.log_entry_20240510": index scans: 0 pages: 0 removed, 53551748 remain, 0 skipped due to pins, 53551747 skipped frozen tuples: 0 removed, 242384013 remain, 0 are dead but not yet removable, oldest xmin: 3245896267 buffer usage: 107117593 hits, 123159244 misses, 3 dirtied avg read rate: 856.853 MB/s, avg write rate: 0.000 MB/s system usage: CPU: user: 151.07 s, system: 638.29 s, elapsed: 1122.92 s 2024-05-18 23:20:37.900 GMT "" "" LOG: automatic aggressive vacuum of table "workdb1.public.log_entry_20240510": index scans: 0 pages: 0 removed, 53551748 remain, 0 skipped due to pins, 53551747 skipped frozen tuples: 0 removed, 242384013 remain, 0 are dead but not yet removable, oldest xmin: 3445717402 buffer usage: 107105521 hits, 123171316 misses, 3 dirtied avg read rate: 839.611 MB/s, avg write rate: 0.000 MB/s system usage: CPU: user: 166.46 s, system: 611.40 s, elapsed: 1146.09 s 2024-05-20 19:11:29.519 GMT "" "" LOG: automatic aggressive vacuum of table "workdb1.public.log_entry_20240510": index scans: 0 pages: 0 removed, 53551748 remain, 0 skipped due to pins, 53551747 skipped frozen tuples: 0 removed, 242384013 remain, 0 are dead but not yet removable, oldest xmin: 3645738812 buffer usage: 107095847 hits, 123180990 misses, 2 dirtied avg read rate: 744.513 MB/s, avg write rate: 0.000 MB/s system usage: CPU: user: 195.13 s, system: 694.13 s, elapsed: 1292.59 s workdb1=# SELECT * from pg_visibility('log_entry_20240510'::regclass) where all_visible IS NOT true OR all_frozen IS NOT true OR pd_all_visible IS NOT true; blkno | all_visible | all_frozen | pd_all_visible ---+-++ (0 rows) Thank you for any insights, Senor