Trying to understand odd trigger behavior
I think I know what is happening, but I wanted to see if my understanding is correct. I have a perl after insert trigger for a table with a non-null column element and I am getting an occasional error when the trigger executes for printing a null value which is $TD->{new}{element}. However, I do the insert with an on conflict that converts it into an update. And I have a before update trigger that blocks changing the record, but returns null if the old and new records are the same. My theory is that the insert trigger is firing even though the action got changed to an update and that because the update trigger cancelled the update, there is no new record. So I should just test for $TD->{new} being doing before doing stuff. Does that sound correct?
Re: Trying to understand odd trigger behavior
On Fri, Jun 15, 2018 at 08:54:52 +0200, Laurenz Albe wrote: Absolutely, but it should be easy to run a few tests with only a single row insert that confirms your theory. Thanks.
Re: Trying to understand odd trigger behavior
On Fri, Jun 15, 2018 at 08:54:52 +0200, Laurenz Albe wrote: Bruno Wolff III wrote: I think I know what is happening, but I wanted to see if my understanding is correct. I have a perl after insert trigger for a table with a non-null column element and I am getting an occasional error when the trigger executes for printing a null value which is $TD->{new}{element}. However, I do the insert with an on conflict that converts it into an update. And I have a before update trigger that blocks changing the record, but returns null if the old and new records are the same. My theory is that the insert trigger is firing even though the action got changed to an update and that because the update trigger cancelled the update, there is no new record. So I should just test for $TD->{new} being doing before doing stuff. Does that sound correct? Absolutely, but it should be easy to run a few tests with only a single row insert that confirms your theory. It looks like something else is going on. I tried checking if $_TD->{new}{element} was defined and print some extra information if it wasn't to the local syslog service instead of to the remote elastic server and I was still seeing the issue occasionally without getting the extra info. Also reinserting duplicate data doesn't seem to be a reliable way to reproduce the problem. I didn't see it at all doing it by hand to a small number of records. I only see this on the trigger that sends data to the elastic server and not the one that sends it to the local syslog server. The if the issue is data dependent it would always hit the elastic server trigger first. So that doesn't rule out both triggers from having the same issue, but does suggest that I may want to look at stuff specific to the elastic trigger with extra scrutiny. I probably need to narrow things down myself as looking at the whole system is going to be too much effort when this might be user error and not a bug. And it isn't self contained as it is part of a system that pulls log data from a third party. That would make it hard for someone else to test.
Re: Trying to understand odd trigger behavior
On Thu, Jun 14, 2018 at 23:04:24 -0500, Bruno Wolff III wrote: I have a perl after insert trigger for a table with a non-null column element and I am getting an occasional error when the trigger executes for printing a null value which is $TD->{new}{element}. However, I do the insert with an on conflict that converts it into an update. And I have a before update trigger that blocks changing the record, but returns null if the old and new records are the same. I think I misunderstood the error I was getting. I don't think the value being printed is null, but rather the socket instance I am trying to print to. I haven't tested this yet, but it makes sense and my other theory didn't match later evidence.
Postgres 15 upgrades and template1 public schema
I noticed when I did an upgrade from Postgres 14 to 15 that the public schema in template1 was still owned by postgres instead of pg_database_owner. I was expecting it to change because the release notes said that new database clusters would have that. But shouldn't new clusters use what is set in template1?
Re: Postgres 15 upgrades and template1 public schema
On Wed, Oct 19, 2022 at 23:30:58 +0200, Thomas Kellerer wrote: Bruno Wolff III schrieb am 19.10.2022 um 22:36: I noticed when I did an upgrade from Postgres 14 to 15 that the public schema in template1 was still owned by postgres instead of pg_database_owner. I was expecting it to change because the release notes said that new database clusters would have that. But shouldn't new clusters use what is set in template1? This is explained in the release notes: The change applies to new database clusters and to newly-created databases in existing clusters. Upgrading a cluster or restoring a database dump will preserve public's existing permissions. How do new databases in pre-existing clusters get the new public schema security if it doesn't come from template1?
Re: Postgres 15 upgrades and template1 public schema
On Wed, Oct 19, 2022 at 19:59:52 -0400, Tom Lane wrote: The release notes could probably use some tweaking here. It looks to me like pg_dumpall (and hence pg_upgrade) will adjust the ownership and permissions of template1's public schema to match what was in the old installation, but it doesn't touch template0. Hence, whether a "newly-created database in an existing cluster" has the old or new properties of the public schema will depend on whether you clone it from template1 or template0. That definitely needs explained, and maybe we should recommend that DBAs consider manually changing what's in template1. This answers my question about what is actually happening. I think expanding the release notes section on this a bit could be helpful for other people.
Re: Postgres 15 upgrades and template1 public schema
On Wed, Oct 19, 2022 at 19:59:52 -0400, Tom Lane wrote: The release notes could probably use some tweaking here. It looks to me like pg_dumpall (and hence pg_upgrade) will adjust the ownership and permissions of template1's public schema to match what was in the old installation, but it doesn't touch template0. Hence, whether a "newly-created database in an existing cluster" has the old or new properties of the public schema will depend on whether you clone it from template1 or template0. That definitely needs explained, and maybe we should recommend that DBAs consider manually changing what's in template1. I didn't see any changes related to this in the first draft of the 15.1 release notes. Since I got the impression from the above that there might be a change, I'm sending this as a reminder.
Should pg 11 use a lot more memory building an spgist index?
While reloading a database cluster to move from 10.5 to 11, I'm getting out of memory crashes that I did see when doing reloads on pg 10. The statement flagged in the log is this: 2018-10-23 16:44:34.815 CDT [126839] STATEMENT: ALTER TABLE ONLY public.iplocation ADD CONSTRAINT overlap EXCLUDE USING spgist (network WITH &&); iplocation has 4398722 rows. This is geolite data where the networks are a partial covering of the total address spaces with no overlaps. Should I expect to have to make config changes to make this work?
Re: Should pg 11 use a lot more memory building an spgist index?
On Wed, Oct 24, 2018 at 09:33:48 +0100, Tom Lane wrote: Bruno Wolff III writes: While reloading a database cluster to move from 10.5 to 11, I'm getting out of memory crashes that I did see when doing reloads on pg 10. The statement flagged in the log is this: 2018-10-23 16:44:34.815 CDT [126839] STATEMENT: ALTER TABLE ONLY public.iplocation ADD CONSTRAINT overlap EXCLUDE USING spgist (network WITH &&); Hm, there's a fair amount of new code in SP-GIST in v11, so maybe you've hit a memory leak in that. Can you create a self-contained test case? I'll try. I think I should only need the geolite data to cause the problem and I can share that publicly. So far the problem seems to be happening consistently. I'll work on this at the office, but probably won't get it done until the afternoon. If I have a substantial database dump file to provide for reproducing this do you prefer it on a web server somewhere? I expect that mailing very large attachments to the lists is a bad idea.
Re: Should pg 11 use a lot more memory building an spgist index?
On Tue, Oct 23, 2018 at 20:23:14 -0500, Bruno Wolff III wrote: While reloading a database cluster to move from 10.5 to 11, I'm getting out of memory crashes that I did see when doing reloads on pg 10. The statement flagged in the log is this: 2018-10-23 16:44:34.815 CDT [126839] STATEMENT: ALTER TABLE ONLY public.iplocation ADD CONSTRAINT overlap EXCLUDE USING spgist (network WITH &&); iplocation has 4398722 rows. I'm trying to reproduce this on my desktop but so far it isn't producing the same issue. The database is still loading after 9 hours, but it looks like it got past the point where the problem index was created. (I'm not sure how to check if the index has really finished being created, but \d shows it as existing.) I'll know better tomorrow. My workstation is using the Fedora version of postgresql 11 which might have some relevant difference from the pgdg version for rhel7. I still have a number of things I can try, but they might take significant time to run and I might not get a reasonable test case for a while.
Re: Should pg 11 use a lot more memory building an spgist index?
It looks like it got past creating the exclude constraint based on the ordering of commands in the dump file. However creating a more normal spgist index is taking a very long time with a lot of disk wait time. CPU usage seems pretty low for the amount of time it has been working on building that index, but maybe that is normal for building indexes. I used the almost the default postgresql.conf on my workstation. I bumped up the limits in a few places on the server that could have allowed a lot more memory to be used especially if the index creation was parallelized. While the load is running I'll see if I can tell if there is a memory leak with this index build. Once it finishes, I can dump a specific table and test building the exclude spgist index with some different settings to see if I can reproduce the out of memory error with a lot less data then is in the whole database.
Re: Should pg 11 use a lot more memory building an spgist index?
On Wed, Oct 24, 2018 at 10:21:11 +0100, Tom Lane wrote: Bruno Wolff III writes: If I have a substantial database dump file to provide for reproducing this do you prefer it on a web server somewhere? I expect that mailing very large attachments to the lists is a bad idea. No, don't do that. If you can make sample data available for download, or point to some accessible dataset somewhere, that'd work. regards, tom lane I have something that seems to produce it on rhel7. Fedora isn't working well either, but the difference may be due to postgresql.conf being different or some difference in the Fedora build. http://wolff.to/iplocation is a bit under 400mb. It should download at about 1MB/sec. It is a plain text dump of the iplocation table with the alter table for the constaint / exclude index added at the end. http://wolff.to/postgresql.conf is the config file I use on the server. The server has the following installed (but you don't need plperl for the test): postgresql11-plperl-11.0-1PGDG.rhel7.x86_64 postgresql11-libs-11.0-1PGDG.rhel7.x86_64 postgresql11-docs-11.0-1PGDG.rhel7.x86_64 postgresql11-11.0-1PGDG.rhel7.x86_64 postgresql11-server-11.0-1PGDG.rhel7.x86_64 The output of createdb -U postgres test psql -U postgres -f iplocation test is: SET SET SET SET SET set_config (1 row) SET SET SET SET SET CREATE TABLE ALTER TABLE COPY 4398722 psql:iplocation:4398789: ERROR: out of memory DETAIL: Failed on request of size 6264 in memory context "PortalContext". It is certainly possible that my postgresql.conf file is bad and that I just got away with it under 10.5 by the. The server is a vm with 32GB of memory allocated to it. I set vm.overcommit_memory = 2 to avoid the oom killer after upgrading to 11. Before that I didn't have a problem. On Fedora with a more vanilla postgresql.conf the exclude constraint built fine, but creating an spgist index file is taking forever (near a day now) creating a normal spgist index on an ip address column for a table with a lot of rows (I think around 150 million), that ran in a reasonable amount of time on the server.
Re: Should pg 11 use a lot more memory building an spgist index?
On Fri, Oct 26, 2018 at 10:16:09 +0100, Tom Lane wrote: Bruno Wolff III writes: I have something that seems to produce it on rhel7. Fedora isn't working well either, but the difference may be due to postgresql.conf being different or some difference in the Fedora build. Hmm, in my hands this produces the same size leak (~28GB) in either v10 or v11. In HEAD, somebody's made it even worse (~43GB). So this is certainly pretty broken, but I'm not sure why it seems worse to you in v11 than before. As a short term work around, could I create the index first and use insert statements, each in their own transaction, to get the table loaded with the index? Is the issue on Fedora taking very long to build a normal spgist index for network addresses worth pursuing separately, or is it likely to be the same underlying cause? I don't really need to get this working there, as that was just to help with testing. I could also try adjusting memory limits temporarily. If the leak is 28GB on a 32GB system I might be able to get the index built if less memory is tied up in other things. My workstation also has 32GB and the exclude index did build there with the postgresql.conf having lower memory limits.
Re: Should pg 11 use a lot more memory building an spgist index?
On Fri, Oct 26, 2018 at 13:44:07 +0100, Tom Lane wrote: Bruno Wolff III writes: As a short term work around, could I create the index first and use insert statements, each in their own transaction, to get the table loaded with the index? Yes; it might also be that you don't even need to break it up into separate statements. It was time to refresh the geolite data anyway so I tried this. I needed to turn memory_overcommit back on (0) to avoid an error, but the load went OK without the oom killer doing anything. So things are fully working again. Thanks for your help. Is the issue on Fedora taking very long to build a normal spgist index for network addresses worth pursuing separately, or is it likely to be the same underlying cause? This issue only applies if it was an exclusion constraint. If you saw slowness or bloat with a plain index, that would be worth investigating separately. I'll start a seperate thread if I get something to reasonably ask about. The current dataset is probably a lot larger then needed to demonstrate the issue. The difference might be do to configuration or how Fedora built it. And I'll want to compare back to version 10. In the end I'll probably ask why it is slower in one case as opposed to the other and it might not even be a real bug.
Can any_value be used like first_value in an aggregate?
For example, is output of 10 guaranteed in the following: bruno=> select any_value(x order by x desc) from generate_series(1,10) as x; any_value --- 10 (1 row) The use case is that I want to return a value of one column that is paired with the maximum value of another column in each group when using GROUP BY. (There aren't going to be any NULLs in the involved columns.) Thanks.
Re: Can any_value be used like first_value in an aggregate?
On Tue, Jun 25, 2024 at 13:08:45 -0400, Tom Lane wrote: Not really. It will work that way in simple cases, but I think the behavior stops being predictable if the input gets large enough to induce the planner to use parallel aggregation. In any case, the example shown isn't amazingly efficient since it'll still perform a sort to meet the ORDER BY spec. Thanks. The use case is that I want to return a value of one column that is paired with the maximum value of another column in each group when using GROUP BY. Use window functions (i.e. first_value). This is what they're for, and they are smart enough to do just one sort for functions sharing a common window spec. If I do that, I'd need to do it as a subselect inside of a group by. I'm thinking distinct on may work and be a better way to do it. The actual use case is a set of tripplets returned from a query, where I want on row for each distinct value in the first column, paired with the value in the second column, for which the third column is the largest. The second and third columns are effectively dependent on each other, so there won't be any ambiguity. Thanks for getting me thinking about some other ways to approach the problem.
Re: Can any_value be used like first_value in an aggregate?
On Wed, Jun 26, 2024 at 09:50:59 +0200, Laurenz Albe wrote: On Tue, 2024-06-25 at 14:11 -0500, Bruno Wolff III wrote: The actual use case is a set of tripplets returned from a query, where I want on row for each distinct value in the first column, paired with the value in the second column, for which the third column is the largest. The second and third columns are effectively dependent on each other, so there won't be any ambiguity. Try SELECT DISTINCT ON (first_column) first_column, second_column, third_column FROM the_table ORDER BY first_column, third_column DESC; That approach worked for that part of the query. Thanks.
COPY documentation with regard to references constraints
I was unable to find any documentation clarifying if using COPY to load a table with rows referencing others rows in the same table, whether I had to worry about ordering of the input. What I found (in 17) is that even if the references constraint wasn't set as deferrable, I could have rows for children before their parents in the COPY data and things still worked.
Re: COPY documentation with regard to references constraints
On Thu, Oct 31, 2024 at 13:15:34 +0100, Dominique Devienne wrote: AFAIK, DEFERRABLE is for temporary violations crossing statement boundaries, to postpone the enforcement of FKs at COMMIT time of the transaction. While a single COPY is just one statement, so whether a temporary violation for self-referential FKs would occur would be implementation-dependent, and thus "neither logical" nor predictable. Applies to other statement types too. Thanks. I needed to look more carefully under deferrable where it says the following: This controls whether the constraint can be deferred. A constraint that is not deferrable will be checked immediately after every command. That is pretty clear.