Re: constant crashing
Jack: On Sun, 14 Apr 2024 at 22:18, jack wrote: > The CSV files are being produced by another system, a WIndows app on a > Windows machine. I then copy them to a USB key and copy them onto the ubuntu > machine. The data is then imported via the COPY command. > > COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t' > The fields are tab delimited. Then call them tab-delimited, not CSV. It may be important. You have a tendency of sending your interpretation of information which hampers people trying to help you. As an example, only about three messages above did we see the "signal 11" in the server logs which indicates a probable bug. Signal 11 is normally due to stray pointers or corrupted memory ( although faulty ram can lead to it too ), which hints at a bug, not a resource problem. > But importing the data works. I can get all the data into a single table > without any problems. The issue is only when I start to update the single > table. And that is why I started using smaller temporary tables for each CSV > file, to do the updates in the smaller tables before I move them all to a > single large table. > After all the data is loaded and updated, I run php programs on the large > table to generate reports. All of which works well EXCEPT for performing the > updates on the data. And I do not want to use perl or any outside tool. I > want it all one in SQL because I am required to document all my steps so that > someone else can take over, so everything needs to be as simple as possible. I suggested perl ( or any similar thing, php should be equally fine ) as it normally makes your importing / debugging much easier / lighter. On the documentation side, you can have the data and some transformations in a big sql file, but having some steps in a php file will have them documented too. But if you want to do text processing in SQL, go ahead, the only problems are going to be making it harder to debug and probably harder to document. Now people know the signal 11 stuff and the problem can probably be tracked. Francisco Olarte.
efficiency random values / sequential ID values in indexes
Hi Is there any appreciable difference in using random values or sequential values in indexes in a multi tenanted application there is a choice that the single field ID's value is totally random / UUID or the numbers are created with a prefix of the tenant. Since all access will be within a tenant only, will it make any performance difference between using purely random values vs +. thanks Sanjay
Re: Failing streaming replication on PostgreSQL 14
On Mon, Apr 15, 2024 at 2:53 AM Nicolas Seinlet wrote: > Hello everyone, > > Since I moved some clusters from PostgreSQL 12 to 14, I noticed random > failures in streaming replication. I say "random" mostly because I haven't > got the source of the issue. > > I'm using the Ubuntu/cyphered ZFS/PostgreSQL combination. I'm using Ubuntu > LTS (20.04 22.04) and provided ZFS/PostgreSQL with LTS (PostgreSQL 12 on > Ubuntu 20.04 and 14 on 22.04). > > The streaming replication of PostgreSQL is configured with > `primary_conninfo 'host=main_server port=5432 user=replicant > password=a_very_secure_password sslmode=require > application_name=replication_postgresql_app' ` , no replication slot nor > restore command, and the wal is configured with `full_page_writes = off > wal_init_zero = off wal_recycle = off` > > If this works like a charm on PostgreSQL 12, it's sometimes failing with > PostgreSQL 14. As we also changed the OS, maybe the issue relies somewhere > else. > > When the issue is detected, the WAL on the primary is correct. A piece of > the WAL is wrong on the secondary. Only some bytes. Some bytes later, the > wal is again correct. Stopping PostgreSQL on the secondary, removing the > wrong WAL file, and restarting PostgreSQL solves the issue. > > We've added another secondary and noticed the issue can appear on one of > the secondaries, not both at the same time. > > What can I do to detect the origin of this issue? > 1. Minor version number? 2. Using replication_slots? 3. Error message(s)?
Re: efficiency random values / sequential ID values in indexes
On Mon, Apr 15, 2024 at 6:05 AM Sanjay Minni wrote: > Hi > > Is there any appreciable difference in using random values or sequential > values in indexes > > in a multi tenanted application there is a choice that the single field > ID's value is totally random / UUID or the numbers are created with a > prefix of the tenant. Since all access will be within a tenant only, will > it make any performance difference between using purely random values vs > +. > Two benefits of +: 1. In a non-partitioned table, it gives your index "locality of data": all of customer X's record pointers are in *This* subtree. Makes buffers more efficient when a customer runs reports. Bonus points if you then regularly CLUSTER using that table. 2. Makes table partitioning by much easier. That also enhances locality of data. Just make sure that the field ID is BIGINT...
Re: Failing streaming replication on PostgreSQL 14
On Monday, April 15th, 2024 at 14:36, Ron Johnson wrote: > On Mon, Apr 15, 2024 at 2:53 AM Nicolas Seinlet wrote: > > > Hello everyone, > > > > Since I moved some clusters from PostgreSQL 12 to 14, I noticed random > > failures in streaming replication. I say "random" mostly because I haven't > > got the source of the issue. > > > > I'm using the Ubuntu/cyphered ZFS/PostgreSQL combination. I'm using Ubuntu > > LTS (20.04 22.04) and provided ZFS/PostgreSQL with LTS (PostgreSQL 12 on > > Ubuntu 20.04 and 14 on 22.04). > > > > The streaming replication of PostgreSQL is configured with > > `primary_conninfo 'host=main_server port=5432 user=replicant > > password=a_very_secure_password sslmode=require > > application_name=replication_postgresql_app' ` , no replication slot nor > > restore command, and the wal is configured with `full_page_writes = off > > wal_init_zero = off wal_recycle = off` > > > > If this works like a charm on PostgreSQL 12, it's sometimes failing with > > PostgreSQL 14. As we also changed the OS, maybe the issue relies somewhere > > else. > > > > When the issue is detected, the WAL on the primary is correct. A piece of > > the WAL is wrong on the secondary. Only some bytes. Some bytes later, the > > wal is again correct. Stopping PostgreSQL on the secondary, removing the > > wrong WAL file, and restarting PostgreSQL solves the issue. > > > > We've added another secondary and noticed the issue can appear on one of > > the secondaries, not both at the same time. > > > > What can I do to detect the origin of this issue? > > > 1. Minor version number? > 2. Using replication_slots? > 3. Error message(s)? > > Hi, 1. PostgreSQL 14.11 2. No. no replication slot nor restore command. As we've understood the replication slot, it's a mechanism to keep on the primary side everything needed for the secondary to recover. Will this make the primary acknowledge that the secondary received the good wal file? 3. incorrect resource manager data checksum Looking at the WAL files with xxd gives the following diff: The bad one: 006c9160: 0a6e 7514 5030 2e31 0e35 016c 0f07 0009 2f62 6568 6100 7669 6f72 3a6e 6f72 be6d .nu.P0.1.5.l/beha.vior:nor.m 006c9180: 006c91a0: 006c91c0: 437a 4263 7500 7273 6f72 3a70 6f69 0302 4503 9023 3237 3665 3720 323b 223e 5527 CzBcu.rsor:poi..E..#276e7 2;">U' The good one contains the same 1st and 4th lines, but the 2nd and 3rd lines contain the correct values, as if a packet was missed. Thanks for helping, Nicolas. signature.asc Description: OpenPGP digital signature
re: constant crashing hardware issue and thank you
It seems that the hardware may in fact be the issue. I ran the exact same process during the night on a much slower machine (4 core Xeon E5-1620 v4 32G Ram) and it worked fine. It did not crash and completed all the updates on 113 million records successfully. I will be bringing the i9 in for a checkup to find the problem. Thank you all for your help with this issue.
No error message/code for commands after libpq pipeline abortion
Hello, I am writing a Lua client library for PostgreSQL using libpq that supports pipelining in an async/effect-based context. I understand that when an error occurs, all subsequent commands also fail (where PQgetResult returns PGRES_PIPELINE_ABORTED). However, only the first command has an error message via PQresultErrorMessage and an error code via PQresultErrorField(..., PG_DIAG_SQLSTATE) set. All other, subsequently obtained results lack an error message or an SQLSTATE. I would expect something like "ERROR: command failed due to previous failure in pipeline", but I don't get anything like that. Why is that? Is it intentional? Is it documented somewhere? Further question: How do I deal with it with regard to the user of my library? Should I make up my own error message? Which error code could I set? I didn't find any appropriate error code in Appendix A of the documentation. https://www.postgresql.org/docs/16/errcodes-appendix.html My current approach is to just set "*" as an error code, which doesn't feel good: https://github.com/JanBeh/neumond/blob/9889c5fc19dbb23dff483e4a1cb2e4ef3ab74085/pgeff.c#L332-L345 Maybe I'm just overlooking something? Kind regards, Jan Behrens
Re: (When) can a single SQL statement return multiple result sets?
On 11.04.24 01:02, Tom Lane wrote: And if not, why can't I write a stored procedure or function that returns multiple result sets? [ shrug... ] Lack of round tuits, perhaps. We don't have any mechanism today whereby a stored procedure could say "please ship this resultset off to the client, but I want to continue afterwards". But you can do that in other RDBMSes and probably somebody will be motivated to make it possible in Postgres. The development of this feature was the subject of this thread: https://www.postgresql.org/message-id/flat/6e747f98-835f-2e05-cde5-86ee444a7...@2ndquadrant.com But it has not concluded successfully yet.
Re: (When) can a single SQL statement return multiple result sets?
On Wed, 10 Apr 2024 19:02:48 -0400 Tom Lane wrote: > Jan Behrens writes: > > While writing a PostgreSQL client library for Lua supporting > > Pipelining (using PQsendQueryParams), I have been wondering if there > > are any single SQL commands that return multiple result sets. > > Right now, I don't think so. I believe the current protocol design > intends to support that, and I think this may trace back to some > ancient idea at Berkeley that if you select from an inheritance > hierarchy where the child tables aren't all alike, you should be > able to see all the child data, which'd require changing tuple > descriptors midstream. But our current interpretation of SQL > SELECT forbids that. I thought multiple result sets are supported for commands like PQexec, where "Multiple queries sent in a single PQexec call" are explictly supported, and which then return multiple result set. This, however, doesn't apply to pipelining because PQexec is not available in pipelining mode. > > > Here, "DELETE FROM magic" returns multiple result sets, even though it > > is only a single SQL statement. > > Right, so it's kind of a case that you have to support. We're not > likely to rip out rules anytime soon, even if they're a bit > deprecated. As it seems to be a corner case that rarely occurs in practice, I was considering to simply not support this case in my client library. I don't know which SQL error code I could return in that case though. Maybe "0A000" (feature_not_supported) or "21000" (cardinality_violation). Not sure if either of those is a good choice. Any better idea? > > > The case outlined above seems to be a somewhat special case. I haven't > > found any other way to return multiple results (other than sending > > several semicolon-separated statements, which is not supported by > > PQsendQueryParams). So is there any (other) case where I reasonably > > should expect several result sets returned by PQgetResult (before > > PQgetResult returns NULL)? Wouldn't it make sense to disallow such > > behavior altogether? > > No. For one thing, there's too much overlap between what you're > suggesting and pipelined queries. To which question was "no" the answer to. I'm not sure if I understand. > > regards, tom lane > Regards, Jan Behrens
Re: Failing streaming replication on PostgreSQL 14
On 2024-Apr-15, Nicolas Seinlet wrote: > I'm using the Ubuntu/cyphered ZFS/PostgreSQL combination. I'm using > Ubuntu LTS (20.04 22.04) and provided ZFS/PostgreSQL with LTS > (PostgreSQL 12 on Ubuntu 20.04 and 14 on 22.04). What exactly is "cyphered ZFS"? Can you reproduce the problem with some other filesystem? If it's something very unusual, it might well be a bug in the filesystem. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever" (Oliver Silfridge)
Re: (When) can a single SQL statement return multiple result sets?
On Mon, Apr 15, 2024 at 10:24 AM Jan Behrens wrote: > On Wed, 10 Apr 2024 19:02:48 -0400 > Tom Lane wrote: > > > > > Here, "DELETE FROM magic" returns multiple result sets, even though it > > > is only a single SQL statement. > > > > Right, so it's kind of a case that you have to support. We're not > > likely to rip out rules anytime soon, even if they're a bit > > deprecated. > > As it seems to be a corner case that rarely occurs in practice, I was > considering to simply not support this case in my client library. I > don't know which SQL error code I could return in that case though. > Maybe "0A000" (feature_not_supported) or > "21000" (cardinality_violation). Not sure if either of those is a good > choice. Any better idea? If you are asking if "rules" can be ignored or error-walled in terms of your library design, I'd say yes. 100% yes. The main caveat would then be the proposed multi-resultset stored procedure feature, which might break the 'one result per semicolon' assumption you might be chasing as it has some basis in the standard, so I'd be balancing risk/reward against that feature IMO if I were you. merlin