Fwd: "Go" (lang) standard driver
-- Forwarded message - From: Tony Shelver Date: Tue, 18 Aug 2020 at 09:33 Subject: Re: "Go" (lang) standard driver To: Edson Richter A quick Google search found https://github.com/lib/pq. Has 6.1K stars so I would guess fairly well used. On Tue, 18 Aug 2020 at 05:56, Edson Richter wrote: > Is there any standard PostgreSQL driver for Go language? > Do members of this group could share experiences with Go + PostgreSQL > development? > > Thanks, > > E.R. >
Re: "Go" (lang) standard driver
Hi Edson, Le mar. 18 août 2020 à 09:36, Tony Shelver a écrit : > -- Forwarded message - > From: Tony Shelver > Date: Tue, 18 Aug 2020 at 09:33 > Subject: Re: "Go" (lang) standard driver > To: Edson Richter > > > A quick Google search found https://github.com/lib/pq. Has 6.1K stars > so I would guess fairly well used. > I have used it successfully and it performed perfectly. Go for it. > On Tue, 18 Aug 2020 at 05:56, Edson Richter > wrote: > >> Is there any standard PostgreSQL driver for Go language? >> Do members of this group could share experiences with Go + PostgreSQL >> development? >> >> Thanks, >> >> E.R. >> > Olivier >
Re: passing linux user to PG server as a variable ?
David Gauthier wrote: > I can avoid the error by just throwing a namespace in there... > atletx7-reg036:/home/dgauthie[ 214 ] --> setenv PGOPTIONS "-c > 'os.user=$USER' " > But once in, "show os.user" is undefined. It's documented to work [1], but you need to remove these single quotes. For instance: $ env PGOPTIONS="-c os.user=$USER" psql -d postgres psql (12.4 (Debian 12.4-1.pgdg90+1)) Type "help" for help. postgres=> show "os.user"; os.user - daniel (1 row) If characters needed to be quoted in the value, backslash should be used as the quote character. Since it's also a quote character for the shell, two levels of quoting are needed: $ env PGOPTIONS="-c os.user=user\\ is\\ $USER" psql -d postgres psql (12.4 (Debian 12.4-1.pgdg90+1)) Type "help" for help. postgres=> show "os.user"; os.user user is daniel (1 row) [1] https://www.postgresql.org/docs/current/config-setting.html Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
BUG? Slave don't reconnect to the master
Hi all. I found some strange behaviour of postgres, which I recognise as a bug. First of all, let me explain situation. I created a "test bed" (not sure how to call it right), to test high availability clusters based on Pacemaker and PostgreSQL. The test bed consist of 12 virtual machines (on VirtualBox) runing on a MacBook Pro and formed 4 HA clusters with different structure. And all 4 HA cluster constantly tested in loop: simulated failures with different nature, waited for rising fall-over, fixing, and so on. For simplicity I'll explain only one HA cluster. This is 3 virtual machines, with master on one, and sync and async slaves on other. The PostgreSQL service is provided by float IPs pointed to working master and slaves. Slaves are connected to the master float IP too. When the pacemaker detects a failure, for instance, on the master, it promote a master on other node with lowest latency WAL and switches float IPs, so the third node keeping be a sync slave. My company decided to open this project as an open source, now I am finishing formality. Almost works fine, but sometimes, rather rare, I detected that a slave don't reconnect to the new master after a failure. First case is PostgreSQL-STOP, when I `kill` by STOP signal postgres on the master to simulate freeze. The slave don't reconnect to the new master with errors in log: 18:02:56.236 [3154] FATAL: terminating walreceiver due to timeout 18:02:56.237 [1421] LOG: record with incorrect prev-link 0/1600DDE8 at 0/1A00DE10 What is strange that error about incorrect WAL is risen after the termination of connection. Well, this can be workarouned by turning off wal receiver timeout. Now PostgreSQL-STOP works fine, but the problem is still exists with other test. ForkBomb simulates an out of memory situation. In this case a slave sometimes don't reconnect to the new master too, with errors in log: 10:09:43.99 [1417] FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 10:09:43.992 [1413] LOG: invalid record length at 0/D8014278: wanted 24, got 0 The last error message (last row in log) was observed different, btw. What I expect as right behaviour. The PostgreSQL slave must reconnect to the master IP (float IP) after the wal_retrieve_retry_interval.
Point in time recovery
Hello Team, I want to know the best way to ensure/verify that the Point in time recovery has done successfully after the crash and the restore. Thanks,
Re: Point in time recovery
Search the log file for errors? Query the database(s) to verify that the latest data s there? On 8/18/20 5:10 AM, Daulat Ram wrote: Hello Team, I want to know the best way to ensure/verify that the Point in time recovery has done successfully after the crash and the restore. Thanks, -- Angular momentum makes the world go 'round.
Re: "Go" (lang) standard driver
Greetings, * Olivier Gautherot (ogauthe...@gautherot.net) wrote: > Le mar. 18 août 2020 à 09:36, Tony Shelver a écrit : > > -- Forwarded message - > > From: Tony Shelver > > Date: Tue, 18 Aug 2020 at 09:33 > > Subject: Re: "Go" (lang) standard driver > > To: Edson Richter > > > > > > A quick Google search found https://github.com/lib/pq. Has 6.1K stars > > so I would guess fairly well used. > > I have used it successfully and it performed perfectly. Go for it. It also says that it's not actively developed and recommends pgx instead... Thanks, Stephen signature.asc Description: PGP signature
Re: Point in time recovery
Greetings, Please don't top-post. * Ron (ronljohnso...@gmail.com) wrote: > Search the log file for errors? Sadly, this doesn't actually work if you're really doing PITR- you need to look for the specific message saying "recovery stopping before commit" or similar, eg: 2020-08-18 12:55:31.240 UTC [16149] LOG: recovery stopping before commit of transaction 495, time 2020-08-18 12:55:17.227359+00 If you don't see a message along those lines, and you actually specified a recovery target, chances are good that, for whatever reason, PG didn't find that recovery target and instead just replayed to the end of WAL, which isn't what you want if you're specifying a recovery target. You can see how to take backups, perform restores, and see messages like the above in the PG log here: https://learn.crunchydata.com/pg-administration/courses/basic-postgresql-for-dbas-12/pgbackrest/ (which is what I just did to get at the above message, hence the rather recent timestamp ;) > Query the database(s) to verify that the latest data s there? I do generally recommend regular backup and restore procedures that export the data from the database and then re-import it into a clean system to verify that everything is consistent. > On 8/18/20 5:10 AM, Daulat Ram wrote: > >I want to know the best way to ensure/verify that the Point in time > >recovery has done successfully after the crash and the restore. If you're coming from a crash, then that's an entirely different thing and you need to make sure you replay all WAL that existed at the time of the crash- you can't do PITR to a point before then or the database system won't be consistent. Thanks, Stephen signature.asc Description: PGP signature
Database logins taking longer and longer, showing up as "authentication" in ps(1)
Hi, We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which sometimes exhibits a behaviour I can't quite understand: simply logging into the database starts to take minutes to complete. We have 60 processes (workers) running on different machines accessing the database, that all grab jobs from a queue and update rows in a table after doing some calculations (which vary in time from <1s to perhaps a minute, many of them fast). Sometimes new database logins slow down, from usually taking <0.05s to taking minutes. This is for psql as a normal user using Kerberos, for psql as the postgres superuser, for the web-application logging into the database, for everything. What I see in the output of ps(1) is a lot of postgres processes in state "authentication", when this problem occurs. It "feels" like the logins are waiting for something - when e.g. psql finally logs in, running queries seem normal. The only "special" thing I can think of is that the workers use advisory locks on rows in the table they are updating (as there can be multiple jobs in flight at the same time, for the same row). I can imagine that a pattern of jobs and workers could make the locking cause those workers to wait and slow each other down, but I don't understand how it can affect the time it takes for new database logins. When this happens, if I stop the 60 workers, the login times are back to normal within seconds, and all the postgres processes showing "authentication" switch away from that state. If I start the workers again, the login problem appears within a couple of minutes. If I start a smaller number of workers, say 5, the problem doesn't appear. Could this be lock contention on a database table somehow affecting logins? I have tried searching for this problem, and found some older references to things² that seem to have been fixed in the meantime. Is anybody else seeing this? The server is configured with max_connections 1200 and currently has ~400 connections. It has 1.5 TB of memory and 112 cores (including hyperthreading) - the load is not high when I see this problem, between 20 and 30. Any ideas and suggestions are welcome - if there is some relevant information I can provide, I will be happy to try. Best regards, Adam ¹ We should plan to upgrade to 11.9, yes. ² E.g. https://www.postgresql.org/message-id/CAJw8uJQGoobesbPCMbxj6Vb4nv9D-GgvZ%2B7pK%2Bfckbb4DqJEAg%40mail.gmail.com and https://www.postgresql.org/message-id/20140312232810.634.7997%40wrigleys.postgresql.org -- "We don't care about crashing our cars Adam Sjøgren Pedal to the metal and our heads among the stars"a...@koldfront.dk
Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)
On Tue, 2020-08-18 at 14:53 +0200, Adam Sjøgren wrote: > We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which > sometimes exhibits a behaviour I can't quite understand: simply logging > into the database starts to take minutes to complete. > > We have 60 processes (workers) running on different machines accessing > the database, that all grab jobs from a queue and update rows in a table > after doing some calculations (which vary in time from <1s to perhaps a > minute, many of them fast). > > Sometimes new database logins slow down, from usually taking <0.05s to > taking minutes. This is for psql as a normal user using Kerberos, for > psql as the postgres superuser, for the web-application logging into the > database, for everything. I would suspect that the problem is with the authentication method. How is authenticatoin configured? I'd "strace" a connection that is hanging in "authenticating" and see what the process does. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)
Greetings, * Adam Sjøgren (a...@koldfront.dk) wrote: > We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which > sometimes exhibits a behaviour I can't quite understand: simply logging > into the database starts to take minutes to complete. > > We have 60 processes (workers) running on different machines accessing > the database, that all grab jobs from a queue and update rows in a table > after doing some calculations (which vary in time from <1s to perhaps a > minute, many of them fast). > > Sometimes new database logins slow down, from usually taking <0.05s to > taking minutes. This is for psql as a normal user using Kerberos, for > psql as the postgres superuser, for the web-application logging into the > database, for everything. When in doubt, blame DNS. Alternatively, in your case, the issue might be the KDC taking forever to issue a ticket for the service. Note that, in both cases, this is a client-side issue. Once the ticket has been acquired for the PG service, the actual authentication on the server side should be very fast and not depend on things external (though you might check if you have log_hostnames on..). Thanks, Stephen signature.asc Description: PGP signature
Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)
Laurenz writes: > On Tue, 2020-08-18 at 14:53 +0200, Adam Sjøgren wrote: >> >> We have 60 processes (workers) running on different machines accessing >> the database, that all grab jobs from a queue and update rows in a table >> after doing some calculations (which vary in time from <1s to perhaps a >> minute, many of them fast). >> >> Sometimes new database logins slow down, from usually taking <0.05s to >> taking minutes. This is for psql as a normal user using Kerberos, for >> psql as the postgres superuser, for the web-application logging into the >> database, for everything. > > I would suspect that the problem is with the authentication method. Interesting, how are you suspecting the interaction to be playing out? > How is authenticatoin configured? Primarily using gss (for the web-application, delegated credentials), md5 (system users), and a few trust (for legacy access). > I'd "strace" a connection that is hanging in "authenticating" and > see what the process does. I did that one of the previous times I have seen this problem, and saw stuff like this: 2020-06-10 11:22:14strace: Process 101559 attached 2020-06-10 11:22:14futex(0x7f46df29e138, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ) = 0 ... 2020-06-10 11:24:09futex(0x7f46df29e138, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ) = 0 2020-06-10 11:24:09futex(0x7f46df2a2e38, FUTEX_WAKE, 1)= 1 2020-06-10 11:24:09brk(0x55c197084000) = 0x55c197084000 2020-06-10 11:24:09brk(0x55c1970b) = 0x55c1970b 2020-06-10 11:24:09open("global/pg_filenode.map", O_RDONLY) = 6 (where the elided part is futex() calls). The slow login is regardless of whether I login with Kerberos or as the postgres superuser with psql. Best regards, Adam -- "Sometimes I feel like going home, walking in the Adam Sjøgren boardrooma...@koldfront.dk Sometimes I feel like I wanna go, walk into the ballroom"
Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)
On Tue, Aug 18, 2020, 8:04 PM Stephen Frost wrote: > Greetings, > > * Adam Sjøgren (a...@koldfront.dk) wrote: > > We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which > > sometimes exhibits a behaviour I can't quite understand: simply logging > > into the database starts to take minutes to complete. > > > > We have 60 processes (workers) running on different machines accessing > > the database, that all grab jobs from a queue and update rows in a table > > after doing some calculations (which vary in time from <1s to perhaps a > > minute, many of them fast). > > > > Sometimes new database logins slow down, from usually taking <0.05s to > > taking minutes. This is for psql as a normal user using Kerberos, for > > psql as the postgres superuser, for the web-application logging into the > > database, for everything. > > When in doubt, blame DNS. > > Alternatively, in your case, the issue might be the KDC taking forever > to issue a ticket for the service. > > Note that, in both cases, this is a client-side issue. If it is a client side issue, how do you explain the behaviour that OP is not facing problem when there are only few clients? Regards, Seenu. > Once the ticket > has been acquired for the PG service, the actual authentication on the > server side should be very fast and not depend on things external > (though you might check if you have log_hostnames on..). > > Thanks, > > Stephen >
Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)
Stephen Frost writes: > * Adam Sjøgren (a...@koldfront.dk) wrote: >> Sometimes new database logins slow down, from usually taking <0.05s to >> taking minutes. This is for psql as a normal user using Kerberos, for >> psql as the postgres superuser, for the web-application logging into the >> database, for everything. > When in doubt, blame DNS. The trouble with outside-the-database explanations is that it's not clear why stopping the worker processes would make such issues go away. I think looking into pg_stat_activity and pg_locks might be interesting, although it's possible the delay is before the new session has made any entries there. Also, obtaining stack traces from a few of the stuck processes would likely be quite informative. regards, tom lane
Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)
Stephen writes: > * Adam Sjøgren (a...@koldfront.dk) wrote: >> >> We have 60 processes (workers) running on different machines accessing >> the database, that all grab jobs from a queue and update rows in a table >> after doing some calculations (which vary in time from <1s to perhaps a >> minute, many of them fast). >> >> Sometimes new database logins slow down, from usually taking <0.05s to >> taking minutes. This is for psql as a normal user using Kerberos, for >> psql as the postgres superuser, for the web-application logging into the >> database, for everything. > > When in doubt, blame DNS. I'd love to! However I don't see any DNS outages on our local network correlating with whether I run 60 workers or 5 workers. > Alternatively, in your case, the issue might be the KDC taking forever > to issue a ticket for the service. If that was the cause, logging in as the 'postgres' superuser (not using Kerberos) locally on the server should be fast regardless, right? > (though you might check if you have log_hostnames on..). It's off: $ grep hostname /etc/postgresql/11/main/postgresql.conf #log_hostname = off Thanks for replying! Best regards, Adam -- "Probabilistic algorithms don't appeal to me. (ThisAdam Sjøgren is a question of aesthetics, not practicality.) So a...@koldfront.dk later, I figured out how to remove the probability and turn it into a deterministic algorithm."
import XML
Hi all, I want to import XML file into PG database table. I've find functions to get the XML content of a cell after imported an XML file with the pg_get_file function. But, I want to explode the XML content to colums. How can I do this ? PG 10 under Ubuntu 18 _ Greetings, Pascal CROZET DBA - Qualis Consulting _
Re: import XML
On Tue, Aug 18, 2020 at 8:27 AM PASCAL CROZET < pascal.cro...@qualis-consulting.com> wrote: > I want to import XML file into PG database table. > I've find functions to get the XML content of a cell after imported an XML > file with the pg_get_file function. > But, I want to explode the XML content to colums. How can I do this ? > In short, you need to import the content as a document value first then use functions to explode that document. You should find the documentation helpful. https://www.postgresql.org/docs/10/functions-xml.html Specifically, "xmltable" Keep in mind that "context as a document value" can just be a literal. David J.
Re: Index tuple deduplication limitations in pg13
On Mon, Aug 17, 2020 at 11:44 PM Matthias van de Meent wrote: > But, if the ordering of operator-class equal tuples is already > system-defined, could the physical ordering of index tuples in a btree > (with deduplication enabled for "unsafe" opclasses) be updated from > [index_columns, tid] to [index_columns, > image_compare(non_datum_equal_columns), tid], giving a stable sorting > of opclass-equal and image-equal values and enabling safe consistent > deduplication? The issue isn't the physical ordering. The issue is that we cannot allow the implementation to destroy semantic differences among equal datums. We avoid deduplication with cases where two equal datums are visibly different. For example, it would not be okay if we forgot that your numeric datum was originally input as '5.000', and output '5' later on. If we wanted to fix this for numeric, we'd have to invent a new numeric datatype (called numeric2, say). That probably isn't as hard as it sounds, since it could be part of the same B-Tree operator family as numeric. It could also be implicitly cast to numeric. -- Peter Geoghegan
Re: Index tuple deduplication limitations in pg13
On Tue, Aug 18, 2020 at 9:44 AM Peter Geoghegan wrote: > If we wanted to fix this for numeric, we'd have to invent a new > numeric datatype (called numeric2, say). That probably isn't as hard > as it sounds, since it could be part of the same B-Tree operator > family as numeric. It could also be implicitly cast to numeric. I forgot to say: numeric2 would be just like numeric, except in one specific way: it wouldn't care about display scale. The user would be giving up on display scale by choosing numeric2 over numeric. The "5 vs 5.000" information would always be lost by design, so there'd be nothing for deduplication to break. Deduplication could then be enabled. -- Peter Geoghegan
Re: Index tuple deduplication limitations in pg13
On Tue, Aug 18, 2020 at 11:52 AM Matthias van de Meent wrote: > Deduplication does not need to destroy semantic differences? 'equal' > can (in my book) mean: > - 'opclass-equal', that is the opclass returns true for an equality check > - 'binary equal' or 'datum-equal' (? maybe incorrect term), that is > the unprocessed on-disk representations (datum image is the right term > I believe?) of the compared values are indistinguishable. > > Runs of 'binary equal' datums can be freely deduplicated [0] when found. > [0] > Inserting a row in a deduplicated index with in, with TID ntid, can > encounter a posting list of a opclass-equal but not datum image-equal > tuples where the lowest TID of the posting list is less than ntid, and > ntid is less than the highest TID of the posting list. This would > require a posting list split to accomodate the new tuples' index entry > in order to not lose data. But you can't do that easily, because it breaks subtle assumptions about posting list splits and space utilization. In particular, it means that you can no longer think of a posting list split as rewriting an incoming new item such that you can more or less pretend that there was no overlap in the first place -- code like _bt_split and nbtsplitloc.c relies on this. Introducing new special cases to nbtsplitloc.c is very unappealing. More concretely, if you introduce a posting list split like this then you need three copies of the key -- the original, the new, and a second copy of the original. That's much more complicated. -- Peter Geoghegan
Orphaned relations after crash/sigkill during CREATE TABLE
Postgres 12.4 I was directed in slack to mention here that we're being impacted by Postgres leaving orphaned pages in /base/ after a crash while a CREATE TABLE is being run in transaction. The issue is the same as the reproduction steps listed here [1], that is: - Start a CREATE TABLE transaction for a large table - Terminate the process via kill -9 - Pages are left in /base that have no filenode references anymore, such that `pg_database_size()` grows to be very large while total table+index size remains constant However in our particular case, we're using a managed/cloud Postgres server and our `CREATE TABLE` transaction was being terminated by the OOM killer. Using a managed service, we don't have filesystem access to go and clear out these orphaned pages. This caused our total db size to grow from 40GB of table+index data to 4TB on-disk (but still only 40GB of table+index data, the other ~3.95TB being orphaned CREATE TABLE pages) I realize (per a blog post from Robert Haas [2] and from slack conversation) that this is a known issue, but was directed here from slack to just mention that we were impacted by it, and have no resolution due to not having filesystem access, and not having a method internally to Postgres to deal with these orphaned relations. (Our recourse currently is to do something like a pg_dump/pg_restore onto a clean instance in order to escape the orphaned files) -Jason [1] https://github.com/bdrouvot/pg_orphaned#example-1 [2] https://rhaas.blogspot.com/2020/05/dont-manually-modify-postgresql-data.html
Re: Orphaned relations after crash/sigkill during CREATE TABLE
On 8/18/20 12:35 PM, Jason Myers wrote: Postgres 12.4 I was directed in slack to mention here that we're being impacted by Postgres leaving orphaned pages in /base/ after a crash while a CREATE TABLE is being run in transaction. The issue is the same as the reproduction steps listed here [1], that is: - Start a CREATE TABLE transaction for a large table - Terminate the process via kill -9 - Pages are left in /base that have no filenode references anymore, such that `pg_database_size()` grows to be very large while total table+index size remains constant So from [1] you are using CREATE TABLE AS. Have you tried with: BEGIN; CREATE TABLE some_table SELECT some_data FROM other_table LIMIT 1 WITH NO DATA; COMMIT; The above gets you the table structure, but no data. BEGIN; INSERT into some_table SELECT * FROM other_table; COMMIT; The above populates the table. Have not tested but I'm going to assume if you kill the above the problem would not happen or would be fixable by DELETE FROM some_table/TRUNCATE some_table; However in our particular case, we're using a managed/cloud Postgres server and our `CREATE TABLE` transaction was being terminated by the OOM killer. Using a managed service, we don't have filesystem access to go and clear out these orphaned pages. This caused our total db size to grow from 40GB of table+index data to 4TB on-disk (but still only 40GB of table+index data, the other ~3.95TB being orphaned CREATE TABLE pages) I realize (per a blog post from Robert Haas [2] and from slack conversation) that this is a known issue, but was directed here from slack to just mention that we were impacted by it, and have no resolution due to not having filesystem access, and not having a method internally to Postgres to deal with these orphaned relations. (Our recourse currently is to do something like a pg_dump/pg_restore onto a clean instance in order to escape the orphaned files) -Jason [1] https://github.com/bdrouvot/pg_orphaned#example-1 [2] https://rhaas.blogspot.com/2020/05/dont-manually-modify-postgresql-data.html -- Adrian Klaver adrian.kla...@aklaver.com
Re: Index tuple deduplication limitations in pg13
On Tue, Aug 18, 2020 at 11:52 AM Matthias van de Meent wrote: > Given that the above could work, the current btree tuple ordering is > not optimized for opclass-equal but datum image-distinct values: > ordering of opclass-equal values is currently determined only by tid, > with as an example current ordering ['0.0', '0', '0.00', '0', '0.0', > '0']. It would be more optimized for deduplication if that was stored > as e.g. ['0', '0', '0', '0.0', '0.0', '0.00'], which is why I > suggested to add an ordering by the datum image before the tid > ordering. Additionally, this extra ordering also prevents the problem > of [0] by never attempting an insertion of non-equal image datums in a > posting list of otherwise equal values, as it would be ordered either > before or after the posting list, never inside the list. Yeah, that would work, but at the cost of making numeric totally unusable. Now you cannot rely on unique enforcement detecting that '0' is a duplicate of '0.0'. In fact, even the most trivial use of the = operator will be broken in the presence of different display scales. It's a non-starter. The numeric2 design that I sketched is a bit ugly, but I can see no better way. A three-way posting list split (i.e. the other design that you sketched) is a special case that is very hard to test, very complicated, and of little value in the grand scheme of things. -- Peter Geoghegan
Re: Orphaned relations after crash/sigkill during CREATE TABLE
> On Tue, Aug 18, 2020 at 3:49 PM Adrian Klaver wrote: > > Have you tried with: > BEGIN; > CREATE TABLE some_table SELECT some_data FROM other_table LIMIT 1 WITH > NO DATA; > COMMIT; > > The above gets you the table structure, but no data. > > BEGIN; > INSERT into some_table SELECT * FROM other_table; >COMMIT; > > The above populates the table Thanks -- we were indeed creating and populating the new table all in a single transaction. I'll see if we can split this into two transactions so that the table structure is committed quickly. I think you're right that this would mostly sidestep the issue. -Jason p.s. Apologies if this is formatted wrong, this is my first mailing list post.
Re: Orphaned relations after crash/sigkill during CREATE TABLE
On 8/18/20 1:19 PM, Jason Myers wrote: > On Tue, Aug 18, 2020 at 3:49 PM Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: > Have you tried with: > BEGIN; > CREATE TABLE some_table SELECT some_data FROM other_table LIMIT 1 WITH > NO DATA; > COMMIT; > > The above gets you the table structure, but no data. > > BEGIN; > INSERT into some_table SELECT * FROM other_table; >COMMIT; > > The above populates the table Thanks -- we were indeed creating and populating the new table all in a single transaction. I'll see if we can split this into two transactions so that the table structure is committed quickly. I think you're right that this would mostly sidestep the issue. Assuming the table you are pulling from is fairly static, it could also allow you to regulate the amount of data you transfer at any one time into the new table. -Jason p.s. Apologies if this is formatted wrong, this is my first mailing list post. -- Adrian Klaver adrian.kla...@aklaver.com
Re: "Go" (lang) standard driver
I have been using pgx and seems to be working well. I recommend if your just starting to use the v4 version. On Tue, Aug 18, 2020 at 5:53 AM Stephen Frost wrote: > Greetings, > > * Olivier Gautherot (ogauthe...@gautherot.net) wrote: > > Le mar. 18 août 2020 à 09:36, Tony Shelver a écrit > : > > > -- Forwarded message - > > > From: Tony Shelver > > > Date: Tue, 18 Aug 2020 at 09:33 > > > Subject: Re: "Go" (lang) standard driver > > > To: Edson Richter > > > > > > > > > A quick Google search found https://github.com/lib/pq. Has 6.1K > stars > > > so I would guess fairly well used. > > > > I have used it successfully and it performed perfectly. Go for it. > > It also says that it's not actively developed and recommends pgx > instead... > > Thanks, > > Stephen > -- T: @Thaumion IG: Thaumion scot...@gmail.com
Re: Index tuple deduplication limitations in pg13
On Tue, 18 Aug 2020 at 18:44, Peter Geoghegan wrote: > > On Mon, Aug 17, 2020 at 11:44 PM Matthias van de Meent > wrote: > > But, if the ordering of operator-class equal tuples is already > > system-defined, could the physical ordering of index tuples in a btree > > (with deduplication enabled for "unsafe" opclasses) be updated from > > [index_columns, tid] to [index_columns, > > image_compare(non_datum_equal_columns), tid], giving a stable sorting > > of opclass-equal and image-equal values and enabling safe consistent > > deduplication? > > The issue isn't the physical ordering. The issue is that we cannot > allow the implementation to destroy semantic differences among equal > datums. Deduplication does not need to destroy semantic differences? 'equal' can (in my book) mean: - 'opclass-equal', that is the opclass returns true for an equality check - 'binary equal' or 'datum-equal' (? maybe incorrect term), that is the unprocessed on-disk representations (datum image is the right term I believe?) of the compared values are indistinguishable. Runs of 'binary equal' datums can be freely deduplicated [0] when found. > We avoid deduplication with cases where two equal datums are > visibly different. For example, it would not be okay if we forgot that > your numeric datum was originally input as '5.000', and output '5' > later on. I agree on that point. But, isn't this display scale also stored in the 'datum image' [1] and could therefore be distinguished against for two opclass-equal values whilst deduplicating, only putting binary equal values in a posting list? e.g. all values generated through '0.000'::numeric -values can be compressed into one posting tuple without information loss, as the datum image is effectively the same for all of these (unless I have missed something), and distinct from the datum image of '0'::numeric. With enough equal values, you would eventually have a posting lists for each distinct datum image of equal values. Given that the above could work, the current btree tuple ordering is not optimized for opclass-equal but datum image-distinct values: ordering of opclass-equal values is currently determined only by tid, with as an example current ordering ['0.0', '0', '0.00', '0', '0.0', '0']. It would be more optimized for deduplication if that was stored as e.g. ['0', '0', '0', '0.0', '0.0', '0.00'], which is why I suggested to add an ordering by the datum image before the tid ordering. Additionally, this extra ordering also prevents the problem of [0] by never attempting an insertion of non-equal image datums in a posting list of otherwise equal values, as it would be ordered either before or after the posting list, never inside the list. > If we wanted to fix this for numeric, we'd have to invent a new > numeric datatype (called numeric2, say). That probably isn't as hard > as it sounds, since it could be part of the same B-Tree operator > family as numeric. It could also be implicitly cast to numeric. > > -- > Peter Geoghegan [0] Inserting a row in a deduplicated index with in, with TID ntid, can encounter a posting list of a opclass-equal but not datum image-equal tuples where the lowest TID of the posting list is less than ntid, and ntid is less than the highest TID of the posting list. This would require a posting list split to accomodate the new tuples' index entry in order to not lose data. [1] # a table with 1000 distinct values, each duplicated 700 times, but split into 7x100 'datum-equal' value groups that should be deduplicatable. SELECT (n % 1000 || '.' || repeat('0', n%7))::numeric AS num INTO numbers FROM generate_series(1, 70) series(n); CREATE INDEX numeric_index ON numbers (num); ANALYZE numbers; SELECT num FROM numbers ORDER BY num LIMIT 700; -- this is an index-only scan, and results show numeric scale, so the information must be stored in the index.
Re: Index tuple deduplication limitations in pg13
On Tue, 18 Aug 2020 at 22:00, Peter Geoghegan wrote: > > On Tue, Aug 18, 2020 at 11:52 AM Matthias van de Meent > wrote: > > Given that the above could work, the current btree tuple ordering is > > not optimized for opclass-equal but datum image-distinct values: > > ordering of opclass-equal values is currently determined only by tid, > > with as an example current ordering ['0.0', '0', '0.00', '0', '0.0', > > '0']. It would be more optimized for deduplication if that was stored > > as e.g. ['0', '0', '0', '0.0', '0.0', '0.00'], which is why I > > suggested to add an ordering by the datum image before the tid > > ordering. Additionally, this extra ordering also prevents the problem > > of [0] by never attempting an insertion of non-equal image datums in a > > posting list of otherwise equal values, as it would be ordered either > > before or after the posting list, never inside the list. > > Yeah, that would work, but at the cost of making numeric totally > unusable. Now you cannot rely on unique enforcement detecting that '0' > is a duplicate of '0.0'. In fact, even the most trivial use of the = > operator will be broken in the presence of different display scales. > It's a non-starter. Would this extra ordering not effectively be an extra tiebreaker in the ordering, applied before the TID? I do not know the full implications of that, but I believe that would not result in the limitations that you are mentioning. - Matthias
Re: Index tuple deduplication limitations in pg13
On Tue, Aug 18, 2020 at 1:31 PM Matthias van de Meent wrote: > Would this extra ordering not effectively be an extra tiebreaker in > the ordering, applied before the TID? I do not know the full > implications of that, but I believe that would not result in the > limitations that you are mentioning. You could probably do it that way, but again you end up with a lot of new complexity. Not to mention overhead that would have to be paid by everyone. It would require code that supported the old way (even if it was added to Postgres 13) for pg_upgrade, that would also be hard to test. And it might defeat certain future optimizations based on heap TID being the only tiebreaker. Having two types of equality might have to bleed into the optimizer. It's a question of engineering trade-offs. I don't think that it's worth it. -- Peter Geoghegan
Re: "Go" (lang) standard driver
I can recommend sqlx: https://github.com/jmoiron/sqlx On Tue, Aug 18, 2020 at 4:56 PM Scottix wrote: > I have been using pgx and seems to be working well. > > I recommend if your just starting to use the v4 version. > > On Tue, Aug 18, 2020 at 5:53 AM Stephen Frost wrote: > >> Greetings, >> >> >> >> >> >> * Olivier Gautherot (ogauthe...@gautherot.net) wrote: >> >> >> > Le mar. 18 août 2020 à 09:36, Tony Shelver a >> écrit : >> >> >> > > -- Forwarded message - >> >> >> > > From: Tony Shelver >> >> >> > > Date: Tue, 18 Aug 2020 at 09:33 >> >> >> > > Subject: Re: "Go" (lang) standard driver >> >> >> > > To: Edson Richter >> >> >> > > >> >> >> > > >> >> >> > > A quick Google search found https://github.com/lib/pq. Has 6.1K >> stars >> >> >> > > so I would guess fairly well used. >> >> >> > >> >> >> > I have used it successfully and it performed perfectly. Go for it. >> >> >> >> >> >> It also says that it's not actively developed and recommends pgx >> >> >> instead... >> >> >> >> >> >> Thanks, >> >> >> >> >> >> Stephen >> >> >> > > -- > T: @Thaumion > IG: Thaumion > scot...@gmail.com > > >
postgres materialized view refresh in parallel
Created materialized view in parallel by setting max_parallel_workers_per_gather to 4. I could see parallel sessions kicking off and creating the mview much faster than serial execution (10mins - parallel vs over an hour - serial). Now I want the refresh of the mview to happen taking around same time as create, I am okay with it taking little more than create, to execute the steps it takes when refreshing the view. But what I am noticing is, refresh is not running in parallel at all - even after having max_parallel_workers_per_gather to 4. Does PostgreSQL 11.6 not support refresh of the mviews in parallel as it did while creating the mview?