Fwd: "Go" (lang) standard driver

2020-08-18 Thread Tony Shelver
-- Forwarded message -
From: Tony Shelver 
Date: Tue, 18 Aug 2020 at 09:33
Subject: Re: "Go" (lang) standard driver
To: Edson Richter 


A quick Google search found https://github.com/lib/pq.   Has 6.1K stars so
I would guess fairly well used.

On Tue, 18 Aug 2020 at 05:56, Edson Richter 
wrote:

> Is there any standard PostgreSQL driver for Go language?
> Do members of this group could share experiences with Go + PostgreSQL
> development?
>
> Thanks,
>
> E.R.
>


Re: "Go" (lang) standard driver

2020-08-18 Thread Olivier Gautherot
Hi Edson,

Le mar. 18 août 2020 à 09:36, Tony Shelver  a écrit :

> -- Forwarded message -
> From: Tony Shelver 
> Date: Tue, 18 Aug 2020 at 09:33
> Subject: Re: "Go" (lang) standard driver
> To: Edson Richter 
>
>
> A quick Google search found https://github.com/lib/pq.   Has 6.1K stars
> so I would guess fairly well used.
>

I have used it successfully and it performed perfectly. Go for it.


> On Tue, 18 Aug 2020 at 05:56, Edson Richter 
> wrote:
>
>> Is there any standard PostgreSQL driver for Go language?
>> Do members of this group could share experiences with Go + PostgreSQL
>> development?
>>
>> Thanks,
>>
>> E.R.
>>
>
Olivier

>


Re: passing linux user to PG server as a variable ?

2020-08-18 Thread Daniel Verite
David Gauthier wrote:

> I can avoid the error by just throwing a namespace in there...
> atletx7-reg036:/home/dgauthie[ 214 ] --> setenv PGOPTIONS "-c
> 'os.user=$USER' "
> But once in, "show os.user" is undefined.

It's documented to work [1], but you need to remove these single
quotes. For instance:

$ env PGOPTIONS="-c os.user=$USER" psql -d postgres
psql (12.4 (Debian 12.4-1.pgdg90+1))
Type "help" for help.

postgres=> show "os.user";
 os.user 
-
 daniel
(1 row)

If characters needed to be quoted in the value, backslash
should be used as the quote character. Since it's also
a quote character for the shell, two levels of quoting
are needed:

$ env PGOPTIONS="-c os.user=user\\ is\\ $USER" psql -d postgres
psql (12.4 (Debian 12.4-1.pgdg90+1))
Type "help" for help.

postgres=> show "os.user";
os.user 

 user is daniel
(1 row)


[1] https://www.postgresql.org/docs/current/config-setting.html


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite




BUG? Slave don't reconnect to the master

2020-08-18 Thread Олег Самойлов
Hi all.

I found some strange behaviour of postgres, which I recognise as a bug. First 
of all, let me explain situation.

I created a "test bed" (not sure how to call it right), to test high 
availability clusters based on Pacemaker and PostgreSQL. The test bed consist 
of 12 virtual machines (on VirtualBox) runing on a MacBook Pro and formed 4 HA 
clusters with different structure. And all 4 HA cluster constantly tested in 
loop: simulated failures with different nature, waited for rising fall-over, 
fixing, and so on. For simplicity I'll explain only one HA cluster. This is 3 
virtual machines, with master on one, and sync and async slaves on other. The 
PostgreSQL service is provided by float IPs pointed to working master and 
slaves. Slaves are connected to the master float IP too. When the pacemaker 
detects a failure, for instance, on the master, it promote a master on other 
node with lowest latency WAL and switches float IPs, so the third node keeping 
be a sync slave. My company decided to open this project as an open source, now 
I am finishing formality.

Almost works fine, but sometimes, rather rare, I detected that a slave don't 
reconnect to the new master after a failure. First case is PostgreSQL-STOP, 
when I `kill` by STOP signal postgres on the master to simulate freeze. The 
slave don't reconnect to the new master with errors in log:

18:02:56.236 [3154] FATAL:  terminating walreceiver due to timeout
18:02:56.237 [1421] LOG:  record with incorrect prev-link 0/1600DDE8 at 
0/1A00DE10

What is strange that error about incorrect WAL is risen  after the termination 
of connection. Well, this can be workarouned by turning off wal receiver 
timeout. Now PostgreSQL-STOP works fine, but the problem is still exists with 
other test. ForkBomb simulates an out of memory situation. In this case a slave 
sometimes don't reconnect to the new master too, with errors in log:

10:09:43.99 [1417] FATAL:  could not receive data from WAL stream: server 
closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
10:09:43.992 [1413] LOG:  invalid record length at 0/D8014278: wanted 24, got 0

The last error message (last row in log) was observed different, btw.

What I expect as right behaviour. The PostgreSQL slave must reconnect to the 
master IP (float IP) after the wal_retrieve_retry_interval.



Point in time recovery

2020-08-18 Thread Daulat Ram
Hello Team,

I want to know the best way to ensure/verify that the Point in time recovery 
has done successfully after the crash and the restore.

Thanks,



Re: Point in time recovery

2020-08-18 Thread Ron


Search the log file for errors?

Query the database(s) to verify that the latest data s there?

On 8/18/20 5:10 AM, Daulat Ram wrote:


Hello Team,

I want to know the best way to ensure/verify that the Point in time 
recovery has done successfully after the crash and the restore.


Thanks,



--
Angular momentum makes the world go 'round.


Re: "Go" (lang) standard driver

2020-08-18 Thread Stephen Frost
Greetings,

* Olivier Gautherot (ogauthe...@gautherot.net) wrote:
> Le mar. 18 août 2020 à 09:36, Tony Shelver  a écrit :
> > -- Forwarded message -
> > From: Tony Shelver 
> > Date: Tue, 18 Aug 2020 at 09:33
> > Subject: Re: "Go" (lang) standard driver
> > To: Edson Richter 
> >
> >
> > A quick Google search found https://github.com/lib/pq.   Has 6.1K stars
> > so I would guess fairly well used.
> 
> I have used it successfully and it performed perfectly. Go for it.

It also says that it's not actively developed and recommends pgx
instead...

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Point in time recovery

2020-08-18 Thread Stephen Frost
Greetings,

Please don't top-post.

* Ron (ronljohnso...@gmail.com) wrote:
> Search the log file for errors?

Sadly, this doesn't actually work if you're really doing PITR- you need
to look for the specific message saying "recovery stopping before
commit" or similar, eg:

2020-08-18 12:55:31.240 UTC [16149] LOG:  recovery stopping before commit of 
transaction 495, time 2020-08-18 12:55:17.227359+00

If you don't see a message along those lines, and you actually specified
a recovery target, chances are good that, for whatever reason, PG didn't
find that recovery target and instead just replayed to the end of WAL,
which isn't what you want if you're specifying a recovery target.

You can see how to take backups, perform restores, and see messages like
the above in the PG log here:

https://learn.crunchydata.com/pg-administration/courses/basic-postgresql-for-dbas-12/pgbackrest/

(which is what I just did to get at the above message, hence the rather
recent timestamp ;)

> Query the database(s) to verify that the latest data s there?

I do generally recommend regular backup and restore procedures that
export the data from the database and then re-import it into a clean
system to verify that everything is consistent.

> On 8/18/20 5:10 AM, Daulat Ram wrote:
> >I want to know the best way to ensure/verify that the Point in time
> >recovery has done successfully after the crash and the restore.

If you're coming from a crash, then that's an entirely different thing
and you need to make sure you replay all WAL that existed at the time of
the crash- you can't do PITR to a point before then or the database
system won't be consistent.

Thanks,

Stephen


signature.asc
Description: PGP signature


Database logins taking longer and longer, showing up as "authentication" in ps(1)

2020-08-18 Thread Adam Sjøgren
  Hi,


We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which
sometimes exhibits a behaviour I can't quite understand: simply logging
into the database starts to take minutes to complete.

We have 60 processes (workers) running on different machines accessing
the database, that all grab jobs from a queue and update rows in a table
after doing some calculations (which vary in time from <1s to perhaps a
minute, many of them fast).

Sometimes new database logins slow down, from usually taking <0.05s to
taking minutes. This is for psql as a normal user using Kerberos, for
psql as the postgres superuser, for the web-application logging into the
database, for everything.

What I see in the output of ps(1) is a lot of postgres processes in
state "authentication", when this problem occurs. It "feels" like the
logins are waiting for something - when e.g. psql finally logs in,
running queries seem normal.

The only "special" thing I can think of is that the workers use advisory
locks on rows in the table they are updating (as there can be multiple
jobs in flight at the same time, for the same row).

I can imagine that a pattern of jobs and workers could make the locking
cause those workers to wait and slow each other down, but I don't
understand how it can affect the time it takes for new database logins.

When this happens, if I stop the 60 workers, the login times are back to
normal within seconds, and all the postgres processes showing
"authentication" switch away from that state. If I start the workers
again, the login problem appears within a couple of minutes.

If I start a smaller number of workers, say 5, the problem doesn't
appear.

Could this be lock contention on a database table somehow affecting
logins?

I have tried searching for this problem, and found some older references
to things² that seem to have been fixed in the meantime. Is anybody else
seeing this?

The server is configured with max_connections 1200 and currently has
~400 connections. It has 1.5 TB of memory and 112 cores (including
hyperthreading) - the load is not high when I see this problem, between
20 and 30.

Any ideas and suggestions are welcome - if there is some relevant
information I can provide, I will be happy to try.


  Best regards,

Adam


¹ We should plan to upgrade to 11.9, yes.

² E.g. 
https://www.postgresql.org/message-id/CAJw8uJQGoobesbPCMbxj6Vb4nv9D-GgvZ%2B7pK%2Bfckbb4DqJEAg%40mail.gmail.com
   and 
https://www.postgresql.org/message-id/20140312232810.634.7997%40wrigleys.postgresql.org

-- 
 "We don't care about crashing our cars Adam Sjøgren
  Pedal to the metal and our heads among the stars"a...@koldfront.dk





Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)

2020-08-18 Thread Laurenz Albe
On Tue, 2020-08-18 at 14:53 +0200, Adam Sjøgren wrote:
> We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which
> sometimes exhibits a behaviour I can't quite understand: simply logging
> into the database starts to take minutes to complete.
> 
> We have 60 processes (workers) running on different machines accessing
> the database, that all grab jobs from a queue and update rows in a table
> after doing some calculations (which vary in time from <1s to perhaps a
> minute, many of them fast).
> 
> Sometimes new database logins slow down, from usually taking <0.05s to
> taking minutes. This is for psql as a normal user using Kerberos, for
> psql as the postgres superuser, for the web-application logging into the
> database, for everything.

I would suspect that the problem is with the authentication method.

How is authenticatoin configured?

I'd "strace" a connection that is hanging in "authenticating" and
see what the process does.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)

2020-08-18 Thread Stephen Frost
Greetings,

* Adam Sjøgren (a...@koldfront.dk) wrote:
> We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which
> sometimes exhibits a behaviour I can't quite understand: simply logging
> into the database starts to take minutes to complete.
> 
> We have 60 processes (workers) running on different machines accessing
> the database, that all grab jobs from a queue and update rows in a table
> after doing some calculations (which vary in time from <1s to perhaps a
> minute, many of them fast).
> 
> Sometimes new database logins slow down, from usually taking <0.05s to
> taking minutes. This is for psql as a normal user using Kerberos, for
> psql as the postgres superuser, for the web-application logging into the
> database, for everything.

When in doubt, blame DNS.

Alternatively, in your case, the issue might be the KDC taking forever
to issue a ticket for the service.

Note that, in both cases, this is a client-side issue.  Once the ticket
has been acquired for the PG service, the actual authentication on the
server side should be very fast and not depend on things external
(though you might check if you have log_hostnames on..).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)

2020-08-18 Thread Adam Sjøgren
Laurenz writes:

> On Tue, 2020-08-18 at 14:53 +0200, Adam Sjøgren wrote:
>>
>> We have 60 processes (workers) running on different machines accessing
>> the database, that all grab jobs from a queue and update rows in a table
>> after doing some calculations (which vary in time from <1s to perhaps a
>> minute, many of them fast).
>> 
>> Sometimes new database logins slow down, from usually taking <0.05s to
>> taking minutes. This is for psql as a normal user using Kerberos, for
>> psql as the postgres superuser, for the web-application logging into the
>> database, for everything.
>
> I would suspect that the problem is with the authentication method.

Interesting, how are you suspecting the interaction to be playing out?

> How is authenticatoin configured?

Primarily using gss (for the web-application, delegated credentials),
md5 (system users), and a few trust (for legacy access).

> I'd "strace" a connection that is hanging in "authenticating" and
> see what the process does.

I did that one of the previous times I have seen this problem, and saw
stuff like this:

  2020-06-10 11:22:14strace: Process 101559 attached
  2020-06-10 11:22:14futex(0x7f46df29e138, 
FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ) = 0
...
  2020-06-10 11:24:09futex(0x7f46df29e138, 
FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ) = 0
  2020-06-10 11:24:09futex(0x7f46df2a2e38, FUTEX_WAKE, 1)= 1
  2020-06-10 11:24:09brk(0x55c197084000) = 
0x55c197084000
  2020-06-10 11:24:09brk(0x55c1970b) = 
0x55c1970b
  2020-06-10 11:24:09open("global/pg_filenode.map", O_RDONLY) = 6

(where the elided part is futex() calls).

The slow login is regardless of whether I login with Kerberos or as the
postgres superuser with psql.


  Best regards,

Adam

-- 
 "Sometimes I feel like going home, walking in the  Adam Sjøgren
  boardrooma...@koldfront.dk
  Sometimes I feel like I wanna go, walk into the
  ballroom"




Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)

2020-08-18 Thread Srinivasa T N
On Tue, Aug 18, 2020, 8:04 PM Stephen Frost  wrote:

> Greetings,
>
> * Adam Sjøgren (a...@koldfront.dk) wrote:
> > We have a PostgreSQL 11.3¹ running on an Ubuntu 16.04 server, which
> > sometimes exhibits a behaviour I can't quite understand: simply logging
> > into the database starts to take minutes to complete.
> >
> > We have 60 processes (workers) running on different machines accessing
> > the database, that all grab jobs from a queue and update rows in a table
> > after doing some calculations (which vary in time from <1s to perhaps a
> > minute, many of them fast).
> >
> > Sometimes new database logins slow down, from usually taking <0.05s to
> > taking minutes. This is for psql as a normal user using Kerberos, for
> > psql as the postgres superuser, for the web-application logging into the
> > database, for everything.
>
> When in doubt, blame DNS.
>
> Alternatively, in your case, the issue might be the KDC taking forever
> to issue a ticket for the service.
>
> Note that, in both cases, this is a client-side issue.


If it is a client side issue, how do you explain the behaviour that OP is
not facing problem when there are only few clients?

Regards,
Seenu.

>   Once the ticket
> has been acquired for the PG service, the actual authentication on the
> server side should be very fast and not depend on things external
> (though you might check if you have log_hostnames on..).
>
> Thanks,
>
> Stephen
>


Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)

2020-08-18 Thread Tom Lane
Stephen Frost  writes:
> * Adam Sjøgren (a...@koldfront.dk) wrote:
>> Sometimes new database logins slow down, from usually taking <0.05s to
>> taking minutes. This is for psql as a normal user using Kerberos, for
>> psql as the postgres superuser, for the web-application logging into the
>> database, for everything.

> When in doubt, blame DNS.

The trouble with outside-the-database explanations is that it's not
clear why stopping the worker processes would make such issues go away.

I think looking into pg_stat_activity and pg_locks might be interesting,
although it's possible the delay is before the new session has made
any entries there.

Also, obtaining stack traces from a few of the stuck processes would
likely be quite informative.

regards, tom lane




Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)

2020-08-18 Thread Adam Sjøgren
Stephen writes:

> * Adam Sjøgren (a...@koldfront.dk) wrote:
>> 
>> We have 60 processes (workers) running on different machines accessing
>> the database, that all grab jobs from a queue and update rows in a table
>> after doing some calculations (which vary in time from <1s to perhaps a
>> minute, many of them fast).
>> 
>> Sometimes new database logins slow down, from usually taking <0.05s to
>> taking minutes. This is for psql as a normal user using Kerberos, for
>> psql as the postgres superuser, for the web-application logging into the
>> database, for everything.
>
> When in doubt, blame DNS.

I'd love to! However I don't see any DNS outages on our local network
correlating with whether I run 60 workers or 5 workers.

> Alternatively, in your case, the issue might be the KDC taking forever
> to issue a ticket for the service.

If that was the cause, logging in as the 'postgres' superuser (not using
Kerberos) locally on the server should be fast regardless, right?

> (though you might check if you have log_hostnames on..).

It's off:

  $ grep hostname /etc/postgresql/11/main/postgresql.conf 
  #log_hostname = off


Thanks for replying!


  Best regards,

Adam

-- 
 "Probabilistic algorithms don't appeal to me. (ThisAdam Sjøgren
  is a question of aesthetics, not practicality.) So   a...@koldfront.dk
  later, I figured out how to remove the probability
  and turn it into a deterministic algorithm."




import XML

2020-08-18 Thread PASCAL CROZET
Hi all,

I want to import XML file into PG database table.
I've find functions to get the XML content of a cell after imported an XML file 
with the pg_get_file function.
But, I want to explode the XML content to colums. How can I do this ?

PG 10 under Ubuntu 18

_
Greetings, Pascal CROZET
DBA - Qualis Consulting
_


Re: import XML

2020-08-18 Thread David G. Johnston
On Tue, Aug 18, 2020 at 8:27 AM PASCAL CROZET <
pascal.cro...@qualis-consulting.com> wrote:

> I want to import XML file into PG database table.
> I've find functions to get the XML content of a cell after imported an XML
> file with the pg_get_file function.
> But, I want to explode the XML content to colums. How can I do this ?
>

In short, you need to import the content as a document value first then use
functions to explode that document.

You should find the documentation helpful.

https://www.postgresql.org/docs/10/functions-xml.html

Specifically, "xmltable"

Keep in mind that "context as a document value" can just be a literal.

David J.


Re: Index tuple deduplication limitations in pg13

2020-08-18 Thread Peter Geoghegan
On Mon, Aug 17, 2020 at 11:44 PM Matthias van de Meent
 wrote:
> But, if the ordering of operator-class equal tuples is already
> system-defined, could the physical ordering of index tuples in a btree
> (with deduplication enabled for "unsafe" opclasses) be updated from
> [index_columns, tid] to [index_columns,
> image_compare(non_datum_equal_columns), tid], giving a stable sorting
> of opclass-equal and image-equal values and enabling safe consistent
> deduplication?

The issue isn't the physical ordering. The issue is that we cannot
allow the implementation to destroy semantic differences among equal
datums. We avoid deduplication with cases where two equal datums are
visibly different. For example, it would not be okay if we forgot that
your numeric datum was originally input as '5.000', and output '5'
later on.

If we wanted to fix this for numeric, we'd have to invent a new
numeric datatype (called numeric2, say). That probably isn't as hard
as it sounds, since it could be part of the same B-Tree operator
family as numeric. It could also be implicitly cast to numeric.

-- 
Peter Geoghegan




Re: Index tuple deduplication limitations in pg13

2020-08-18 Thread Peter Geoghegan
On Tue, Aug 18, 2020 at 9:44 AM Peter Geoghegan  wrote:
> If we wanted to fix this for numeric, we'd have to invent a new
> numeric datatype (called numeric2, say). That probably isn't as hard
> as it sounds, since it could be part of the same B-Tree operator
> family as numeric. It could also be implicitly cast to numeric.

I forgot to say: numeric2 would be just like numeric, except in one
specific way: it wouldn't care about display scale. The user would be
giving up on display scale by choosing numeric2 over numeric. The "5
vs 5.000" information would always be lost by design, so there'd be
nothing for deduplication to break. Deduplication could then be
enabled.

-- 
Peter Geoghegan




Re: Index tuple deduplication limitations in pg13

2020-08-18 Thread Peter Geoghegan
On Tue, Aug 18, 2020 at 11:52 AM Matthias van de Meent
 wrote:
> Deduplication does not need to destroy semantic differences? 'equal'
> can (in my book) mean:
> - 'opclass-equal', that is the opclass returns true for an equality check
> - 'binary equal' or 'datum-equal' (? maybe incorrect term), that is
> the unprocessed on-disk representations (datum image is the right term
> I believe?) of the compared values are indistinguishable.
>
> Runs of 'binary equal' datums can be freely deduplicated [0] when found.

> [0]
> Inserting a row in a deduplicated index with in, with TID ntid, can
> encounter a posting list of a opclass-equal but not datum image-equal
> tuples where the lowest TID of the posting list is less than ntid, and
> ntid is less than the highest TID of the posting list. This would
> require a posting list split to accomodate the new tuples' index entry
> in order to not lose data.

But you can't do that easily, because it breaks subtle assumptions
about posting list splits and space utilization. In particular, it
means that you can no longer think of a posting list split as
rewriting an incoming new item such that you can more or less pretend
that there was no overlap in the first place -- code like _bt_split
and nbtsplitloc.c relies on this. Introducing new special cases to
nbtsplitloc.c is very unappealing.

More concretely, if you introduce a posting list split like this then
you need three copies of the key -- the original, the new, and a
second copy of the original. That's much more complicated.

--
Peter Geoghegan




Orphaned relations after crash/sigkill during CREATE TABLE

2020-08-18 Thread Jason Myers
Postgres 12.4

I was directed in slack to mention here that we're being impacted by
Postgres leaving orphaned pages in /base/ after a crash while a CREATE
TABLE is being run in transaction.

The issue is the same as the reproduction steps listed here [1], that is:

- Start a CREATE TABLE transaction for a large table
- Terminate the process via kill -9
- Pages are left in /base that have no filenode references anymore, such
that `pg_database_size()` grows to be very large while total table+index
size remains constant

However in our particular case, we're using a managed/cloud Postgres server
and our `CREATE TABLE` transaction was being terminated by the OOM killer.
Using a managed service, we don't have filesystem access to go and clear
out these orphaned pages.  This caused our total db size to grow from 40GB
of table+index data to 4TB on-disk (but still only 40GB of table+index
data, the other ~3.95TB being orphaned CREATE TABLE pages)

I realize (per a blog post from Robert Haas [2] and from slack
conversation) that this is a known issue, but was directed here from slack
to just mention that we were impacted by it, and have no resolution due to
not having filesystem access, and not having a method internally to
Postgres to deal with these orphaned relations.  (Our recourse currently is
to do something like a pg_dump/pg_restore onto a clean instance in order to
escape the orphaned files)

-Jason

[1] https://github.com/bdrouvot/pg_orphaned#example-1
[2]
https://rhaas.blogspot.com/2020/05/dont-manually-modify-postgresql-data.html


Re: Orphaned relations after crash/sigkill during CREATE TABLE

2020-08-18 Thread Adrian Klaver

On 8/18/20 12:35 PM, Jason Myers wrote:

Postgres 12.4

I was directed in slack to mention here that we're being impacted by 
Postgres leaving orphaned pages in /base/ after a crash while a 
CREATE TABLE is being run in transaction.


The issue is the same as the reproduction steps listed here [1], that is:

- Start a CREATE TABLE transaction for a large table
- Terminate the process via kill -9
- Pages are left in /base that have no filenode references anymore, such 
that `pg_database_size()` grows to be very large while total table+index 
size remains constant


So from [1] you are using CREATE TABLE AS. Have you tried with:

BEGIN;
CREATE TABLE some_table SELECT some_data FROM other_table LIMIT 1 WITH 
NO DATA;

COMMIT;

The above gets you the table structure, but no data.

BEGIN;
INSERT into some_table SELECT * FROM other_table;
COMMIT;

The above populates the table. Have not tested but I'm going to assume 
if you kill the above the problem would not happen or would be fixable 
by DELETE FROM some_table/TRUNCATE some_table;





However in our particular case, we're using a managed/cloud Postgres 
server and our `CREATE TABLE` transaction was being terminated by the 
OOM killer.  Using a managed service, we don't have filesystem access to 
go and clear out these orphaned pages.  This caused our total db size to 
grow from 40GB of table+index data to 4TB on-disk (but still only 40GB 
of table+index data, the other ~3.95TB being orphaned CREATE TABLE pages)


I realize (per a blog post from Robert Haas [2] and from slack 
conversation) that this is a known issue, but was directed here from 
slack to just mention that we were impacted by it, and have no 
resolution due to not having filesystem access, and not having a method 
internally to Postgres to deal with these orphaned relations.  (Our 
recourse currently is to do something like a pg_dump/pg_restore onto a 
clean instance in order to escape the orphaned files)


-Jason

[1] https://github.com/bdrouvot/pg_orphaned#example-1
[2] 
https://rhaas.blogspot.com/2020/05/dont-manually-modify-postgresql-data.html





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Index tuple deduplication limitations in pg13

2020-08-18 Thread Peter Geoghegan
On Tue, Aug 18, 2020 at 11:52 AM Matthias van de Meent
 wrote:
> Given that the above could work, the current btree tuple ordering is
> not optimized for opclass-equal but datum image-distinct values:
> ordering of opclass-equal values is currently determined only by tid,
> with as an example current ordering ['0.0', '0', '0.00', '0', '0.0',
> '0']. It would be more optimized for deduplication if that was stored
> as e.g. ['0', '0', '0', '0.0', '0.0', '0.00'], which is why I
> suggested to add an ordering by the datum image before the tid
> ordering. Additionally, this extra ordering also prevents the problem
> of [0] by never attempting an insertion of non-equal image datums in a
> posting list of otherwise equal values, as it would be ordered either
> before or after the posting list, never inside the list.

Yeah, that would work, but at the cost of making numeric totally
unusable. Now you cannot rely on unique enforcement detecting that '0'
is a duplicate of '0.0'. In fact, even the most trivial use of the =
operator will be broken in the presence of different display scales.
It's a non-starter.

The numeric2 design that I sketched is a bit ugly, but I can see no
better way. A three-way posting list split (i.e. the other design that
you sketched) is a special case that is very hard to test, very
complicated, and of little value in the grand scheme of things.

-- 
Peter Geoghegan




Re: Orphaned relations after crash/sigkill during CREATE TABLE

2020-08-18 Thread Jason Myers
> On Tue, Aug 18, 2020 at 3:49 PM Adrian Klaver 
wrote:

> > Have you tried with:
> BEGIN;
> CREATE TABLE some_table SELECT some_data FROM other_table LIMIT 1 WITH
> NO DATA;
> COMMIT;
>
> The above gets you the table structure, but no data.
>
> BEGIN;
> INSERT into some_table SELECT * FROM other_table;
>COMMIT;
>
> The above populates the table

Thanks -- we were indeed creating and populating the new table all in a
single transaction.

I'll see if we can split this into two transactions so that the table
structure is committed quickly.  I think you're right that this would
mostly sidestep the issue.

-Jason

p.s. Apologies if this is formatted wrong, this is my first mailing list
post.


Re: Orphaned relations after crash/sigkill during CREATE TABLE

2020-08-18 Thread Adrian Klaver

On 8/18/20 1:19 PM, Jason Myers wrote:
 > On Tue, Aug 18, 2020 at 3:49 PM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


 > Have you tried with:
 > BEGIN;
 > CREATE TABLE some_table SELECT some_data FROM other_table LIMIT 1 WITH
 > NO DATA;
 > COMMIT;
 >
 > The above gets you the table structure, but no data.
 >
 > BEGIN;
 > INSERT into some_table SELECT * FROM other_table;
 >COMMIT;
 >
 > The above populates the table

Thanks -- we were indeed creating and populating the new table all in a 
single transaction.


I'll see if we can split this into two transactions so that the table 
structure is committed quickly.  I think you're right that this would 
mostly sidestep the issue.


Assuming the table you are pulling from is fairly static, it could also 
allow you to regulate the amount of data you transfer at any one time 
into the new table.




-Jason

p.s. Apologies if this is formatted wrong, this is my first mailing list 
post.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: "Go" (lang) standard driver

2020-08-18 Thread Scottix
I have been using pgx and seems to be working well.

I recommend if your just starting to use the v4 version.

On Tue, Aug 18, 2020 at 5:53 AM Stephen Frost  wrote:

> Greetings,
>
> * Olivier Gautherot (ogauthe...@gautherot.net) wrote:
> > Le mar. 18 août 2020 à 09:36, Tony Shelver  a écrit
> :
> > > -- Forwarded message -
> > > From: Tony Shelver 
> > > Date: Tue, 18 Aug 2020 at 09:33
> > > Subject: Re: "Go" (lang) standard driver
> > > To: Edson Richter 
> > >
> > >
> > > A quick Google search found https://github.com/lib/pq.   Has 6.1K
> stars
> > > so I would guess fairly well used.
> >
> > I have used it successfully and it performed perfectly. Go for it.
>
> It also says that it's not actively developed and recommends pgx
> instead...
>
> Thanks,
>
> Stephen
>


-- 
T: @Thaumion
IG: Thaumion
scot...@gmail.com


Re: Index tuple deduplication limitations in pg13

2020-08-18 Thread Matthias van de Meent
On Tue, 18 Aug 2020 at 18:44, Peter Geoghegan  wrote:
>
> On Mon, Aug 17, 2020 at 11:44 PM Matthias van de Meent
>  wrote:
> > But, if the ordering of operator-class equal tuples is already
> > system-defined, could the physical ordering of index tuples in a btree
> > (with deduplication enabled for "unsafe" opclasses) be updated from
> > [index_columns, tid] to [index_columns,
> > image_compare(non_datum_equal_columns), tid], giving a stable sorting
> > of opclass-equal and image-equal values and enabling safe consistent
> > deduplication?
>
> The issue isn't the physical ordering. The issue is that we cannot
> allow the implementation to destroy semantic differences among equal
> datums.

Deduplication does not need to destroy semantic differences? 'equal'
can (in my book) mean:
- 'opclass-equal', that is the opclass returns true for an equality check
- 'binary equal' or 'datum-equal' (? maybe incorrect term), that is
the unprocessed on-disk representations (datum image is the right term
I believe?) of the compared values are indistinguishable.

Runs of 'binary equal' datums can be freely deduplicated [0] when found.

> We avoid deduplication with cases where two equal datums are
> visibly different. For example, it would not be okay if we forgot that
> your numeric datum was originally input as '5.000', and output '5'
> later on.

I agree on that point. But, isn't this display scale also stored in
the 'datum image' [1] and could therefore be distinguished against for
two opclass-equal values whilst deduplicating, only putting binary
equal values in a posting list? e.g. all values generated through
'0.000'::numeric -values can be compressed into one posting tuple
without information loss, as the datum image is effectively the same
for all of these (unless I have missed something), and distinct from
the datum image of '0'::numeric. With enough equal values, you would
eventually have a posting lists for each distinct datum image of equal
values.

Given that the above could work, the current btree tuple ordering is
not optimized for opclass-equal but datum image-distinct values:
ordering of opclass-equal values is currently determined only by tid,
with as an example current ordering ['0.0', '0', '0.00', '0', '0.0',
'0']. It would be more optimized for deduplication if that was stored
as e.g. ['0', '0', '0', '0.0', '0.0', '0.00'], which is why I
suggested to add an ordering by the datum image before the tid
ordering. Additionally, this extra ordering also prevents the problem
of [0] by never attempting an insertion of non-equal image datums in a
posting list of otherwise equal values, as it would be ordered either
before or after the posting list, never inside the list.

> If we wanted to fix this for numeric, we'd have to invent a new
> numeric datatype (called numeric2, say). That probably isn't as hard
> as it sounds, since it could be part of the same B-Tree operator
> family as numeric. It could also be implicitly cast to numeric.
>
> --
> Peter Geoghegan

[0]
Inserting a row in a deduplicated index with in, with TID ntid, can
encounter a posting list of a opclass-equal but not datum image-equal
tuples where the lowest TID of the posting list is less than ntid, and
ntid is less than the highest TID of the posting list. This would
require a posting list split to accomodate the new tuples' index entry
in order to not lose data.

[1]
# a table with 1000 distinct values, each duplicated 700 times, but
split into 7x100 'datum-equal' value groups that should be
deduplicatable.
SELECT  (n % 1000 || '.' || repeat('0', n%7))::numeric AS num INTO
numbers FROM generate_series(1, 70) series(n);
CREATE INDEX numeric_index ON numbers (num);
ANALYZE numbers;
SELECT num FROM numbers ORDER BY num LIMIT 700; -- this is an
index-only scan, and results show numeric scale, so the information
must be stored in the index.




Re: Index tuple deduplication limitations in pg13

2020-08-18 Thread Matthias van de Meent
On Tue, 18 Aug 2020 at 22:00, Peter Geoghegan  wrote:
>
> On Tue, Aug 18, 2020 at 11:52 AM Matthias van de Meent
>  wrote:
> > Given that the above could work, the current btree tuple ordering is
> > not optimized for opclass-equal but datum image-distinct values:
> > ordering of opclass-equal values is currently determined only by tid,
> > with as an example current ordering ['0.0', '0', '0.00', '0', '0.0',
> > '0']. It would be more optimized for deduplication if that was stored
> > as e.g. ['0', '0', '0', '0.0', '0.0', '0.00'], which is why I
> > suggested to add an ordering by the datum image before the tid
> > ordering. Additionally, this extra ordering also prevents the problem
> > of [0] by never attempting an insertion of non-equal image datums in a
> > posting list of otherwise equal values, as it would be ordered either
> > before or after the posting list, never inside the list.
>
> Yeah, that would work, but at the cost of making numeric totally
> unusable. Now you cannot rely on unique enforcement detecting that '0'
> is a duplicate of '0.0'. In fact, even the most trivial use of the =
> operator will be broken in the presence of different display scales.
> It's a non-starter.

Would this extra ordering not effectively be an extra tiebreaker in
the ordering, applied before the TID? I do not know the full
implications of that, but I believe that would not result in the
limitations that you are mentioning.

- Matthias




Re: Index tuple deduplication limitations in pg13

2020-08-18 Thread Peter Geoghegan
On Tue, Aug 18, 2020 at 1:31 PM Matthias van de Meent
 wrote:
> Would this extra ordering not effectively be an extra tiebreaker in
> the ordering, applied before the TID? I do not know the full
> implications of that, but I believe that would not result in the
> limitations that you are mentioning.

You could probably do it that way, but again you end up with a lot of
new complexity. Not to mention overhead that would have to be paid by
everyone. It would require code that supported the old way (even if it
was added to Postgres 13) for pg_upgrade, that would also be hard to
test. And it might defeat certain future optimizations based on heap
TID being the only tiebreaker. Having two types of equality might have
to bleed into the optimizer.

It's a question of engineering trade-offs. I don't think that it's worth it.

-- 
Peter Geoghegan




Re: "Go" (lang) standard driver

2020-08-18 Thread David Gallagher
I can recommend sqlx:
https://github.com/jmoiron/sqlx


On Tue, Aug 18, 2020 at 4:56 PM Scottix  wrote:

> I have been using pgx and seems to be working well.
>
> I recommend if your just starting to use the v4 version.
>
> On Tue, Aug 18, 2020 at 5:53 AM Stephen Frost  wrote:
>
>> Greetings,
>>
>>
>>
>>
>>
>> * Olivier Gautherot (ogauthe...@gautherot.net) wrote:
>>
>>
>> > Le mar. 18 août 2020 à 09:36, Tony Shelver  a
>> écrit :
>>
>>
>> > > -- Forwarded message -
>>
>>
>> > > From: Tony Shelver 
>>
>>
>> > > Date: Tue, 18 Aug 2020 at 09:33
>>
>>
>> > > Subject: Re: "Go" (lang) standard driver
>>
>>
>> > > To: Edson Richter 
>>
>>
>> > >
>>
>>
>> > >
>>
>>
>> > > A quick Google search found https://github.com/lib/pq.   Has 6.1K
>> stars
>>
>>
>> > > so I would guess fairly well used.
>>
>>
>> >
>>
>>
>> > I have used it successfully and it performed perfectly. Go for it.
>>
>>
>>
>>
>>
>> It also says that it's not actively developed and recommends pgx
>>
>>
>> instead...
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>>
>>
>> Stephen
>>
>>
>>
>
> --
> T: @Thaumion
> IG: Thaumion
> scot...@gmail.com
>
>
>


postgres materialized view refresh in parallel

2020-08-18 Thread Ayub M
Created materialized view in parallel by setting
max_parallel_workers_per_gather to 4. I could see parallel sessions kicking
off and creating the mview much faster than serial execution (10mins -
parallel vs over an hour - serial).

Now I want the refresh of the mview to happen taking around same time as
create, I am okay with it taking little more than create, to execute the
steps it takes when refreshing the view. But what I am noticing is, refresh
is not running in parallel at all - even after having
max_parallel_workers_per_gather to 4. Does PostgreSQL 11.6 not support
refresh of the mviews in parallel as it did while creating the mview?