pldbgapi error

2019-05-20 Thread Prakash Ramakrishnan
Hi Team,

We are trying to install the pldbgapi extension in postgresql 11.3 server
but we getting below error,Please do the needful,

==> make USE_PGXS=1
gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches
-m64 -mtune=generic -fPIC
-I/usr/pgsql-11/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./
-I/usr/pgsql-11/include/server -I/usr/pgsql-11/include/internal
-D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include  -c -o
plpgsql_debugger.o plpgsql_debugger.c
plpgsql_debugger.c: In function ‘plpgsql_send_vars’:
plpgsql_debugger.c:263:16: error: ‘TRUE’ undeclared (first use in this
function)
isArg = TRUE;
^
plpgsql_debugger.c:263:16: note: each undeclared identifier is reported
only once for each function it appears in
plpgsql_debugger.c:244:4: warning: enumeration value ‘PLPGSQL_DTYPE_ROW’
not handled in switch [-Wswitch]
switch( estate->datums[i]->dtype )
^
plpgsql_debugger.c:244:4: warning: enumeration value ‘PLPGSQL_DTYPE_REC’
not handled in switch [-Wswitch]
plpgsql_debugger.c:244:4: warning: enumeration value
‘PLPGSQL_DTYPE_RECFIELD’ not handled in switch [-Wswitch]
plpgsql_debugger.c:244:4: warning: enumeration value
‘PLPGSQL_DTYPE_ARRAYELEM’ not handled in switch [-Wswitch]
plpgsql_debugger.c:244:4: warning: enumeration value
‘PLPGSQL_DTYPE_PROMISE’ not handled in switch [-Wswitch]
plpgsql_debugger.c: In function ‘find_datum_by_name’:
plpgsql_debugger.c:484:9: error: ‘PLPGSQL_DTYPE_EXPR’ undeclared (first use
in this function)
case PLPGSQL_DTYPE_EXPR:
 ^
plpgsql_debugger.c:462:3: warning: enumeration value
‘PLPGSQL_DTYPE_PROMISE’ not handled in switch [-Wswitch]
   switch( frame->datums[i]->dtype )
   ^
plpgsql_debugger.c: In function ‘print_rec’:
plpgsql_debugger.c:575:9: error: ‘PLpgSQL_rec’ has no member named ‘tupdesc’
  if (tgt->tupdesc == NULL)
 ^
plpgsql_debugger.c:578:29: error: ‘PLpgSQL_rec’ has no member named
‘tupdesc’
  for( attNo = 0; attNo < tgt->tupdesc->natts; ++attNo )
 ^
plpgsql_debugger.c:580:36: error: ‘PLpgSQL_rec’ has no member named ‘tup’
   char * extval = SPI_getvalue( tgt->tup, tgt->tupdesc, attNo + 1 );
^
plpgsql_debugger.c:580:46: error: ‘PLpgSQL_rec’ has no member named
‘tupdesc’
   char * extval = SPI_getvalue( tgt->tup, tgt->tupdesc, attNo + 1 );
  ^
In file included from /usr/pgsql-11/include/server/postgres.h:46:0,
 from plpgsql_debugger.c:12:
plpgsql_debugger.c:582:51: error: ‘PLpgSQL_rec’ has no member named
‘tupdesc’
   dbg_send( "v:%s.%s:%s\n", var_name, NameStr(
tgt->tupdesc->attrs[attNo]->attname ), extval ? extval : "NULL" );
   ^
/usr/pgsql-11/include/server/c.h:576:25: note: in definition of macro
‘NameStr’
 #define NameStr(name) ((name).data)


-- 
Thanks,
Prakash.R


Bug in documentation (trim(...))?

2019-05-20 Thread Johann Spies
The following illustrates the problem (removing double quotes from a
string):











*js=# select regexp_replace(regexp_replace( '"University of Cape Town"',
'^"', ''),'"$', ''); regexp_replace
- University of Cape Town(1 row)js=# trim(both '"'
from '"University of Cape Town"');ERROR:  syntax error at or near
"trim"LINE 1: trim(both '"' from '"University of Cape Town"');*

Example from the documentation:





*js=# trim(both 'xyz' from 'yxTomxx');ERROR:  syntax error at or near
"trim"LINE 1: trim(both 'xyz' from 'yxTomxx');^*
I would think that "trim" if it works in this case would be a cheaper
function than using rexexp_replace twice.

Or is there another way?

Regards
Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


pg_upgrade can result in early wraparound on databases with high transaction load

2019-05-20 Thread Jason Harvey
Hello,

This week I upgraded one of my large(2.8TB), high-volume databases from 9
to 11. The upgrade itself went fine. About two days later, we unexpectedly
hit transaction ID wraparound. What was perplexing about this was that the
age of our oldest `datfrozenxid` was only 1.2 billion - far away from where
I'd expect a wraparound. Curiously, the wraparound error referred to a
mysterious database of `OID 0`:

UPDATE ERROR:  database is not accepting commands to avoid wraparound data
loss in database with OID 0

We were able to recover after a few hours by greatly speeding up our vacuum
on our largest table.

In a followup investigation I uncovered the reason we hit the wraparound so
early, and also the cause of the mysterious OID 0 message. When pg_upgrade
executes, it calls pg_resetwal to set the next transaction ID. Within
pg_resetwal is the following code:
https://github.com/postgres/postgres/blob/6cd404b344f7e27f4d64555bb133f18a758fe851/src/bin/pg_resetwal/pg_resetwal.c#L440-L450

This sets the controldata to have a fake database (OID 0) on the brink of
transaction wraparound. Specifically, after pg_upgrade is ran, wraparound
will occur within around 140 million transactions (provided the autovacuum
doesn't finish first). I confirmed by analyzing our controldata before and
after the upgrade that this was the cause of our early wraparound.

Given the size and heavy volume of our database, we tend to complete a
vacuum in the time it takes around 250 million transactions to execute.
With our tunings this tends to be rather safe and we stay well away from
the wraparound point under normal circumstances.

Unfortunately we had no obvious way of knowing that the upgrade would place
our database upon the brink of wraparound. In fact, since this info is only
persisted in the controldata, the only way to discover this state to my
knowledge would be to inspect the controldata itself. Other standard means
of monitoring for wraparound risk involve watching `pg_database` or
`pg_class`, which in this case tells us nothing helpful since the fake
database present in the controldata is not represented in those stats.

I'd like to suggest that either the pg_upgrade->pg_resetwal behaviour be
adjusted, or the pg_upgrade documentation highlight this potential
scenario. I'm happy to contribute code and/or documentation pull requests
to accomplish this.

Thank you,
Jason Harvey
reddit.com


Re: Bug in documentation (trim(...))?

2019-05-20 Thread Francisco Olarte
On Mon, May 20, 2019 at 12:09 PM Johann Spies  wrote:
> The following illustrates the problem (removing double quotes from a string):

Missing "select " in your examples, is it intentional?

> js=# trim(both '"' from '"University of Cape Town"');
> ERROR:  syntax error at or near "trim"
> LINE 1: trim(both '"' from '"University of Cape Town"');
> js=# trim(both 'xyz' from 'yxTomxx');
> ERROR:  syntax error at or near "trim"
> LINE 1: trim(both 'xyz' from 'yxTomxx');
> ^
> I would think that "trim" if it works in this case would be a cheaper 
> function than using rexexp_replace twice.

( copied from your mail AFTER "select " ):

$ select trim(both '"' from '"University of Cape Town"');
  btrim
-
 University of Cape Town
(1 row)

$ select trim(both 'xyz' from 'yxTomxx');
 btrim
---
 Tom
(1 row)

Francisco Olarte.




Re: Bug in documentation (trim(...))?

2019-05-20 Thread Johann Spies
Thanks!

I was a bit stupid.

Johann


On Mon, 20 May 2019 at 12:18, Francisco Olarte 
wrote:

> On Mon, May 20, 2019 at 12:09 PM Johann Spies 
> wrote:
> > The following illustrates the problem (removing double quotes from a
> string):
>
> Missing "select " in your examples, is it intentional?
>
> > js=# trim(both '"' from '"University of Cape Town"');
> > ERROR:  syntax error at or near "trim"
> > LINE 1: trim(both '"' from '"University of Cape Town"');
> > js=# trim(both 'xyz' from 'yxTomxx');
> > ERROR:  syntax error at or near "trim"
> > LINE 1: trim(both 'xyz' from 'yxTomxx');
> > ^
> > I would think that "trim" if it works in this case would be a cheaper
> function than using rexexp_replace twice.
>
> ( copied from your mail AFTER "select " ):
>
> $ select trim(both '"' from '"University of Cape Town"');
>   btrim
> -
>  University of Cape Town
> (1 row)
>
> $ select trim(both 'xyz' from 'yxTomxx');
>  btrim
> ---
>  Tom
> (1 row)
>
> Francisco Olarte.
>


-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: pldbgapi error

2019-05-20 Thread Ian Barwick

On 5/20/19 7:07 PM, Prakash Ramakrishnan wrote:> Hi Team,
>
> We are trying to install the pldbgapi extension in postgresql 11.3 server but 
we getting below error,Please do the needful,
>
> ==> make USE_PGXS=1
> gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing 
-fwrapv -fexcess-precision=standard -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 
-grecord-gcc-switches -m64 -mtune=generic -fPIC 
-I/usr/pgsql-11/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./ 
-I/usr/pgsql-11/include/server -I/usr/pgsql-11/include/internal  -D_GNU_SOURCE 
-I/usr/include/libxml2  -I/usr/include  -c -o plpgsql_debugger.o plpgsql_debugger.c
> plpgsql_debugger.c: In function ‘plpgsql_send_vars’:
> plpgsql_debugger.c:263:16: error: ‘TRUE’ undeclared (first use in this 
function)
>  isArg = TRUE;
>  ^
> plpgsql_debugger.c:263:16: note: each undeclared identifier is reported only 
once for each function it appears in
> plpgsql_debugger.c:244:4: warning: enumeration value ‘PLPGSQL_DTYPE_ROW’ not 
handled in switch [-Wswitch]
>  switch( estate->datums[i]->dtype )
>  ^
> plpgsql_debugger.c:244:4: warning: enumeration value ‘PLPGSQL_DTYPE_REC’ not 
handled in switch [-Wswitch]
> plpgsql_debugger.c:244:4: warning: enumeration value ‘PLPGSQL_DTYPE_RECFIELD’ 
not handled in switch [-Wswitch]
> plpgsql_debugger.c:244:4: warning: enumeration value 
‘PLPGSQL_DTYPE_ARRAYELEM’ not handled in switch [-Wswitch]
> plpgsql_debugger.c:244:4: warning: enumeration value ‘PLPGSQL_DTYPE_PROMISE’ 
not handled in switch [-Wswitch]
> plpgsql_debugger.c: In function ‘find_datum_by_name’:
> plpgsql_debugger.c:484:9: error: ‘PLPGSQL_DTYPE_EXPR’ undeclared (first use 
in this function)
>  case PLPGSQL_DTYPE_EXPR:(...)

It looks like you are trying to compile some extension code which has not been 
updated
to be compatible with the PostgreSQL 11 source. Check to see if there's updated 
code available.

Regards

Ian Barwick

--
 Ian Barwick   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: pldbgapi error

2019-05-20 Thread Prakash Ramakrishnan
Hi Ian,

Thanks for replying can you please help me to solve this problem which code
will be support for postgresql 11 can you please share me the link or file.

Regards,
Prakash.R

On Mon, May 20, 2019 at 5:25 PM Ian Barwick 
wrote:

> On 5/20/19 7:07 PM, Prakash Ramakrishnan wrote:> Hi Team,
>  >
>  > We are trying to install the pldbgapi extension in postgresql 11.3
> server but we getting below error,Please do the needful,
>  >
>  > ==> make USE_PGXS=1
>  > gcc -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
> -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
> -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches
> -m64 -mtune=generic -fPIC
> -I/usr/pgsql-11/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./
> -I/usr/pgsql-11/include/server -I/usr/pgsql-11/include/internal
> -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include  -c -o
> plpgsql_debugger.o plpgsql_debugger.c
>  > plpgsql_debugger.c: In function ‘plpgsql_send_vars’:
>  > plpgsql_debugger.c:263:16: error: ‘TRUE’ undeclared (first use in this
> function)
>  >  isArg = TRUE;
>  >  ^
>  > plpgsql_debugger.c:263:16: note: each undeclared identifier is reported
> only once for each function it appears in
>  > plpgsql_debugger.c:244:4: warning: enumeration value
> ‘PLPGSQL_DTYPE_ROW’ not handled in switch [-Wswitch]
>  >  switch( estate->datums[i]->dtype )
>  >  ^
>  > plpgsql_debugger.c:244:4: warning: enumeration value
> ‘PLPGSQL_DTYPE_REC’ not handled in switch [-Wswitch]
>  > plpgsql_debugger.c:244:4: warning: enumeration value
> ‘PLPGSQL_DTYPE_RECFIELD’ not handled in switch [-Wswitch]
>  > plpgsql_debugger.c:244:4: warning: enumeration value
> ‘PLPGSQL_DTYPE_ARRAYELEM’ not handled in switch [-Wswitch]
>  > plpgsql_debugger.c:244:4: warning: enumeration value
> ‘PLPGSQL_DTYPE_PROMISE’ not handled in switch [-Wswitch]
>  > plpgsql_debugger.c: In function ‘find_datum_by_name’:
>  > plpgsql_debugger.c:484:9: error: ‘PLPGSQL_DTYPE_EXPR’ undeclared (first
> use in this function)
>  >  case PLPGSQL_DTYPE_EXPR:(...)
>
> It looks like you are trying to compile some extension code which has not
> been updated
> to be compatible with the PostgreSQL 11 source. Check to see if there's
> updated code available.
>
> Regards
>
> Ian Barwick
>
> --
>   Ian Barwick   https://www.2ndQuadrant.com/
>   PostgreSQL Development, 24x7 Support, Training & Services
>


-- 
Thanks,
Prakash.R
PostgreSQL - Offshore DBA support TCS / Nielsen Infrastructure Team On call
: +91-8939599426


Re: pldbgapi error

2019-05-20 Thread Ian Barwick

On 5/20/19 8:57 PM, Prakash Ramakrishnan wrote:



On Mon, May 20, 2019 at 5:25 PM Ian Barwick mailto:ian.barw...@2ndquadrant.com>> wrote:

On 5/20/19 7:07 PM, Prakash Ramakrishnan wrote:> Hi Team,
  >
  > We are trying to install the pldbgapi extension in postgresql 11.3 
server but we getting below error,Please do the needful,
  >
  > ==> make USE_PGXS=1
  > gcc -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -g 
-pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong 
--param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC 
-I/usr/pgsql-11/lib/pgxs/src/makefiles/../../src/pl/plpgsql/src -I. -I./ 
-I/usr/pgsql-11/include/server -I/usr/pgsql-11/include/internal  -D_GNU_SOURCE 
-I/usr/include/libxml2  -I/usr/include  -c -o plpgsql_debugger.o plpgsql_debugger.c
  > plpgsql_debugger.c: In function ‘plpgsql_send_vars’:
  > plpgsql_debugger.c:263:16: error: ‘TRUE’ undeclared (first use in this 
function)
  >          isArg = TRUE;
  >                  ^
  > plpgsql_debugger.c:263:16: note: each undeclared identifier is reported 
only once for each function it appears in
  > plpgsql_debugger.c:244:4: warning: enumeration value 
‘PLPGSQL_DTYPE_ROW’ not handled in switch [-Wswitch]
  >      switch( estate->datums[i]->dtype )
  >      ^
  > plpgsql_debugger.c:244:4: warning: enumeration value 
‘PLPGSQL_DTYPE_REC’ not handled in switch [-Wswitch]
  > plpgsql_debugger.c:244:4: warning: enumeration value 
‘PLPGSQL_DTYPE_RECFIELD’ not handled in switch [-Wswitch]
  > plpgsql_debugger.c:244:4: warning: enumeration value 
‘PLPGSQL_DTYPE_ARRAYELEM’ not handled in switch [-Wswitch]
  > plpgsql_debugger.c:244:4: warning: enumeration value 
‘PLPGSQL_DTYPE_PROMISE’ not handled in switch [-Wswitch]
  > plpgsql_debugger.c: In function ‘find_datum_by_name’:
  > plpgsql_debugger.c:484:9: error: ‘PLPGSQL_DTYPE_EXPR’ undeclared (first 
use in this function)
  >      case PLPGSQL_DTYPE_EXPR:(...)

It looks like you are trying to compile some extension code which has not 
been updated
to be compatible with the PostgreSQL 11 source. Check to see if there's 
updated code available.

>

Hi Ian,

Thanks for replying can you please help me to solve this problem which code 
will be support for postgresql 11 can you please share me the link or file.


This is a simple two-step process:

1) locate the source code repository for "pldbgapi" - I don't know what this 
is, and
   a quick web search doesn't reveal any obvious repositories (but presumably 
you
   obtained the source from somewhere)
2) check if the repository contains code which is compatible with PostgreSQL 11.


HTH

Regard

Ian Barwick



--
 Ian Barwick   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: pldbgapi error

2019-05-20 Thread Daniel Verite
Prakash Ramakrishnan wrote:

> which code
> will be support for postgresql 11 can you please share me the link or file.

It appears indeed from the error messages that you're trying
to compile an old version.

The lastest version here:
git://git.postgresql.org/git/pldebugger.git
does compile successfully for me with PG11 (Debian 9).


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Adrian Klaver

On 5/20/19 9:55 AM, Will Hartung wrote:
Please reply to list also.
Ccing list.



On Sat, May 18, 2019 at 6:55 AM Adrian Klaver > wrote:


In addition to what Tim asked:

Is the jsonb field the only field in the table?

Can we see the table schema?


Did you see Tom Lane's post?




  # \d eis_entity
    Table "public.eis_entity"
   Column   |   Type   | Collation | 
Nullable | Default

---+--+---+--+-
  eis_entity_key    | numeric  |   | not 
null |
  eis_identifier_domain_key | numeric  |   | not 
null |
  global_id | character varying(80)    |   | not 
null |
  local_id  | character varying(80)    |   | not 
null |
  eis_status_key    | numeric  |   | not 
null |
  eis_entity_type_key   | numeric  |   | not 
null |
  eis_entity_domain_key | numeric  |   | not 
null |
  all_trait_hash    | numeric(10,0)    |   
|  |
  searchable_trait_hash | numeric(10,0)    |   
|  |
  created_date  | timestamp with time zone |   
|  |
  traits    | jsonb    |   
|  |
  current_revision  | numeric(8,0) |   | not 
null |
  current_version_label | character varying(10)    |   | not 
null |

Indexes:
     "eis_ent_traits" gin (traits jsonb_path_ops)




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung
I sent these earlier, but they didn’t get to the list, so I’m resending — sorry 
about that.

> On May 17, 2019, at 7:53 PM, Tim Cross  wrote:
> 
> Which version of postgres?

 PostgreSQL 10.8 (Ubuntu 10.8-0ubuntu0.18.04.1) on x86_64-pc-linux-gnu, 
compiled by gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0, 64-bit

> 
> How are you loading the data? (application, psql, pg_restore) using
> (insert, copy)?


I have a SQL file I’m using filled with statements like:

\copy eis_entity from program 'gzcat /u2/hdhdata/xxx/out1000.txt.gz';
select now();
\copy eis_entity from program 'gzcat /u2/hdhdata/xxx/out1001.txt.gz';
select now();

Thanks



Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung
I send this earlier, but it did not make it to the list, so I’m resending it.

> On May 18, 2019, at 6:55 AM, Adrian Klaver  wrote:

> Is the jsonb field the only field in the table?
> 
> Can we see the table schema?

# \d eis_entity
Table "public.eis_entity"
  Column   |   Type   | Modifiers 
---+--+---
 eis_entity_key| numeric  | not null
 eis_identifier_domain_key | numeric  | not null
 global_id | character varying(80)| not null
 local_id  | character varying(80)| not null
 eis_status_key| numeric  | not null
 eis_entity_type_key   | numeric  | not null
 eis_entity_domain_key | numeric  | not null
 all_trait_hash| numeric(10,0)| 
 searchable_trait_hash | numeric(10,0)| 
 created_date  | timestamp with time zone | 
 traits| jsonb| 
 current_revision  | numeric(8,0) | not null
 current_version_label | character varying(10)| not null
Indexes:
"eis_ent_traits" gin (traits jsonb_path_ops)

Thanks.



Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung
I send this earlier, but it did not make it to the list, so I’m resending it.

> On May 18, 2019, at 11:02 AM, Tom Lane  wrote:
> 
> In addition to the questions about what PG version you're using, is
> the backend process that's doing the load actually consuming CPU time,
> or is it just sitting?

It’s consuming both CPU and disk space.

Thanks.





INSERT where not exists with foreign key

2019-05-20 Thread Chuck Martin
My Google foo isn't working on this question, probably because I don't
understand the question well enough. I'm using Postgres 11.3 on Centos 7.
I'm trying to insert a record in table A with a foreign key to table B, but
only where there is not already a foreign key in A to B. So assume this
simple structure:

Table A
A.keyInteger
A.something text
A.Bkey Integer [foreign key to table B, column B.key

Table B (or View C)
B.keyInteger
[more columns]

Thinking that it might simplify matters, I created a view to table B that
only includes records with no foreign key in Table A. But still, after
reading the documentation and Googling, I can't work out what seems like it
should be simple. Among the many things that didn't work is:

INSERT INTO A(something,A.Bkey)

VALUES ('text',
  (SELECT C.key FROM C)

But this didn't work because the subquery returned more than one value. Of
course I want it to return all values, but just one per insert.

I can do this outside of Postgres, but would like to learn how to do this
with SQL.

Chuck Martin
Avondale Software


Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Tom Lane
Will Hartung  writes:
>> On May 18, 2019, at 11:02 AM, Tom Lane  wrote:
>> In addition to the questions about what PG version you're using, is
>> the backend process that's doing the load actually consuming CPU time,
>> or is it just sitting?

> It’s consuming both CPU and disk space.

Hmm, so not a deadlock (plus you're running a version that has that
bug fixed).  It sounds like you might have hit an actual bug, perhaps
one that causes it to endlessly re-split an index page, or something
along that line.  But it's going to be hard to find the cause without
a test case.  I don't suppose you can anonymize your data to the
point where you'd have a publishable test case?

regards, tom lane




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Adrian Klaver

On 5/20/19 11:05 AM, Will Hartung wrote:

I send this earlier, but it did not make it to the list, so I’m resending it.


On May 18, 2019, at 11:02 AM, Tom Lane  wrote:

In addition to the questions about what PG version you're using, is
the backend process that's doing the load actually consuming CPU time,
or is it just sitting?


It’s consuming both CPU and disk space.


What do the below show:

1) ps ax | grep postgres

2) As superuser:
select * from pg_stat_activity ;


Thanks.






--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung


> On May 20, 2019, at 11:13 AM, Adrian Klaver  wrote:
> 
> What do the below show:
> 
> 1) ps ax | grep postgres
> 
> 2) As superuser:
>   select * from pg_stat_activity ;

I’ll restart the process and report back.

Thanks.







Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung


> On May 20, 2019, at 11:12 AM, Tom Lane  wrote:
> 
> Hmm, so not a deadlock (plus you're running a version that has that
> bug fixed).  It sounds like you might have hit an actual bug, perhaps
> one that causes it to endlessly re-split an index page, or something
> along that line.  But it's going to be hard to find the cause without
> a test case.  I don't suppose you can anonymize your data to the
> point where you'd have a publishable test case?

That would take some time, thankfully it seems reproducible with a few million 
rows.

I’ve moved the files around a little bit (not aggressively) and it still 
stalled, so I don’t think it’s something specific to the data. But we’ll have 
to play with it and see.

Thanks.





Re: INSERT where not exists with foreign key

2019-05-20 Thread Adrian Klaver

On 5/20/19 11:11 AM, Chuck Martin wrote:
My Google foo isn't working on this question, probably because I don't 
understand the question well enough. I'm using Postgres 11.3 on Centos 
7. I'm trying to insert a record in table A with a foreign key to table 
B, but only where there is not already a foreign key in A to B. So 
assume this simple structure:


Table A
A.key    Integer
A.something     text
A.Bkey     Integer [foreign key to table B, column B.key

Table B (or View C)
B.key    Integer
[more columns]

Thinking that it might simplify matters, I created a view to table B 
that only includes records with no foreign key in Table A. But still, 
after reading the documentation and Googling, I can't work out what 
seems like it should be simple. Among the many things that didn't work is:


INSERT INTO A(something,A.Bkey)

VALUES ('text',
               (SELECT C.key FROM C)

But this didn't work because the subquery returned more than one value. 
Of course I want it to return all values, but just one per insert.


I can do this outside of Postgres, but would like to learn how to do 
this with SQL.


Some examples that you can modify:
https://www.postgresql.org/docs/11/sql-insert.html

INSERT INTO films SELECT * FROM tmp_films WHERE date_prod < '2004-05-07';

WITH upd AS (
  UPDATE employees SET sales_count = sales_count + 1 WHERE id =
(SELECT sales_person FROM accounts WHERE name = 'Acme Corporation')
RETURNING *
)
INSERT INTO employees_log SELECT *, current_timestamp FROM upd;





Chuck Martin
Avondale Software



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: INSERT where not exists with foreign key

2019-05-20 Thread Adrian Klaver

On 5/20/19 11:11 AM, Chuck Martin wrote:
My Google foo isn't working on this question, probably because I don't 
understand the question well enough. I'm using Postgres 11.3 on Centos 
7. I'm trying to insert a record in table A with a foreign key to table 
B, but only where there is not already a foreign key in A to B. So 
assume this simple structure:


Table A
A.key    Integer
A.something     text
A.Bkey     Integer [foreign key to table B, column B.key

Table B (or View C)
B.key    Integer
[more columns]

Thinking that it might simplify matters, I created a view to table B 
that only includes records with no foreign key in Table A. But still, 
after reading the documentation and Googling, I can't work out what 
seems like it should be simple. Among the many things that didn't work is:


INSERT INTO A(something,A.Bkey)

VALUES ('text',
               (SELECT C.key FROM C)

But this didn't work because the subquery returned more than one value. 
Of course I want it to return all values, but just one per insert.


I can do this outside of Postgres, but would like to learn how to do 
this with SQL.


A more concrete example:

create table parent_tbl(id integer primary key, fld_1 varchar)

create table child_tbl(child_id serial primary key, fk_id integer 
references parent_tbl, child_fld_1 varchar);


insert into parent_tbl values (1, 'test');
insert into parent_tbl values (2, 'dog');

insert into child_tbl(fk_id, child_fld_1)  values (1, 'cat');
insert into child_tbl(fk_id, child_fld_1)  values (2, 'fish');

insert into child_tbl(fk_id, child_fld_1)  select id, 'rabbit' from 
parent_tbl where id not in(select child_id from child_tbl);


select * from child_tbl;
 child_id | fk_id | child_fld_1
--+---+-
1 | 1 | cat
2 | 2 | fish
3 | 3 | rabbit

insert into parent_tbl values (4, 'parrot');
insert into parent_tbl values (5, 'lion');

insert into child_tbl(fk_id, child_fld_1)  select id, 'rabbit' from 
parent_tbl where id not in(select child_id from child_tbl);


select * from child_tbl;
 child_id | fk_id | child_fld_1
--+---+-
1 | 1 | cat
2 | 2 | fish
3 | 3 | rabbit
4 | 4 | rabbit
5 | 5 | rabbit




Chuck Martin
Avondale Software



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Refresh Publication takes hours and doesn´t finish

2019-05-20 Thread PegoraroF10
I tried sometime ago ... but with no responses, I ask you again.
pg_publication_tables is a view that is used to refresh publication, but as
we have 15.000 tables, it takes hours and doesn´t complete. If I change that
view I can have an immediate result. The question is: Can I change that view
? There is some trouble changing those system views ?

Original View is ...
create view pg_catalog.pg_publication_tables as
SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM
pg_publication p,
(pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace))) 
WHERE (c.oid IN (SELECT pg_get_publication_tables.relid FROM
pg_get_publication_tables((p.pubname)::text)
pg_get_publication_tables(relid)));
This way it takes 45 minutes to respond.

I changed it to ... 
create or replace pg_catalog.view pg_publication_tables as SELECT p.pubname,
n.nspname AS schemaname, c.relname AS tablename from pg_publication p inner
join pg_get_publication_tables(p.pubname) pt on true inner join pg_class c
on pt.relid = c.oid inner join pg_namespace n ON (n.oid = c.relnamespace);
This one takes just one or two seconds.



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung

> On May 20, 2019, at 11:13 AM, Adrian Klaver  wrote:

> What do the below show:
> 
> 1) ps ax | grep postgres

$ ps -ax | grep postgres
 1171 ?S  0:04 /usr/lib/postgresql/10/bin/postgres -D 
/var/lib/postgresql/10/main -c 
config_file=/etc/postgresql/10/main/postgresql.conf
 1420 ?Ds 0:21 postgres: 10/main: checkpointer process   
 1421 ?Ss 0:38 postgres: 10/main: writer process   
 1422 ?Ss 0:22 postgres: 10/main: wal writer process   
 1423 ?Ss 0:04 postgres: 10/main: autovacuum launcher process   
 1424 ?Ss 0:04 postgres: 10/main: stats collector process   
 1425 ?Ss 0:00 postgres: 10/main: bgworker: logical replication 
launcher   
15917 tty1 S  0:00 su - postgres
16300 ?Rs10:39 postgres: 10/main: postgres willtest 
10.35.60.64(54594) COPY
16444 ?Ss 0:08 postgres: 10/main: autovacuum worker process   
willtest waiting
16633 tty1 S+ 0:00 /usr/lib/postgresql/10/bin/psql
16641 ?Ss 0:00 postgres: 10/main: postgres postgres [local] idle


> 2) As superuser:
>   select * from pg_stat_activity ;

 datid | datname  |  pid  | usesysid | usename  | application_name | 
client_addr | client_hostname | client_port | backend_start |   
   xact_start   |  query_start  | 
state_change  | wait_event_type | wait_event  | state  | 
backend_xid | backend_xmin | query |
backend_type 
---+--+---+--+--+--+-+-+-+---+---+---+---+-+-++-+--+---+-
   |  |  1425 |   10 | postgres |  |
 | | | 2019-05-17 12:00:17.659235-07 |  
 |   |  
 | Activity| LogicalLauncherMain || |  
|   | background worker
   |  |  1423 |  |  |  |
 | | | 2019-05-17 12:00:17.658936-07 |  
 |   |  
 | Activity| AutoVacuumMain  || |  
|   | autovacuum launcher
 16384 | willtest | 16444 |  |  |  |
 | | | 2019-05-20 12:16:14.564982-07 | 2019-05-20 
12:16:14.641913-07 | 2019-05-20 12:16:14.641913-07 | 2019-05-20 
12:16:14.641914-07 | Lock| page| active | 
624 |  623 | autovacuum: ANALYZE public.eis_entity | autovacuum worker
 13051 | postgres | 16889 |   10 | postgres | psql |
 | |  -1 | 2019-05-20 13:44:50.84062-07  | 2019-05-20 
13:46:17.209382-07 | 2019-05-20 13:46:17.209382-07 | 2019-05-20 
13:46:17.209387-07 | | | active |   
  |  623 | select * from pg_stat_activity;   | client backend
 16384 | willtest | 16300 |   10 | postgres | psql | 
10.35.60.64 | |   54594 | 2019-05-20 11:24:59.865383-07 | 
2019-05-20 12:15:42.494372-07 | 2019-05-20 12:15:42.494372-07 | 2019-05-20 
12:15:42.494378-07 | LWLock  | WALWriteLock| active | 
623 |  612 | COPY  eis_entity FROM STDIN ; | client backend
   |  |  1421 |  |  |  |
 | | | 2019-05-17 12:00:17.557683-07 |  
 |   |  
 | Activity| BgWriterMain|| |  
|   | background writer
   |  |  1420 |  |  |  |
 | | | 2019-05-17 12:00:17.557992-07 |  
 |   |  
 | | || |  
|   | checkpointer
   |  |  1422 |  |  |  |
 | | | 2019-05-17 12:00:17.554268-07 |  
 |   |  
 | | || |  
|   | walwriter
(8 rows)


It’s been running for

Re: Refresh Publication takes hours and doesn´t finish

2019-05-20 Thread Fabrízio de Royes Mello
Em seg, 20 de mai de 2019 às 17:18, PegoraroF10 
escreveu:
>
> I tried sometime ago ... but with no responses, I ask you again.
> pg_publication_tables is a view that is used to refresh publication, but
as
> we have 15.000 tables, it takes hours and doesn´t complete. If I change
that
> view I can have an immediate result. The question is: Can I change that
view
> ? There is some trouble changing those system views ?
>

You really need a publication with a lot of relations??? If you can split
it in several publications your life should be easy.

>
> Original View is ...
> create view pg_catalog.pg_publication_tables as
> SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM
> pg_publication p,
> (pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
> WHERE (c.oid IN (SELECT pg_get_publication_tables.relid FROM
> pg_get_publication_tables((p.pubname)::text)
> pg_get_publication_tables(relid)));
> This way it takes 45 minutes to respond.
>

I really don't know why we did it... because pg_get_publication_tables
doesn't have any special behavior different than get relations assigned to
publications.


>
> I changed it to ...
> create or replace pg_catalog.view pg_publication_tables as SELECT
p.pubname,
> n.nspname AS schemaname, c.relname AS tablename from pg_publication p
inner
> join pg_get_publication_tables(p.pubname) pt on true inner join pg_class c
> on pt.relid = c.oid inner join pg_namespace n ON (n.oid = c.relnamespace);
> This one takes just one or two seconds.
>

Even better, you can go direct by system catalogs:

 SELECT p.pubname,
n.nspname AS schemaname,
c.relname AS tablename
   FROM pg_publication p
 JOIN pg_publication_rel pr ON pr.prpubid = p.oid
 JOIN pg_class c ON c.oid = pr.prrelid
 JOIN pg_namespace n ON n.oid = c.relnamespace;

To change it, before you'll need to set "allow_system_table_mods=on" and
restart PostgreSQL.

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-20 Thread Tom Lane
PegoraroF10  writes:
> I tried sometime ago ... but with no responses, I ask you again.
> pg_publication_tables is a view that is used to refresh publication, but as
> we have 15.000 tables, it takes hours and doesn´t complete. If I change that
> view I can have an immediate result. The question is: Can I change that view
> ? There is some trouble changing those system views ?

> Original View is ...
> create view pg_catalog.pg_publication_tables as
> SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM
> pg_publication p,
> (pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace))) 
> WHERE (c.oid IN (SELECT pg_get_publication_tables.relid FROM
> pg_get_publication_tables((p.pubname)::text)
> pg_get_publication_tables(relid)));
> This way it takes 45 minutes to respond.

> I changed it to ... 
> create or replace pg_catalog.view pg_publication_tables as SELECT p.pubname,
> n.nspname AS schemaname, c.relname AS tablename from pg_publication p inner
> join pg_get_publication_tables(p.pubname) pt on true inner join pg_class c
> on pt.relid = c.oid inner join pg_namespace n ON (n.oid = c.relnamespace);
> This one takes just one or two seconds.

Hmm ... given that pg_get_publication_tables() shouldn't return any
duplicate OIDs, it does seem unnecessarily inefficient to put it in
an IN-subselect condition.  Peter, is there a reason why this isn't
a straight lateral join?  I get a much saner-looking plan from

FROM pg_publication P, pg_class C
-JOIN pg_namespace N ON (N.oid = C.relnamespace)
-   WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname));
+JOIN pg_namespace N ON (N.oid = C.relnamespace),
+LATERAL pg_get_publication_tables(P.pubname)
+   WHERE C.oid = pg_get_publication_tables.relid;

regards, tom lane




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Ron

On 5/20/19 4:14 PM, Will Hartung wrote:
[snip]
Also to note, I tried just loading the table with no indexes, and I was 
getting a solid 22MB/s via iostat of just raw data load (just to proof 
that I/O system, while certainly not extraordinary, was functional).


I think you answered this earlier, but does the same stalling happen when 
indexes are dropped?



--
Angular momentum makes the world go 'round.




Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-20 Thread Fabrízio de Royes Mello
Em seg, 20 de mai de 2019 às 18:30, Tom Lane  escreveu:
>
> Hmm ... given that pg_get_publication_tables() shouldn't return any
> duplicate OIDs, it does seem unnecessarily inefficient to put it in
> an IN-subselect condition.  Peter, is there a reason why this isn't
> a straight lateral join?  I get a much saner-looking plan from
>
> FROM pg_publication P, pg_class C
> -JOIN pg_namespace N ON (N.oid = C.relnamespace)
> -   WHERE C.oid IN (SELECT relid FROM
pg_get_publication_tables(P.pubname));
> +JOIN pg_namespace N ON (N.oid = C.relnamespace),
> +LATERAL pg_get_publication_tables(P.pubname)
> +   WHERE C.oid = pg_get_publication_tables.relid;
>

And why not just JOIN direct with pg_publication_rel ?

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung


> On May 20, 2019, at 2:36 PM, Ron  wrote:

> I think you answered this earlier, but does the same stalling happen when 
> indexes are dropped?

No, the data loads fine.

The way I originally stumbled upon this was that I had off loaded the data for 
some other kinds of testing.

There are several indexes in production, but for my local testing, I only had 
one on the pk.

Then, one day, I wanted to look for something, and decided I’ll build the json 
index “real quick”.

Being me, it wasn’t until a few days later I’m like “why is my machine busy, 
what’s that grinding sound” (I rarely hear my drive, most work is done on SSD).

Then I realized that the index was still being built, several days later.

“Huh”

And I left it again.

I was, again, not really paying attention, and didn’t realize until 2 weeks 
after I had started that it was STILL running.

At that point I killed it.

Then I decided to break the file up and load it in chunks, see “how far” it was 
getting, etc. It did not take long for it to get stuck.

I tried it on different versions of PG, now on different OS’s (albeit one is in 
a VM). It wasn’t a crisis, just a curiosity.

But we had a scary event couple of weeks ago that may have forced us to reload 
the table, but, thankfully, didn’t and we recovered. But got me to thinking 
about poking at it again.

A colleague is trying to pg_restore our production DB for a conversion test to 
AWS Aurora, and it’s been stuck for 43 hours on 68M row loaded table. So, it’s 
systemic with our data, and I believe that we can not reload this table 
currently. Which is scary.

And here we are.






Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Ron

On 5/20/19 4:48 PM, Will Hartung wrote:

On May 20, 2019, at 2:36 PM, Ron  wrote:
I think you answered this earlier, but does the same stalling happen when 
indexes are dropped?

No, the data loads fine.

The way I originally stumbled upon this was that I had off loaded the data for 
some other kinds of testing.

There are several indexes in production, but for my local testing, I only had 
one on the pk.


And it pathologically loads even when there's just a PK on the numeric field?


Then, one day, I wanted to look for something, and decided I’ll build the json 
index “real quick”.

Being me, it wasn’t until a few days later I’m like “why is my machine busy, 
what’s that grinding sound” (I rarely hear my drive, most work is done on SSD).

Then I realized that the index was still being built, several days later.

“Huh”

And I left it again.

I was, again, not really paying attention, and didn’t realize until 2 weeks 
after I had started that it was STILL running.

At that point I killed it.

Then I decided to break the file up and load it in chunks, see “how far” it was 
getting, etc. It did not take long for it to get stuck.

I tried it on different versions of PG, now on different OS’s (albeit one is in 
a VM). It wasn’t a crisis, just a curiosity.

But we had a scary event couple of weeks ago that may have forced us to reload 
the table, but, thankfully, didn’t and we recovered. But got me to thinking 
about poking at it again.


Drop all indexes, load data, recreate indexes?


A colleague is trying to pg_restore our production DB for a conversion test to 
AWS Aurora, and it’s been stuck for 43 hours on 68M row loaded table. So, it’s 
systemic with our data, and I believe that we can not reload this table 
currently. Which is scary.


But pg_restore only creates indexes AFTER the data loads.

--
Angular momentum makes the world go 'round.




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung


> On May 20, 2019, at 2:55 PM, Ron  wrote:

> And it pathologically loads even when there's just a PK on the numeric field?

Yea, that works fine.

> Drop all indexes, load data, recreate indexes?

No, I use the incremental load as it gives a much better indication when the 
process has gone off the deep end.

> But pg_restore only creates indexes AFTER the data loads.

Yes. Demonstrating that it doesn’t work either incrementally or all at once.

(No, I have no confidence that the Aurora build will ever finish in any 
reasonable amount of time.)






Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Ron

On 5/20/19 5:43 PM, Will Hartung wrote:

On May 20, 2019, at 2:55 PM, Ron  wrote:

And it pathologically loads even when there's just a PK on the numeric field?

Yea, that works fine.


Drop all indexes, load data, recreate indexes?

No, I use the incremental load as it gives a much better indication when the 
process has gone off the deep end.


That's fine for tracing errors in a test environment, but.


But pg_restore only creates indexes AFTER the data loads.

Yes. Demonstrating that it doesn’t work either incrementally or all at once.

(No, I have no confidence that the Aurora build will ever finish in any 
reasonable amount of time.)


I'm confused.  You wrote above that loading without indexes and with just 
the PK works just fine; if you *really* need it loaded in Aurora or 
production, just drop the indexes beforehand?


--
Angular momentum makes the world go 'round.


Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Adrian Klaver

On 5/20/19 3:43 PM, Will Hartung wrote:



On May 20, 2019, at 2:55 PM, Ron  wrote:



And it pathologically loads even when there's just a PK on the numeric field?


Yea, that works fine.


Drop all indexes, load data, recreate indexes?


No, I use the incremental load as it gives a much better indication when the 
process has gone off the deep end.


But pg_restore only creates indexes AFTER the data loads.


Yes. Demonstrating that it doesn’t work either incrementally or all at once.


So the issue is building the index(s) not loading the data, correct?



(No, I have no confidence that the Aurora build will ever finish in any 
reasonable amount of time.)








--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung

> On May 20, 2019, at 4:39 PM, Adrian Klaver  wrote:
> 
> So the issue is building the index(s) not loading the data, correct?

Correct.

Since I have been exploring this, I have not been able to successfully create a 
gin jsonb_path_ops index on a jsonb field either as a new index on a table with 
existing data, or as a pre-existing index on an empty table and loading the 
data in.

The create index does not finish on the existing data, and loading data does 
not finish when the index pre-exists.

We currently have a 68M row table loaded in an Aurora instance on AWS (I do not 
know the instance/machine size), and the index has been building for 48 hours 
now.

The pg_restore is currently working on that index:


CREATE INDEX eis_ent_traits
  ON public.eis_entity
  USING gin
  (traits jsonb_path_ops);




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung

> On May 20, 2019, at 4:27 PM, Ron  wrote:
> 
> I'm confused.  You wrote above that loading without indexes and with just the 
> PK works just fine; if you really need it loaded in Aurora or production, 
> just drop the indexes beforehand?

Because the jsonb_path_ops index does not rebuild, and it’s the heart of that 
table.

The table is essentially worthless without that index to us.




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Adrian Klaver

On 5/20/19 4:49 PM, Will Hartung wrote:


On May 20, 2019, at 4:39 PM, Adrian Klaver > wrote:


So the issue is building the index(s) not loading the data, correct?


Correct.

Since I have been exploring this, I have not been able to successfully 
create a gin jsonb_path_ops index on a jsonb field either as a new index 
on a table with existing data, or as a pre-existing index on an empty 
table and loading the data in.


The create index does not finish on the existing data, and loading data 
does not finish when the index pre-exists.


Well looks like you are down to Tom's suggestion of creating a test 
case. Given that it seems confined to the jsonb field and corresponding 
index, I would think that is all that is needed for the test case. Start 
with some smaller subset, say 10,000 rows and work up till you start 
seeing an issue.




We currently have a 68M row table loaded in an Aurora instance on AWS (I 
do not know the instance/machine size), and the index has been building 
for 48 hours now.


The pg_restore is currently working on that index:


CREATE INDEX eis_ent_traits
   ON public.eis_entity
   USING gin
   (traits jsonb_path_ops);





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Ron

On 5/20/19 6:51 PM, Will Hartung wrote:
On May 20, 2019, at 4:27 PM, Ron > wrote:


I'm confused.  You wrote above that loading without indexes and with just 
the PK works just fine; if you *really* need it loaded in Aurora or 
production, just drop the indexes beforehand?


Because the jsonb_path_ops index does not rebuild, and it’s the heart of 
that table.


The table is essentially worthless without that index to us.


Clarified.

Are there a sufficiently small number of elements in each traits object that 
you can do something like this, on the UNINDEXED table?

SELECT traits->element1, traits->element2, count(*)
from eis_entry
group by traits->element1, traits->element2;

Sure it would run for a long time, but might show you where the problem lies.


--
Angular momentum makes the world go 'round.


Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung


> On May 20, 2019, at 5:15 PM, Ron  wrote:
> 
> Are there a sufficiently small number of elements in each traits object that 
> you can do something like this, on the UNINDEXED table?
> SELECT traits->element1, traits->element2, count(*)
> from eis_entry
> group by traits->element1, traits->element2;
> 
> Sure it would run for a long time, but might show you where the problem lies.

I don’t quite understand what you’re trying to achieve here. Is the intent 
essentially to access each individual element within the json on the select 
statement? There’s at least 50 elements in this thing, and the structure is not 
flat, nor is it static and consistent. Similar, sure. Perfectly congruent, no.






Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Will Hartung


> On May 20, 2019, at 5:14 PM, Adrian Klaver  wrote:
> 
> Well looks like you are down to Tom's suggestion of creating a test case. 
> Given that it seems confined to the jsonb field and corresponding index, I 
> would think that is all that is needed for the test case. Start with some 
> smaller subset, say 10,000 rows and work up till you start seeing an issue.

This will take quite some work, and I wouldn’t attempt it with less than 5M 
rows to load.






Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Tom Lane
Will Hartung  writes:
>> On May 20, 2019, at 5:14 PM, Adrian Klaver  wrote:
>> Well looks like you are down to Tom's suggestion of creating a test case. 
>> Given that it seems confined to the jsonb field and corresponding index, I 
>> would think that is all that is needed for the test case. Start with some 
>> smaller subset, say 10,000 rows and work up till you start seeing an issue.

> This will take quite some work, and I wouldn’t attempt it with less than 5M 
> rows to load.

Well, you're the only one who's seen this problem, and none of the
rest of us have any idea how to reproduce it.  So if you want something
to get done in a timely fashion, it's up to you to show us a test case.

My guess is that it wouldn't be that hard to anonymize your data to
the point where it'd be OK to show to someone else.  It's unlikely
that the problem depends on the *exact* data you've got --- though it
might depend on string lengths and the number/locations of duplicates.
But you should be able to substitute random strings for the original
values while preserving that.

regards, tom lane




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Adrian Klaver

On 5/20/19 5:22 PM, Will Hartung wrote:



On May 20, 2019, at 5:14 PM, Adrian Klaver  wrote:

Well looks like you are down to Tom's suggestion of creating a test case. Given 
that it seems confined to the jsonb field and corresponding index, I would 
think that is all that is needed for the test case. Start with some smaller 
subset, say 10,000 rows and work up till you start seeing an issue.


This will take quite some work, and I wouldn’t attempt it with less than 5M 
rows to load.



From this post:
https://www.postgresql.org/message-id/CAKMEDdxE95SC76wadMzrH454HpvCV4tOdEmuZRO%3DpwArEw6YVg%40mail.gmail.com

You started seeing longer times by the time you got to 500,000 rows and 
then it fell off by 1.1 million.


Creating the data:

CREATE TABLE AS gin_test SELECT traits FROM eis_entity LIMIT 100;

Then:

pg_dump -t gin_test







--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Adrian Klaver

On 5/20/19 5:33 PM, Adrian Klaver wrote:

On 5/20/19 5:22 PM, Will Hartung wrote:


On May 20, 2019, at 5:14 PM, Adrian Klaver 
 wrote:


Well looks like you are down to Tom's suggestion of creating a test 
case. Given that it seems confined to the jsonb field and 
corresponding index, I would think that is all that is needed for the 
test case. Start with some smaller subset, say 10,000 rows and work 
up till you start seeing an issue.


This will take quite some work, and I wouldn’t attempt it with less 
than 5M rows to load.




 From this post:
https://www.postgresql.org/message-id/CAKMEDdxE95SC76wadMzrH454HpvCV4tOdEmuZRO%3DpwArEw6YVg%40mail.gmail.com 



You started seeing longer times by the time you got to 500,000 rows and 
then it fell off by 1.1 million.


Creating the data:

CREATE TABLE AS gin_test SELECT traits FROM eis_entity LIMIT 100;


Correction:

CREATE TABLE gin_test AS SELECT traits FROM eis_entity LIMIT 100;



Then:

pg_dump -t gin_test










--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Loading table with indexed jsonb field is stalling

2019-05-20 Thread Ron

On 5/20/19 7:21 PM, Will Hartung wrote:

On May 20, 2019, at 5:15 PM, Ron  wrote:

Are there a sufficiently small number of elements in each traits object that 
you can do something like this, on the UNINDEXED table?
SELECT traits->element1, traits->element2, count(*)
from eis_entry
group by traits->element1, traits->element2;

Sure it would run for a long time, but might show you where the problem lies.

I don’t quite understand what you’re trying to achieve here. Is the intent 
essentially to access each individual element within the json on the select 
statement? There’s at least 50 elements in this thing,


Like I said earlier... "Are there a sufficiently small number of elements in 
each traits object...?"  But obviously there aren't.


--
Angular momentum makes the world go 'round.