[GENERAL] INSERT ON CONFLICT of "wide" table: target lists can have at most 1664 entries

2016-12-04 Thread Justin Pryzby
Our application INSERTs data from external sources, and infrequently UPDATEs
the previously-inserted data (currently, it first SELECTs to determine whether
to UPDATE).

I'm implementing unique indices to allow "upsert" (and pg_repack and..), but
running into a problem when the table has >830 columns (we have some tables
which are at the 1600 column limit, and have previously worked around that
limit using arrays or multiple tables).

I tried to work around the upsert problem by using pygresql inline=True
(instead of default PREPAREd statements) but both have the same issue.

I created a test script which demonstrates the problem (attached).

It seems to me that there's currently no way to "upsert" such a wide table?

I see:
./src/include/access/htup_details.h:#define MaxTupleAttributeNumber 1664
/* 8 * 208 */

./src/backend/parser/parse_node.c-  /*
./src/backend/parser/parse_node.c-   * Check that we did not produce too 
many resnos; at the very least we
./src/backend/parser/parse_node.c-   * cannot allow more than 2^16, since 
that would exceed the range of a
./src/backend/parser/parse_node.c:   * AttrNumber. It seems safest to use 
MaxTupleAttributeNumber.
./src/backend/parser/parse_node.c-   */
./src/backend/parser/parse_node.c:  if (pstate->p_next_resno - 1 > 
MaxTupleAttributeNumber)
./src/backend/parser/parse_node.c-  ereport(ERROR,
./src/backend/parser/parse_node.c-  
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
./src/backend/parser/parse_node.c:   errmsg("target 
lists can have at most %d entries",
./src/backend/parser/parse_node.c:  
MaxTupleAttributeNumber)));

Thanks in advance for any suggestions.

Justin
#! /bin/sh
set -e

n=831
t=wide_upsert

psql -c "DROP TABLE IF EXISTS $t"

cols='id int'
vals='0'
sets='id=0'
for a in `seq -w 0 $n`
do
c="c$a int"
cols="$cols, c$a int"

vals="$vals, \$1"
sets="$sets, c$a=\$1"
done

echo $cols
psql -c "CREATE TABLE $t ($cols, PRIMARY KEY (id))"

set -x
psql -c "PREPARE x AS INSERT INTO $t VALUES ($vals) ON CONFLICT (id) DO UPDATE 
SET $sets; EXECUTE x(0)"

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] INSERT ON CONFLICT of "wide" table: target lists can have at most 1664 entries

2016-12-04 Thread Justin Pryzby
Our application INSERTs data from external sources, and infrequently UPDATEs
the previously-inserted data (currently, it first SELECTs to determine whether
to UPDATE).

I'm implementing unique indices to allow "upsert" (and pg_repack and..), but
running into a problem when the table has >830 columns (we have some tables
which are at the 1600 column limit, and have previously worked around that
limit using arrays or multiple tables).

I tried to work around the upsert problem by using pygresql inline=True
(instead of default PREPAREd statements) but both have the same issue.

I created a test script which demonstrates the problem (attached).

It seems to me that there's currently no way to "upsert" such a wide table?

I see:
./src/include/access/htup_details.h:#define MaxTupleAttributeNumber 1664
/* 8 * 208 */

./src/backend/parser/parse_node.c-  /*
./src/backend/parser/parse_node.c-   * Check that we did not produce too 
many resnos; at the very least we
./src/backend/parser/parse_node.c-   * cannot allow more than 2^16, since 
that would exceed the range of a
./src/backend/parser/parse_node.c:   * AttrNumber. It seems safest to use 
MaxTupleAttributeNumber.
./src/backend/parser/parse_node.c-   */
./src/backend/parser/parse_node.c:  if (pstate->p_next_resno - 1 > 
MaxTupleAttributeNumber)
./src/backend/parser/parse_node.c-  ereport(ERROR,
./src/backend/parser/parse_node.c-  
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
./src/backend/parser/parse_node.c:   errmsg("target 
lists can have at most %d entries",
./src/backend/parser/parse_node.c:  
MaxTupleAttributeNumber)));

Thanks in advance for any suggestions.

Justin
#! /bin/sh
set -e

n=831
t=wide_upsert

psql -c "DROP TABLE IF EXISTS $t"

cols='id int'
vals='0'
sets='id=0'
for a in `seq -w 0 $n`
do
c="c$a int"
cols="$cols, c$a int"

vals="$vals, \$1"
sets="$sets, c$a=\$1"
done

echo $cols
psql -c "CREATE TABLE $t ($cols, PRIMARY KEY (id))"

set -x
psql -c "PREPARE x AS INSERT INTO $t VALUES ($vals) ON CONFLICT (id) DO UPDATE 
SET $sets; EXECUTE x(0)"

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unnest-like pg_stats.most_common_values and pg_stats.most_common_freqs

2017-11-17 Thread Justin Pryzby
On Fri, Nov 17, 2017 at 09:32:23PM +0100, Luca Ferrari wrote:
> Hi all,
> maybe this is trivial, but I need an hint on a way to see a table form
> of the MCVs and MCFs out of pg_stats with a query. Is it possible to
> get a set of rows each with a most common value on one column and the
> corresponding column on the the other? (assuming I can cast the array
> of MCVs to the right type array)

I think you want something like this ?

postgres=# SELECT schemaname, tablename, attname, 
unnest(histogram_bounds::text::text[]), histogram_bounds FROM pg_stats LIMIT 9;
 pg_catalog | pg_pltemplate | tmplname| plperl  | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplname| plperlu | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplname| plpgsql | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplname| plpython2u  | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplname| plpython3u  | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplname| plpythonu   | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplname| pltcl   | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplname| pltclu  | 
{plperl,plperlu,plpgsql,plpython2u,plpython3u,plpythonu,pltcl,pltclu}
 pg_catalog | pg_pltemplate | tmplhandler | plperl_call_handler | 
{plperl_call_handler,plperlu_call_handler,plpgsql_call_handler,plpython2_call_handler,plpython3_call_handler,plpython_call_handler,pltcl_cal
l_handler,pltclu_call_handler}

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] SELECT x'00000000F'::int leading zeros causes "integer out of range"

2017-02-24 Thread Justin Pryzby
Is this expected behavior ?

This works:

ts=# SELECT x'000F'::int;
int4|15

.. but an additional leading zero causes it to fail:

ts=# SELECT x'F'::int;
ERROR:  22003: integer out of range
LOCATION:  bittoint4, varbit.c:1575


|/* Check that the bit string is not too long */
|if (VARBITLEN(arg) > sizeof(result) * BITS_PER_BYTE)
|   ereport(ERROR,
|   (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
|errmsg("integer out of range")));

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to add columns to view with dependencies

2017-04-16 Thread Justin Pryzby
On Sun, Apr 16, 2017 at 08:02:54PM -0700, Guyren Howe wrote:
> Seems like a simple question, but I’ve never found a good answer to this and 
> similar issues.
> 
> I would think it was safe to let me add columns to a view on which other 
> views depend, but Postgres won’t let me.
> 
> I can imagine ways of sort-of dealing with this. I might maintain a SQL file 
> with views to create in a suitable order, Then I could drop all views, edit 
> the definition of one, then run the file, but this is awfully tedious.
> 
> What is best practice in this situation?

If you're not re-ordering existing columns, you can use CREATE OR REPLACE VIEW

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PG96 pg_restore connecting to PG95 causes ERROR: unrecognized configuration parameter "idle_in_transaction_session_timeout"

2017-05-05 Thread Justin Pryzby
When doing a dump+restore upgrade, it's commonly recommended to use the later
version of pg_restore:

https://www.postgresql.org/docs/current/static/upgrading.html
"It is recommended that you use the pg_dump and pg_dumpall programs from the
newer version of PostgreSQL, to take advantage of enhancements that might have
been made in these programs. Current releases of the dump programs can read
data from any server version back to 7.0."

In the immediate case, I was loading data from PG95 dumps into PG95 server (not
an upgrade), using P96 pg_restore, and getting:

ERROR:  unrecognized configuration parameter 
"idle_in_transaction_session_timeout"

I can't see anybody has raised that issue before.  Should pg_restore check the
remote server version and avoid sending commands not expected to be understood?

(Yes, I know and use pg_upgrade, however I'm currenting migrating a DB between
servers and this procedure will allow doing so with ~30min
downtime...pg_upgrade to 9.6 will be done afterwards, which is why PG96
pg_upgrade is installed).

Thanks,
Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and Table Inheritance

2017-05-12 Thread Justin Pryzby
On Mon, May 08, 2017 at 10:12:18AM -0700, Paul A Jungwirth wrote:
> I'm working on a problem where partitioning seems to be the right
> approach, but we would need a lot of partitions (say 10k or 100k).
> Everywhere I read that after ~100 child tables you experience
> problems. I have a few questions about that:

We use partitioning, previously one child per month (with history of 1-6
years); I tried using one child per day, and caused issues.

For us, planning time is pretty unimportant (~1sec would be acceptable 99% of
the time) but I recall seeing even more than that.  I changed to using daily
granularity for only our largest tables, which seems to be working fine for the
last ~9months.  So the issue isn't just "number of children" but "total number
of tables".  I believe the problem may have been due to large
pg_statistic/pg_attribute and similar tables taking more than a few 100MBs, and
potentially no longer fitting in buffer cache.

> 3. Is it true that query planning time should scale linearly as I add
> more child tables?

I believe it's understood to be super-linear:
https://www.postgresql.org/message-id/26761.1483989025%40sss.pgh.pa.us
https://www.postgresql.org/message-id/4188.1298960419%40sss.pgh.pa.us

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Justin Pryzby
On Mon, May 15, 2017 at 12:55:48PM -0700, Ken Tanzer wrote:
> Hi.  On a Centos 6.9 server (in the cloud with Rackspace), I'm wanting to
> install PGDG 9.6 alongside the already-running 9.2.  After installing the
> 9.6 packages (and even before doing an initdb), I am no
> longer able to make a local connection to the 9.2 server.  Instead I get
> the message:

See eg.
https://www.postgresql.org/message-id/21044.1326496...@sss.pgh.pa.us
https://www.postgresql.org/message-id/0a21bc93-7b9c-476e-aaf4-0ff71708e...@elevated-dev.com

I'm guessing you upgraded the client libraries, which probably change the
(default) socket path.

Your options are to specify path to the socket (maybe in /tmp for running
PG92?), change to TCP connection, or specify server option
unix_socket_directories.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Current best practice for maximum shared_buffers settings on big hardware?

2017-05-24 Thread Justin Pryzby
On Wed, May 24, 2017 at 08:24:15AM -0400, Bill Moran wrote:
> ... I tried allocating 64G to shared buffers and we had a bunch of problems
> with inconsistent performance, including "stall" periods where the database
> would stop responding for 2 or 3 seconds.  After trying all sorts of tuning
> options that didn't help, the problem finally went away after reducing
> shared_buffers to 32G. I speculated, at the time, that the shared buffer code
> hit performance issues managing that much memory, but I never had the
> opportunity to really follow up on it.

I think you were hitting an issue related to "kernel shared memory" and maybe
"transparent huge pages".

I was able to work around similar issues with ~32GB allocations to QEMU/QEMU
running on something like kernel 3.13.  I didn't spend time to narrow down the
problem, and I don't know if the behavior is better with recent kernel.

/sys/kernel/mm/ksm/run=2
... and maybe also:
/sys/kernel/mm/transparent_hugepage/defrag=madvise
/sys/kernel/mm/ksm/merge_across_nodes=0 

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deadlock with single update statement?

2017-06-10 Thread Justin Pryzby
On Sat, Jun 10, 2017 at 03:16:26PM -0400, Tom Lane wrote:
> Rob Nikander  writes:
> >> On Jun 10, 2017, at 10:34 AM, Tom Lane  wrote:
> >> […] but it'd be better to adjust the query to ensure a deterministic
> >> update order.
> 
> > Thank you for the answer. Since `update` has no `order by` clause, I’m 
> > guessing there’s no way to do this with just the `update` statement, and 
> > that I should use `select … order by … for update’ for this.
> 
> Yeah, that's one easy answer.  You can probably force it with a sub-select
> in the UPDATE, as well, but it will take more thought.

I have a question about this ... I'm already using SELECT .. FOR UDPATE,
prefixed with "SET synchronize_seqscans=off".

..using "ORDER BY ctid" since our SELECT statements for a given table may
differ (previously I'd tried using "ORDER BY 1,2,...n" for each key column).

And yet I still see deadlocks.

Example:

session_id|593be2ff.68f7
log_time|2017-06-10 01:16:37.786-11
pid|26871
detail|Process 26871 waits for ShareLock on transaction 13693505; blocked by 
process 26646.
Process 26646 waits for ShareLock on transaction 13693504; blocked by process 
26871.
Process 26871: SELECT db_column_name,table_name FROM 
huawei_m2000_counter_details ORDER BY ctid FOR UPDATE
Process 26646: SELECT db_column_name,table_name FROM 
huawei_m2000_counter_details ORDER BY ctid FOR UPDATE
session_line|2923
message|deadlock detected

session_id|593be2ff.68f7
log_time|2017-06-10 01:16:27.633-11
pid|26871
detail|
session_line|2917
message|statement: BEGIN

session_id|593be2ff.68f7
log_time|2017-06-10 01:16:27.638-11
pid|26871
detail|
session_line|2918
message|statement: SET synchronize_seqscans=off

session_id|593be2ff.68f7
log_time|2017-06-10 01:16:27.64-11
pid|26871
detail|
session_line|2919
message|statement: SELECT db_column_name,table_name FROM 
huawei_m2000_counter_details ORDER BY ctid FOR UPDATE

session_id|593be2ff.68f7
log_time|2017-06-10 01:16:28.994-11
pid|26871
detail|Process holding the lock: 29467. Wait queue: 26871, 26646.
session_line|2920
message|process 26871 still waiting for ShareLock on transaction 13693494 after 
1000.070 ms

session_id|593be2ff.68f7
log_time|2017-06-10 01:16:36.786-11
pid|26871
detail|
session_line|2921
message|process 26871 acquired ShareLock on transaction 13693494 after 8791.608 
ms

session_id|593be2ff.68f7
log_time|2017-06-10 01:16:37.786-11
pid|26871
detail|Process holding the lock: 26646. Wait queue: .
session_line|2922
message|process 26871 detected deadlock while waiting for ShareLock on 
transaction 13693505 after 1000.080 ms

2nd process:

session_id|593be2fd.6816
log_time|2017-06-10 01:16:28.947-11
pid|26646
detail|
session_line|2301
message|statement: BEGIN

session_id|593be2fd.6816
log_time|2017-06-10 01:16:28.949-11
pid|26646
detail|
session_line|2302
message|statement: SET synchronize_seqscans=off

session_id|593be2fd.6816
log_time|2017-06-10 01:16:28.949-11
pid|26646
detail|
session_line|2303
message|statement: SELECT db_column_name,table_name FROM 
huawei_m2000_counter_details ORDER BY ctid FOR UPDATE

session_id|593be2fd.6816
log_time|2017-06-10 01:16:29.956-11
pid|26646
detail|Process holding the lock: 29467. Wait queue: 26871, 26646.
session_line|2304
message|process 26646 still waiting for ShareLock on transaction 13693494 after 
1000.076 ms

session_id|593be2fd.6816
log_time|2017-06-10 01:16:36.786-11
pid|26646
detail|
session_line|2305
message|process 26646 acquired ShareLock on transaction 13693494 after 7829.560 
ms

session_id|593be2fd.6816
log_time|2017-06-10 01:16:37.833-11
pid|26646
detail|
session_line|2306
message|statement: RESET synchronize_seqscans

Thanks in advance for any clue or insight.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] workaround for column cross-correlation

2017-06-12 Thread Justin Pryzby
I know PG 10 will have support "CREATE STATISTICS.." for this..

..but I wondered if there's a recommended workaround in earlier versions ?

We had two issues:

1) improper estimate caused poor plans (nested loops due to ::date, GROUP BY,
cross-column stats, and maybe more).
2) memory explosion in hash join (due to poor estimate?) caused OOM.

I tried ROW/ARRAY comparison for the correlated columns which seems to do what
I want, acting as a single comparison, rather than two, independent
comparisons.

->  Merge Join  (cost=57811.81..62290.04 rows=114606 width=58) (actual 
time=5174.556..5698.323 rows=204672 loops=1)
   Merge Cond: ((eric_enodeb_cellrelation_metrics.start_time = 
eric_enodeb_cellrelation_metrics_1.start_time) AND 
((ROW(eric_enodeb_cellrelation_metrics.sect_id, 
eric_enodeb_cellrelation_metrics.neigh_sect_id)) = 
(ROW(eric_enodeb_cellrelation_metrics_1.neigh_sect_id, 
eric_enodeb_cellrelation_metrics_1.sect_id

Thanks,
Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] workaround for column cross-correlation

2017-06-12 Thread Justin Pryzby
On Mon, Jun 12, 2017 at 08:46:57PM -0700, Jeff Janes wrote:
> On Mon, Jun 12, 2017 at 8:17 PM, Justin Pryzby  wrote:
> 
> > I know PG 10 will have support "CREATE STATISTICS.." for this..
> >
> > ..but I wondered if there's a recommended workaround in earlier versions ?
> 
> Not without seeing the query

With my workaround:
ts=# explain ANALYZE SELECT t1.sect_id, t1.start_time as period, sum (1) FROM 
enodeb_ncell_view t1, enodeb_ncell_view inc
WHERE ((t1.start_time >= '2017-04-30 00:00:00' AND t1.start_time < '2017-05-01 
00:00:00'))
AND ((inc.start_time >= '2017-04-30 00:00:00' AND inc.start_time < '2017-05-01 
00:00:00'))
AND t1.start_time = inc.start_time
AND ROW((t1.sect_id,t1.neigh_sect_id))= ROW((inc.neigh_sect_id,inc.sect_id))
GROUP BY t1.sect_id, period;
HashAggregate  (cost=63149.59..63371.74 rows=22215 width=10) (actual 
time=80092.652..80097.521 rows=22464 loops=1)
...

Without:
ts=# explain ANALYZE SELECT t1.sect_id, t1.start_time as period, sum (1) FROM 
enodeb_ncell_view t1, enodeb_ncell_view inc
WHERE ((t1.start_time >= '2017-04-30 00:00:00' AND t1.start_time < '2017-05-01 
00:00:00'))
AND ((inc.start_time >= '2017-04-30 00:00:00' AND inc.start_time < '2017-05-01 
00:00:00'))
AND t1.start_time = inc.start_time
AND t1.sect_id=inc.neigh_sect_id AND t1.neigh_sect_id=inc.sect_id
GROUP BY t1.sect_id, period;
GroupAggregate  (cost=57847.32..62265.54 rows=402 width=10) (actual 
time=694.186..952.744 rows=22464 loops=1)
...

This is a small inner subquery of a larger report - sum(1) is a placeholder for
other aggregates I've stripped out.

> > 2) memory explosion in hash join (due to poor estimate?) caused OOM.
> 
> As far as I know, the only way a hash join should do this is if the join
> includes a huge number of rows with exactly the same 32 bit hash codes.
> Otherwise, it should spill to disk without causing OOM.  Hash aggregates,
> on the other hand, are a different matter.

That's almost certainy what I meant.

   ->  Subquery Scan on data_rx_enb  (cost=3409585.76..3422861.74 rows=663799 
width=20) (actual time=510475.987..512069.064 rows=2169821 loops=1)
 ->  HashAggregate  (cost=3409585.76..3416223.75 rows=663799 width=16) 
(actual time=510475.984..511650.337 rows=2169821 loops=1)
   Group Key: eric_enodeb_cell_metrics_1.site_id, 
eric_enodeb_cell_metrics_1.start_time

Thanks,
Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] inheritence children with integer columns of differing width

2017-06-19 Thread Justin Pryzby
I wondered if anyone had considered allowing inheritence children to have
different column types than the parent (and each other).

I'm thinking of the trivial (?) case of smallint/int/bigint.  Reason is that
when we load data which exceeds the theshold for the current data type we have
to promote the column, rewriting the table, which can take a very long time,
and use very large amount of space.  We've had to start uninheriting all but
the most recent children before ALTERing to make it more reasonable (and then
separately ALTER+reinherit each child) - it's especially painful when a key
column grows beyond "int", and many tables need to be altered all at once..

It seems to me this is what would happen if one were to UNION ALL the children,
although I see the plan differs with differering type:

pryzbyj=# create table ii(i bigint);
pryzbyj=# create table i(i int);
pryzbyj=# explain SELECT * FROM ii UNION ALL SELECT * FROM i;
Append  (cost=0.00..110.80 rows=4540 width=6)
  ->  Seq Scan on ii  (cost=0.00..31.40 rows=2140 width=8)
  ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..58.00 rows=2400 width=4)
->  Seq Scan on i  (cost=0.00..34.00 rows=2400 width=4)

pryzbyj=# alter table i ALTER i TYPE bigint;
ALTER TABLE
pryzbyj=# explain SELECT * FROM ii UNION ALL SELECT * FROM i;
Append  (cost=0.00..62.80 rows=4280 width=8)
  ->  Seq Scan on ii  (cost=0.00..31.40 rows=2140 width=8)
  ->  Seq Scan on i  (cost=0.00..31.40 rows=2140 width=8)

If it were allowed for children to have int columns with differing widths, then
to promote int column, we would uninherit the historic children, ALTER the
parent (and most recent tables), and then reinherit the children (unless ALTER
on its own avoided rewriting tables in such a case).

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] huge RAM use in multi-command ALTER of table heirarchy

2017-07-06 Thread Justin Pryzby
I've seen this before while doing SET STATISTICS on a larger number of columns
using xargs, but just came up while doing ADD of a large number of columns.
Seems to be roughly linear in number of children but superlinear WRT columns.
I think having to do with catalog update / cache invalidation with many
ALTERs*children*columns?

32 cols and 2 children=> 12MB
256 cols and 11 children  => 74MB
256 cols and 111 children => 582MB
512 cols and 11 children  => 229MB

(in our "huge" case, there were ~1600 columns and maybe even more children)

I was testing with this command
PGHOST=/tmp PGPORT= sh -ec 'for maxcols in 512 ; do 
~/src/postgresql.install/bin/postgres -D ~/src/postgres.dat -c port= & 
sleep 4; cols=$(for d in `seq 1 $maxcols`; do echo "ADD c$d int,"; done 
|xargs); PGOPTIONS="-c client_min_messages=warning" psql postgres -qc "DROP 
TABLE t CASCADE" || [ $? -eq 1 ]; psql postgres -qc "CREATE TABLE t()"; for c 
in `seq 1 11`; do psql postgres -qc "CREATE TABLE c$c() INHERITS(t)"; done; for 
d in `seq 1 $maxcols`; do echo "ALTER TABLE t ADD c$d int;"; done 
|PGOPTIONS="-c client_min_messages=DEBUG3 -c log_statement_stats=on" psql 
postgres -c "ALTER TABLE t ${cols%,}" 2>/tmp/pg.err2; 
~/src/postgresql.install/bin/pg_ctl -swD ~/src/postgres.dat stop; done'

..and log_statment_stats with a variation on the getrusage patch here 
https://www.postgresql.org/message-id/20170615145824.GC15684%40telsasoft.com

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning (constraint exclusion involving joins)

2017-07-25 Thread Justin Pryzby
On Tue, Jul 25, 2017 at 06:21:43PM +0530, Krithika Venkatesh wrote:
> I have a table that is partitioned on a numeric column (ID).
> 
> Partitioning works when I query the table with no joins.
> 
> SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
> CREATED_TS = CURRENT_TIMESTAMP)
> 
> Partitioning doesn't work when I do join.
> 
> SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.

I think you mean "constraint exclusion doesn't work when yo do a join",

which is because it only works on simple values compiled before the planner
gets to see them:

main=# explain SELECT COUNT(1) FROM eric_enodeb_metrics WHERE start_time>now(); 
-- -'999 minutes'::interval;
   QUERY PLAN   
 
-
 Aggregate  (cost=62.44..62.45 rows=1 width=8)
   ->  Append  (cost=0.00..62.40 rows=14 width=0)
 ->  Seq Scan on eric_enodeb_metrics  (cost=0.00..0.00 rows=1 width=0)
   Filter: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201607_idx on eric_enodeb_201607 
 (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201608_idx on eric_enodeb_201608 
 (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201609_idx on eric_enodeb_201609 
 (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())

https://www.postgresql.org/docs/current/static/ddl-partitioning.html
|The following caveats apply to constraint exclusion:
| Constraint exclusion only works when the query's WHERE clause contains
|constants (or externally supplied parameters). For example, a comparison
|against a non-immutable function such as CURRENT_TIMESTAMP cannot be optimized,
|since the planner cannot know which partition the function value might fall
|into at run time.
[..]


.. and see an early mail on its implementation, here:
https://www.postgresql.org/message-id/1121251997.3970.237.camel@localhost.localdomain

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning question

2017-07-30 Thread Justin Pryzby
On Mon, Jul 31, 2017 at 09:15:29AM +1000, Alex Samad wrote:
> Hi
> 
> I was about to partition a large (?) approx 3T of data 2B rows into
> partition tables but broken up into MM ...
> 
> Now I have been reading about limiting the number of partitions otherwise
> it could slow down the parser.
> 
> My reasoning for limiting to MM was that most of the request would be
> monthly based.
> 
> Should I be making the partitioning based on  instead and have lots
> more indexs.
> 
> If I have an index on the timestamp field will it help limiting to MM ?

The major advantages of partitions are enumerated here:
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#ddl-partitioning-overview

For your case, it might be that seq scans of an entire "monthly" partition turn
out to be very advantageous, compared with index scan (or seq scan of entire
3TB data).

Also DROPing the oldest partition every month is commonly very much more
efficient than DELETEing it..

There are warnings like these:

|All constraints on all partitions of the master table are examined during
|constraint exclusion, so large numbers of partitions are likely to increase
|query planning time considerably. Partitioning using these techniques will work
|well with up to perhaps a hundred partitions; don't try to use many thousands
|of partitions.

Unless you have 100s of years of data I don't think it would be a problem.

For us, having hundreds of partitions hasn't been an issue (planning time is
insignificant for our analytic report queries).  But there's an overhead to
partitions and at some point the cost becomes significant.  (Actually, I think
one cost which *did* hit us, while experimenting with *daily* partition
granularity of every table, was probably due to very large pg_statistics and
pg_attributes tables, which no longer fit in buffer cache).

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning question

2017-07-30 Thread Justin Pryzby
On Mon, Jul 31, 2017 at 10:25:54AM +1000, Alex Samad wrote:
> I note that you link to P10 and I am currently looking at 9.6.  The changes
> do look nice for partitioning for p10.

Yes sorry, pg10 is beta - avoid using it except for testing purposes.

> I will add currently we don't delete anything, we will keep adding to it.
> 
> Also I am thinking my insert trigger becomes a lot smaller and easier if I
> leave it at yearly.
Note: the trigger function can either be a static function updated monthly (to
handle the next month), preferably with the most recent months tested first (so
a typical newly-inserted rows only goes through one if/case test).

Alternately, the trigger function can dynamically compute the table into which
to insert using plpgsql "format()" similar to here:
https://www.postgresql.org/docs/9.1/static/plpgsql-statements.html#PLPGSQL-QUOTE-LITERAL-EXAMPLE

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Count column with name 'count' returns multiple rows. Why?

2017-08-18 Thread Justin Pryzby
On Fri, Aug 18, 2017 at 01:01:45PM +0200, Rob Audenaerde wrote:
> I don't understand why this query:
> 
>select count(base.*) from mytable base;
> 
> does return multiple rows.
> 
>select count(1) from mytable base;
> 
> returns the proper count.
> 
> There is a column with the name 'count'.
> 
> Can anyone please explain this behaviour?

https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SQL-EXPRESSIONS-FUNCTION-CALLS
https://www.postgresql.org/docs/9.6/static/rowtypes.html#ROWTYPES-USAGE


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Count column with name 'count' returns multiple rows. Why?

2017-08-18 Thread Justin Pryzby
On Fri, Aug 18, 2017 at 10:47:37PM +0200, Peter J. Holzer wrote:
> On 2017-08-18 06:37:15 -0500, Justin Pryzby wrote:
> > On Fri, Aug 18, 2017 at 01:01:45PM +0200, Rob Audenaerde wrote:
> > > I don't understand why this query:
> > > 
> > >select count(base.*) from mytable base;
> > > 
> > > does return multiple rows.
> > > 
> > >select count(1) from mytable base;
> > > 
> > > returns the proper count.
> > > 
> > > There is a column with the name 'count'.
> > > 
> > > Can anyone please explain this behaviour?
> > 
> > https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SQL-EXPRESSIONS-FUNCTION-CALLS
> > https://www.postgresql.org/docs/9.6/static/rowtypes.html#ROWTYPES-USAGE
> 
> Maybe I overlooked it, but I don't see anything in those pages which
> explains why «count» is parsed as a column name in the first example and
> as a function name in the second.
> 
> Nor do I see what «count(base.*)» is supposed to mean. It seems to be
> completely equivalent to just writing «count», but the part in
> parentheses is not ignored: It has to be either the table name or the
> table name followed by «.*». Everything else I tried either led to a
> syntax error or to «count» being recognized as a function. So apparently
> columnname open-parenthesis tablename closed-parenthesis is a specific
> syntactic construct, but I can't find it documented anywhere.

| Another special syntactical behavior associated with composite values is that
|we can use functional notation for extracting a field of a composite value. The
|simple way to explain this is that the notations field(table) and table.field
|are interchangeable. For example, these queries are equivalent:


| Tip: Because of this behavior, it's unwise to give a function that takes a
|single composite-type argument the same name as any of the fields of that
|composite type. If there is ambiguity, the field-name interpretation will be
|preferred, so that such a function could not be called without tricks. One way
|to force the function interpretation is to schema-qualify the function name,
|that is, write schema.func(compositevalue).

pryzbyj=# select base.count from s91 base;
 count 
---
 1
 2
 3
(3 rows)

pryzbyj=# select pg_catalog.count(base.*) from s91 base;
count | 3


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] "Canceling authentication due to timeout" with idle transaction and reindex

2017-09-15 Thread Justin Pryzby
On Fri, Sep 15, 2017 at 12:25:58PM +0200, s19n wrote:

> 1. with "\set AUTOCOMMIT off" in my psqlrc, issue a
> "SELECT * FROM pg_stat_activity;" and leave it there
This probably obtains a read lock on some shared, system tables/indices..

> 2. in a different connection, issue a database REINDEX (of any database
> different from 'postgres')
.. and this waits to get an EXCLUSIVE lock on those tables/inds, but has to
wait on the read lock;

> * Any further attempt to create new connections to the server, to any
> database, does not succeed and leads to a "FATAL: canceling authentication
> due to timeout" in the server logs.
.. and logins are apparently waiting on the reindex (itself waiting to get
exclusive) lock.

You can look at the locks (granted vs waiting) in SELECT * FROM pg_locks

But you might consider: 1) looping around tables/indices rather than "REINDEX
DATABASE", and then setting a statement_timeout=9s for each REINDEX statement;
and/or, 2) use pg_repack, but I don't think it handles system tables.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] "Canceling authentication due to timeout" with idle transaction and reindex

2017-09-15 Thread Justin Pryzby
On Fri, Sep 15, 2017 at 06:49:06AM -0500, Ron Johnson wrote:
> On 09/15/2017 06:34 AM, Justin Pryzby wrote:
> [snip]
> >But you might consider: 1) looping around tables/indices rather than "REINDEX
> >DATABASE", and then setting a statement_timeout=9s for each REINDEX 
> >statement;
> 
> Is there a way to do that within psql?  (Doing it from bash is trivial, but
> I'd rather do it from SQL.)

Not that I know, but it wouldn't help me, since our script also calls pg_repack
(for indices on system and some other tables), and also has logic to handle
differently historic partition tables.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Any known issues Pg 9.3 on Ubuntu Xenial kernel 4.4.0?

2017-09-20 Thread Justin Pryzby
On Wed, Sep 20, 2017 at 01:14:14PM -0500, Jerry Sievers wrote:
> Be curious to hear of issues encountered and particular to eager to know
> if disabling any kernel 4.x features helped.

What was the old kernel/OS ?  wheezy / kernel 3.x ?

Perhaps try these ideas ?
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Justin Pryzby
On Tue, Oct 10, 2017 at 01:40:07PM -0700, pinker wrote:
> Hi to all!
> 
> We've got problem with a very serious repetitive incident on our core
> system. Namely, cpu load spikes to 300-400 and the whole db becomes
> unresponsive. From db point of view nothing special is happening, memory
> looks fine, disks io's are ok and the only problem is huge cpu load. Kernel
> parameters that are increasing with load are always the same:

> * disabled transparent huge pages (they were set before unfortunately to
> 'always')

Did you also try disabling KSM ?
echo 2 |sudo tee /sys/kernel/mm/ksm/run

I believe for us that was affecting a postgres VM(QEMU/KVM) and maybe not
postgres itself.  Worth a try ?

https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Determine size of table before it's committed?

2017-10-11 Thread Justin Pryzby
On Wed, Oct 11, 2017 at 10:43:26AM -0300, Seamus Abshere wrote:
> I've had an `INSERT INTO x SELECT FROM [...]` query running for more
> then 2 days.
> 
> Is there a way to see how big x has gotten? Even a very rough estimate
> (off by a gigabyte) would be fine.

On linux:

Run ps -fu postgres (or SELECT pid, query FROM pg_stat_activity) and
look at: ls -l /proc/PID/fd

writing to XX.22 means it's written ~22GB.

You can also SELECT relfilenode FROM pg_class WHERE oid='x'::regclass (or
relname='x').

Or try using strace (but beware I've seen its interruption to syscalls change
the behavior of the program being straced).

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Preventing psql from attempting to access ~/.pgpass file.

2017-10-16 Thread Justin Pryzby
On Tue, Oct 17, 2017 at 09:06:59AM +0300, Allan Kamau wrote:
> Is there a way to instruct psql not to try reading ~/.pgpass file?

https://www.postgresql.org/docs/current/static/libpq-envars.html
PGPASSFILE behaves the same as the passfile connection parameter.
passfile
Specifies the name of the file used to store passwords (see Section 33.15). 
Defaults to ~/.pgpass, or %APPDATA%\postgresql\pgpass.conf on Microsoft 
Windows. (No error is reported if this file does not exist.)

https://www.postgresql.org/docs/9.6/static/libpq-envars.html
PGPASSFILE specifies the name of the password file to use for lookups. If not 
set, it defaults to ~/.pgpass (see Section 31.15).

verifying it doesn't access the default:
pryzbyj@pryzbyj:~$ echo quit |PGPASSFILE=/nonextant strace psql 2>&1 |grep -E 
'nonex|pgpass'
stat("/nonextant", 0x7fffbd13c9f0)  = -1 ENOENT (No such file or directory)
pryzbyj@pryzbyj:~$ 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird performance difference

2017-10-20 Thread Justin Pryzby
On Fri, Oct 20, 2017 at 03:08:26PM -0800, Israel Brewster wrote:
> Summary: the following query takes around 12 seconds on my test machine. On 
> my production machine, it's at half an hour and counting. What's going on?
> 
> which, when run on my test server, has this explain analyze output: 
> https://explain.depesz.com/s/4piv . Around 
> 12 second runtime, which isn't too bad (in the grand scheme of things), 
> although there is probably room for improvement.

Are these cast to ::date cast is really needed (Alternately, do you have an 
index on column::date ?)
|WHERE outtime::date>='2017-01-01'
|ON outtime::date BETWEEN oag_schedules.startdate
|AND outtime::date BETWEEN oag_batches.eff_from

The problem is clearly here:

 Merge Join (cost=30,604.12..31,301.12 ROWS=1 width=76) (actual 
time=1,153.883..9,812.434 ROWS=3,420,235 loops=1)
Merge Cond: oag_schedules.flightnum)::text) = 
(legdetail.flightnum)::text) AND ((oag_schedules.origin)::text = 
(legdetail.legfrom)::text))
Join Filter: (((legdetail.outtime)::date >= oag_schedules.startdate) AND 
((legdetail.outtime)::date <= COALESCE(oag_schedules.enddate, 
'infinity'::date)) AND (date_part('isodow'::text, 
((legdetail.outtime)::date)::timestamp without time zone) = ANY 
((oag_schedules.frequency)::double precision[])))
ROWS REMOVED BY JOIN FILTER: 6822878

Can you send "explain" (not explain analyze) for the production server?

And \d for those tables.

And/or EXPLAIN ANALYZE for a query with shorter date range on production (to
confirm it has a similar problem in rowcount estimation).

You can try munging the query to move/change the "Join Filter" components of
the query to see which one is contributing most to the rowcount estimate being
off by a factor of 3e6.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] parray_gin and \d errors in PG10

2017-10-22 Thread Justin Pryzby
After installing parray_gin extension and pg_upgrading another instance,
\d is failing like so:

[pryzbyj@database ~]$ psql ts -c '\d pg_class'
ERROR:  operator is not unique: "char"[] @> unknown
LINE 6:   (stxkind @> '{d}') AS ndist_enabled,
   ^
HINT:  Could not choose a best candidate operator. You might need to add 
explicit type casts.

[pryzbyj@database ~]$ psql ts -c '\d pg_class' -E
[...]
* QUERY **
SELECT oid, stxrelid::pg_catalog.regclass, 
stxnamespace::pg_catalog.regnamespace AS nsp, stxname,
  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
   FROM pg_catalog.unnest(stxkeys) s(attnum)
   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
a.attnum = s.attnum AND NOT attisdropped)) AS columns,
  (stxkind @> '{d}') AS ndist_enabled,
  (stxkind @> '{f}') AS deps_enabled
FROM pg_catalog.pg_statistic_ext stat WHERE stxrelid = '1259'
ORDER BY 1;
**

ERROR:  operator is not unique: "char"[] @> unknown
LINE 6:   (stxkind @> '{d}') AS ndist_enabled,
   ^
HINT:  Could not choose a best candidate operator. You might need to add 
explicit type casts.

Thankfully this is still working:
ts=# \do @>

 List of operators
   Schema   | Name | Left arg type | Right arg type | Result type | 
 Description   
+--+---++-+
 pg_catalog | @>   | aclitem[] | aclitem| boolean | contains
 pg_catalog | @>   | anyarray  | anyarray   | boolean | contains
 pg_catalog | @>   | anyrange  | anyelement | boolean | contains
 pg_catalog | @>   | anyrange  | anyrange   | boolean | contains
 pg_catalog | @>   | box   | box| boolean | contains
 pg_catalog | @>   | box   | point  | boolean | contains
 pg_catalog | @>   | circle| circle | boolean | contains
 pg_catalog | @>   | circle| point  | boolean | contains
 pg_catalog | @>   | jsonb | jsonb  | boolean | contains
 pg_catalog | @>   | path  | point  | boolean | contains
 pg_catalog | @>   | polygon   | point  | boolean | contains
 pg_catalog | @>   | polygon   | polygon| boolean | contains
 pg_catalog | @>   | tsquery   | tsquery| boolean | contains
 public | @>   | hstore| hstore | boolean | 
 public | @>   | text[]| text[] | boolean | text array 
contains compared by strict
(15 rows)

This query works fine when adding cast to text[]:

ts=# SELECT oid, stxrelid::pg_catalog.regclass, 
stxnamespace::pg_catalog.regnamespace AS nsp, stxname,
  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
   FROM pg_catalog.unnest(stxkeys) s(attnum)
   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
a.attnum = s.attnum AND NOT attisdropped)) AS columns,
  (stxkind @> '{d}'::text[]) AS ndist_enabled,
  (stxkind @> '{f}'::text[]) AS deps_enabled
FROM pg_catalog.pg_statistic_ext stat WHERE stxrelid = '1259'
ORDER BY 1;
 oid | stxrelid | nsp | stxname | columns | ndist_enabled | deps_enabled 
-+--+-+-+-+---+--
(0 rows)

Is this to be considered an issue with parray_gin or with psql ?

I don't think that's an urgent problem to fix, but if someone has a workaround
for \d I would appreciate if you'd pass it along :)

Thanks in advance
Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] parray_gin and \d errors in PG10

2017-10-22 Thread Justin Pryzby
On Sun, Oct 22, 2017 at 02:36:12PM -0400, Tom Lane wrote:
> Justin Pryzby  writes:
> > After installing parray_gin extension and pg_upgrading another instance,
> > \d is failing like so:
> 
> > [pryzbyj@database ~]$ psql ts -c '\d pg_class'
> > ERROR:  operator is not unique: "char"[] @> unknown
> > LINE 6:   (stxkind @> '{d}') AS ndist_enabled,

> match the anyarray operator.  Possibly we could use
> 
>   (stxkind @> '{d}'::pg_catalog."char"[])
> 
> That works for me without parray_gin installed, but I wonder whether
> it fails due to ambiguity if you do have parray_gin installed.  In
> principle this'd still match the text[] @> text[] operator, and I'm
> not sure whether we have an ambiguity resolution rule that would
> prefer one over the other.

ts=# SELECT oid, stxrelid::pg_catalog.regclass, 
stxnamespace::pg_catalog.regnamespace AS nsp, stxname,
  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
   FROM pg_catalog.unnest(stxkeys) s(attnum)
   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
a.attnum = s.attnum AND NOT attisdropped)) AS columns,
  (stxkind @> '{d}'::pg_catalog."char"[]) AS ndist_enabled,
  (stxkind @> '{d}'::pg_catalog."char"[]) AS deps_enabled
FROM pg_catalog.pg_statistic_ext stat WHERE stxrelid = '1259'
ORDER BY 1;
ERROR:  operator is not unique: "char"[] @> "char"[]
LINE 6:   (stxkind @> '{d}'::pg_catalog."char"[]) AS ndist_enabled,
   ^
HINT:  Could not choose a best candidate operator. You might need to add 
explicit type casts.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does a SELECT query cause "dirtied" buffers?

2017-10-27 Thread Justin Pryzby
On Fri, Oct 27, 2017 at 09:24:40PM +0200, Thomas Kellerer wrote:
> Under which situation does a SELECT query change a block?

https://wiki.postgresql.org/wiki/Hint_Bits


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] UPDATE syntax change (column-list UPDATE syntax fails with single column)

2017-10-30 Thread Justin Pryzby
On Mon, Oct 30, 2017 at 12:29:03PM -0400, Adam Brusselback wrote:
> I have some queries that were working in 9.6 which suddenly broke when
> moving to 10.
> 
> Digging in, the error i'm getting is: ERROR:  source for a
> multiple-column UPDATE item must be a sub-SELECT or ROW() expression

> So there was a change made, and you now cannot use the multi-column
> syntax if you're only updating a single column.  Was this intentional?

I found the same while testing during beta:
https://www.postgresql.org/message-id/flat/20170719174507.GA19616%40telsasoft.com#20170719174507.ga19...@telsasoft.com

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] explain analyze output: 0 rows, 1M loops

2017-11-01 Thread Justin Pryzby
On Wed, Nov 01, 2017 at 12:19:21PM -0700, David G. Johnston wrote:
> On Wed, Nov 1, 2017 at 11:59 AM, Scott Marlowe 
> wrote:
> 
> > So some of my output from an explain analyze here has a line that says
> > this:
> >
> > ex Scan using warranty_order_item_warranty_order_id_idx on
> > warranty_order_item woi_1 (cost=0.57..277.53 rows=6 width=137) (actual
> > time=0.110..0.111 rows=0 loops=1,010,844)
> >
> 
> Not my strong suit but, I'm pretty sure that reads: "The index was queried
> 1M+ times and none of those inqueries resulted in a record being found".
> IIUC I'd be wondering why some form of hash join wasn't used...

Except that:

https://www.postgresql.org/docs/current/static/using-explain.html
"... the loops value reports the total number of executions of the node, and
the actual time and ROWS VALUES SHOWN ARE AVERAGES PER-EXECUTION."

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Possible bug: could not open relation with OID [numbers] SQL State: XX000

2017-11-01 Thread Justin Pryzby
On Wed, Nov 01, 2017 at 04:11:07PM -0400, Adam Brusselback wrote:
> I have something going on, and i'm not sure what is causing it.  I
> recently upgraded our development environment to PG10, and the error
> in the subject appeared with one of my analytical functions.

What relation is that ?  I guess it's harder to know since it's within a
function, but could you add NOTICE for all the relations you're outputting ?

Something like
ts=# SELECT 'alarms'::regclass::oid;
oid | 19575

Also, if you have log_statement=all (and maybe log_destination=stderr,csvlog),
can you send the log fragment for the line with error_severity='ERROR' ?
https://www.postgresql.org/docs/current/static/runtime-config-logging.html#runtime-config-logging-csvlog

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] EXPLAIN command just hangs...

2017-11-02 Thread Justin Pryzby
On Thu, Nov 02, 2017 at 08:51:23PM +, Rhhh Lin wrote:
> However, this query will run for days without completing. I suspect it has to 
> do with the timestamp predicate and lack of using an appropriate index access 
> path. This is what I need to verify/establish.

Perhaps the timestamp index is badly fragmented, and perhaps it would help to
reindex/cluster/pg_repack..

> So I try and perform a simple 'EXPLAIN ' in order to check what the 
> planner has for the execution of this query.
> And after approx. six hours waiting, nothing has returned. It is still 
> executing, but has not given me back my prompt (I can see the session is 
> still active).My understanding is that the simple EXPLAIN version does not 
> actually execute the query, so I do not understand why this is also 
> performing poorly/hanging/stuck? Any ideas?

Is explain "wait"ing ?  If you do "ps -fu postgres |grep EXPLAIN" does it say
"EXPLAIN waiting" ?

Or, if you "ps uww ThePID" does it show lots of CPU(or RAM) ?

If you do "SELECT * FROM pg_stat_activity WHERE pid=??" (from "ps" or from
SELECT pg_backend_pid() before starting "explain") does it show "active" state
or waiting ?

If it's waiting, you can see what it's waiting ON by looking at pg_locks..
Maybe like: SELECT c.query, * FROM pg_locks a JOIN pg_locks b USING(relation)
JOIN pg_stat_activity c ON b.pid=c.pid WHERE a.pid=?? (from ps)

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] EXPLAIN command just hangs...

2017-11-02 Thread Justin Pryzby
On Thu, Nov 02, 2017 at 09:13:05PM +, Rhhh Lin wrote:
> Yes, it may be an issue with the index, but I'd like to have some evidence
> towards that before dropping and recreating (It does not appear that bloat is
> a problem here or dead tuples either).

Why do you say those aren't an issue?  Just curious.

Have you vacuum or reindexed (or pg_repack) ?

How large are the table and index? \dt+ and \di+

> The reason I am very suspect of the timestamp column makeup is that if I
> remove that predicate from the EXPLAIN command and the actual query, both
> complete within seconds without issue. So I do know where the issue is (I
> just dont know what the issue is!).

It could be that you're hitting selfuncs.c:get_actual_variable_range() and the
extremes of the index point to many dead tuples (as I see Tom suggests).

You could strace the backend and see if it's reading (or writing??)
consecutively (hopefully with ample OS readahead) or randomly (without).

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] checkpoint and recovering process use too much memory

2017-11-02 Thread Justin Pryzby
On Fri, Nov 03, 2017 at 01:43:32AM +, tao tony wrote:
> I had an asynchronous steaming replication HA cluster.Each node had 64G 
> memory.pg is 9.6.2 and deployed on centos 6.
> 
> Last month the database was killed by OS kernel for OOM,the checkpoint 
> process was killed.

If you still have logs, was it killed during a large query?  Perhaps one using
a hash aggregate?

> I noticed checkpoint process occupied memory for more than 20GB,and it was 
> growing everyday.In the hot-standby node,the recovering process occupied 
> memory as big as checkpoint process.

"resident" RAM of a postgres subprocess is often just be the fraction of
shared_buffers it's read/written.  checkpointer must necessarily read all dirty
pages from s-b and write out to disk (by way of page cache), so that's why its
RSS is nearly 32GB.  And the recovery process is continuously writing into s-b.

> Now In the standby node,checkpoint and recovering process  used more then 
> 50GB memory as below,and I worried someday the cluster would be killed by OS 
> again.
> 
>PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 167158 postgres  20   0 34.9g  25g  25g S  0.0 40.4  46:36.86 postgres: 
> startup process   recovering 00040855004B
> 167162 postgres  20   0 34.9g  25g  25g S  0.0 40.2  17:58.38 postgres: 
> checkpointer process
> 
> shared_buffers = 32GB

Also, what is work_mem ?

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] EXPLAIN command just hangs...

2017-11-03 Thread Justin Pryzby
On Fri, Nov 03, 2017 at 09:12:02PM +, Rhhh Lin wrote:
> I checked for dead tuples against that particular table initially as I have 
> seen performance problems before in a related 'busy' environment which needed 
> its frequency of vacuuming to be increased. So I have a query to check for 
> table with dead tuples and this table is not showing any.
>
> I also came across the suggestion that bloat might be an issue on the 
> database and how to identify and address it and it does not appear to be 
> evident here also, so thats my thinking as to why these are not factors.

I'd be helpful if you'd paste the commands+output as you run them "\dt+, \di+,
ps, vacuum, dead tuples, etc"

> I have vacuumed. I have not reindexed as it is a prod environment and I see 
> that... "REINDEX locks out writes but not reads of the index's parent 
> table.", so I may have to arrange this to avoid any interruptions (Although 
> currently, accessing this table seems completely problematic anyway!).

Perhaps you could look into pg_repack?  Note that by default it will kill
longrunning transaction if it needs in order to (briefly) obtain a
super-exclusive lock.

> The table is 691MB and the composite index(PK) is 723 MB.

It'd be useful to see the pg_stat_user_tables.* and pg_class.reltuples and
relpages for that table.  Also output from VACUUM VERBOSE or autovacuum logs,
if you have them (but note that vacuum does different work every time it's
re-run).

> My thinking now is I may need to export this data out to a staging area 
> whereby I can attempt to "play" with it without any repercussions...

I imagine that maybe this is related to the pattern of activity on that table
(specifically around the extremes values of its indexed columns).  So it'll be
hard to reproduce, and dumping and reloading the table (or just reindexing it
without reloading it at all) will probably temporarily improve or resolve the
issue.

You could try *adding* a new index on the timestamp column alone (CREATE INDEX
CONCURRENTLY).  Re-creating the index might conceivably be the solution in
the end, and it's what pg_repack does behind the scenes.

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] dump time increase by 1h with new kernel

2009-09-27 Thread Justin Pryzby
When we upgraded from linux-2.6.24 to ./linux-2.6.27, our pg_dump
duration increased by 20%.  My first attempt at resolution was to boot
with elevator=deadline.  However that's actually the default IO
scheduler in both kernels.

The two dmesg's are at:
https://www.norchemlab.com/tmp/linux-2.6.24-22.45-server
https://www.norchemlab.com/tmp/linux-2.6.27-14.41-server

The database partition is: xfs / lvm / aic79xx / scsi.

Booting back into the .24 kernel brings the pg_dump down to 5 hours
(rather than 6, for daily 20GB output compressed by pg_dump -Fc).

Does anyone know what might be different which could cause such a
drastic change?

Thanks,
Justin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general