from:"Tom Lane"

Re: system catalog permissions

2018-02-26 Thread Tom Lane

PropAAS DBA  writes:
> We have a client which is segmenting their multi-tenant cluster 
> (PostgreSQL 9.6) by schema, however if one of their clients connects via 
> pgadmin they see ALL schemas, even the ones they don't have access to 
> read. I assume pgadmin is pulling the list from the system catalogs.

> What's the right/best practice approach? revoke all from public on 
> specific system catalog tables? Which tables?

Messing with the system catalog permissions is likely to break stuff
you'd rather not break.

PG generally doesn't assume that anything in the system catalogs is
sensitive.  If you don't want user A looking at user B's catalog
entries, give them separate databases, not just separate schemas.

regards, tom lane

Re: Unexpected behavior with transition tables in update statement trigger

2018-02-27 Thread Tom Lane

Thomas Munro  writes:
> Here's a new version with tuplestore_select_read_pointer() added in
> another place where it was lacking, and commit message.  Moving to
> -hackers, where patches go.

Pushed, along with a regression test based on your example.
Unfortunately, this came in a bit too late for this week's releases :-(

        regards, tom lane

Re: index-only-scan when there is an index on all columns

2018-02-27 Thread Tom Lane

Hadi Moshayedi  writes:
> I am wondering why is it not using index-only-scan (which would use the
> cache better) and instead it does a bitmap scan?

Never experiment on an empty table and assume that the resulting plan
is the same as you'd get on a large table.

In this case, not only don't you have any meaningful amount of data
loaded, but the planner can see that none of the table's pages are
marked all-visible, meaning that the "index-only" scan would degrade
to a regular indexscan, which is how it gets costed.  And on a single-page
table, an indexscan is going to have a hard time beating other
alternatives.

        regards, tom lane

Re: index-only-scan when there is an index on all columns

2018-02-27 Thread Tom Lane

"David G. Johnston"  writes:
> If one runs vacuum on a table (small or otherwise) that is currently
> choosing an index scan as its best plan how likely is it that post-vacuum
> an index-only plan would be chosen if the index type and column presence
> conditions are met?

Offhand I think it would always prefer IOS over regular indexscan if the
table is mostly all-visible.  The problem in this example was that other
choices dominate both.

> Also, I recall discussion that select statements will touch the visibility
> map (hence causing write I/O even in a read-only query) but [1] indicates
> that only vacuum will set them ddl will clear them.

Hm, I don't recall that, but I've not been involved in the last few rounds
of hacking on that mechanism.

regards, tom lane

Re: Posgresql Log: lots of parse statements

2018-03-01 Thread Tom Lane

Vikas Sharma  writes:
> But why are there so many parse statement occurances for one query?

A "parse" log entry is recorded when the client sends a Parse protocol
message.  So the answer to that question needs to be sought in your
client application's logic.

> In the log I can see these parse statement occurances about 400 times in a
> day and everytime taking longer than 15 secs.

That seems like a mighty long time for parse analysis.  Maybe you're
having difficulties with something taking exclusive locks, thereby
blocking other queries?

    regards, tom lane

Re: SQL syntax

2018-03-02 Thread Tom Lane

Adrian Klaver  writes:
> On 03/02/2018 04:36 PM, Dale Seaburg wrote:
>> To finish off the WHERE clause, I need to look at the first 2 letters, 
>> like "D:".  My question is how do I specify in the WHERE clause, to look 
>> at the first 2 characters in the Image_Filename column?  What is the 
>> correct SQL syntax for looking at just a portion of a column?

> SELECT "Image_Filename" FROM "Instruments" WHERE "ImageFilename" LIKE 'D:%';

Another way is to use the substring() function:

SELECT "Image_Filename" FROM "Instruments"
  WHERE substring("ImageFilename", 1, 2) = 'D:';

or if you want to use the SQL committee's COBOLish syntax:

SELECT "Image_Filename" FROM "Instruments"
  WHERE substring("ImageFilename" FROM 1 FOR 2) = 'D:';

Depending on what you're doing, either the pattern-match way or the
substring way might be more convenient.  The performance implications
are different too, though that won't matter to you unless you're dealing
with so much data that you want to create a specialized index to make
queries of this form faster.

regards, tom lane

Re: pg_upgrade fails to restore function

2018-03-03 Thread Tom Lane

"=?gb18030?B?Um9nZXIgU3Vubnk=?=" <947035...@qq.com> writes:
> During the pg_upgrade from PostgreSQL 9.4 to PostgreSQL 10.0, the 
> error¡±[archiver (db)] Error while PROCESSING TOC:¡± happened, The upgrade 
> failed. OS is RHEL 6.5 , The error is as below:
> pg_restore: connecting to database for restorepg_restore: 
> creating pg_largeobject "pg_largeobject"pg_restore: creating 
> pg_largeobject_metadata "pg_largeobject_metadata"pg_restore: creating SCHEMA 
> "bo_user0300"pg_restore: creating SCHEMA "d0007359_bex2018"pg_restore: 
> creating SCHEMA "d0007359_dan2014"   omit -- omit -- omit 
> -- 
> pg_restore: creating PROCEDURE "zchuo_ult18.kp_delkakeihyo(character)"
 
> pg_restore: creating FOREIGN DATA WRAPPER "dblink_fdw"
 
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
>  pg_restore: [archiver (db)] Error from TOC entry 470023; 2328 777028 FOREIGN 
> DATA WRAPPER dblink_fdw postgrespg_restore: [archiver (db)] could not execute 
> query: ERROR:  function dblink_fdw_validator(text[], oid) does not exist 
>Command was: CREATE FOREIGN DATA WRAPPER "dblink_fdw" VALIDATOR 
> "dblink_fdw_validator";


Is that really the first error, or did you "omit" some?  What it
looks like is that dblink_fdw_validator had previously failed to
be created, probably because you don't have dblink.so installed
in the destination database.  Or maybe it's not the right version
of dblink.so.

regards, tom lane

Re: Q: text palloc() size vs. SET_VARSIZE()

2018-03-04 Thread Tom Lane

Albrecht =?iso-8859-1?b?RHJl3w==?=  writes:
>  text *t = PG_GETARG_TEXT_PP(0);
>  size_tout_len = 0U;

>  // allocate to the max. possible output size
>  text *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);

>  // copy data to VARDATA(new_t), and count bytes in out_len

>  // set output size which is out_len <= VARSIZE_ANY_EXHDR(t)
>  SET_VARSIZE(new_t, out_len + VARHDRSZ);
>  PG_RETURN_TEXT_P(new_t);

That code looks fine to me.

>  From the docs, for me it is not clear whether the value assigned using 
> SET_VARSIZE() must be the *exact* size of the newly allocated return value, 
> or just the length of the text plus the header size.  IOW would the code 
> above create a memory leak if out_len < VARSIZE_ANY_EXHDR(t)?

No memory leak.  Your returned value would have some wasted memory at
the end of its palloc chunk, but function result values don't normally
live long enough that that's worth worrying about.

You could repalloc the result down to minimum size if you felt like it,
but I think it'd largely be a waste of cycles.  There are lots of similar
examples in the core backend, and few if any bother with a repalloc.

regards, tom lane

Re: momjian.us is down?

2018-03-04 Thread Tom Lane

Igal  writes:
> On 03/04/2018 07:24 PM, Adrian Klaver wrote:
>> On 03/04/2018 05:53 PM, Igal wrote:
>>> I am trying to connect to http://momjian.us/ but I get connection 
>>> timeout (from Firefox):

> Thanks, I'm not sure if something else was fixed, but now it works.  I 
> don't think that it had to do with caching as I couldn't even ping it 
> before, and it was like that for me (connecting from AT&T network in 
> California) for the whole weekend.

Dunno if it's related, but large parts of the US Northeast were without
power over the weekend due to storm damage.

regards, tom lane

Re: Crash and core on 10.1 and 10.2

2018-03-08 Thread Tom Lane

Kelly Burkhart  writes:
> Hello, I've had two core dumps in the last couple of weeks.  The most
> recent, yesterday was on version 10.2:

> (gdb) bt
> #0  0x7f317a043886 in get_next_seq () from /lib64/libc.so.6
> #1  0x7f317a044acc in strcoll_l () from /lib64/libc.so.6
> #2  0x007ced5f in varstrfastcmp_locale ()
> #3  0x0081b6fb in qsort_ssup ()
> #4  0x0081d8e1 in tuplesort_performsort ()

Hm.  If you'd just showed this one, my thoughts might bend towards a bug
in our sort abbreviation logic, which is relatively new ...

> (gdb) bt
> #0  0x7f6e1f09d8ea in get_next_seq () from /lib64/libc.so.6
> #1  0x7f6e1f09eacc in strcoll_l () from /lib64/libc.so.6
> #2  0x007cf70b in varstr_cmp ()
> #3  0x0075f25b in compareJsonbContainers ()
> #4  0x0075d8f2 in jsonb_eq ()

... but this stack trace is not going anywhere near that code.  The
common factor is just strcoll_l(), raising the possibility that you're
dealing with a glibc bug, or perhaps corrupted locale data on your
machine.  Are you up-to-date on glibc patches?

regards, tom lane

Re: circular wait not triggering deadlock ?

2018-03-08 Thread Tom Lane

Justin Pryzby  writes:
> Running pg10.2, I have a handful of maintenance jobs run in the middle of the
> night, which appear to have gotten stuck waiting on each other..

> ts=# SELECT granted, pid, mode, virtualtransaction, virtualxid FROM pg_locks 
> WHERE virtualxid='22/4317099' ORDER BY 1,2,3;
>  granted |  pid  | mode  | virtualtransaction | virtualxid 
> -+---+---++
>  f   | 20488 | ShareLock | 19/28401734| 22/4317099
>  t   |  6471 | ExclusiveLock | 22/4317099 | 22/4317099

PID 20488 is evidently waiting for PID 6471 to finish its transaction.
What's that one doing?

> Is it a bug that this isn't caught by a deadlock detector and cancelled?

We did find a bug recently with concurrent CREATE INDEX CONCURRENTLY's
all waiting for each other to commit.

regards, tom lane

Re: circular wait not triggering deadlock ?

2018-03-08 Thread Tom Lane

Justin Pryzby  writes:
> On Thu, Mar 08, 2018 at 01:57:06PM -0500, Tom Lane wrote:
>> PID 20488 is evidently waiting for PID 6471 to finish its transaction.
>> What's that one doing?

> Um, I thought I had kept track of all two pids but looks not..
> query| SELECT pg_export_snapshot();

pg_export_snapshot doesn't block for other transactions, though.
Further down, you have output that confirms that:

>   6471 | idle in transaction | psql  | SELECT pg_export_snapshot();

That SELECT is the most recently completed command, not the current one.
So now the question is what the connected application is waiting for
before committing.

    regards, tom lane

Re: What is the meaning of pg_restore output?

2018-03-08 Thread Tom Lane

Adrian Klaver  writes:
> When I restore using 10.2 I see:
> pg_restore: creating ACL "public.TABLE wl_week"
> Do you see something similar?

> My suspicion is that this might have something to do with the commits below:

Yeah, this evidently changed in commits 3eb9a5e7c et al, which made ACL
restore go through restore_toc_entry().  I think I figured that the
"creating ACL" message that restore_toc_entry() would emit made the
dedicated "setting owner and privileges" message redundant.  It had also
been misleading for quite some time, maybe forever: restore of that TOC
entry would set the object's privileges all right, but if it ever had
anything to do with setting the object's ownership, it was a really long
time ago.

regards, tom lane

Re: pg/tcl performance related

2018-03-09 Thread Tom Lane

wolfg...@alle-noten.de writes:
> Now, for pl/tcl, I could use either

> create function check(text) returns text as $$
> set data {
>   # the data value
> }
> foreach d $data {
># the work
> }
> $$ language pltcl;
> or
> create function check(text) returns text as $$
> if ![info exists GD(data)] {
>set GD(data) {
> # the data value
>   }
> }
> foreach d $GD(data) {
>   # the work
> }
> $$ language pltcl; 

> Does this make any difference - i.e. how often will parsing the data happen?

Obviously, in the first case the "set data" will happen every time you
call the function.  The cost of that should be about the same as it
would be in standalone Tcl.  PL/Tcl builds a Tcl code object for the
function body on first call in a session, and thereafter reuses that
code object, so that you shouldn't have extra parsing overhead per
se ... but execution of the command costs something too.

regards, tom lane

Re: Feature request: min/max for macaddr type

2018-03-09 Thread Tom Lane

Herwin Weststrate  writes:
> ... I tried
> adding some functions for macaddr_smaller/macaddr_larger to implement
> the min/max. The code for it was very trivial, but then I got completely
> lost in pg_proc.h/pg_aggregate.h. I fail to find the documentation for
> this files as well, is there any reference I could use to finish my work?

Well, you'd need two pg_proc entries for macaddr_smaller/macaddr_larger,
and two more for the min and max aggregates, and then two pg_aggregate
entries as well.  Looking at the existing entries related to some other
datatype (inet, perhaps) ought to let you move forward.

However, if you're thinking of submitting a patch ... it's too late
for any more feature submissions for v11, and I expect that by the
time v12 opens, the bootstrap data representation will have changed,
cf the discussion at
https://www.postgresql.org/message-id/flat/cajvsvgwo48jbbwxkjz_ybfygyw-m9ywxnpdxjbuosdc9ou_...@mail.gmail.com

So, while you'll still need the same six new catalog entries, what
you actually have to write is going to look completely different.
You might want to stick this project on the back burner for a month
or so, until that dust settles.

    regards, tom lane

Re: Ability to create tables

2018-03-09 Thread Tom Lane

Ron Johnson  writes:
> Even though I revoked the  CREATE priv on role ABCREADONLY, it's still able 
> to create tables.  What can I do to prevent this?

> $ psql -c 'revoke create on database "ABC123" from "ABCREADONLY";'

That revokes the ability to create new schemas within that database
(which I suspect the role did not have anyway).  What you need is
to remove its ability to create objects within the public schema
within that database.  By default, that ability is granted to PUBLIC,
so that "revoke create on schema public from "ABCREADONLY";" won't
help either.  What you have to do is "revoke create on schema public
from public", and then grant it back to just the roles that should have
it.

If you don't want the role creating temp tables either, you need to
revoke its TEMP right on the database (which *is* a database-level
privilege).  Again, this'll involve disallowing that to PUBLIC,
since that default grant is how it's getting the privilege.

regards, tom lane

Re: wrong message when trying to create an already existing index

2018-03-10 Thread Tom Lane

legrand legrand  writes:
> I thougth that thoses messages where using relation's relkind:
> ..
> wouldn't it be easier to read for beginners ?

I doubt it would be an improvement.  Consider this example:

regression=# create table t1 (f1 int);
CREATE TABLE
regression=# create materialized view mv1 as select * from t1;
SELECT 0
regression=# create index mv1 on t1 (f1);
ERROR:  relation "mv1" already exists

You seem to be proposing that the error should read either

ERROR:  index "mv1" already exists

which would be a lie, or

ERROR:  materialized view "mv1" already exists

which while accurate seems to me to be *more* confusing not less.
A person who did not understand that these relation types all
share the same namespace would probably not get enlightened
this way.  Using the generic term "relation" is just as accurate,
and it might help somebody understand that the problem is exactly
that relations of different types share the same namespace.

regards, tom lane

Re: Question on corruption (PostgreSQL 9.6.1)

2018-03-13 Thread Tom Lane

Andy Halsall  writes:
>  db=# select * from x where col_a = 4675635;
>  col_a   | col_b   | col_c   | col_d | col_e |  
> last_modified
> +-+-+---+---+---
> | | |   |   |
> (1 row)

> Row 4675635 is very odd - NULL columns and at the same time retrievable by a 
> value in col_a.

Doesn't seem particularly surprising if col_a is indexed.  That query
would choose an indexscan plan, which would normally not bother to
re-verify the index condition against heap tuples found via the index.

If you're continuing to use this damaged database, it might be a good
idea to try to REINDEX all your indexes.  That'd be particularly
useful for primary/unique indexes, since if corruption has led to
any apparent duplicate rows, the reindex would fail and complain.
But in any case it'd clean up heap-vs-index inconsistencies like the
above, as well as repairing any cases where the corruption was in an
index rather than heap.

Another test I'd strongly recommend is to see if you can pg_dumpall
and reload into a spare server.  That might catch forms of data
corruption that reindexing would not, such as violated CHECK constraints.

regards, tom lane

Re: UPSERT on a view

2018-03-13 Thread Tom Lane

Melvin Davidson  writes:
> On Mon, Mar 12, 2018 at 2:26 PM, Steven Roth  wrote:
>> Why does the following code raise an error?
>> 
>> CREATE TABLE ttest (x integer);
>> CREATE VIEW vtest AS SELECT x FROM ttest;
>> CREATE FUNCTION vtest_insert() RETURNS trigger LANGUAGE plpgsql AS $$
>> RAISE 'foo' USING ERRCODE='unique_violation';
>> END $$;
>> CREATE TRIGGER vtest_insert INSTEAD OF INSERT ON vtest
>> FOR EACH ROW EXECUTE PROCEDURE vtest_insert();
>> INSERT INTO vtest VALUES (1) ON CONFLICT DO NOTHING;
>> 
>> This code raises the error 'foo', even though the insert says DO NOTHING
>> and the error type is unique_violation.  Why?

> *Your problem is that A. TRIGGERS are meant for TABLES, not views*
> *and *
> *B. You CANNOT insert into a VIEW.*

No, instead-of triggers are supposed to do exactly that.  The OP's problem
is that he imagines that ON CONFLICT works by catching arbitrary errors and
checking to see if they are labeled unique_violation.  Doesn't work like
that; if it did, it'd be the wrong thing, because we would not know
whether or how to roll back any other side effects the trigger had had.

>> More generally:  how can one write trigger functions for a view (that is
>> not automatically updatable), such that INSERT ... ON CONFLICT DO UPDATE
>> will work with the correct semantics?

Don't think it's possible unfortunately.  The trigger would have to take
care of the situation, but it doesn't have any way to know that the
calling query has an ON CONFLICT rule.

regards, tom lane

Re: UPSERT on a view

2018-03-13 Thread Tom Lane

I wrote:
>> On Mon, Mar 12, 2018 at 2:26 PM, Steven Roth  wrote:
>>> More generally:  how can one write trigger functions for a view (that is
>>> not automatically updatable), such that INSERT ... ON CONFLICT DO UPDATE
>>> will work with the correct semantics?

> Don't think it's possible unfortunately.  The trigger would have to take
> care of the situation, but it doesn't have any way to know that the
> calling query has an ON CONFLICT rule.

BTW, I don't suppose it solves your problem, but ON CONFLICT does work
with auto-updatable views:

regression=# CREATE TABLE ttest (x integer primary key);
CREATE TABLE
regression=# CREATE VIEW vtest AS SELECT x FROM ttest;
CREATE VIEW
regression=# INSERT INTO vtest VALUES (1);
INSERT 0 1
regression=# INSERT INTO vtest VALUES (1);
ERROR:  duplicate key value violates unique constraint "ttest_pkey"
DETAIL:  Key (x)=(1) already exists.
regression=# INSERT INTO vtest VALUES (1) ON CONFLICT DO NOTHING;
INSERT 0 0

The query rewriter knows how to translate the ON CONFLICT to apply
to the underlying table along with the rest of the command.

regards, tom lane

Re: Is there a way to create a functional index that tables tableoid column as an arg?

2018-03-13 Thread Tom Lane

Ryan Murphy  writes:
> Hi David!  Thanks for the reply.
>> tableoid might be an exception to
>> this, but it does not really seem like a useful column to index,
>> giving it would be indexing the same value for each record in the
>> table.

> Unless you're using inheritance - then tableoid may vary.  That's the case
> I'm interested in.

Uh, no, not within a single table ... and indexes only cover one table.

regards, tom lane

Re: Primary key gist index?

2018-03-14 Thread Tom Lane

Paul Jungwirth  writes:
> On 03/14/2018 06:19 AM, Jeremy Finzel wrote:
>> Hello!  From all that I can tell, it is not possible using a btree_gist 
>> index as a primary key.  If so, why not?  I have a table with this gist 
>> index which truly ought to be its primary key.  as_of_date is of range 
>> date type:
>> 
>> EXCLUDE USING gist (id WITH =, as_of_date WITH &&)

> Technically I think an exclusion constraint (or at least this one) 
> fulfills the formal requirements of a primary key (is unique, isn't 
> null), but maybe there are other primary-key duties it doesn't meet, 
> like defining foreign keys that reference it.

I think a key point is that an exclusion constraint might or might not
provide behavior that could be construed as uniqueness.  The logic for
PRIMARY KEY hasn't got nearly enough knowledge to tell whether particular
operators used in a particular way in a GIST index will behave in a way
that would support calling that a primary key.  b-tree indexes, on the
other hand, have basically only one behavior, so they're easy.

Also, as you mention, extrapolating behavior that's not really equality
to situations like foreign keys gets pretty interesting pretty fast.
An exclusion constraint using && might ensure that no two values in the
column are identical, but it would not be enough to ensure that a proposed
FK row can't && with more than one PK row.  So what then?

regards, tom lane

Re: wiki Disk Usage, table size: ERROR: could not open relation with OID 0

2018-03-14 Thread Tom Lane

Michael Paquier  writes:
> On Wed, Mar 14, 2018 at 04:17:54PM +0100, Hans Schou wrote:
>> I got the message
>> ERROR: could not open relation with OID 0
>> when running the "General Table Size Information" from
>> https://wiki.postgresql.org/wiki/Disk_Usage

>> I'm running version 9.1.9 so it should be working according to the
>> wiki.

> You should update and upgrade.  9.1 has fallen out of community support
> 1 year and a half ago, and 9.1.9 is utterly outdated.

The query does fail on < 9.2, because on rows with no reltoastrelid
it will call pg_total_relation_size(0), and we didn't make those
functions forgiving of bogus OIDs until 9.2.

Given that pre-9.2 is well out of support I didn't feel like complicating
the query to handle that; what I did do was change the labeling to say
"works with >= 9.2" instead of "works with >= 9.0".  But hey, it's a wiki;
if you feel more ambitious, edit away.

regards, tom lane

Re: SELECT .. FOR UPDATE: find out who locked a row

2018-03-15 Thread Tom Lane

Melvin Davidson  writes:
> Yes, Stephen, I certainly understand making changes to system catalogs
> _when necessary_.  That being said, the first change was the renaming of
> pid to procpid in pg_stat_activity.  However, I contend that was more
> because someone felt that it was more to make the column names
> consistent across catalogs, rather than necessity.

Please read all of
https://www.postgresql.org/message-id/flat/201106091554.p59Fso314146%40momjian.us
where this was discussed to death (and rejected), and then read all of
https://www.postgresql.org/message-id/flat/CAKq0gvK8PzMWPv19_o7CGg8ZQ0G%2BUuAWor5RrAg0SOmWTqqLwg%40mail.gmail.com
which is the thread in which the change was agreed to after all
(on the grounds that we were breaking backwards compatibility of
the view anyway with respect to other, more important, columns).

If you still feel that we make incompatible changes without adequate
consideration, that's your right, but you might want to consider
speaking up in some reasonable time frame, not six years later.
This could have been objected to as late as 9.2 beta, so it's not
like you need to be drinking from the pgsql-hackers firehose continually
in order to weigh in.  But 9.2 is not just released, it's EOL, so it's
really kinda late to be objecting.

    regards, tom lane

Re: pentaho timestamp issue

2018-03-18 Thread Tom Lane

PropAAS DBA  writes:
> Pinging the list on the off chance someone has dealt with this and knows 
> of a workaroud. We have pentaho pointing to a PostgreSQL v10.3 database, 
> getting this error:
> 2018/03/18 15:06:37 - INPUT STEP -  Fact.0 - Bad value for type 
> timestamp : 0001-02-04 17:00:04-06:59:56

Hmm.  Presumably, this is coming from something that thinks that 1 AD
is outside the reasonable range of timestamps.  Assuming you agree that
such a value shouldn't appear in your application, I'd look for timestamps
getting put into the database using to_timestamp() with a format string
that doesn't really match the data, causing the year field to be truncated
or misinterpreted.

        regards, tom lane

Re: You might be able to move the set-returning function into a LATERAL FROM item.

2018-03-20 Thread Tom Lane

Alexander Farber  writes:
> I am trying to fetch a history/protocol of a game with:

> SELECT
> CASE WHEN JSONB_TYPEOF(tiles) = 'array' THEN
> JSONB_ARRAY_ELEMENTS(tiles) ELSE NULL END AS x

I think you could push the conditionality into a plpgsql function,
something like (untested)

create function jsonb_elements_if_array(j jsonb) returns setof jsonb as $$
begin
  if jsonb_typeof(j) = 'array' then
return query select jsonb_array_elements(j);
  end if;
end$$
strict immutable language plpgsql;

Note that this gives *no* elements, rather than a single NULL value,
if the input isn't an array --- but that seems to me to make more sense
than your existing code anyhow.  If you disagree, add "else return next
null::jsonb".

regards, tom lane

Re: FDW Foreign Table Access: strange LOG message

2018-03-20 Thread Tom Lane

Adrian Klaver  writes:
> On 03/20/2018 11:52 AM, Albrecht Dreß wrote:
>> I use Postgres 10.3 on a Debian Stretch system with foreign tables, and 
>> noticed strange LOG messages when accessing them.
>> [time stamp/pid] user@my_db LOG:  could not receive data from client: 
>> Connection reset by peer

> My suspicion is it has to do with this:
> postgres_fdw establishes a connection to a foreign server during the 
> first query that uses a foreign table associated with the foreign 
> server. This connection is kept and re-used for subsequent queries in 
> the same session.

Perhaps.  It's not entirely clear if these complaints are about the
original user session or the sub-session opened by postgres_fdw.
(Albrecht, if you're not sure either, enabling log_connections and
log_disconnections might make it clearer.)

I don't see any such log messages when testing postgres_fdw here,
which is odd; why are my results different?

If these are about the FDW connections, maybe the answer is that
postgres_fdw ought to establish a backend-exit callback in which
it can shut down its connections gracefully.  If it's doing that
now, I sure don't see where.

regards, tom lane

Re: Foreign Key Validation after Reference Table Ownership Change

2018-03-21 Thread Tom Lane

"David G. Johnston"  writes:
> On Wed, Mar 21, 2018 at 8:30 AM, Battuello, Louis <
> louis.battue...@etasseo.com> wrote:
>> So, user_2 needs usage on the schema containing its newly owned reference
>> table even though user_1 is performing the insert on a table in the other
>> schema? Interesting. I though the validation was only dependent on user_1's
>> ACL.

> It was the only thing that made sense, given the error, though I agree it
> seems a bit odd.

The point is you can't resolve a name like "schema_1.something" unless
you have USAGE on schema_1.  So the RI-checking query, which is run as
the owner of the table, fails at parse time.

regards, tom lane

Re: Foreign Key Validation after Reference Table Ownership Change

2018-03-21 Thread Tom Lane

Louis Battuello  writes:
>> The point is you can't resolve a name like "schema_1.something" unless
>> you have USAGE on schema_1.  So the RI-checking query, which is run as
>> the owner of the table, fails at parse time.

> That certainly makes sense for user_2 that owns the reference table and is 
> blocked by not having usage on the reference table’s schema.

> But, user_1 owns both schemas and has usage on both but no longer owns the 
> reference table in one schema. Why is user_1’s insert on the referencing 
> table failing? Is the validation of the FK no longer done as user_1?

Exactly, it's done as the owner of the referencing table.  (I don't recall
whether that's uniformly true for all types of FK-enforcement queries,
but evidently it's true for this case.)

regards, tom lane

Re: FDW Foreign Table Access: strange LOG message

2018-03-22 Thread Tom Lane

Albrecht =?iso-8859-1?b?RHJl3w==?=  writes:
> No messages are printed whilst the session is open.  However, when I quit the 
> session by typing “\q” /after/ entering the command above, the LOG message 
> appears.  It does not appear when I query tables from my_db exclusively, 
> though, i.e. when I do not access the Foreign Table.

It finally occurred to me what is a likely explanation for why you're
seeing these messages but I don't when I test postgres_fdw locally:
probably, you are encrypting connections to the foreign server with SSL.
I can reproduce "could not receive data from client: Connection reset by
peer" if I kill -9 a psql client when it's using SSL, but not otherwise.

Anyway, it's clearly not very nice that postgres_fdw makes no attempt
to do a graceful shutdown of its remote connection.  I don't know that
this rises to the level of a bug, exactly, but if somebody wanted to
send in a patch it'd probably get accepted.

regards, tom lane

Re: FDW Foreign Table Access: strange LOG message

2018-03-22 Thread Tom Lane

Michael Paquier  writes:
> On Thu, Mar 22, 2018 at 06:55:12PM -0400, Tom Lane wrote:
>> Anyway, it's clearly not very nice that postgres_fdw makes no attempt
>> to do a graceful shutdown of its remote connection.  I don't know that
>> this rises to the level of a bug, exactly, but if somebody wanted to
>> send in a patch it'd probably get accepted.

> I won't qualify that as a bug, this is mainly noise.  Still I agree that
> it would be cleaner to make more efforts in finishing the connections
> when the session goes down.  You would roughly just need to register an
> on_shmem_exit callback which loops through all the connections to finish
> them.

That was my basic thought.  Not sure whether on_proc_exit is better than
on_shmem_exit, but one or the other.

> Now would we want to slow down the session shutdown just for
> that?  I am less sure particularly if there is lag between the remote
> and the local servers.

Yeah, I think the sticky part might be what to do if the shutdown attempt
blocks.  I vaguely recall that PQfinish just fires off an 'X' protocol
message and doesn't wait for an answer, so maybe there's not huge risk
there, but you don't really want to risk any delay.  (The worst-case
scenario is that you've gotten SIGTERM from init because a system
shutdown is beginning, and you don't have much time before it brings
the hammer down with SIGKILL, preventing a clean database shutdown
and thereby risking data corruption if anything is wrong with the
filesystem sync semantics.  I am not sure if it's more or less dangerous
for this purpose if the "remote" session is another backend in the same
cluster, and therefore also subject to the same shutdown signal.)

regards, tom lane

Re: FDW Foreign Table Access: strange LOG message

2018-03-23 Thread Tom Lane

Albrecht =?iso-8859-1?b?RHJl3w==?=  writes:
> A different, confusing point (which is closer to a “bug” IMHO) is that 
> connections to localhost are actually encrypted by default.  This is 
> basically useless and just a waste of cpu cycles – if a malicious user may 
> somehow tap (tcpdump) lo, there is a different problem which can not be 
> mitigated by encryption…

I agree that it's not very useful to do that, but it'd be tough for us to
make it not happen by default --- that requires knowing an awful lot about
the local network topology.  Not sure that we'd want to assume that
"localhost" is safe, and we'd certainly not know what to do for
connections that use the host's name.

Note that in most scenarios, "local" connections travel over a Unix socket
not TCP, and in that case we don't encrypt.

regards, tom lane

Re: connection dropped from the backend server

2018-03-27 Thread Tom Lane

Adrian Klaver  writes:
> On 03/27/2018 04:07 PM, armand pirvu wrote:
>> As long as the connection stays up yes data gets fine across
>> In pg_stat_activity I see the node ip address where tail -f piped into psql 
>> happens

> So what does the rest of that record show? In particular for:

I wonder how often data gets put into the pipe.  If it's "not very often",
maybe the connection from psql to the server is timing out due to
inactivity?  This would be the fault of a firewall or something in
between.  You could probably fix it by enabling (more aggressive) TCP
keepalive settings.

    regards, tom lane

Re: Query Crashes PG 10.3 using partitions, works on 9.6

2018-03-28 Thread Tom Lane

Cory Tucker  writes:
> I was issuing a query on both databases to cleanup some duplicates in
> preparation of applying new indexes.  On the 9.6 database with all the data
> in one table, the query runs fine in about 6 min.  On 10.3, with a work_mem
> setting of 1GB the query runs for about 7 minutes and then gets terminated
> with an out of memory error.

Hm, this seems a bit excessive:

  MessageContext: 1333788672 total in 169 blocks; 2227176 free (9 chunks); 
1331561496 used

and this is really grim:

65678 more child contexts containing 47607478048 total in 2577 blocks; 
12249392 free (446 chunks); 47595228656 used

and this is just silly:

2018-03-28 19:20:33.264 UTC [10580] cory@match ERROR:  out of memory
2018-03-28 19:20:33.264 UTC [10580] cory@match DETAIL:  Failed on request of 
size 1610612736.

Can you extract a self-contained test case that uses unreasonable amounts
of memory?  It seems from this trace that the wheels are coming off in
at least two places, but identifying exactly where is impossible without
more info.

If you can't make a publishable test case, capturing a stack trace from
the point of the OOM error (set the breakpoint at errfinish) would
probably be enough info to figure out what is trying to grab 1.6GB in one
bite.  But it won't help us find out why so many empty ExprContexts are
getting created.

    regards, tom lane

Re: Query Crashes PG 10.3 using partitions, works on 9.6

2018-03-28 Thread Tom Lane

Cory Tucker  writes:
>> Can you extract a self-contained test case that uses unreasonable amounts
>> of memory?  It seems from this trace that the wheels are coming off in
>> at least two places, but identifying exactly where is impossible without
>> more info.

> I will try to make a test case.  The data in this table is PII so I can't
> just export it, but I might be able to replicate using fake data.

My guess is that the specific data is not critical at all, just the
table properties.  You might need to take care that the same EXPLAIN
plan is selected, which might require fake data that has roughly
similar properties as to number of distinct values and so on.

    regards, tom lane

Re: Bad Query Plans on 10.3 vs 9.6

2018-03-29 Thread Tom Lane

David Rowley  writes:
> On 29 March 2018 at 18:26, Cory Tucker  wrote:
>> The plan on 9.6 v 10.3 are effectively identical except in 9.6 the planner
>> decides to use an index only scan on the primary key and in 10.3 it does a
>> sequential scan.  The problem is the sequential scan is for a table of 75M
>> rows and 25 columns so its quiet a lot of pages it has to traverse.

> How certain are you that all the indexes match on each instance?

Another possibility is that 10.3 sees the index-only scan as too expensive
because it thinks most of the table isn't all-visible.  Comparing
pg_class.relallvisible values might be informative.

    regards, tom lane

Re: Bad Query Plans on 10.3 vs 9.6

2018-03-29 Thread Tom Lane

Cory Tucker  writes:
> relallvisible has a value of 0 for that table on both databases.

That would result in IOS being estimated at the same cost as a regular
indexscan, I believe, or very close to that anyway.

Is the 10.3 plan parallelized at all?  It's possible that the planner
thinks a parallel seqscan is faster than a nonparallel indexscan
(AFAIR, v10 doesn't have parallel indexscan).

The other likely explanation is simply that indexscanning a partitioned
table is not considered, or not correctly costed.  I'm not very sure what
the state of that code is, but certainly all the v10 partitioning logic is
still pretty wet behind the ears.

        regards, tom lane

Re: How to get an inclusive interval when using daterange

2018-04-02 Thread Tom Lane

hmidi slim  writes:
> When I insert a data into this table I use this query:
> insert into availability values ('product x', daterange('2018-02-02',
> '2018-03-01', '[]')

> So I get a period like this: [2018-02-02, 2018-03-02)

Yup.

> In my app I tried to subtract a day from the period when I got it from
> database.I'm using daterange and not tsrange because the daterange gives me
> less execution time when I make tests with a huge number of data.So is
> there any way to get an inclusive interval with daterange or I have to use
> tsrange to get inclusive intervals?

No, daterange will always canonicalize a range into '[)' format.
This is explained (perhaps not with adequate clarity) in
https://www.postgresql.org/docs/10/static/rangetypes.html#RANGETYPES-DISCRETE

The key reason why is to make it clearer which range specifications
are equal.  For instance, it's not really clear whether
['2018-02-02','2018-03-01'] and ['2018-02-02','2018-03-02') represent
the same set of values --- they do if it's a daterange, but not if it's
a tsrange.  Canonicalizing makes equal ranges look equal.

regards, tom lane

Re: Seems like there is an issue with reltuples showing twice the number of rows

2018-04-03 Thread Tom Lane

Tomas Vondra  writes:
> On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote:
>> it returns
>>  reltuples  | n_live_tup | n_dead_tup
>> -++
>> 2.7209e+06 |    1360448 |    1360448
>> 
>> If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d
>> and I run the same query again,
>>   reltuples  | n_live_tup | n_dead_tup
>> -++
>>  1.36045e+06 |    1360448 |    1360448
>> 
>> But after some time the value goes back to being double the value.

> There was a difference between VACUUM and ANALYZE in handling recently
> dead rows (essentially deleted rows that can't be removed yet), causing
> similar changes to reltuples. Essentially if you do VACUUM and ANALYZE,
> it may set reltuples to rather different estimates. That is fixed now
> and should be in the next minor release.

No, I think this is the *other* thing we fixed recently: VACUUM thinks
it should set reltuples to total tuples (live + dead) whereas ANALYZE
counts only live tuples.  We did not risk back-patching that.

The question I'd ask about this case is why is there persistently 100%
bloat?  Those dead tuples should've gotten reclaimed by autovacuum.
Perhaps an open prepared transaction, or some such?

> It's probably better to use n_live_tup instead, though. I'd say that's
> closer to the "live tuples" definition.

Yeah, you might be better off looking at that, particularly since it
updates on-the-fly not just after a vacuum or analyze.

regards, tom lane

Re: Extension make installcheck: include update/insert feedback?

2018-04-04 Thread Tom Lane

Paul Jungwirth  writes:
> I've noticed that if my test code does an INSERT or DELETE, the usual 
> `INSERT 0 1` and `UPDATE 2` messages don't appear in the *.out files, 
> even though those otherwise mirror psql. I thought maybe there was some 
> psql switch that turns those on/off, but I couldn't find one.

That's because pg_regress launches psql with the -q option (as well as -a).

I think you might be able to override that within a particular test script
by fooling with QUIET, or whichever psql variable it is that that switch
sets.

        regards, tom lane

Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Tom Lane

a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes:
> Here's a statement which currently gives an unexpected chunk error:

>   efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908';
>   ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698936148 
> in pg_toast_10919630

OK ...

> And when I run the suggested query, I get:

>   efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
> length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 
> 1698936148 order by 1,2;
> chunk_id  | chunk_seq | ctid |xmin| xmax | length 
>   +---+--++--+
>1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
>1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
>   (2 rows)

Hmph.  So if you EXPLAIN that query, does it show it's doing it as an
indexscan?  I'd expect so, but it's always good to make sure.

Assuming it does say that, then the other test I had in mind would
involve "set enable_indexscan = 0", then repeat the EXPLAIN to make
sure that you now get a seqscan plan (you might need to turn off
enable_bitmapscan too), then do the query again and see whether the
results are the same.

regards, tom lane

Re: Docker + postgreSQL : OOM killing in a large Group by operation

2018-04-05 Thread Tom Lane

Jorge Daniel  writes:
> I have a problem with a query that grabs a bunch of rows and then does an 
> aggreate operation, at that moment it gots killed by OOM-killer, I don't know 
> why, the  engine starts using tmpfiles as expected , and then tries to work 
> in  memory and gots killed.

> SELECT count(*)
>  FROM "changelog_change_transaction"
>  INNER JOIN "changelog_change_stats" ON ( 
> changelog_change_stats.changelog_change_transactionid = 
> changelog_change_transaction.changelog_change_transactionid )
>  LEFT OUTER JOIN "changelog_change_group" ON ( 
> changelog_change_transaction.changelog_change_groupid = 
> changelog_change_group.changelog_change_groupid )

> WHERE ( changelog_change_group.companyid = 40 OR 
> changelog_change_group.companyid = 1 OR changelog_change_group.companyid = 53 
> OR changelog_change_group.companyid IS NULL )
> AND changelog_change_transaction.started_at > '2017-04-21'
> GROUP BY  "changelog_change_transaction"."changelog_change_transactionid", 
> "changelog_change_transaction"."epoch", "changelog_change_transaction"
> ."changelog_change_groupid", "changelog_change_transaction"."started_at", 
> "changelog_change_transaction"."duration_microseconds", 
> "changelog_change_transaction"."changed_items", 
> "changelog_change_transaction"."xmin"
> ;

Why are you grouping on xmin?

> For sure if the GROUP BY the one that causes this OOM (when I removed it, the 
> query finish ok ) , so I've change the query-plan to avoid the HashAggregate:
>   But the explain still shows:

That's because type XID doesn't have sort support, only hash support,
so hash aggregation is the only way to do the query at all.

regards, tom lane

Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Tom Lane

a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes:
>> [... still waiting for the result, I will return with what it said
>> when the server does ...]

> It did eventually finish, with the same result:

Huh.  So what we have here, apparently, is that regular MVCC snapshots
think there is exactly one copy of the 1698936148/0 row, but TOAST fetches
think there is more than one.  This is darn odd, not least because we
never do UPDATEs in toast tables, only inserts and deletes, so there
certainly shouldn't be update chains there.

It seems like you've got some corner case wherein SnapshotToast sees a row
that isn't visible according to MVCC --- probably a row left over from
some previous cycle of life.  That is, I'm imagining the OID counter
wrapped around and we've reused a toast OID, but for some reason there's
still a row in the table with that OID.  I'm not sure offhand how we could
get into such a state.  Alvaro, does this ring any bells (remembering that
this is 9.3)?

regards, tom lane

Re: algo for canceling a deadlocked transaction

2018-04-09 Thread Tom Lane

Christophe Pettus  writes:
>> On Apr 9, 2018, at 07:33, Thomas Poty  wrote:
>> ok, and long  answer ? is it random?

> It's not literally random, but from the application point of view, it's not 
> predictable.  For example, it's not always the one that opened first, or any 
> other consistent measure.

It's whichever one runs the deadlock detector first after the circular
wait becomes established.  For instance:

* Process A takes lock L1

* Process B takes lock L2

* Process A tries to take lock L2, blocks

* Process B tries to take lock L1, blocks (now a deadlock exists)

Process A will run the deadlock detector one deadlock_timeout after
blocking.  If that happens before B has blocked, then A will see
no deadlock and will go back to waiting.  In that case, when B's
own deadlock_timeout expires and it runs the deadlock detector,
it will see the deadlock and fix it by canceling its own wait.
On the other hand, if B started to wait less than one deadlock_timeout
after A did, then A will be first to observe the deadlock and it will
cancel itself, not B.

So you can't predict it unless you have a lot of knowledge about
the timing of events.  You could probably make it more predictable
by making deadlock_timeout either very short or very long, but
neither of those are desirable things to do.

regards, tom lane

Re: [GENERAL] missing public on schema public

2018-04-11 Thread Tom Lane

Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Yeah, I can reproduce this.  I suspect it got broken by Stephen's hacking
>> around with default ACLs.  A simple example is

> Yes, it's related to the work I did with pg_dump's ACL handling, because
> we're no longer just always including the whole revoke/grant set of ACLs
> for everything in the output.

I see that this is listed on the open items for v11, but it's hard to
justify it being there, because the bug exists only in 9.6 and 10.
(We fixed it in HEAD as part of the pg_dump-vs-pg_dumpall refactoring.)

I gather, from the lack of any progress since November, that you're
probably not going to fix it in the back branches.  I'm not excited
about working on it either, but I dislike leaving such a bug unfixed.

In any case, I think it should be removed from the v11 list.  The "older
bugs" section is meant to capture pre-existing bugs that we might possibly
fix as part of v11 stabilization, and this isn't that.

regards, tom lane

Re: Commands linked to pg_wrapper not working with non-root users

2018-04-11 Thread Tom Lane

Jonathan Leroy - Inikup  writes:
> Here's my issue : when I'm logged as an user which is not root or
> postgresql, I can't use any of the commands linked to pg_wrapper:

> user1@server1:~ $ /usr/bin/psql --version
> Error: Invalid data directory

There's no error message with exactly that spelling in PG proper,
so it must be coming out of the wrapper script.  At a guess, the
wrapper script thinks it should be able to examine the data directory,
but it can't because of restrictive file permissions.  Why your
setup is triggering that when it works for other people, I can't say.

regards, tom lane

Re: cursor "x" does not exist

2018-04-14 Thread Tom Lane

Bob Jones  writes:
> My apologies if I'm being incredibly stupid here, but I've reviewed
> what the docs have to say about naming portals and I still can't see
> where I'm going wrong here ?

I think you're forgetting to double-quote an upper case identifier.

> FETCH ALL IN B;
> ERROR:  cursor "b" does not exist

The cursor is named "B" not "b", but B without quotes folds to the latter.

regards, tom lane

Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade

2018-04-16 Thread Tom Lane

Keith Fiske  writes:
> Running into an issue with helping a client upgrade from 8.3 to 10 (yes, I
> know, please keep the out of support comments to a minimum, thanks :).

> The old database was in SQL_ASCII and it needs to stay that way for now
> unfortunately. The dump and restore itself works fine, but we're now
> running into issues with some data returning encoding errors unless we
> specifically set the client_encoding value to SQL_ASCII.

I'm guessing you might be hitting this 9.1 change:

* Have psql set the client encoding from the operating system locale
  by default (Heikki Linnakangas)

  This only happens if the PGCLIENTENCODING environment variable is
  not set.

I think the previous default was to set client encoding equal to the
server encoding.

> Looking at the 8.3 database, it has the client_encoding value set to UTF8
> and queries seem to work fine. Is this just a bug in the old 8.3 not
> enforcing encoding properly?

Somewhere along the line we made SQL_ASCII -> something else conversions
check that the data was valid per the other encoding, even though no
actual data change happens.

> The other thing I noticed on the 10 instance was that, while the LOCALE was
> set to SQL_ASCII,

You mean encoding, I assume.

> the COLLATE and CTYPE values for the restored databases
> were en_US.UTF-8. Could this be having an affect?

This is not a great idea, no.  You could be getting strange misbehaviors
in e.g. string comparison, because strcoll() will expect UTF8 data and
will likely not cope well with data that isn't valid in that encoding.

If you can't sanitize the encoding of your data, I'd suggest running
with lc_collate and lc_ctype set to "C".

regards, tom lane

Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade

2018-04-16 Thread Tom Lane

Keith Fiske  writes:
> On Mon, Apr 16, 2018 at 12:09 PM, Tom Lane  wrote:
>> This is not a great idea, no.  You could be getting strange misbehaviors
>> in e.g. string comparison, because strcoll() will expect UTF8 data and
>> will likely not cope well with data that isn't valid in that encoding.

> And pg_controldata was able to show that the CTYPE and COLLATE were UTF8 on
> the old system. If that's the case, do you still think it's a good idea to
> set the COLLATE and CTYPE to "C"?

Well, if the customer's been happy with the behavior of the system so far,
maybe it's all right.  But this is sure the first thing I'd look at if
there are any gripes about its behavior with non-UTF8 strings.  I'd be
especially worried about this if you try to migrate the database to any
new platform, as it's a bet about the behavior of libc not PG itself.

regards, tom lane

Re: Can PostgreSQL create new WAL files instead of reusing old ones?

2018-04-17 Thread Tom Lane

Alvaro Herrera  writes:
> David Pacheco wrote:
>> tl;dr: We've found that under many conditions, PostgreSQL's re-use of old
>> WAL
>> files appears to significantly degrade query latency on ZFS.  The reason is
>> complicated and I have details below.  Has it been considered to make this
>> behavior tunable, to cause PostgreSQL to always create new WAL files
>> instead of re-using old ones?

> I don't think this has ever been proposed, because there was no use case
> for it.  Maybe you want to work on a patch for it?

I think possibly the OP doesn't understand why it's designed that way.
The point is not really to "recycle old WAL files", it's to avoid having
disk space allocation occur during the critical section where we must
PANIC on failure.  Now, of course, that doesn't really work if the
filesystem is COW underneath, because it's allocating fresh disk space
anyway even though semantically we're overwriting existing data.
But what I'd like to see is a fix that deals with that somehow, rather
than continue to accept the possibility of ENOSPC occurring inside WAL
writes on these file systems.  I have no idea what such a fix would
look like :-(

regards, tom lane

Re: Tricking the optimizer

2018-04-18 Thread Tom Lane

Vitaliy Garnashevich  writes:
> Is there any trick to craft the query in such a way, so that to make the 
> optimizer believe that the filters would not remove any rows, and all 
> rows will likely be returned by the query?

If you don't mind writing some C code, you could create a dummy operator
that just returns its boolean argument, and attach a selectivity estimator
to it that returns 1.0.  Then you'd write the query as, perhaps,

WHERE ~~~(securityFilter1) AND ...

where ~~~ could be read as "likely()".  I wouldn't recommend using such a
wrapper for any WHERE clause that the optimizer had any intelligence about
at all, because it would defeat all of that.  But in this case you're
not getting any wins anyhow, so burying an opaque subselect in another
layer of opacity won't hurt.

regards, tom lane

Re: pg_upgrade: when the --old-bindir requires "additional" $libdir/ plugins?

2018-04-18 Thread Tom Lane

Pavel Raiskup  writes:
> . and it seems like the hstore.so was somewhat intimately integrated into
> OP's database so the '/usr/bin/pg_dump --schema-only --binary-upgrade
> --format=custom' called through 'pg_upgrade' failed with:
>   pg_dump: [archiver (db)] query failed: ERROR:  could not access file
>   "$libdir/hstore": No such file or directory
> Which means that the dump from old datadir, with old server (without
> hstore.so packaged) failed.  But playing with hstore.so a bit, the upgrade
> always worked smoothly for me even without the "old" hstore.so

Hi Pavel,

There are certainly plenty of reasons why extension .so's might be needed
during pg_dump, even in a binary-upgrade situation.  The first example
that comes to mind is that an hstore-type constant appearing in a view
definition would require hstore_out() to be invoked while dumping the view
definition.

I don't remember anymore whether I'd set up the postgresql-update package
to include the contrib modules for the old server version.  If I didn't,
it was an oversight :-(.

regards, tom lane

Re: Inconsistent compilation error

2018-04-18 Thread Tom Lane

r...@raf.org writes:
> I have a stored function with code that looks like:

> raise notice '% %', 'blah_history.original_id', r;

> But I'm getting this compilation error when it tries to load this
> function:

> ERROR:  too many parameters specified for RAISE

That is ... weird.  The code is checking that the number of % symbols in
the string literal matches the number of comma-separated expressions after
the literal, and it sure looks like those match.

> Which looks fine. The really wierd thing is that this happens when done on a
> debian9 host but when I load the function from another host (my macos laptop)
> with the same function into the same database, it works fine.

Weirder yet.

The only idea that comes to mind is that '%%', with nothing between the
percent signs, means a literal '%' character not two parameter markers.
If the contents of the string literal were '%%' then this is exactly
the error message you'd get.

So here's a long-shot theory: what's in your source code is not a plain
space but something weird like a no-break space, and when you transfer
that text from machine A to machine B, the no-break space is getting
dropped.

regards, tom lane

Re: Problem with trigger makes Detail record be invalid

2018-04-19 Thread Tom Lane

Adrian Klaver  writes:
> On 04/19/2018 10:55 AM, PegoraroF10 wrote:
>> Is this a bug or it´s mine responsability to check that trigger result ?

> Without seeing exactly what the trigger function on Detail is doing that 
> is not answerable.

I think the OP is complaining because his misimplemented trigger can break
the consistency of the foreign key constraint.  That is not a bug, it's
an intentional design decision: triggers are lower-level than foreign key
enforcement queries, and fire during such queries.  It's easy to construct
examples where people would be very unhappy if this were not so, because
then FK-driven updates would not be seen by the table's triggers.  It does
mean that you have to be careful when writing a trigger.

(I'm not sure that this issue is adequately documented, though.
I'd have expected to find something about it in triggers.sgml and/or
create_trigger.sgml, but in a quick look neither of them mentions foreign
keys.)

    regards, tom lane

Re: Problem with trigger makes Detail record be invalid

2018-04-19 Thread Tom Lane

=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?=  writes:
> 2018-04-19 15:57 GMT-03:00 Tom Lane :
>> (I'm not sure that this issue is adequately documented, though.
>> I'd have expected to find something about it in triggers.sgml and/or
>> create_trigger.sgml, but in a quick look neither of them mentions foreign
>> keys.)

> We don't have it properly documented... at the time I answered this
> question on pt-br stackoverflow I noticed the lack o documentation and
> unfortunately I completely forgot to propose a small patch for it.

It strikes me that there are basically two things a trigger could do to
break FK consistency:

1. Turn an FK-commanded update into a no-op by returning NULL.

2. Change the content of the FK-related columns during an FK-commanded
update.

Both of these apply only to BEFORE ROW triggers, of course.

It might not be unreasonable or unduly expensive to throw an error for
case #1.  I don't think I want to get into the expense of checking
for case #2, but covering case #1 would be enough to catch all of the
reports of this type of problem that I can remember.

IIRC, you can also break FK consistency with poorly-thought-out rules,
but given that rules are close-to-deprecated, I'm not very concerned
about sanding down rough edges in that case.

(But if you feel like writing a documentation patch, please do, because
we'd not be likely to back-patch a behavioral change like this even
if we get around to making it.)

regards, tom lane

Re: PGAdmin4 debugger - unable to call ltree functions

2018-04-26 Thread Tom Lane

"Ian Bell"  writes:
> I just tried your suggestion and it generates the same error.  For
> verification, here is the function run in the debugger.

I just realized that "syntax error at position N" is *not* the exact
spelling you would get from a syntax error in PG's main parser, which
I think is what we were all assuming this'd trace down to somehow.

What it is is the (very poorly chosen) message you get if either
ltree_in() or lquery_in() doesn't like what it's been handed.
So basically, the problem here is that TestLtree() is being called
with a string that isn't a valid ltree representation.  You haven't
shown us what you are doing on that end, but for example I can
duplicate the error message with

# select text2ltree(' ');
ERROR:  syntax error at position 0

regards, tom lane

Re: PGAdmin4 debugger - unable to call ltree functions

2018-04-26 Thread Tom Lane

"Ian Bell"  writes:
> My function TestLtree() is being executed in the PGAdmin debugger.   The
> debugger requires I define the argument being passed, I set this to 'a.b.c'
> and I have confirmed in the debugger that this is indeed what is passed into
> the function.

> I do not understand what you mean by "You haven't shown us what you are
> doing on that end".   I am only using the debugger to test a simple function
> - nothing more.

Well, I'd say that *something* is calling that function with an argument
different from what you expect.  It's entirely possible that this is a
bug in PGAdmin's debugger.  Again, you'd be best off discussing it on the
pgadmin lists rather than here.

regards, tom lane

Re: Locks analysis after-the-fact

2018-04-27 Thread Tom Lane

Olleg Samoylov  writes:
> On 2018-04-27 10:55, Olivier Macchioni wrote:
>> Does anyone have an idea on how to process in such a case?

> Log statments too. :)

Another idea is to get the app to set application_name differently for
each session (and be sure to include %a in log_line_prefix).  It might
not be practical in this particular case, but it's a good tool to keep
in mind.

    regards, tom lane

Re: Query function arg data types ONLY (no arg names)

2018-04-28 Thread Tom Lane

Jeremy Finzel  writes:
> It appears that neither pg_get_function_arguments
> nor pg_get_function_identity_arguments could be used for this.  I want to
> get function argument data types from the catalog by ordinal position,
> without the argument name.

Something involving pg_proc.proargtypes::regtype[] might be useful.

    regards, tom lane

Re: pg_stat_statements : how to catch non successfully finished statements ?

2018-04-28 Thread Tom Lane

legrand legrand  writes:
> So the only solution is to had queryId to ErrorData in this hook
> or create a new hook fired on ERROR and containing queryId ?

I see no particular need for a new hook.  What's needed here is for
pgss_ExecutorRun (and maybe some other existing functions in
pg_stat_statements) to go ahead and record the statement when they catch
an error thrown out of standard_ExecutorRun, rather than just updating
the module's nested_level variable and re-throwing.

The hard part here is that you have to be really careful what you do in
a PG_CATCH block, because the only thing you know for sure about the
backend's state is that it's not good.  Catalog fetches are right out,
and anything that might itself throw an error had best be avoided as
well.  (Which, among other things, means that examining executor state
would be a bad idea, and I'm not even sure you'd want to traverse the plan
tree.)

I'm not convinced that it's practical for pg_stat_statements to make a new
shared hashtable entry under those constraints.  But figuring out how to
minimize the risks around that is the stumbling block, not lack of a hook.

regards, tom lane

Re: Surprised by index choice for count(*)

2018-05-01 Thread Tom Lane

Rob Sargent  writes:
> Should I be?  I would have thought the pk would have been chosen v. 
> function index?

If I'm reading this correctly, the PK index contains uuids while
the fpv index contains float4s, meaning the latter is probably half
the size.  So scanning it is a lot cheaper, at least according to
the planner's cost model.

        regards, tom lane

Re: extension dependencies with 'requires'

2018-05-01 Thread Tom Lane

Eric Hanson  writes:
> I'm trying to author an extension and running into troubles with
> dependencies.  The extension depends on the uuid-ossp, pgcrypto and
> postgres_fdw extensions, but I would like the dependencies to be installed
> in the public schema, though the extension itself lives in its own schema.
> Is there a way to use CREATE EXTENSION ... CASCADE and specify in the
> control file which schema the dependencies are installed in?

Afraid not.  CASCADE will pass down the same target-schema option to the
child CREATE EXTENSION operations that appeared in the initial command.

        regards, tom lane

Re: Query planner riddle (array-related?)

2018-05-04 Thread Tom Lane

Markus  writes:
> I'm running Postgresql 9.6.7 on Debian stretch -- and I'm trying to
> understand a query plan, with any hint where to gain further insight
> welcome.

Well, you say

>   select count(*) from gaia.dr2light where parallax>50;
> gives 5400 rows in no time.

but the planner thinks there are 12991627 such rows:

>  ->  Bitmap Heap Scan on dr2light  (cost=243173.69..25288015.74 
> rows=12991627 width=132)
>Recheck Cond: (parallax > '50'::double precision)
>->  Bitmap Index Scan on dr2light_parallax  
> (cost=0.00..239925.78 rows=12991627 width=0)
>  Index Cond: (parallax > '50'::double precision)

So my first instinct would be to try to get that estimate more in
line with reality.  Maybe you need to increase the statistics target
for that column.

Also, this sort of thing is usually much easier to diagnose from
EXPLAIN ANALYZE output.  All we can see from these queries is that
the planner picked what it thought was the lowest-cost plan.  Without
actual rowcounts it's very hard to guess why the estimates were wrong.
You happened to provide one actual-rowcount number that maybe was
enough to diagnose the issue; but if the above doesn't do the trick,
we're going to need to see EXPLAIN ANALYZE to guess what else is up.

regards, tom lane




>->  Hash  (cost=118285.38..118285.38 rows=551038 width=1523)
>  ->  Seq Scan on dr2epochflux  (cost=0.00..118285.38 rows=551038 
> width=1523)


> And here's the bad plan (for query 1):

> --
>  Nested Loop  (cost=0.58..4801856.96 rows=4229 width=1647)
>->  Seq Scan on dr2epochflux  (cost=0.00..118285.38 rows=551038 width=1523)
>->  Index Scan using dr2light_pkey on dr2light  (cost=0.58..8.49 rows=1 
> width=132)
>  Index Cond: (source_id = dr2epochflux.source_id)
>  Filter: (parallax > '50'::double precision)

> If I enable_seqscan=0, it comes up with this for query 1:

> -
>  Nested Loop  (cost=1.00..4810726.18 rows=4229 width=1647)
>->  Index Scan using dr2epochflux_pkey on dr2epochflux  
> (cost=0.42..127154.60 rows=551038 width=1523)
>->  Index Scan using dr2light_pkey on dr2light  (cost=0.58..8.49 rows=1 
> width=132)
>  Index Cond: (source_id = dr2epochflux.source_id)
>  Filter: (parallax > '50'::double precision)

> -- which in reality appears to be a good deal faster than the "bad"
> plan, though still much, much slower than the "good plan".

> Both tables are ANALYZE-d, and they should be reasonably VACUUMED.

> Is there anything I can do to make it easier for the planner to see the
> light?

>-- Markus

Re: Known Bugs on Postgres 9.5

2018-05-04 Thread Tom Lane

Anudeep Gudipelli  writes:
>  I would like to know the known bugs for v9.5 and also v9.6, is there any 
> place where I can check?

See release notes at

https://www.postgresql.org/docs/current/static/release.html

regards, tom lane

Re: Query planner riddle (array-related?)

2018-05-07 Thread Tom Lane

Markus  writes:
> Ah... yeah, the parallax distribution is fairly sharply peaked around
> 0, so >50 might be severely off.
> So, I've run
>   alter table gaia.dr2light alter parallax set statistics 1000;
>   analyze gaia.dr2light;
> With this, the query plans converge to trivial variations of

>  Hash Join  (cost=253856.23..4775113.84 rows=422 width=1647) (actual 
> time=1967.095..2733.109 rows=18 loops=1)
>Hash Cond: (dr2light.source_id = dr2epochflux.source_id)
>->  Bitmap Heap Scan on dr2light  (cost=24286.88..4385601.28 rows=1297329 
> width=132) (actual time=3.113..19.346 rows=5400 loops=1)
>->  Hash  (cost=118285.38..118285.38 rows=551038 width=1523) (actual 
> time=1885.177..1885.177 rows=550737 loops=1)

> While that's a reasonable plan and fast enough, I'd still like to
> keep the box from seqscanning dr2epochflux with its large arrays and
> use that table's index on source_id.  If I read the query plan right,
> part of the problem is that it still massively overestimates the
> result of parallax>50 (1297329 rather than 5400).  Is there anything
> I can do to improve that estimate?

Raise the parallax stats target still higher, perhaps.  I think we let you
set it as high as 1.

> But even with that suboptimal estimate, postgres, under the
> assumption of something not too far from a uniform distribution on
> source_id, should end up estimating the cardinality of the end result
> as something like
> (selectivity on dr2light)*(cardinality of dr2epochflux),
> and hence roughly (1297329/1.6e9*5e5)=405 rows to be drawn from
> dr2epochflux.  It would seem a lot smarter to just pull these few 1e2
> rows using the source_id index on dr2epochflux than seqscanning that
> table, no?

No.  Given the above estimates, it would have to do 1297329 index lookups
in dr2epochflux, which is not going to be a win compared to 1297329 probes
into an in-memory hash table.  Even with a dead-accurate estimate of 5400
dr2light rows to be joined, I don't think an inner indexscan is
necessarily a better plan than a hash.  It's the number of probes that
matter, not the number of successful probes.

(It's not clear to me why so few of the dr2light rows have join partners,
but the planner does seem to understand that most of them don't.)

You say you're worried about "large arrays" in dr2epochflux; but if they
are large enough to be toasted out-of-line, it's really a nonissue.  Only
the toast pointers would be read during the seqscan or stored in the hash.

regards, tom lane

Re: Run external command as part of an sql statement ?

2018-05-07 Thread Tom Lane

"David G. Johnston"  writes:
> On Mon, May 7, 2018 at 2:35 PM, David Gauthier 
> wrote:
>> I want to be able to run a shell command like this from within a stored
>> procedure.  Is there a way to do this ?

> In core, you can probably use the untrusted version of Perl, Python, or
> Tcl to accomplish your goal.  SQL and pl/pgSQL do not provide that
> capability.

Depending on what you want to do, COPY TO/FROM PROGRAM might be a
serviceable option.  But, just like the untrusted-PL variants, you
need to be superuser.  Keep in mind that the program will run as
the database server owner (which is the reason for the superuser
restriction).

regards, tom lane

Re: How to manipulate tuples in C functions?

2018-05-07 Thread Tom Lane

"=?ISO-8859-1?B?YQ==?=" <372660...@qq.com> writes:
> As far as I know, composite type and rows (trigger functions) are passed 
> between PostgreSQL and C functions as tuple. However, I rarely find things 
> relating on how to manipulate a tuple under C from documentation. 

A lot of low-level stuff like that is only really documented in the server
source code.  You shouldn't hesitate to look around in the server source
and crib from functions that do something related to what you want.

> So if anyone can tell me how to answer the above question? Or simply give an 
> example of iterator that go through all entries of a tuple? Thank you so 
> much!!

I'd suggest record_out() in src/backend/utils/adt/rowtypes.c as a
prototypical example of disassembling an arbitrary tuple passed as
a composite-type argument.  (Note that the environment for trigger
functions is a bit different, mostly for historical reasons.)

regards, tom lane

Re: Enhancement to psql command, feedback.

2018-05-09 Thread Tom Lane

Stephen Frost  writes:
> Greetings,
> * John McKown (john.archie.mck...@gmail.com) wrote:
>> Again, this is just a discussion point. And I'm quite willing to admit
>> defeat if most people don't think that it is worth the effort.

> For my 2c, at least, I do think it'd be kind of neat to have, but we'd
> need a fool-proof way to realize that's how we're being called and,
> ideally, that would be something we could detect without having to have
> special flags for psql which anyone writing such a script would have to
> be aware of.
> Do you know if there's a way to detect that we're being called this
> way..?

Actually, I'd say that's exactly what *not* to do.  It's generally
important that a script act the same whether or not it was invoked
with a shortcut.  For instance, just because you had one of these
magic lines at the top, you'd not want it to not work if called
via \include.

So my take on it is that this is a request to ignore the first line
if it starts with "#!" (and yes, I'd insist on checking both characters).
I do not see that as noticeably more dangerous than the existing kluge
to ignore a UTF BOM character at the start of the file.

The concerns about whether psql would get invoked with a desirable
set of options if you tried to do this seem more worrisome, but if
that does work out usefully, I think this is a reasonable proposal.

regards, tom lane

Re: Domain based on TIMEZONE WITH TIME ZONE

2018-05-10 Thread Tom Lane

Ben Hood  writes:
> So the question is not how does the timestamp get stored, rather, is it an 
> anti-pattern to use Postgres as a linter for apps that forget to use UTC 
> exclusively?

Well, using a domain to enforce additional constraints on a field's value
is certainly not an anti-pattern in itself.  But you have to realize that
the processing consists of first creating a value of the base type and
then applying the constraint expressions of the domain to it.  This means
you cannot check any details that are lost in the input conversion,
because you don't have access to the original input string, only the
stored value.

As others have explained, Postgres' TIMESTAMP WITH TIME ZONE type doesn't
preserve the input's timezone specification (if any) but forcibly rotates
to UTC and stores just a scalar UTC value.  So you can't use a domain to
check anything about whether the input had a timezone field and if so what
it was.

(This behavior is nonstandard --- the SQL spec evidently expects the
timezone to be stored explicitly in some fashion --- but I don't foresee
us changing it; we've accumulated too much backwards-compatibility
baggage now.)

If you're sufficiently intent on having checking of that sort, you could
invent your own datatype with your own input function, and then make it
binary-compatible with timestamptz so that you don't need to provide much
else besides the I/O functions.  varchar(n) has the same sort of
relationship with text, so there's precedent ...

regards, tom lane

Re: Is there any C functions that convert the entry to string?

2018-05-10 Thread Tom Lane

"=?ISO-8859-1?B?YQ==?=" <372660...@qq.com> writes:
> Saying that I am writing a C function that may facing varies of types. Is 
> there a postgresql function that will automatically call the relative 
> to_string function and transform them to string??

You might look at the format() function (a/k/a text_format).

    regards, tom lane

Re: Selecting strict, immutable text for a composite type.

2018-05-10 Thread Tom Lane

Steven Lembark  writes:
> Q: What is the syntax for a strict, immutable function in 
>SQL that returns text suitable for use with either 
>creating a "select *" view or a GIST index?

Your example works just fine for me ...

regression=# select lat_lng_text('(42,54.5)');
 lat_lng_text 
--
 42-54.5
(1 row)

Maybe you should show a more concrete example of what's not working.

Side comment: I think you need to rethink that text representation,
because it'll be at best ugly with a negative longitude.  Is there a
reason not to just use the default record representation (with parens
and a comma)?

    regards, tom lane

Re: Selecting strict, immutable text for a composite type.

2018-05-10 Thread Tom Lane

[ please keep the list cc'd ]

Steven Lembark  writes:
> On Thu, 10 May 2018 11:52:48 -0400
> Tom Lane  wrote:
>> Maybe you should show a more concrete example of what's not working.

> The problem is with gists telling me that they cannot index
> the type. This works for enums, just not the composite type.

Oh, well, they can't.  There's no GiST opclass covering arbitrary
composite types.  This doesn't seem very surprising to me given
the lack of operators that such an opclass might accelerate.

What are you expecting an index on such a column to do for you?
If you just want a uniqueness constraint, plain btree can handle it.

regards, tom lane

Re: Selecting strict, immutable text for a composite type.

2018-05-10 Thread Tom Lane

Steven Lembark  writes:
> On Thu, 10 May 2018 14:41:26 -0400
> Tom Lane  wrote:
>> Steven Lembark  writes:
>>> The problem is with gists telling me that they cannot index
>>> the type. This works for enums, just not the composite type.  

>> Oh, well, they can't.  There's no GiST opclass covering arbitrary
>> composite types.  This doesn't seem very surprising to me given
>> the lack of operators that such an opclass might accelerate.

> But I thought that they could include functions of composite
> types that were indexable (e.g., text)?

Yeah, but that's not what you did.

I think you could make that work with

exclude using gist (
  lat_lng_text(location) with =,
  effective with &&
)

but it's not going to apply the function without you telling it to.

regards, tom lane

Re: How to migrate database from 10.1 to 9.1

2018-05-13 Thread Tom Lane

"Andrus"  writes:
> Database is created in
> "PostgreSQL 10.1 on x86_64-pc-mingw64, compiled by gcc.exe (Rev5, Built 
> by MSYS2 project) 4.9.2, 64-bit"
> and contains tables with autogenerated primary keys like
> ...
> if this database is restored to Postgres 9.1 from custom backup using
> ...
> automatic primary key generation stops working.

This is a bit unfortunate, because it's not obvious to the naked eye that
a SERIAL column (I assume you actually defined the column as SERIAL?) is
using any new-in-v10 syntax.  But it is.  If you'd looked at the error
output you'd have noticed that the CREATE SEQUENCE command is getting a
syntax error:

psql:testv10.dump:58: ERROR:  syntax error at or near "AS"
LINE 2: AS integer
^

You'd need to edit the dump output to remove the "AS integer" clause
in order to get this dump to load into a pre-v10 server.

Alternatively, I believe creating the column as BIGSERIAL would result in
dump files that would reload into older servers without adjustments.

regards, tom lane

Re: Rapid disk usage spikes when updating large tables with GIN indexes

2018-05-14 Thread Tom Lane

Jonathan Marks  writes:
> One recurring, and predictable, issue that we have experienced regularly for 
> multiple years is that inserting or updating rows in any table with GIN 
> indexes results in extremely large drops in free disk space — i.e. inserting 
> 10k rows with a total size of 10GB can result in the temporary loss of 
> several hundred gigabytes of free disk space over 2-3 hours (and it could be 
> more — we try to keep a 10-15% buffer of free disk space so that often 
> represents almost all available disk space). Once we stop the operation, free 
> disk space rapidly recovers, which makes us believe that this occurs due to 
> logs, or some kind of temporary table. Our work_mem and maintenance_work_mem 
> settings are pretty large (12GB and 62GB, respectively). The database’s size 
> on disk scarcely budges during this process.

I'm not following exactly what you mean by "the database’s size on
disk scarcely budges" --- how does that square with the free disk space
dropping?  (IOW, what are you measuring?)

If you're not including WAL space in the "database size", then perhaps
a plausible theory is that the space consumption comes from a burst of
WAL output, and that the space is freed after the WAL has been dumped
off to secondary servers or archived or whatever you do with it.
If you do none of those things, it'd reduce to being an issue of how
long till the next checkpoint.

Assuming this theory is accurate, probably your use of fastupdate = off
is a contributing factor, as that causes a lot more "churn" in the
internals of the GIN indexes during updates, and correspondingly more
WAL output to log the changes.  But you probably don't want to revisit
that decision if you're happy with performance otherwise.

If you are archiving or streaming WAL, then probably what you want to
look at is getting rid of bottlenecks in that, so that it can keep up
with these WAL-generation spikes better.

If you're not, the only suggestion I can think of is to try twiddling
your checkpoint parameters to alleviate the space spikes.  Reducing
the checkpoint interval would do that, but you have to be very wary
of going too far; a short checkpoint interval results in more full-page
images being emitted to WAL and thus can actually increase your WAL
space consumption.  Depending on what parameters you're using now,
maybe even an increase would be better.

regards, tom lane

Re: Rapid disk usage spikes when updating large tables with GIN indexes

2018-05-14 Thread Tom Lane

[ please keep the list cc'd ]

Jonathan Marks  writes:
> Thanks for your quick reply. Here’s a bit more information:
> 1) to measure the “size of the database” we run something like `select 
> datname, pg_size_pretty(pg_database_size(datname)) from pg_database;` I’m not 
> sure if this includes WAL size.
> 2) I’ve tried measuring WAL size with `select sum(size) from pg_ls_waldir();` 
> — this also doesn’t budge.
> 3) Our current checkpoint_timeout is 600s with a checkpoint_completion_target 
> of 0.9 — what does that suggest?

Hmph.  Your WAL-size query seems on point, and that pretty much destroys
my idea about a WAL emission spike.

pg_database_size() should include all regular and temporary tables/indexes
in the named DB.  It doesn't include WAL (but we've eliminated that), nor
cluster-wide tables such as pg_database (but those seem pretty unlikely
to be at issue), nor non-relation temporary files such as sort/hash temp
space.  At this point I think we have to focus our attention on what might
be creating large temp files.  I do not see anything in the GIN index code
that could do that, especially not if you have fastupdate off.  I wonder
whether there is something about the particular bulk-insertion queries
you're using that could result in large temp files --- which'd make the
apparent correlation with GIN index use a mirage, but we're running out
of other ideas.  You could try enabling log_temp_files to see if there's
anything to that.

In the grasping-at-straws department: are you quite sure that the extra
disk space consumption is PG's to begin with, rather than something
outside the database entirely?

regards, tom lane

Re: 10.4 upgrade, function markings, and template0

2018-05-14 Thread Tom Lane

Dominic Jones  writes:
> The PostgreSQL 10.4 upgrade involves changes to some function markings (see 
> release notes, E.1.2, second and third bullet points for specifics). One way 
> to make these changes is to use `ALTER FUNCTION` to change the functions in 
> your existing databases. While this was mainly straightforward, I'm unclear 
> on whether the `template0` database must be changed in this manner or if it 
> is automatically updated when the upgrade is applied. The documentation 
> indicates that in general you shouldn't manually change the `template0` 
> database.

Yes, you'd need to fix it in template0 as well, or you risk
subsequently-created databases not having the fix.  See previous
minor releases where we've given more painstaking detail about
applying catalog corrections, e.g. 9.6.4:

https://www.postgresql.org/docs/current/static/release-9-6-4.html

I didn't bother with spelling it all out in full detail this time,
which maybe was a mistake, but I felt that probably most users
wouldn't need to bother with these changes at all (unlike the case
where a catalog correction is security-related).

regards, tom lane

Re: 10.4 upgrade, function markings, and template0

2018-05-14 Thread Tom Lane

Adrian Klaver  writes:
> On 05/14/2018 02:02 PM, Tom Lane wrote:
>> I didn't bother with spelling it all out in full detail this time,
>> which maybe was a mistake, but I felt that probably most users
>> wouldn't need to bother with these changes at all (unlike the case
>> where a catalog correction is security-related).

> Well what is nice about the news release is you can cut and past the 
> entire list of commands and do the updates en masse.

It'd be nice to have some more-automated way of doing this type of
correction.  Ordinary scripting doesn't look very promising, because
I don't see an easy way to deal with the need to connect to every
database in the cluster; that seems to depend on a lot of local
characteristics about usernames and authentication.

Maybe it'd be worth building some sort of infrastructure that would
allow this to be done at a lower level.  It's not hard to imagine
an autovacuum-like or bgworker-based thingy that could run around
and apply a given SQL script in every database, bypassing the usual
worries about authentication and connections-disabled databases.
That seems like a lot of work for a need that only comes up once in
awhile, but perhaps it'd have more applications than just catalog
corrections.

regards, tom lane

Re: Rapid disk usage spikes when updating large tables with GIN indexes

2018-05-16 Thread Tom Lane

Jonathan Marks  writes:
> We turned on log_temp_files and since the last stats reset (about a week ago) 
> we’re seeing 0 temp files altogether (grabbing that info from 
> pg_stat_database).

Hm.

> Another thread we found suggested pg_subtrans — this seems less likely 
> because we’ve been able to replicate this across many different types of 
> connections etc. but thought it might be a potential source.

We're running out of other ideas, so maybe, but it's kind of hard to
credit large numbers of gigabytes going into pg_subtrans et al.
Still, you could easily adapt your WAL-size query to track the sizes
of other DB subdirectories and see if anything springs out.  Since
I'm really feeling a bit baffled at this point, I'd suggest watching
all of them:

pg_commit_ts/
pg_dynshmem/
pg_logical/mappings/
pg_logical/snapshots/
pg_multixact/members/
pg_multixact/offsets/
pg_logical/
pg_multixact/
pg_notify/
pg_replslot/
pg_serial/
pg_snapshots/
pg_stat/
pg_stat_tmp/
pg_subtrans/
pg_tblspc/
pg_twophase/
pg_wal/
pg_xact/

    regards, tom lane

Re: postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_toast_2619

2018-05-19 Thread Tom Lane

Justin Pryzby  writes:
> I'll defer fixing this for awhile in case someone wants me to save a copy of
> the relation/toast/index.  From last time, I recall this just needs the right
> combination of REINDEX/VACUUM/ANALYZE, and the only complication was me
> needing to realize the right combination of affected DB(s).

If you could come up with such a sequence that causes the problem
reproducibly, that would be of huge interest, and probably lead to
a fix promptly.  But I don't think that we can do much by looking
at the post-mortem state --- once the toast rows are gone, they're
gone, especially if the table's been vacuumed since.

        regards, tom lane

Re: postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_toast_2619

2018-05-19 Thread Tom Lane

Justin Pryzby  writes:
> [pryzbyj@database ~]$ while :; do for db in `psql postgres -Atc "SELECT 
> datname FROM pg_database WHERE datallowconn"`; do for t in pg_statistic 
> pg_attrdef pg_constraint; do echo "$db.$t..."; 
> PGOPTIONS=-cstatement_timeout='9s' psql $db -qc "VACUUM FULL $t"; done; done; 
> done

> ...
> postgres.pg_statistic...
> postgres.pg_attrdef...
> postgres.pg_constraint...
> template1.pg_statistic...
> template1.pg_attrdef...
> template1.pg_constraint...
> ts.pg_statistic...
> ERROR:  canceling statement due to statement timeout
> ts.pg_attrdef...
> ts.pg_constraint...
> postgres.pg_statistic...
> ERROR:  missing chunk number 0 for toast value 3372855171 in pg_toast_2619

Hm, so was the timeout error happening every time through on that table,
or just occasionally, or did you provoke it somehow?  I'm wondering how
your 9s timeout relates to the expected completion time.

I don't have any test DBs with anywhere near large enough stats to
require 9s to vacuum pg_statistic, but I'm trying this with a
much-reduced value of statement_timeout, and so far no failures ...

regards, tom lane

Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

2018-05-21 Thread Tom Lane

"David G. Johnston"  writes:
>> Is it possible to obtain the
>> first ROW_COUNT (after SELECT) without performing it twice?

> Not directly, no.  You should execute the inner query to a temporary table
> than perform your counting and json_agg from that.

OP could do something like

declare r record;
...
select json_agg(_) as j, count(*) as c INTO r FROM (
  SELECT foo, bar, baz ...
  FROM t1, t2, t3 WHERE ...) AS _;

This would be slightly more expensive than doing only the one aggregate,
but it should beat anything involving a temp table.

        regards, tom lane

Re: Installing PostgreSQL as non-root by Extracting the binaries from RPMs

2018-05-21 Thread Tom Lane

Thiagarajan Lakshminarayanan  writes:
> Thanks Benjamin.  Actually, we don't use Docker in our environment. We just
> use a standard RHEL OS install. Will there be any issues if I install
> PostgreSQL by extracting the RPM binaries?

What exactly is your purpose here?  Is it just that you want to run a
separate data directory somewhere?  You don't need a separate set of
binaries for that, just point PGDATA to where you want it and run the
server as the owner of that data directory.

(Depending on the details, you might need to teach SELinux to allow
this usage.  I don't remember exactly what cues SELinux to treat a
particular process as being subject to its restrictions, but use of
the system binaries is at least part of it.)

If you want to be able to build and run a server without any root
privileges at all, you might be best off just building from source.
Maybe I'm too used to doing that, but I'm not very clear on what
you gain by extracting the result of somebody else's build -- which,
almost by definition, is not exactly what you want.

    regards, tom lane

Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

2018-05-21 Thread Tom Lane

Alexey Dokuchaev  writes:
> On Mon, May 21, 2018 at 10:45:27AM -0400, Tom Lane wrote:
>> OP could do something like
>>  select json_agg(_) as j, count(*) as c INTO r FROM (

> Thanks, I've arrived at the same solution (using local RECORD) eventually.
> It works as intended, but I still need to assign OUT parameters by hand:

> retcode := tmp.c;
> result := tmp.j;

Actually, it should work to do

select json_agg(_), count(*) INTO result, retcode FROM ...

        regards, tom lane

Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics

2018-05-21 Thread Tom Lane

Alexey Dokuchaev  writes:
> Quick reality check
> question: are count(*) vs. count(_) equivalent above?

Only if _ is guaranteed non-null ... which, as a rowtype result, it
probably is.  But I'd use count(*) if you're concerned about speed.

        regards, tom lane

Re: source of connection fails at pg startup?

2018-05-22 Thread Tom Lane

Stuart McGraw  writes:
> When I start my postgresql server I get 11 messages reporting that "password
> authentication failed for user 'postgres'" spaced about ~.5sec apart.

Sounds like the trace of something probing the postmaster to see if it's
ready yet.  Pre-v10 versions of pg_ctl did exactly that, but with a
1-second wait interval, so this couldn't be pg_ctl itself (even if you
hadn't specified this is v10).

> This is on a Ububuntu-18.04 machine with postgresql-10.3 from Ubuntu.  As 
> distributed
> the pg_hba.conf line mentioned used "peer" authentication method, I have 
> changed to
> "md5".  When I change back to "peer" the error messages go away.

In that case, whatever is doing it is running as the postgres user.

Putting all this together, I'd bet on the connections coming from an
Ubuntu-specific startup script.  Poke around in their PG start script
for something like a pg_isready call in a loop with an 0.5 second wait.

I imagine that undoing that would be rather difficult, even if you
wanted to run with a locally-modified script.  They probably had a
reason why they didn't want to leave it to pg_ctl to do the waiting.

Personally, my recommendation would be to go back to "peer" auth,
at least for local connections by postgres.  There is no reason
to think that passwords are a more secure approach: password
management is a hard problem, especially for automated connections
like these.

regards, tom lane

Re: pg_multixact/members growing

2018-05-22 Thread Tom Lane

Tiffany Thang  writes:
> Our pg_multixact/members directory has been growing to more than 18GB over
> the last couple of months. According to the documentation, the files in
> there are used to support row locking by multiple transactions and when all
> tables in all databases are eventually scanned by VACUUM, the older
> multixacts are removed. In our case, the files are not removed.

Hmm.  What does pg_controldata tell you about NextMultiXactId,
NextMultiOffset, oldestMultiXid, oldestMulti's DB?
Are pg_clog/ or pg_subtrans/ or pg_multixact/offsets/ getting large?
Is there anything at all in pg_twophase/?  Is this system a replication
master, and if so are any of its slaves lagging behind?

> Any
> suggestions what I should do to purge the files automatically? Can old
> files since the last reboot be manually removed?

I wouldn't do that.  Much safer to figure out what's blocking automatic
cleanup so you can fix the root cause.

    regards, tom lane

Re: pg_multixact/members growing

2018-05-23 Thread Tom Lane

Tiffany Thang  writes:
> Where do I find pg_controldata? I could not locate it on the file system.

Hmm, should be one of the installed PG executables.

> pg_clog/ or pg_subtrans/ or pg_multixact/offsets/  are getting larger too
> but by only a few hundreds MBs.

This is consistent with the idea that something is holding back the
oldest-transaction horizon.  (However, I'd think you'd also be having
serious problems with table bloat if that were true, so it's a bit
mystifying.)  Did you check for unclosed prepared transactions?

select * from pg_prepared_xacts;

        regards, tom lane

Re: Parameter placeholders, $n vs ?

2018-05-24 Thread Tom Lane

"David G. Johnston"  writes:
> On Thursday, May 24, 2018, Lele Gaifax  wrote:
>> So the questions: is the '?' style placeholder a supported variant? and
>> if so, should the ParamRef doc tell something about that?

> PostgreSQL's Prepare statement doesn't accept question mark as a parameter
> symbol, and cannot ever because it is already a valid operator symbol.

To enlarge on that a bit:

* PG's core parser certainly does not accept ? as a parameter symbol.
I speculate that you fed the input through some frontend that converts
? to $n (JDBC, perhaps)?

* The only thing in the core code that would print a ParamRef in any
sort of symbolic form is _outParamRef, and it's easily seen by inspection
to be incapable of omitting the "number" field ... not to mention that
it doesn't emit the JSON-esque representation you're showing us.  So
that behavior must also be due to some non-core pretty-printing code you
haven't identified to us.  No idea what rules that might have for deciding
to omit "number".

regards, tom lane

Re: Syndicating PostgreSQL mailing list to Discourse

2018-05-25 Thread Tom Lane

Adrian Klaver  writes:
> On 05/25/2018 01:03 PM, Erlend Sogge Heggen wrote:
>> If our mirror archive gets enough traction we'd like to make it possible 
>> to sign up to the forum and seamlessly interact with the mailing list, 

> Well 'seamlessly works' would be the important part. In the past when 
> this was tried(can't remember the who) the posts lagged noticeably and 
> content went missing.

TBH, the existing services that try to do that provide a remarkably
unfriendly experience on this side, and haven't shown much interest in
improving that (I'm thinking of Nabble in particular).  So my initial
reaction is "thanks but no thanks, we don't need another of those".
But maybe you can do it a lot better than they have.

regards, tom lane

Re: How to drop a value from an ENUM?

2018-05-29 Thread Tom Lane

=?UTF-8?Q?Torsten_F=C3=B6rtsch?=  writes:
> I am absolutely sure a certain value of one of my ENUM types is not used in
> the entire database. Now I am asked to drop that value. Unfortunately,
> there is no ALTER TYPE DROP VALUE.

Yup.

> On my development box I tried
> delete from pg_enum
>  where enumtypid='my_type_name'::regtype
>and enumlabel='my_label'
> It worked and I could not find any adverse effects.
> Given the value is not used anywhere, is this a save command?

No.  If it were, we'd have an ALTER DROP VALUE command.

The key problem that is hard to fix here is that, even if today you have
no live rows containing that value, it may still be present in indexes.
In a btree, for example, the value might've migrated up into upper index
pages as a page boundary value.  Once that's happened, it's likely to
persist indefinitely, even if the live occurrences in the underlying table
get deleted and vacuumed away.

Now, even if that's happened, you might be okay, because of the
optimizations that typically allow enum value comparisons to be done
without consulting pg_enum.  But if you're in the habit of altering
enums, it's that much more likely that you would have done an ALTER TYPE
that defeats those optimizations; so I wouldn't rely on this.  Sooner
or later you're going to get burnt by complaints about an invalid enum
value (not sure of the exact wording) when you access certain parts
of the index.

You could maybe get around all of that by reindexing any indexes
containing the altered enum type after you're certain that all
entries of the unwanted enum value are dead and vacuumed away.
But it's not terribly safe.

regards, tom lane

Re: Insert UUID GEN 4 Value

2018-05-31 Thread Tom Lane

Adrian Klaver  writes:
> My suspicion is it had to do with this from a post upstream:
> "When I tried it, I am getting an error: Invalid input syntax for UUID: 
> uuid_generate_v4(),"

Yeah.  We haven't seen the actual query yet, but I'm betting the OP
really wrote

INSERT ... , 'uuid_generate_v4()', ...

not

INSERT ... , uuid_generate_v4(), ...

There might be some layers of client-side parameter mangling obscuring
what's actually happening.

regards, tom lane

Re: RPM Packaging Question - Fedora 28 & Postgis

2018-06-01 Thread Tom Lane

John Woltman  writes:
>  I'm attempting to install PostGIS2 (specifically postgis2_95 for Postgres
> 9.5) on Fedora 28.  When I run ``sudo dnf install postgis23_95`` I get the
> the following the error:

> Error:
>   Problem: conflicting requests
>  - nothing provides geos36 >= 3.6.2 needed by postgis23_95

> Fedora 28 provides GEOS 3.6.1.  Is my issue something that I should bring
> up to the Fedora maintainers or to whoever's in charge of the Postgres Yum
> repos?

The latter; specifically, whoever produced the allegedly-F28-compatible
postgis RPM you're trying to install.

regards, tom lane

Re: Code of Conduct plan

2018-06-03 Thread Tom Lane

Two years ago, there was considerable discussion about creating a
Code of Conduct for the Postgres community, as a result of which
the core team announced a plan to create an exploration committee
to draft a CoC [1].  That process has taken far longer than expected,
but the committee has not been idle.  They worked through many comments
and many drafts to produce a version that seems acceptable in the view
of the core team.  This final(?) draft can be found at

https://wiki.postgresql.org/wiki/Code_of_Conduct

We are now asking for a final round of community comments.
Please send any public comments to the pgsql-general list (only).
If you wish to make a private comment, you may send it to
c...@postgresql.org.

The initial membership of the CoC committee will be announced separately,
but shortly.

Unless there are substantial objections, or nontrivial changes as a result
of this round of comments, we anticipate making the CoC official as of
July 1 2018.

regards, tom lane

[1] https://www.postgresql.org/message-id/56a8516b.8000...@agliodbs.com

Re: Code of Conduct plan

2018-06-03 Thread Tom Lane

Benjamin Scherrey  writes:
> Is there some archive of the discussion that brought on this effort and the
> considerations of the committee itself? I wish I had seen the earlier
> announcements in 2016 as I would have definitely participated.

If you poke around in our mailing list archives for early 2016 (Jan/Feb),
you'll find a number of threads about it.  Mostly on the -general list,
IIRC.

> Another more specific factual question - have there been incidents within
> the active Postgresql community where behaviour by individuals who are
> participants in the community have conducted themselves in a manner that
> brought on the actual need for such a code of conduct to exist in the first
> place?

I believe there were a couple of unfortunate incidents at conferences.
Now, conferences are generally expected to have their own CoCs and enforce
them themselves; this CoC is meant more to cover on-line interactions.
You could argue that we shouldn't create such a CoC until something bad
happens on-line; but I'd prefer to think that having a CoC might prevent
that from ever happening at all, which is surely better.

In any case, we went over all these sorts of arguments at excruciating
length in 2016.  It's quite clear to the core team that a majority of
the community wants a CoC.  I don't think any useful purpose will be
served by re-litigating that point.

regards, tom lane

Re: Code of Conduct plan

2018-06-04 Thread Tom Lane

Benjamin Scherrey  writes:
> I did go back and read through the 2016 content rather thoroughly. But
> where has all the discussion been going on for the last two years?

It's been private, mostly either (a) the exploration committee responding
to comments that were received at PGCon 2016 [1] or privately, or (b) the
core team arguing among ourselves whether we were prepared to support the
draft yet.  I'm embarrassed to admit that a whole lot of the delay has
been due to (b).  Core did finally resolve our differences in in-person
meetings at PGCon 2018, which is why you're seeing this now rather than
some other time.

Anyway, the core discussions certainly aren't going to be made public,
and I doubt that Stacey has any intention of publishing the exploration
committee's private mail either.  If you compare the current draft to
what was available in 2016, I don't think you'll find any differences
that are so substantive as to require public defense.  We tried to make
the wording simpler and less intimidating, but that's about it.

regards, tom lane

[1] https://wiki.postgresql.org/wiki/Coc_qa_pgcon2016

Re: Microsoft buys GitHub, is this a threat to open-source

2018-06-04 Thread Tom Lane

Michael Nolan  writes:
> Microsoft has bought GitHub for $7.5 billion, is this a threat to the open
> source community?

A fair question, but one that seems entirely off-topic for the Postgres
lists, since we don't depend on GitHub.  (And that's a thing for which
I'm very glad today.)

    regards, tom lane

Re: How to get postmaster shut down time in postgres?

2018-06-04 Thread Tom Lane

"David G. Johnston"  writes:
> On Mon, Jun 4, 2018 at 6:44 AM, pavan95  wrote:
>> The main reason for my requirement is to find the time swing between server
>> stop and start.

> Not all server stops are logged/evented (i.e., crashes), though by
> definition all successful starts are (or at least can be).

Yeah, the server-crash case is the one that makes this requirement
hard to solve internally to the server.  In a normal ("smart")
shutdown, there might be a case for letting something happen just
before we begin the final shutdown sequence, but there's no such
hook at present.  Anyway, that still leaves a definitional question
--- are you looking for the time that shutdown begins, or when it's
done?  By definition, the latter is too late to make a table entry.

One thing you might consider is running pg_controldata just before you
start the server, and noting its report of "pg_control last modified".
That would either be the shutdown-done point in a non-crash case, or
(probably) the last checkpoint time in a crash case.  As others mentioned,
checking for the time of the last message in the postmaster log would
likely provide a more accurate result, but it's also a lot more
complicated.

regards, tom lane

Re: Code of Conduct plan

2018-06-04 Thread Tom Lane

"Joshua D. Drake"  writes:
> On 06/03/2018 11:29 AM, Tom Lane wrote:
>> We are now asking for a final round of community comments.

> Thanks for all the efforts on this. It is nice to see us explicitly 
> moving toward modernizing our community policies and creating an openly 
> inclusive community. There are a couple of notes I have about this:

> I think we need language that explicitly states that this is about 
> participation within postgresql.org only. It is not postgresql.org's 
> mission or purpose to police actions outside of their domain.

Actually, it's intentional that we are not saying that.  The idea is
that any interaction between PG community members is subject to the CoC,
whether it takes place in postgresql.org infrastructure or not, so long as
there is not another CoC that takes precedence (such as a conference's
CoC).  The reason for this is an unfortunate situation that took place in
the FreeBSD community awhile back [1], wherein one community member was
abusing another via Twitter, and their existing CoC failed to cover that
because it had been explicitly written to cover only community-run forums.
So we're trying to learn from that mistake, and make sure that if such a
situation ever came up here, the CoC committee would have authority to
act.

IIRC, the earliest drafts did have language about like what you suggest
here, but we took it out after the FreeBSD case was pointed out.

> There is no language that protects different political or social views. 
> In today's climate it is important especially as we are a worldwide 
> project. Something simple like the following should be enough:
> "Examples of personal characteristics include, but are not limited to 
> age, race, national origin or ancestry, religion, political affiliation, 
> social class, gender, or sexual orientation."

We've gone back and forth on how long the "examples of personal
characteristics" list ought to be; it was longer before, and some folks
didn't like that.  (For onlookers who don't feel like checking the current
draft, JD has added "political affiliation" and "social class" to the
present text.  The May 2016 draft had seventeen entries and was
undoubtedly way too far in the other direction.)  In the end, since it's
just examples anyway, I'm inclined to keep it short.  We can and will
tweak the text in future if actual problems arise and somebody argues
that their hurtful conduct wasn't proscribed.

In the end, whether reasonable things happen is going to depend on
the reasonableness of the CoC committee members.  That's part of the
reason that we've set it up so that that committee is distinct from,
but answerable to, the core team.  Core will take action if the CoC
committee is seen to be getting out of hand --- though I think that
that's very unlikely to happen.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/ca59563a-a97b-4ffc-a414-9888392f5...@justatheory.com
(The linked-to discussion unfortunately seems to be 404 now, so I'm
relying on David's summary.)

Re: Code of Conduct plan

2018-06-04 Thread Tom Lane

Alvaro Herrera  writes:
> On 2018-Jun-05, Gavin Flower wrote:
>> If we are all adults, then we don't need a CoC.

> "We're all adults" is wishful thinking.  Some old people are just kids
> who aged but didn't actually mature.

I'm sure we'd all be ecstatic if the CoC committee never actually has
anything to do.  The point of this exercise is to make new people ---
particularly women and others who have often felt disadvantaged in
technical communities --- feel safe and welcome here.

Also: we *have* had cases where women who had been contributors left
because of harassment, and I'd like to ensure that doesn't happen again.

regards, tom lane

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2928 matches

Mail list logo