Is pg_restore in 10.6 working?

2018-11-12 Thread David
I have some experience with different versions of Postgres, but I'm just
getting around to using pg_restore, and it's not working for me at all.

I can connect with psql, and pg_dump works, but pg_restore never does
anything when I call it.  It never even prompts for a password.

Here is my pg_hba.conf:
# "local" is for Unix domain socket connections only
local   all all ident
map=super
# IPv4 local connections:
hostall all 0.0.0.0/0   md5
# IPv6 local connections:
hostall all ::1/128 ident
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication all peer
hostreplication all 127.0.0.1/32ident
hostreplication all ::1/128 ident

And my pg_ident.conf
# MAPNAME   SYSTEM-USERNAME PG-USERNAME
super   ec2-userpostgres

I can connect with psql either of these two ways:
psql -U postgres
or
psql -h ip-172-31-62-127.ec2.internal -p 5432 -U postgres -W postgres
(Yes, it's an AWS server)

This pg_dump command works:
pg_dump -U postgres -f predata.sql -F p -v  -d prod_data

But a matching pg_restore command does nothing.
pg_restore -U postgres -f predata.sql -v

I'm running 10.6.

thank you


Re: Is pg_restore in 10.6 working?

2018-11-12 Thread David
I'm not following your question.  The pre-data and post-data sections each
go to an individual file, but the data section goes to a directory.  I can
restore the files using psql, but it is the restore of the directory that
is hanging.

On Mon, Nov 12, 2018 at 2:28 PM Rob Sargent  wrote:

>
> On 11/12/18 11:44 AM, Tom Lane wrote:
> > David  writes:
> >> I have some experience with different versions of Postgres, but I'm just
> >> getting around to using pg_restore, and it's not working for me at all.
> >> ...
> >> But a matching pg_restore command does nothing.
> >> pg_restore -U postgres -f predata.sql -v
> > This command expects to read from stdin and write to predata.sql, so
> > it's not surprising that it's just sitting there.  What you want
> > is something along the lines of
> >
> > pg_restore -U postgres -d dbname -v  >
> >   regards, tom lane
> >
>
> In this case, does the "General options" -f make sense? restoring to a
> file?
>
>
>


Re: Is pg_restore in 10.6 working?

2018-11-12 Thread David
Thanks, Adrian.  It's running now.

On Mon, Nov 12, 2018 at 5:05 PM Adrian Klaver 
wrote:

> On 11/12/18 1:58 PM, David wrote:
>
> Please reply to list also.
> Ccing list
>
> > Yes, that's what I get for writing emails while I'm doing 5 other things
> > at the same time.  So, let me try this again.
> > pg_dump is working when I use the following:
> > pg_dump -U postgres -F d -f /pgbackup/prod/data -v --section=data
> prod_data
> > To be clear, prod_data is the name of the database.
> > This works fine, I get /pgbackup/prod/data created and populated by
> > compressed files, as advertised.
> >
> > How to I restore this?  Is there a specific combination of command line
> > options for this format?
> > I've tried -d prod_data and -f /pgbackup/prod/data -F d, but I get an
> error:
> > options -d and -f cannot be used together.
> >
> > So I take -d off the command line and I get
> > [directory archiver] no output directory specified.
> > and if I use this I get nothing at all
> > pg_restore -U postgres -f /pgbackup/prod/data -v
> >
> > So I'm confused.
>
> Enlightenment:
>
> https://www.postgresql.org/docs/10/app-pgrestore.html
>
> -f filename
> --file=filename
>
>  Specify output file for generated script, or for the listing when
> used with -l. Default is the standard output.
>
> -d dbname
> --dbname=dbname
>
>  Connect to database dbname and restore directly into the database.
>
>
> filename
>
>  Specifies the location of the archive file (or directory, for a
> directory-format archive) to be restored. If not specified, the standard
> input is used.
>
>
> So something like:
>
> pg_restore -U postgres -v -d prod_data /pgbackup/prod/data
>
> >
> > thanks again.
> >
> >
> >
> > On Mon, Nov 12, 2018 at 4:39 PM Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> wrote:
> >
> > On 11/12/18 11:39 AM, David wrote:
> >  > I'm not following your question.  The pre-data and post-data
> > sections
> >  > each go to an individual file, but the data section goes to a
> >  > directory.  I can restore the files using psql, but it is the
> > restore of
> >  > the directory that is hanging.
> >
> > That is not what you showed in your OP:
> >
> > This pg_dump command works:
> > pg_dump -U postgres -f predata.sql -F p -v  -d prod_data
> >
> > But a matching pg_restore command does nothing.
> > pg_restore -U postgres -f predata.sql -v
> >
> > We would need to see the commands for data section to be able to
> > comment
> > further.
> >
> >  >
> >  > On Mon, Nov 12, 2018 at 2:28 PM Rob Sargent
> > mailto:robjsarg...@gmail.com>
> >  > <mailto:robjsarg...@gmail.com <mailto:robjsarg...@gmail.com>>>
> wrote:
> >  >
> >  >
> >  > On 11/12/18 11:44 AM, Tom Lane wrote:
> >  >  > David mailto:dlbarro...@gmail.com>
> > <mailto:dlbarro...@gmail.com <mailto:dlbarro...@gmail.com>>> writes:
> >  >  >> I have some experience with different versions of
> > Postgres, but
> >  > I'm just
> >  >  >> getting around to using pg_restore, and it's not working
> > for me
> >  > at all.
> >  >  >> ...
> >  >  >> But a matching pg_restore command does nothing.
> >  >  >> pg_restore -U postgres -f predata.sql -v
> >  >  > This command expects to read from stdin and write to
> > predata.sql, so
> >  >  > it's not surprising that it's just sitting there.  What
> > you want
> >  >  > is something along the lines of
> >  >  >
> >  >  > pg_restore -U postgres -d dbname -v  >  >  >
> >  >  >   regards, tom lane
> >  >  >
> >  >
> >  > In this case, does the "General options" -f make sense?
> > restoring to
> >  > a file?
> >  >
> >  >
> >
> >
> > --
> > Adrian Klaverfile:///usr/share/applications/thunderbird.desktop
> > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Why is tuple_percent so low?

2018-02-27 Thread David Rowley
On 27 February 2018 at 18:03, Sam Saffron  wrote:
> 1. Where is all my missing space, is this in page alignment stuff and
> per-page overhead?

Yes, tuples are MAXALIGNed when copied onto the page. That appears to
be 8 bytes on your system. You're just 2 bytes over the 8-byte
boundary. You'll notice the table becoming more compact if you somehow
could do away with the dsecs column.

> 2. Is there any other schemes I can look at for storing this data to
> have a more efficient yet easily queryable / updateable table.

Hard to say without learning about the use case.  Some sort of
de-normalisation may help to allow fewer tuples, e.g storing one
column in an array, but may be more painful from a design point of
view. It may also cause pain from a dead-tuple point of view if you
have to UPDATE large tuples frequently.

> Keep in mind these tables get huge and in many of our cases will span
> 10-20GB just to store this information.

I guess it's all relative, but that does not seem that large. Is this
causing some sort of performance problems you're trying to solve? or
is this completely down to disk space consumption?  If it's down to
performance then you might be able to solve that problem with an
index.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: index-only-scan when there is an index on all columns

2018-02-27 Thread David Rowley
On 28 February 2018 at 11:11, Andres Freund  wrote:
> I'm fairly certain that only vacuum and table rewrites like cluster sets
> all-visible,

I don't think the pages are set all visible again after a rewrite.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Is there a continuous backup for pg ?

2018-03-02 Thread David Steele
Hi Gary,

On 3/2/18 2:05 PM, Gary M wrote:
> Hi,
> 
> I have an unusual requirement for schema based, live backup of a 24/7
> database processing 100K inserts/updates per hour. The data store is
> around 100TB.
> 
> The requirement is supporting an incremental backup of 10 minute
> windows.  Replication is not considered backup from malicious action.
> 
> Are there any best practices or solutions that can meet these
> requirements ? 
pgBackRest is specifically designed to handle very large clusters and
high WAL rates.  Backup, restore, and archiving can be run in parallel
to speed operations.  The new version 2 has been optimized to make
archive-push even faster than version 1 and we will be releasing an
optimized archive-get soon.

You would be best off achieving your 10-minute windows with daily
incremental backups and then recovery with PITR to the required time.
PITR allows you to specify any time for recovery.


Regards,
-- 
-David
da...@pgmasters.net



Best options for new PG instance

2018-03-05 Thread David Gauthier
Hi:

I'm going to be requesting a PG DB instance (v9.6.7) from an IT dept in a
large corp setting.  I was wondering if anyone could comment on the
pros/cons of getting this put on a virtual machine vs hard metal ?  Locally
mounted disk vs nfs ?

Thanks !


Can I grant permissions to specific set of uids (linux) ?

2018-03-06 Thread David Gauthier
Hi:

I'd like to grant select, insert, update, delete to a table for a specific
set of uids (linux).  All others get select only.  Can the DB authenticate
the current linux user and grant access based on the fact that they are
logged in ()IOW, no passwords ?Is this possible ? If so, how ?

Thanks for any help


Re: Barman versus pgBackRest

2018-03-09 Thread David Steele
Hi Thomas,

On 3/6/18 2:53 PM, Thomas Poty wrote:
> Hello Community,
> 
> I hesitate to use barman or pgBackRest.  I have found a lot of common
> points between them and a few differences:

To be clear, I'm the original author and primary maintainer of
pgBackRest.  I'll let the Barman folks speak to their strengths, but I'm
happy to address your points below.

> About pgBarman, I like :
> - be able restore on a remote server from the backup server

This a good feature, and one that has been requested for pgBackRest. You
can do this fairly trivially with ssh, however, so it generally hasn't
been a big deal for people.  Is there a particular reason you need this
feature?

> - use replication slots for backingup wal on the backup server.

Another good feature.  We have not added it yet because pgBackRest was
originally written for very high-volume clusters (100K+ WAL per day) and
our parallel async feature answers that need much better.  We recommend
a replicated standby for more update-to-date data.

Even so, we are looking at adding support for replication slots to
pgBackRest.  We are considering a hybrid scheme that will use
replication to keep the WAL archive as up to date as possible, while
doing bulk transfer with archive_command.

> About pgBackRest, I like :
> 
> - real differential backup.
> - lots of options
> - option for backingup if PostgreSQL is already in backup mode
> 
> 
> I would like to have :
> - advices or feedbach about using pgBackrest or barman.
> - pros and cons of these solutions

I'll stick with some of the major pgBackRest pros:

- Parallel backup including compression and checksums
- Encryption
- S3 support
- Parallel archive
- Delta restore
- Page checksum validation
- Backup resume

More about features here: https://pgbackrest.org

> - differences that I would not have seen.

pgBackRest is used in some very demanding environments and we are
constantly answering the needs of our users with features and
performance improvements, e.g. the enormous improvements to archive-push
speed in the 2.0 release.

I'd be happy to answer any specific questions you have about pgBackRest.

Regards,
-- 
-David
da...@pgmasters.net



Re: Is there a way to create a functional index that tables tableoid column as an arg?

2018-03-11 Thread David Rowley
On 11 March 2018 at 07:53, Ryan Murphy  wrote:
> I am using table inheritance and have e.g. the following tables:
>
> create table animal ( ... );
> create table dog ( ... ) inherits (animal);
> create table cat ( ... ) inherits (animal);
> create table person ( ... ) inherits (animal);
> create table musician ( ... ) inherits (person);
> create table politician ( ... ) inherits (person);
>
> Now I have a query that gets all the "animal"'s except for those that are
> "person"'s.
>
> select * from only animal
>
> won't cut it, because it leaves out the dogs and cats.
>
> select *, tableoid::regclass relname from animal
> where relname != 'person'::regclass
>
> also won't cut it because it leaves out the musicians and politicians.
>
> So I have created an immutable function is_a_kind_of(tbl regclass,
> parent_tbl regclass) that returns true iff tbl is identical with, or
> directly or indirectly inherits from, parent_tbl.  For example:
>
> is_a_kind_of('person','person') => true
> is_a_kind_of('person','animal') => true
> is_a_kind_of('musician','person') => true
> is_a_kind_of('animal','person') => false
> is_a_kind_of('dog','person') => false
>
> No problems so far.  Now my query works:
>
> select *,tableoid from "animal"
> where not is_a_kind_of(tableoid::regclass::text, 'person')
>
> This query is somewhat slow though - I'd like to index the is_a_kind_of()
> call.  And Postgres supports functional indexes!  So I try:
> create index animal_is_person on animal (
> is_a_kind_of(tableoid::regclass, 'person') );
>
> ERROR:  index creation on system columns is not supported
>
> I see that this is because "tableoid" is a system column. Does anyone know
> any workaround for this?  So close yet so far away!

I don't think having an index on the tableoid would help you here
anyway.  Previous versions did allow indexes on system columns, but
that was disallowed recently due to the fact that the value of most
system columns are changed internally by postgres and the indexes
wouldn't correctly be updated.  tableoid might be an exception to
this, but it does not really seem like a useful column to index,
giving it would be indexing the same value for each record in the
table.

The reason it's slow is that PostgreSQL is executing the function once
per input row. In this case, that must be done because you're passing
in tableoid as a function parameter, and that could change from one
row to the next (once the Append moves to the next subplan).

What you'd want to do instead is create a set-returning function that
finds all the inherited children and returns them all. This will allow
PostgreSQL to execute the function just once at the start of the
query, then join the results to this function.

Something like:

create or replace function get_inherited_tables (prelid oid) returns
setof oid as $$
with recursive c as (
   select inhrelid,inhparent from pg_inherits where inhparent = prelid
   union all
   select i.inhrelid,i.inhparent from pg_inherits i inner join c
on i.inhparent = c.inhrelid
)
select prelid union all select inhrelid from c;
$$ language sql stable;

then have your queries do:

-- get all animals that are persons
select ... from animal where tableoid in (select
get_inherited_tables('person'::regclass);

-- get all animals that are not persons
select ... from animal where tableoid not in (select
get_inherited_tables('person'::regclass);

Just be careful around search_paths and your use of regclass. In this
case, if "animal" was not in the first schema in search_path, but
someone created another table called "person" that was in the first
schema listed in search_path, then the query would not do what you
want. You might want to consider prefixing the input parameter into
get_inherited_tables with the schema name too.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: primary key and unique index

2018-03-23 Thread David Rowley
On 23 March 2018 at 20:55, Thomas Poty  wrote:
> In MySQL a  "show create table" gives me :
>  ...
>   PRIMARY KEY (`ID`,`CountryCode`,`LanguageCode`),
>   UNIQUE KEY `unique_my table_4` (`ID`,`CountryCode`,`LanguageCode`),
> ...
>
> So, In PostgreSQL, does it make sense to create a primary key AND a unique
> index based on the same columns?
> Is PostgreSQL smart enough to use the unique index created for the primary
> key.

Doing this makes no sense in PostgreSQL.  I'm struggling to imagine
why it would in MySQL.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pg_stat_statements: password in command is not obfuscated

2018-03-23 Thread David Rowley
On 24 March 2018 at 10:30, legrand legrand  wrote:
> It seems that passwords used in commands are not removed when caught by
> pg_stat_statements
> (they are not "normalized" being utility statements)
>
> exemple:
> alter role tt with password '123';
>
> select query from public.pg_stat_statements
> where query like '%password%';
>
> query
> 
> alter role tt with password '123';
>
> Do you think its a bug ?

If it is, then it's not a bug in pg_stat_statements. log_statement =
'ddl' would have kept a record of the same thing.

Perhaps the best fix would be a documentation improvement to mention
the fact and that it's best not to use plain text passwords in
CREATE/ALTER ROLE. Passwords can be md5 encrypted.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Bad Query Plans on 10.3 vs 9.6

2018-03-28 Thread David Rowley
On 29 March 2018 at 18:26, Cory Tucker  wrote:
> Hello all.  I'm migrating a database from PG 9.6 to 10.3 and have noticed a
> particular query that is performing very badly compared to its 9.6
> counterpart.
>
> The plan on 9.6 v 10.3 are effectively identical except in 9.6 the planner
> decides to use an index only scan on the primary key and in 10.3 it does a
> sequential scan.  The problem is the sequential scan is for a table of 75M
> rows and 25 columns so its quiet a lot of pages it has to traverse.

How certain are you that all the indexes match on each instance?

It would be useful to see psql's \d output for each table in question.

Another option for you to consider would be to get rid of the OR
clause in the query entirely and have two separate CTEs and INSERT
INTO your orphaned_matches table having SELECTed from both CTEs with a
UNION ALL. A record already deleted won't appear in the 2nd branch of
the UNION ALL result. However, that still might not fix your issue
with the index not being used, but you may find the anti-joined
version of the query is faster anyway.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Bad Query Plans on 10.3 vs 9.6

2018-03-29 Thread David Rowley
'On 30 March 2018 at 03:21, Cory Tucker  wrote:
>> Another possibility is that 10.3 sees the index-only scan as too expensive
>> because it thinks most of the table isn't all-visible.  Comparing
>> pg_class.relallvisible values might be informative.

> I'm happy to try to dig into this one more, however, I'm not familiar with
> this value.  What should I be looking for here?

Each table in your database has an entry in the pg_class table. Something like:

SELECT relallvisible from pg_class where oid = 'build.household'::regclass;

would show you the value, however, I think a problem here is unlikely
since that would just control the likelihood of an index-only-scan vs
an index-scan. You're getting a Seq-scan, which I imagine is going to
be quite a bit more expensive than even an index scan.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



hardcode password in connect string

2018-04-13 Thread David Gauthier
Hi:

PG v9.5.2 on RHEL

I like to use an alias to connect to my favorite DBs but don't like to
enter passwords.  I used to just disable passwords (working behind a
firewall), but this one is different.  I see nothing in the interactive
connect string where I can enter the password...

psql -h thehost -U memyselfi mydb

Missing something like "-p mypassword"

Is there a way I can stick the pw in the linux alias definition ?

Just thinking something like this may be available since hardcoded
passwords are supported in perl/dbi, pg driver

Thanks !


Re: hardcode password in connect string

2018-04-13 Thread David Gauthier
PGPASSWORD env var works fine.
Thanks !


On Fri, Apr 13, 2018 at 2:46 PM, James Keener  wrote:

> Is setting it as an environment variable an option?
> https://www.postgresql.org/docs/9.1/static/libpq-envars.html
>
> Alternatively, a service file? https://www.postgresql.org/
> docs/9.0/static/libpq-pgservice.html
>
> Jim
>
>
> On April 13, 2018 2:43:01 PM EDT, David Gauthier 
> wrote:
>>
>> Hi:
>>
>> PG v9.5.2 on RHEL
>>
>> I like to use an alias to connect to my favorite DBs but don't like to
>> enter passwords.  I used to just disable passwords (working behind a
>> firewall), but this one is different.  I see nothing in the interactive
>> connect string where I can enter the password...
>>
>> psql -h thehost -U memyselfi mydb
>>
>> Missing something like "-p mypassword"
>>
>> Is there a way I can stick the pw in the linux alias definition ?
>>
>> Just thinking something like this may be available since hardcoded
>> passwords are supported in perl/dbi, pg driver
>>
>> Thanks !
>>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>


Re: Barman versus pgBackRest

2018-04-13 Thread David Steele

Hi Thomas,

On 4/11/18 3:14 AM, Thomas Poty wrote:


Sorry for answering only now but I just saw you answer only now.

 > To be clear, I'm the original author and primary maintainer of
pgBackRest.

I am very happy to see guys like you to take time to answer me. Thank you


You are welcome.  Users are the reason I work on this project.


 >This a good feature, and one that has been requested for pgBackRest. You
 >can do this fairly trivially with ssh, however, so it generally hasn't
 >been a big deal for people.  Is there a particular reason you need this
 >feature?

The reason is probably a psychologic matter but I like the idea of a 
unique connecting point to restore DBs of different location.


I am very impatient to see "replication slot" support and "remote 
restore" feature added.


Remote restore is likely to land first, though neither feature is 
currently at the top of the list.  Unfortunately, we have limited 
resources and must prioritize.


Regards,
--
-David
da...@pgmasters.net



Can PostgreSQL create new WAL files instead of reusing old ones?

2018-04-17 Thread David Pacheco
tl;dr: We've found that under many conditions, PostgreSQL's re-use of old
WAL
files appears to significantly degrade query latency on ZFS.  The reason is
complicated and I have details below.  Has it been considered to make this
behavior tunable, to cause PostgreSQL to always create new WAL files
instead of
re-using old ones?

Context: we're running a large fleet of PostgreSQL shards.  Each shard
consists
of a primary, a synchronous standby, and an asynchronous standby using
chained
replication.  For this problem, we can consider only the primary and
synchronous standby.

PostgreSQL: 9.6.3
OS: illumos (SmartOS, mixed versions, but all from 2017 or later)
FS: ZFS over mirrored HDDs (not SSDs), using a record size of 8K to match
PostgreSQL's record size.  We have an SSD log device for completing
synchronous
writes very quickly.
WAL files are 16MB each, and we're keeping a very large number of segments.
(There's likely a lot of improvement for WAL tuning here.)

Since we're using an 8K record size, when PostgreSQL writes small (or
non-aligned) records to the WAL files, ZFS has to read the old contents in
order to write the new 8K record (read-modify-write).  If that's in cache,
that's not a big deal.  But when PostgreSQL decides to reuse an old WAL file
whose contents have been evicted from the cache (because they haven't been
used
in hours), this turns what should be a workload bottlenecked by synchronous
write
performance (that's well-optimized with our SSD log device) into a random
read
workload (that's much more expensive for any system).

What's significantly worse is that we saw this on synchronous standbys.
When
that happened, the WAL receiver was blocked on a random read from disk, and
since it's single-threaded, all write queries on the primary stop until the
random read finishes.  This is particularly bad for us when the sync is
doing
other I/O (e.g., for an autovacuum or a database backup) that causes disk
reads
to take hundreds of milliseconds.

Reusing old WAL files seems like an optimization intended for filesystems
that
allocate disk blocks up front.  With copy-on-write, it doesn't seem to make
much sense.  If instead of using an old WAL file, PostgreSQL instead just
created a new one, there would be no random reads required to complete these
operations, and we believe we'd completely eliminate our latency outliers.

Thanks,
Dave


Re: Can PostgreSQL create new WAL files instead of reusing old ones?

2018-04-17 Thread David Pacheco
On Tue, Apr 17, 2018 at 11:57 AM, Tom Lane  wrote:

> Alvaro Herrera  writes:
> > David Pacheco wrote:
> >> tl;dr: We've found that under many conditions, PostgreSQL's re-use of
> old
> >> WAL
> >> files appears to significantly degrade query latency on ZFS.  The
> reason is
> >> complicated and I have details below.  Has it been considered to make
> this
> >> behavior tunable, to cause PostgreSQL to always create new WAL files
> >> instead of re-using old ones?
>
> > I don't think this has ever been proposed, because there was no use case
> > for it.  Maybe you want to work on a patch for it?
>
> I think possibly the OP doesn't understand why it's designed that way.
> The point is not really to "recycle old WAL files", it's to avoid having
> disk space allocation occur during the critical section where we must
> PANIC on failure.  Now, of course, that doesn't really work if the
> filesystem is COW underneath, because it's allocating fresh disk space
> anyway even though semantically we're overwriting existing data.
> But what I'd like to see is a fix that deals with that somehow, rather
> than continue to accept the possibility of ENOSPC occurring inside WAL
> writes on these file systems.  I have no idea what such a fix would
> look like :-(



I think I do understand, but as you've observed, recycling WAL files to
avoid allocation relies on the implementation details of the filesystem --
details that I'd expect not to be true of any copy-on-write filesystem.  On
such systems, there may not be a way to avoid ENOSPC in special critical
sections.  (And that's not necessarily such a big deal -- to paraphrase a
colleague, ensuring that the system doesn't run out of space does not seem
like a particularly surprising or heavy burden for the operator.  It's
great that PostgreSQL can survive this event better on some systems, but
the associated tradeoffs may not be worthwhile for everybody.)  And given
that, it seems worthwhile to provide the operator an option where they take
on the risk that the database might crash if it runs out of space (assuming
the result isn't data corruption) in exchange for a potentially tremendous
improvement in tail latency and overall throughput.

To quantify this: in a recent incident, transaction latency on the primary
was degraded about 2-3x (from a p90 of about 45ms to upwards of 120ms, with
outliers exceeding 1s).  Over 95% of the outliers above 1s spent over 90%
of their time blocked on synchronous replication (based on tracing with
DTrace).  On the synchronous standby, almost 10% of the WAL receiver's wall
clock time was spent blocked on disk reads in this read-modify-write path.
The rest of the time was essentially idle -- there was plenty of headroom
in other dimensions (CPU, synchronous write performance).

Thanks,
Dave


Re: Inconsistent compilation error

2018-04-18 Thread David Rowley
On 19 April 2018 at 13:28, Melvin Davidson  wrote:
> It is not fine. You have specifed TWO percent signs (%) which requires TWO
> argumenrts,
> but you have only provided ONE -> r.

I'm confused about this statement.

Did you perhaps overlook the fact that there are in fact two
arguments? I'm unsure why you're considering
'blah_history.original_id' not to be an argument here.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Backup Strategy Advise

2018-04-24 Thread David Gauthier
Hi:  I need some advise on how best to backup a PG DB.
PG 9.5.2 on RHEL6

The requirement is to be able to restore the DB after catastrophic failure
and lose no more than the last 15 minutes worth of data.  Also, we would
like to be able to do the backups on-line (no down time).  There is no need
for PITR other than the point in time being the latest possible.

Typically, I would think doing a weekly full backup, daily incremental
backups and turn on journaling to capture what goes on since the last
backup.  When DB recovery is needed, restore up to the last daily, then
reply the journal to restore up to the last time the journal was flushed to
disk (checkpoint=15 minutes).  I'm not quite sure if something like this is
possible with PG.  I've read about the WAL file and wonder if it could be
used together with the on-line logical backups (pg_dump) to achieve the 15
minute requirement without needing downtime for physical backups..

Any advise?

Thanks in Advance.


Run external command as part of an sql statement ?

2018-05-07 Thread David Gauthier
Hi:

At the psql prompt, I can do something like...
   "select  \! id -nu"
...to get the uid of whoever's running psql.

I want to be able to run a shell command like this from within a stored
procedure.  Is there a way to do this ?

Thanks


Re: What is the C function to modify attribute

2018-05-16 Thread David Rowley
On 17 May 2018 at 15:36, a <372660...@qq.com> wrote:
> I am writing a trigger that will record all adjustment of the row into the
> last column of the row. I have done a iterator to produce text record and
> the last step would be add the text record into the last column (which
> designed as an array of user defined type, with the output function display
> the abstraction of update and a user defined function is used to retrieve
> detail update info).
>
> Now I would like to find a way that will allow me to update the last column
> in C efficiently.
>
> Can anyone help? Thank you so much!!

You might find https://www.postgresql.org/docs/10/static/triggers.html useful.

There's a complete example of a trigger function written in C there.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: limit and query planner

2018-06-05 Thread David Rowley
On 6 June 2018 at 07:17, armand pirvu  wrote:
> So since
> select count(*) from sp_i2birst_reg_staging_test;
> count
> ---
>  6860
> and
> select count(*) from sp_i2birst_reg_staging_test where evt_id = 'ACSF17'
> and status=0 ;
> count
> ---
>  4239
>
> That means to me I fetch almost the whole table and then I fall in the case 
> you described seqscan-and-sort wins over indexscan .

The planner simply assumes that 1 in (6860.0 / 4239.0) rows matches
your WHERE clause. Since you want 10 rows from the LIMIT, it thinks
it'll just need to read 17 rows from the heap to answer your query.

> explain analyze select * from sp_i2birst_reg_staging_test where evt_id = 
> 'ACSF17'
> and status=1;
>  QUERY PLAN
>  Index Scan using fooidx on sp_i2birst_reg_staging_test  (cost=0.28..202.91 
> rows=500 width=519) (actual time=0.097..0.527 rows=500 loops=1)
>Index Cond: (((evt_id)::text = 'ACSF17'::text) AND (status = 1))
>  Planning time: 1.024 ms
>  Execution time: 0.766 ms
> this gets 500 rows out of 6860
>
> explain analyze select * from sp_i2birst_reg_staging_test where evt_id = 
> 'ACSF17'
> and status=1 limit 10;
>
>QUERY PLAN
>  Limit  (cost=0.28..4.33 rows=10 width=519) (actual time=0.073..0.105 rows=10 
> loops=1)
>->  Index Scan using fooidx on sp_i2birst_reg_staging_test  
> (cost=0.28..202.91 rows=500 width=519) (actual time=0.072..0.101 rows=10 
> loops=1)
>  Index Cond: (((evt_id)::text = 'ACSF17'::text) AND (status = 1))
>  Planning time: 0.280 ms
>  Execution time: 0.173 ms
>
> Back to the original
>
> explain analyze select * from sp_i2birst_reg_staging_test where evt_id = 
> 'ACSF17'
> and status=0 limit 1 ;
>
> QUERY PLAN
>  Limit  (cost=0.00..0.13 rows=1 width=519) (actual time=0.021..0.021 rows=1 
> loops=1)
>->  Seq Scan on sp_i2birst_reg_staging_test  (cost=0.00..548.40 rows=4239 
> width=519) (actual time=0.019..0.019 rows=1 loops=1)
>  Filter: (((evt_id)::text = 'ACSF17'::text) AND (status = 0))
>  Rows Removed by Filter: 1
>  Planning time: 0.286 ms
>  Execution time: 0.110 ms

For the more restrictive status, the planner thinks more rows will
need to be looked at, which increases the cost of the seqscan, and the
planner favours the index scan.  You can see the planner estimates 500
rows will match the status=1 query. So thinks ceil(1 *  (6860 / 500.0)
* 10) = 138 rows will need looked at in the seqscan plan. That's
obviously more costly than 17 rows. So the index scan begins to look
more favourable.

The planner always assumes the rows are evenly distributed, which is
not always the case. If your ten rows were at the end of the heap,
then the seqscan would be a bad plan. In this case "Rows Removed by
Filter" would be high.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Doing a \set through perl DBI ?

2018-06-06 Thread David Gauthier
Hi:

Is there a way to do the equivalent of a "\set foo 1" through perl dbi ?
I tried...
$dbh->do("\\set foo 1");
and got a syntax error

Of course, I'd also have to be able to access the value of foo once its
set.   I'm guessing the usual way ??? (select :foo)


Thanks for any help !


Re: Doing a \set through perl DBI ?

2018-06-06 Thread David Gauthier
I think I found my own answer.  I wanted to use the current linux user's
uid as part of a query (again, this is a perl/DBI script).  I was thinking
I might be able to set a variable into the DB session somehow using \set
through DBI to accomplish this.  The solution that finally occurred to me
was to create a temporary table and have the perl script shove the uid into
a column in that table, then query against that.  Sort of like using a temp
table to store variables set from outside.

On Wed, Jun 6, 2018 at 11:46 AM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wednesday, June 6, 2018, David Gauthier 
> wrote:
>
>> Hi:
>>
>> Is there a way to do the equivalent of a "\set foo 1" through perl dbi ?
>> I tried...
>> $dbh->do("\\set foo 1");
>> and got a syntax error
>>
>> Of course, I'd also have to be able to access the value of foo once its
>> set.   I'm guessing the usual way ??? (select :foo)
>>
>>
> No.  Meta commands are psql client program only.  You'd need to explain
> why you want this to get reasonable server-side suggestions.
>
> David J.
>
>


Re: Can PostgreSQL create new WAL files instead of reusing old ones?

2018-06-21 Thread David Pacheco
On Wed, Jun 20, 2018 at 10:35 AM, Jerry Jelinek 
wrote:

> As Dave described in his original email on this topic, we'd like to avoid
> recycling WAL files since that can cause performance issues when we have a
> read-modify-write on a file that has dropped out of the cache.
>
> I have implemented a small change to allow WAL recycling to be disabled.
> It is visible at:
> https://cr.joyent.us/#/c/4263/
>
> I'd appreciate getting any feedback on this.
>
> Thanks,
> Jerry
>
>

For reference, there's more context in this thread from several months ago:
https://www.postgresql.org/message-id/flat/CACukRjO7DJvub8e2AijOayj8BfKK3XXBTwu3KKARiTr67M3E3w%40mail.gmail.com#cacukrjo7djvub8e2aijoayj8bfkk3xxbtwu3kkaritr67m3...@mail.gmail.com

I'll repeat the relevant summary here:

tl;dr: We've found that under many conditions, PostgreSQL's re-use of old
> WAL files appears to significantly degrade query latency on ZFS.  The
> reason is
> complicated and I have details below.  Has it been considered to make this
> behavior tunable, to cause PostgreSQL to always create new WAL files
> instead of re-using old ones?


Thanks,
Dave


Re: help understanding create statistic

2018-06-28 Thread David Rowley
On 28 June 2018 at 21:38, Luca Ferrari  wrote:
> CREATE INDEX idx_year
> ON expenses ( EXTRACT( year FROM day ) );
>
> why is the planner not choosing to use such index on a 'year' raw query?
>
> EXPLAIN SELECT * FROM expenses
> WHERE year = 2016;

The expression in the where clause must match the indexed expression.
You'd need to add an index on just (year) for that to work.

> Am I misunderstaing this functional dependency?

Yeah, the statistics are just there to drive the planner's costing.
They won't serve as proof for anything else.

All you've done by creating those stats is to allow better estimates
for queries such as:

SELECT * FROM expenses WHERE day = '2018-06-28' and year = 2018;

> stxdependencies | {"3 => 5": 1.00}

It would appear that "3" is the attnum for day and "5" is year. All
that tells the planner is that on the records sampled during analyze
is that each "day" had about exactly 1 year.

There's nothing then to stop you going and adding a record with the
day '2017-01-01' and the year 2018. The stats will remain the same
until you analyze the table again.

If those stats didn't exist, the planner would have multiplied the
selectivity estimates of each item in the WHERE clause individually.
So if about 10% of records had year=2018, and 0.01% had '2018-06-28',
then the selectivity would have been 0.1 *  0.001.   With a functional
dependency of 1, the selectivity just becomes 0.001.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Convert Existing Table to a Partition Table in PG10

2018-07-01 Thread David Rowley
On 1 July 2018 at 10:15, Clifford Snow  wrote:
> I also leaned that my range partition value I used on a timestamp needed to
> have fractional seconds. I used a range of 2017-01-01 00:00:00 to
> 2017-23:59:59 which failed when I attempted to add a record that had a
> timestamp of 2017-23:59:59. Adding a fractional second to the range solved
> the problem.

Please be aware that with RANGE partitions the upper bound is
non-inclusive. The lower bound is inclusive.

If you want a 2017 partition, then FOR VALUES FROM ('2017-01-01') TO
('2018-01-01') will allow all 2017 timestamps and only 2017
timestamps.

You've no need to consider precision of the type and how many 9's you
add to anything here.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: FK v.s unique indexes

2018-07-03 Thread David Rowley
On 3 July 2018 at 19:30, Rafal Pietrak  wrote:
> tst=# create table test1(load bigint, a int, b int, c bool) partition by
> list (c);
> CREATE TABLE
> tst=# create table test1_true  partition of test1 for values in (true);
> CREATE TABLE
> tst=# create table test1_false  partition of test1 for values in (false);
> CREATE TABLE
> tst=# create unique index load ON test1 (load,a,b,c);
> CREATE INDEX
> tst=# create table info_text1 (load text, a int, b int, c bool, info
> text,  foreign key (load,a,b,c) references test1(load,a,b,c)) ;
> ERROR:  cannot reference partitioned table "test1"
> --
>
> Why is this forbidden?

I don't think there were any actual roadblocks, it was more of just
not enough time in the cycle to make it work due to a few technical
details that required extra effort to make work.

Alvaro managed to simplify the problem and allow foreign keys to be
defined on partitioned tables and get that into PG11.

So it was a case of 50% is better than 0%, which I very agree with.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pg_dump out of memory

2018-07-03 Thread David Rowley
On 4 July 2018 at 14:43, Andy Colson  wrote:
> I moved a physical box to a VM, and set its memory to 1Gig.  Everything
> runs fine except one backup:
>
>
> /pub/backup# pg_dump -Fc -U postgres -f wildfire.backup wildfirep
>
> g_dump: Dumping the contents of table "ofrrds" failed: PQgetResult() failed.
> pg_dump: Error message from server: ERROR:  out of memory
> DETAIL:  Failed on request of size 1073741823.> pg_dump: The command was: 
> COPY public.ofrrds (id, updateddate, bytes) TO
> stdout;

There will be less memory pressure on the server if the pg_dump was
performed from another host. When running pg_dump locally the 290MB
bytea value will be allocated in both the backend process pg_dump is
using and pg_dump itself. Running the backup remotely won't require
the latter to be allocated on the server.

> I've been reducing my memory settings:
>
> maintenance_work_mem = 80MB
> work_mem = 5MB
> shared_buffers = 200MB

You may also get it to work by reducing shared_buffers further.
work_mem won't have any affect, neither will maintenance_work_mem.

Failing that, the suggestions of more RAM and/or swap look good.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



As a table owner, can I grant "grant" ?

2018-07-05 Thread David Gauthier
Postgres 9.5.2 on linux

Given that I am not superuser, but DO have createrole...
Can I grant some other role the ability to grant access to a table I
created ?

For Example:  I create a table called foo.  As the creator/owner of this
table, I seem to have the ability to grant select,insert,update,delete
(etc...) to other roles.  I would like to grant some other role the ability
to grant access to my table.

Thanks in Advance !


Re: How to set array element to null value

2018-07-09 Thread David Fetter
On Mon, Jul 09, 2018 at 03:28:45PM +0530, Brahmam Eswar wrote:
> I'm trying to reset array element to null.

You can do this in SQL as follows:

SELECT ARRAY(
SELECT CASE e WHEN 'ABC' THEN NULL ELSE e
FROM UNNEST(x) _(e)
)

This should really be going to pgsql-general because to is about how
to use PostgreSQL, not how to change PostgreSQL.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



timestamp (military) at time zone without the suffix

2018-07-11 Thread David Gauthier
Hi:

I would like to get the utc timestamp, 24-hr clock (military time), without
the time zone suffix.

Below commands were run nearly at the same time...

sqfdev=> select now()::timestamp(0)  ;
 now
-
 2018-07-11 15:27:12
(1 row)

...then immediately...

sqfdev=> select now()::timestamp(0) at time zone 'utc' ;
timezone

 2018-07-11 11:27:12-04
(1 row)


15:27:12 makes sense (it's a bout 3:30 in the afternoon EST).
11:27:12 doesn't make sense.  UTC is 5 hours ahead.  I would have expected
either 20:27 (if it stuck to military time, which I want), or 08:27 (P.M.,
non-military time)

And I want to get rid of the -04 suffix.

Is there a way to do this ?

Thanks !


Re: timestamp (military) at time zone without the suffix

2018-07-11 Thread David Gauthier
OK, the "to_char" gets rid of the timezone extension.  But the times still
don't make sense.

UTC should be 5 hours ahead, not behind.  It should be EST plus 5 hours (or
4 for DST), not minus.  That's why I said I expected 20:27 .


When I go to store this in a DB, I want to store the UTC time.  How d I do
that ?

insert into foo (dt) values (localtimestamp(0) at time zone 'utc') ???


On Wed, Jul 11, 2018 at 3:45 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wednesday, July 11, 2018, David Gauthier 
> wrote:
>
>> Hi:
>>
>> I would like to get the utc timestamp, 24-hr clock (military time),
>> without the time zone suffix.
>>
>> Below commands were run nearly at the same time...
>>
>> sqfdev=> select now()::timestamp(0)  ;
>>  now
>> -
>>  2018-07-11 15:27:12
>> (1 row)
>>
>> ...then immediately...
>>
>> sqfdev=> select now()::timestamp(0) at time zone 'utc' ;
>> timezone
>> 
>>  2018-07-11 11:27:12-04
>> (1 row)
>>
>>
>> 15:27:12 makes sense (it's a bout 3:30 in the afternoon EST).
>> 11:27:12 doesn't make sense.  UTC is 5 hours ahead.
>>
>
> Apparently it's only four hours ahead of your server's time zone setting.
>
>
>>
>   I would have expected either 20:27 (if it stuck to military time, which
>> I want), or 08:27 (P.M., non-military time)
>>
>> And I want to get rid of the -04 suffix.
>>
>> Is there a way to do this ?
>>
>
> Specify an appropriate format string with the to_char function.
>
>  https://www.postgresql.org/docs/10/static/functions-formatting.html
>
> David J.
>
>


Re: timestamp (military) at time zone without the suffix

2018-07-11 Thread David Gauthier
 Thanks Everyone, they all work, but TL's seems to be the simplest...
select current_timestamp(0) at time zone 'utc'

I'm kinda stuck with the timestamp data type (vs timestamptz).  Wondering
if I can stick with that.

One last question...
I want to store the current UTC date/time in the DB.  Does PG
unconditionally store something like UTC, then let the queries figure out
how they want to look at it (with "at time zone" and "to_char()" etc...) ?
Or do I have to intentionally store the UTC value somehow?

Right now the code is just inserting and updating records using
"localtimestamp(0)".




On Wed, Jul 11, 2018 at 4:11 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wednesday, July 11, 2018, David Gauthier 
> wrote:
>
>> OK, the "to_char" gets rid of the timezone extension.  But the times
>> still don't make sense.
>>
>> When I go to store this in a DB, I want to store the UTC time.  How d I
>> do that ?
>>
>
> Use the data type that represents exactly that, timestamptz.  Using the
> timestamp data type is generally not what you want even if you can get the
> manipulation logic figured out.
>
> David J.
>


sorting/comparing column values in non-alphanumeric sorting ways ?

2018-07-11 Thread David Gauthier
Hi:

I have a table listing tools and tool versions for a number of different
tool configurations.  Something like this...

create table tv (tool text, tcfg1 test, tcfg2 text, tcfg3 text, highestver
text);

insert into tv (tool,tcfg1mtcfg2,tcfg3) values
('tool_a','1.0.5b','1.0.10','1.0.9');

I want to load the 'highestver' column with the highest version of tcfg1-3.

This won't work...
update tv set greatest = greatest(tcfg1,tcfg2,tcfg3)
...because it thinks 1.0.9 is greater than 1.0.10

Is there a way to get this to work right ?

Thanks in Advance !


Re: Optimizing execution of expensive subqueries

2018-07-14 Thread David Rowley
On 12 July 2018 at 02:24, Mathieu Fenniak  wrote:
> I'm currently looking at a query that is generally selecting a bunch of
> simple columns from a table, and also performing some subqueries to
> aggregate related data, and then sorting by one of the simple columns and
> paginating the result.
>
> eg.
>
> SELECT
>   tbl.field1, tbl.field2, tbl.field3, ...,
>   (SELECT SUM(Duration) FROM anothertbl WHERE anothertbl.UserId = tbl.UserId
> AND anothertbl.ThingyId = 1) as Thingy1Sum,
>   ... repeat for multiply thingies ...
> FROM
>   tbl
> ORDER BY tbl.field1 LIMIT 20
>
> I'm finding that if "tbl" contains hundreds of thousands of rows, the
> subqueries are being executed hundreds of thousands of times.  Because of
> the sorting and pagination, this is appears to be unnecessary, and the
> result is slow performance.  (PostgreSQL 9.5.9 server)

You've got two choices.

1) You can add a btree index on field1 so that the executor does not
need to examine all records before taking the top-20, or;
2) move the subquery out of the target list and instead make it a LEFT
JOIN adding an appropriate GROUP BY clause.

#2 might not be a great option since it may require building groups
that don't get used, but it would likely be the bast option if you
didn't have a LIMIT clause, or the LIMIT was a larger percentage of
the total records.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Finding out why parallel queries not avoided

2018-07-21 Thread David Rowley
On 21 July 2018 at 20:15, Didier Carlier  wrote:
> explain select count(*) from calendar c1, calendar c2, measure m where
>  c1.stddate='2015-01-01' and c2.stddate='2015-12-31' and m.fromdateid 
> >=c1.calendarid and m.fromdateid < c2.calendarid;
>   QUERY PLAN
> --
>  Aggregate  (cost=5073362.73..5073362.74 rows=1 width=8)
>->  Nested Loop  (cost=8718.47..4988195.81 rows=34066770 width=0)
>  ->  Index Scan using calendar_stddate_unique on calendar c2  
> (cost=0.28..2.30 rows=1 width=4)
>Index Cond: (stddate = '2015-12-31 00:00:00+01'::timestamp 
> with time zone)
>  ->  Nested Loop  (cost=8718.19..4647525.81 rows=34066770 width=4)
>->  Index Scan using calendar_stddate_unique on calendar c1  
> (cost=0.28..2.30 rows=1 width=4)
>  Index Cond: (stddate = '2015-01-01 
> 00:00:00+01'::timestamp with time zone)
>->  Bitmap Heap Scan on measure m  (cost=8717.91..4306855.81 
> rows=34066770 width=4)
>  Recheck Cond: ((fromdateid >= c1.calendarid) AND 
> (fromdateid < c2.calendarid))
>  ->  Bitmap Index Scan on idx_measure_fromdate  
> (cost=0.00..201.22 rows=34072527 width=0)
>Index Cond: ((fromdateid >= c1.calendarid) AND 
> (fromdateid < c2.calendarid))
>
> Both queries return the same answers but I don't see why the second one 
> doesn't use parallel query.

You'd likely be better of writing the query as:

select count(*) from measure where fromdateid >= (select calendarid
from calendar where stddate = '2015-01-01') and fromdateid < (select
calendarid from calendar where stddate = '2015-12-31');

The reason you get the poor nested loop plan is that nested loop is
the only join method that supports non-equijoin.

Unsure why you didn't get a parallel plan. Parallel in pg10 supports a
few more plan shapes than 9.6 did. Unsure what version you're using.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Need text wrapping inside a column to avoid record wrapping (tabular output of 'select')

2018-07-25 Thread David Gauthier
Hi:

psql (9.6.7, server 9.1.9)  on RHEL6

In order to avoid record wrapping in the tabular output of a "select"
statement, I need to limit the width of certain columns.  For those
columns, I would like to have text wrapping so as not to lose any
information (IOW, I don't want to simply truncatate and lose info).

Example:

name   |age  |life_story
|favorite_color
---+-+--+--
madame marie   | 123 | She was born a long time ago, blah, blah,| yellow
   | | blah, blah, blah, blah, blah, blah, blah,|
   | | blah, more-more-mmore-more-more, |
   | | andmore-andmore-andmore-andmore, blah,   |
   | | blah, blah, blah, blah, blah, blah, blah |
   | | and now she's 123 years old  |
---+-+--+---

... that sort of thing.  Also, line breaks in the "life_story" where they
make sense, like on spaces, as you see in lines 3&4 of that column

Is there an easy way to configure PG to do this ?

Thanks in Advance !


Re: Need text wrapping inside a column to avoid record wrapping (tabular output of 'select')

2018-07-25 Thread David Gauthier
That does it.  Good enough, despite the non-white space wrapping thing.
Thanks !

On Wed, Jul 25, 2018 at 12:53 PM, Adrian Klaver 
wrote:

> On 07/25/2018 09:40 AM, David Gauthier wrote:
>
>> Hi:
>>
>> psql (9.6.7, server 9.1.9)  on RHEL6
>>
>> In order to avoid record wrapping in the tabular output of a "select"
>> statement, I need to limit the width of certain columns.  For those
>> columns, I would like to have text wrapping so as not to lose any
>> information (IOW, I don't want to simply truncatate and lose info).
>>
>> Example:
>>
>> name   |age  |life_story
>> |favorite_color
>> ---+-+--
>> +--
>> madame marie   | 123 | She was born a long time ago, blah, blah,| yellow
>> | | blah, blah, blah, blah, blah, blah, blah,|
>> | | blah, more-more-mmore-more-more, |
>> | | andmore-andmore-andmore-andmore, blah,   |
>> | | blah, blah, blah, blah, blah, blah, blah |
>> | | and now she's 123 years old  |
>> ---+-+--
>> +---
>>
>> ... that sort of thing.  Also, line breaks in the "life_story" where they
>> make sense, like on spaces, as you see in lines 3&4 of that column
>>
>> Is there an easy way to configure PG to do this ?
>>
>
> In psql:
>
> create table wrap_test(name varchar, age integer, life_story varchar,
> favorite_color varchar);
>
> insert into wrap_test values ('madame marie', 123, 'She was born a long
> time ago, blah, blah, yellow eblah, blah, blah, blah, blah, blah, blah,
> blah, more-more-mmore-more-more, andmore-andmore-andmore-andmore, blah,
> blah, blah, blah, blah, blah, blah, blah and now she''s 123 years old',
> 'yellow');
>
> \pset format wrapped
>
> select * from wrap_test ;
>
> I am not showing output as email client distorts it.
>
>
>> Thanks in Advance !
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Restore relhaspkey in PostgreSQL Version 11 Beta

2018-07-30 Thread David Rowley
On 31 July 2018 at 11:11, Melvin Davidson  wrote:
>> If you want stability use information_schema which we'll try very hard to
>> not ever break.
> Of course. Would you be so kind as to point out where in the
> information_schema  it
> indicates if a table has a primary key or not. Oh wait, now I remember...no
> place.

With all due respect Sir, you're making a fool of yourself here.  I'd
suggest that before you debate or argue with people that you ensure
that you're correct. This can often be hard to do on the spot, but
excuses dwindle a bit more when the communication is asynchronous via
email.

It's not that difficult to find information_schema.table_constraints
and see that constraint_type has "PRIMARY KEY"

>>Keeping random atavistic things around, would slow us down, which will be a
>>price everybody is paying.
> Random atavistic things? I hardly think relhaspkey is random. It's been
> there since version 7.2.
> Exactly how does keeping it around slow you/us down?

Well, it's been known that some people misuse relhaspkey. For example,
in [1], someone is recommending to use relhaspkey to check for tables
which don't have a PRIMARY KEY constraint.  This was the wrong advice
as the flag could remain set after the primary key was dropped from
the table and before any vacuum took place on the table. The correct
advice should have been the same as what Tom mentioned above, by
checking for a pg_index record for the table with indisprimary as
true.  Alternatively, another useful response would have been to check
information_schema, which would have provided an SQL standard way to
check.

Now, in regards to [1]. I rather hope that you can sympathize with the
decision to remove the column a little as the person who made the
incorrect recommendation in [1] was none other than you yourself.  So
it seems that you've only assisted in contributing to the columns
removal by not only misusing it yourself but also instructing others,
publically to do the same.

Now, in regards to your general tone here. It appears you're under the
assumption that the column was removed for some malicious reason in
order to break people's scripts, but I can assure you, there was no
malicious intent involved. However, the column *was* removed exactly
in order to break queries.  The reason being it was most likely the
queries were already broken and we deemed the problem big enough to
remove the misleading column in order to let people know their queries
were broken.  Your argument to put the column back carries very little
weight, as it appears your script is trying to determine which tables
have no primary key incorrectly. So I'd recommend that, instead of
expending some keystrokes in replying to this email, that instead, you
spend them fixing your broken code. Tom has kindly given you a very
good starting point too.

Personally, if I had been using a query like yours, I'd be thanking
Peter for highlighting it was broken for me.

If you'd like something else to read, please also look at [2]. I
imagine this is the sort of thing that Andres is talking about.

[1] 
https://www.postgresql.org/message-id/CANu8FiyQsQg7bF3FPT+FU=kK=wjhfewpp+6qe9fxf6yxr+w...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/20140317185255.20724.49675%40wrigleys.postgresql.org

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Template0 datfrozenxid age is 160million and progressing

2018-08-01 Thread David Rowley
On 2 August 2018 at 02:31, Rijo Roy  wrote:
> I agree.. But unfortunately it is the business call and we cannot alter it..
> I am planning to convince them by keeping autovacuum_analyze_threshold to a
> high value so that auto analyse will not kick in very often leaving the
> autovacuum to do its job..
> Please advise.

If autovacuum is disabled, it'll still kick in for any anti-wraparound
work that needs to be performed.

This is also mentioned in the docs:

"Note that even when this parameter is disabled, the system will
launch autovacuum processes if necessary to prevent transaction ID
wraparound. See Section 24.1.5 for more information."

https://www.postgresql.org/docs/10/static/runtime-config-autovacuum.html

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Weird behaviour of the planner

2018-08-01 Thread David Rowley
On 2 August 2018 at 02:48, Guillaume Lelarge  wrote:
> EXPLAIN (ANALYZE) SELECT DISTINCT * FROM gleu2;
>
>QUERY PLAN
> -
>  Unique  (cost=12005.97..12049.20 rows=1 width=1430) (actual
> time=20055.294..20323.348 rows=1 loops=1)
>->  Sort  (cost=12005.97..12006.30 rows=132 width=1430) (actual
> time=20055.290..20105.738 rows=6 loops=1)
>  Sort Key: (... 130 columns ...)
>  Sort Method: external sort  Disk: 84464kB
>  ->  Seq Scan on gleu2  (cost=0.00..12001.32 rows=132 width=1430)
> (actual time=0.109..114.142 rows=6 loops=1)
>  Planning time: 10.012 ms
>  Execution time: 20337.854 ms
> (7 rows)
>
> That looks quite good. The total cost is 12049, so I expect this plan to
> have the smaller cost as it's the choosen plan. Now, I'm disabling Sort, and
> here is what I get:
>
> SET enable_sort TO off;
> EXPLAIN (ANALYZE) SELECT DISTINCT * FROM gleu2;
>
>QUERY PLAN
> -
>  HashAggregate  (cost=12044.22..12044.23 rows=1 width=1430) (actual
> time=508.342..508.343 rows=1 loops=1)
>Group Key: (... 130 columns ...)
>->  Seq Scan on gleu2  (cost=0.00..12001.32 rows=132 width=1430) (actual
> time=0.036..57.088 rows=6 loops=1)
>  Planning time: 1.335 ms
>  Execution time: 508.529 ms
> (5 rows)

When the planner adds a new path it compares the cost not exactly, but
with a 'fuzz factor'.  It's very likely that the hashagg path did not
make it as it was not fuzzily any cheaper than the unique path. By
default, this fuzz factor is 1%.

It seems in your case the costs don't quite match reality which is
quite likely due to the poor row estimates on "gleu2".  Has that table
been analyzed recently? or is there some reason that auto-vacuum is
not getting to it?

There's a bit more reading of what I'm talking about in
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/util/pathnode.c#L141

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: ProjectSet

2018-08-02 Thread David Rowley
On 2 August 2018 at 21:17, Oliver Kohll  wrote:
> Is there an explanation of ProjectSet anywhere?

Plan node types and what they each do are not very well documented
outside of the source code.

ProjectSet appears when the SELECT or ORDER BY clause of the query.
They basically just execute the set-returning function(s) for each
tuple until none of the functions return any more records.

Simple Example:

EXPLAIN SELECT generate_series(1,2);
   QUERY PLAN
-
 ProjectSet  (cost=0.00..5.02 rows=1000 width=4)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
(2 rows)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: ProjectSet

2018-08-02 Thread David Rowley
On 3 August 2018 at 01:16, Oliver Kohll  wrote:
> Ah thanks David, at least I know what it is now. I don't think I have any
> set returning functions though will double check in case there's a joined
> view that has one. Perhaps it could be the something to do with cross
> product which similarly creates multiple rows on the right for each row on
> the left side.

postgres=# select proretset from pg_proc where proname = 'regexp_matches';
 proretset
---
 t
 t
(2 rows)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Instead trigger on a view to update base tables ?

2018-08-07 Thread Day, David
In a  view of three joined tables,  I install a  INSTEAD OF trigger fx  on the 
view.  The fx contains a  list of felds/columns variable associated to each 
base tables.
When an update operation occurs, I am successfully generating the target list 
of colums altered on 
Each base table.  ( comparing OLD v NEW ) and attempting some dynamic sql 
generation in my trigger fx.
 

I am taking the list of modified fields on the view, and attempting an update 
on appropriate  base tables.
In this sample case "language_preference" was  modified on the view and should 
update the admn.user base table

EXECUTE format(' UPDATE admin.user SET (%I) = ( SELECT %I FROM $1 )  WHERE id = 
$2)', USER_SETTING, USER_SETTING )
USING NEW,  NEW.id;

When this executes my exception handler generates "err syntax error at or near 
\"$1\"

The formatted statement  on my base table (admin.user )  that is throwing this 
is  executing would be:
UPDATE admin.user SET (language_preference) = ( SELECT language_preference FROM 
$1 ) WHERE id = $2)"

Feel Like Im close but missing something fundamental.

I also an  update variant

UPDATE [ ONLY ] table_name [ * ] [ [ AS ] alias ]
SET 
  ( column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT } [, ...] )

Which I thought might be applicable.  but still googling for sample 
implementation.


Thanks for any guidance in this method or better methods to update the base 
tables.


Regards


Dave Day






'Identifier' columns

2018-08-13 Thread David Favro
A couple of questions about auto-assigned identifier columns, 
forgive my ignorance, I'm used to other methods to create IDs...


1. If creating a new application [i.e. no "legacy" reasons to do 
anything] using PostgreSQL 10, when creating an "auto-assigned 
integer ID" column, what are the advantages/disadvantages of using 
the 'SERIAL' datatype [or equivalent explicitly created SEQUENCE w/ 
nextval() used as default for column] versus the SQL-standard 
'integer GENERATED AS IDENTITY'?  All other things being equal, it 
would seem a no-brainer to follow the standard.


2. When using the SQL-standard 'integer GENERATED AS IDENTITY' 
column, after inserting a column, what is the recommended method to 
find the ID of the just-inserted row?  Is there no SQL-standard way?  
The docs seem to imply (without explicitly stating) that a SEQUENCE 
is used behind the scenes hence 'currval()' could be used, but I 
didn't see in the docs any mention of what the underlying sequence's 
name is, or how to specify a name.  Perhaps 'lastval()' would work, 
but not in all cases and in any event it has a sloppy feel to me.


Thank you in advance for any advice that can be offered.

-- David



Re: During promotion, new master tries to archive same segment twice

2018-08-15 Thread David Steele
Hi Phil,

On 8/15/18 4:25 PM, Phil Endecott wrote:
> 
> Questions:
> 
> - Can all of this be blamed on the failure of the first attempt 
> to promote due to the wrong permissions on recovery.conf?

Yes, it looks that way.

> - Should my archive_command detect the case where it is asked to 
> write the same file again with the same contents, and report 
> success in that case?

Yes.  pgBackRest has done this for years and it saves a *lot* of headaches.

> - Is this a bug?

I don't think so.  There are a number of cases where the same WAL
segment can be pushed more than once, especially after failures where
Postgres is not sure that the command completed.  The archive command
should handle this gracefully.

Regards,
-- 
-David
da...@pgmasters.net



Re: During promotion, new master tries to archive same segment twice

2018-08-16 Thread David Steele
On 8/16/18 4:37 AM, Phil Endecott wrote:
> David Steele wrote:
>> On 8/15/18 4:25 PM, Phil Endecott wrote:
>>> - Should my archive_command detect the case where it is asked to
>>> write the same file again with the same contents, and report success
>>> in that case?
>>
>> Yes.
> 
>> There are a number of cases where the same WAL
>> segment can be pushed more than once, especially after failures where
>> Postgres is not sure that the command completed.  The archive command
>> should handle this gracefully.
> 
> Hmm, OK.  Here's what the current docs say:
> 
> Section 25.3.1:
> 
> "The archive command should generally be designed to refuse to
> overwrite any pre-existing archive file. This is an important
> safety feature to preserve the integrity of your archive in case
> of administrator error (such as sending the output of two
> different servers to the same archive directory).
> 
> It is advisable to test your proposed archive command to ensure
> that it indeed does not overwrite an existing file, and that it
> returns nonzero status in this case."
> 
> And section 26.2.9:
> 
> "When continuous WAL archiving is used in a standby, there
> are two different scenarios: the WAL archive can be shared
> between the primary and the standby, or the standby can
> have its own WAL archive.  When the standby has its own WAL
> archive, set archive_mode to always, and the standby will call
> the archive command for every WAL segment it receives, whether
> it's by restoring from the archive or by streaming replication.
> The shared archive can be handled similarly, but the
> archive_command must test if the file being archived exists
> already, and if the existing file has identical contents.
> This requires more care in the archive_command, as it must be
> careful to not overwrite an existing file with different contents,
> but return success if the exactly same file is archived twice.
> And all that must be done free of race conditions, if two
> servers attempt to archive the same file at the same time."
> 
> So you're saying that that's wrong, and that I must always
> handle the case when the same WAL segment is written twice.

Seems like an omission in section 25.3.1 rather than a problem in 26.2.9.

Duplicate WAL is possible in *all* cases.  A trivial example is that
Postgres calls archive_command and it succeeds but an error happens
(e.g. network) right before Postgres is notified.  It will wait a bit
and try the same WAL segment again.

> I'll file a bug against the documentation.

OK.

>> pgBackRest has done this for years and it saves a *lot* of headaches.
> 
> The system to which I am sending the WAL files is a rsync.net
> account.  I use it because of its reliability, but methods for
> transferring files are limited largely to things like scp and
> rsync.

Rsync and scp are not good tools to use for backup because there is no
guarantee of durability, i.e. the file is not synced to disk before
success is returned.  rsync.net may have durability guarantees but you
should verify that with them.

Even so, crafting a safe archive_command using these tools is going to
be very tricky.

Regards,
-- 
-David
da...@pgmasters.net



Can Pg somehow recognize/honor linux groups to control user access ?

2018-08-22 Thread David Gauthier
Hi:

The title says it all.  I need to be control who can gain access to a DB
based on a linux user group.  I can set up a generic role and password, but
also want to prevent users who are not in a specific linux group from
accessing the DB.  For code that works with the DB, this is easy (just
chmod the group on the code file(s)).  But is there a way to add an
additional gauntlet that checks membership in the linux group if, for
example, they were trying to get in using psql at the linux prompt ?

There are a couple hundred in the linux group and the list of names changes
constantly. I suppose creating a DB role per user in the linux group may be
possible if something like a cron was maintaining this (creating/dropping
uid based roles as the group membership changes) then give everyone the
same password.  But does that prevent someone outside the linux group from
just logging in with someone else's uid and the generic password?

I'm hoping that this is a common need and that someone has a good solution.

Thanks in Advance for any help!


unorthodox use of PG for a customer

2018-08-24 Thread David Gauthier
Hi Everyone:

I'm going to throw this internal customer request out for ideas, even
though I think it's a bit crazy.  I'm on the brink of telling him it's
impractical and/or inadvisable.  But maybe someone has a solution.

He's writing a script/program that runs on a workstation and needs to write
data to a DB.  This process also sends work to a batch system on a server
farm external to the workstation that will create multiple, parallel
jobs/processes that also have to write to the DB as well. The workstation
may have many of these jobs running at the same time.  And there are 58
workstation which all have/use locally mounted disks for this work.

At first blush, this is easy.  Just create a DB on a server and have all
those clients work with it.  But he's also adamant about having the DB on
the same server(s) that ran the script AND on the locally mounted disk.  He
said he doesn't want the overhead, dependencies and worries of anything
like an external DB with a DBA, etc... . He also wants this to be fast.

My first thought was SQLite.  Apparently, they now have some sort of
multiple, concurrent write ability.  But there's no way those batch jobs on
remote machines are going to be able to get at the locally mounted disk on
the workstation. So I dismissed that idea.  Then I thought about having 58
PG installs, one per workstation, each serving all the jobs pertaining to
that workstation.  That could work.  But 58 DB instances ?  If he didn't
like the ideal of one DBA, 58 can't be good.  Still, the DB would be on the
workstation which seems to be what he wants.

I can't think of anything better.  Does anyone have any ideas?

Thanks in Advance !


Re: unorthodox use of PG for a customer

2018-08-24 Thread David Gauthier
I tried to convince him of the wisdom of one central DB.  I'll try again.

>>So are the 58 database(stores) on the workstation going to be working
with data independent to each or is the data shared/synced between
instances?

No, 58 workstations, each with its own DB.  There's a concept of a
"workarea" (really a dir with a lot of stuff in it) where the script runs.
He wants to tie all the runs for any one workarea together and is stuck on
the idea that there should be a separate DB per workarea.  I told him you
could just stick all the data in the same table just with a "workarea"
column to distinguish between the workareas.  He likes the idea of a
separate DB per workarea.  He just doesn't gt it.


>>I'm no expert, but I've dozens of PostgreSQL databases running mostly
without manual maintenance for years.

Ya, I've sort of had the same experience with PG DBs.  Like the everready
bunny, they just keep on running.  But these workstations are pretty
volatile as they keep overloading them and crash them.  Of course any DB
running would die too and have to be restarted/recovered.  So the place
for  the DB is really elsewhere, on an external server that wouldn't be
subject to this volatility and crashing.  I told him about transactions and
how you could prevent partial writing of data sets.

So far, I'm not hearing of anything that looks like a solution given the
constraints he's put on this.  Don't get me wrong, he's a very smart and
sharp software engineer.  Very smart.  But for some reason, he doesn't like
the client/server DB model which would work so nicely here.  I'm just
trying to make sure I didn't miss some sort of solution, PG or not, that
would work here.

Thanks for your interest and input everyone !




On Fri, Aug 24, 2018 at 2:39 PM Edson Carlos Ericksson Richter <
rich...@simkorp.com.br> wrote:

> Em 24/08/2018 15:18, David Gauthier escreveu:
> > Hi Everyone:
> >
> > I'm going to throw this internal customer request out for ideas, even
> > though I think it's a bit crazy.  I'm on the brink of telling him it's
> > impractical and/or inadvisable.  But maybe someone has a solution.
> >
> > He's writing a script/program that runs on a workstation and needs to
> > write data to a DB.  This process also sends work to a batch system on
> > a server farm external to the workstation that will create multiple,
> > parallel jobs/processes that also have to write to the DB as well. The
> > workstation may have many of these jobs running at the same time.  And
> > there are 58 workstation which all have/use locally mounted disks for
> > this work.
> >
> > At first blush, this is easy.  Just create a DB on a server and have
> > all those clients work with it.  But he's also adamant about having
> > the DB on the same server(s) that ran the script AND on the locally
> > mounted disk.  He said he doesn't want the overhead, dependencies and
> > worries of anything like an external DB with a DBA, etc... . He also
> > wants this to be fast.
> > My first thought was SQLite.  Apparently, they now have some sort of
> > multiple, concurrent write ability.  But there's no way those batch
> > jobs on remote machines are going to be able to get at the locally
> > mounted disk on the workstation. So I dismissed that idea. Then I
> > thought about having 58 PG installs, one per workstation, each serving
> > all the jobs pertaining to that workstation.  That could work.  But 58
> > DB instances ?  If he didn't like the ideal of one DBA, 58 can't be
> > good.  Still, the DB would be on the workstation which seems to be
> > what he wants.
> > I can't think of anything better.  Does anyone have any ideas?
> >
> > Thanks in Advance !
> >
>
> I'm no expert, but I've dozens of PostgreSQL databases running mostly
> without manual maintenance for years, just do the backups, and you are
> fine.
> In any way, if you need any kind of maintenance, you can program it in
> your app (even backup, restore and vacuum) - it is easy to throw
> administrative commands thru the available interfaces.
> And if the database get out of access, no matter if it is centralized or
> remote: you will need someone phisically there to fix it.
> AFAIK, you don't even PostgreSQL installer - you can run it embed if you
> wish.
>
> Just my2c,
>
> Edson
>
>
>


Re: Will there ever be support for Row Level Security on Materialized Views?

2018-08-27 Thread David Rowley
On 28 August 2018 at 09:58, Ken Tanzer  wrote:
> On Tue, Aug 14, 2018 at 5:24 PM Ken Tanzer  wrote:
>> Basically, I'm wondering whether materialized views are likely to ever 
>> support row-level security.
>
> Hi.  Just wanted to follow up on my own email and see if anyone could answer 
> this.

You might think there's some master project-wide list of things that
are to implement that we all work from, but there's not. There is a
todo list [1] that might serve as some sort of guide for a new person
who wishes to contribute something, who's looking for inspiration as
to what to contribute, but I don't think that list ever goes much
beyond that.

New features normally appear for one of two reasons:

1) A developer gets inspired to make something happen; or
2) Someone pays a developer to make something happen.

So I guess in this case either 1 or 2 has not happened, or they have
but the correct people have either not seen your email or they just
simply don't want to answer.

If you're desperate for the feature, I'd recommend looking into the
possibility of going with 1 or 2.

[1] https://wiki.postgresql.org/wiki/Todo



Re: Erroneous behavior of primary key

2018-08-27 Thread David Rowley
On 28 August 2018 at 07:50, Daniel J Peacock  wrote:
> I've got an odd situation with a table that has a varchar(255) as the
> primary key that is getting key values from an Elasticsearch engine.  What
> I'm finding is that even though there is a primary key on the table, I am
> getting duplicated keys in the data and the constraint is not blocking
> these.

How do you know they're duplicated?

>  When I do a "select ,count(*) from  group by
>  having count(*) > 1" I get no results.  Yet, when I search the
> table for a value that is like a key I know to be duplicated, I get multiple
> results.  When I select from the table where field is equal to the
> duplicated field I get one result.  I verified that they are distinct row
> with ctid.  I also created a clone of the table with CTAS and then tried to
> create a unique index on the id varchar field but that failed with
> "duplicate keys found".  I'm stumped as to what could be the problem.
> The only thing that I can think of is that the primary key is somehow
> corrupt.  I've noticed this behavior on other tables on this database.
> What could be causing this sort of problem?

If the index is corrupt then you might find that:

set enable_indexscan = 0;
set enable_indexonlyscan = 0;
select ,count(*) from  group by  having count(*) > 1;

would return some rows.  You should also verify the above query does
use a Seq Scan by performing an EXPLAIN on the query.

There are issues that have been fixed in previous releases which could
have caused an index to get corrupted in this way, so it's quite
useful to know which version of PostgreSQL you're running here and if
you've paid attention to the release notes when you've previously
upgraded. For example, see [1].

[1] https://www.postgresql.org/docs/9.6/static/release-9-6-2.html

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Will there ever be support for Row Level Security on Materialized Views?

2018-08-27 Thread David Rowley
On 28 August 2018 at 11:06, Ken Tanzer  wrote:
> Or perhaps they "simply don't want to answer." Well I wondered about that 
> too, since the last person who asked about this didn't get a response either. 
>  Is it a particularly stupid question?  Is there some history that makes it a 
> touchy subject?  Or is it just the Bermuda Triangle of this mailing list? :)

I can't imagine why it would be a touchy subject.  Generally, if
you're working on a feature its best not to keep it a secret as if
someone else does the same, then you end up with redundant work being
done.

If I had to guess what's going on here then I'd say that nobody has
been sufficiently motivated to work on this yet. If that's the case,
everyone who reads your email is not the person working on this
feature, so can't answer your question.   I just answered to suggest
the reasons why you might not be getting an answer.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Autovacuum degrades all other operations by keeping all buffers dirty?

2018-08-31 Thread David Pacheco
Hello,

We've been struggling with some major performance issues related to
autovacuum.
(I know this is a common problem.)  For a while, we believed this was
primarily
related to I/O contention, but recent observations from our monitoring make
me
wonder if there's a deeper issue here, so I'm looking for some better
understanding.

>From reading the 9.6.3 source, it looks like the autovacuum process itself
is
single-threaded, and it reads pages essentially linearly from the relation
(possibly skipping some).  When the autovacuum process needs to modify a
page,
it doesn't write it directly, but rather marks the buffer dirty.  The page
will
be written later, either by the checkpointer (according to its
configuration, in
terms of time and WAL), the bgwriter (according to its configuration, in
terms
of write count and sleep time), or else some other backend process that
requires
a free buffer (if the count of non-dirty buffers reaches zero).  Is this
accurate?

In our case, we found that when autovacuum runs, the number of dirty buffers
written by regular backends shoots from a handful to as much as 300 buffers
per
second.  (We have 200 backends on most databases.)  More specifically:
prior to
recent autovacuums starting, databases did under 1000 buffer allocations per
second.  Most of those were done by the checkpointer.  Individual backends
did
just a handful.  After autovacuum started, buffer allocations rose to
between
2,000 and 8,000 per second.  The checkpointer handled many of these, but so
did
individual backends (see above).  The bgwriter peaked around 25 buffer
writes
per second.  So it seems like the spike in buffers written by normal
backends
could explain the significant degradation that we see in average latency and
overall throughput (which can be as much as 50%).

It looks to me like the autovacuum process is effectively generating work
(in
the form of async writes) that's being distributed implicitly to the various
backend processes, creating latency for any other query that happens to
require
a buffer (including read-only queries).  Maybe we can improve this by
tuning the
bgwriter.  But if that's single-threaded, presumably there's an upper bound
to
how many buffer writes it can handle?  Is there anything to guarantee that
the
bgwriter will keep up with the work that's being generated?  That is, for
some
workloads, wouldn't it be possible that autovacuum could always generate
work
faster than the bgwriter can do it, and you'd always have some pretty major
degradation to all other queries?

I've drawn a lot of inferences here, and I'm sure there are some mistakes.
I'd
appreciate any clarification, correction, or confirmation!

Thanks in advance,
Dave


Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
On 9/4/18 11:24 AM, Joshua D. Drake wrote:
> On 09/04/2018 07:52 AM, Ron wrote:
>> On 09/04/2018 09:24 AM, Joshua D. Drake wrote:
>>> On 09/04/2018 07:14 AM, Ron wrote:
>>>>
>>>> That was about barman, in the barman group.  This is asking about
>>>> pgbackrest...  :)
>>>>
>>>> So: does pgbackrest have this ability which barman does not have?
>>>> The "--db-include" option seems to indicate that you can restore a
>>>> single db, but does indicate whether or not you can rename it.
>>>
>>> https://pgbackrest.org/configuration.html#section-restore/option-db-include
>>>
>>>
>>
>> Which implies that you can't do it?
> 
> You can restore a single database and then issue a simple ALTER DATABASE
> command to change the DB name.

This will work, but I don't think it's what Ron is getting at.

To be clear, it is not possible to restore a database into an *existing*
cluster using pgBackRest selective restore.  This is a limitation of
PostgreSQL file-level backups.

To do what Ron wants you would need to restore it to a new cluster, then
use pg_dump to logically dump and restore it to whatever cluster you
want it in.  This still saves time since there is less to restore but is
obviously not ideal.

Regards,
-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
On 9/4/18 11:55 AM, Ron wrote:
> On 09/04/2018 10:51 AM, David Steele wrote:
> [snip]
>> This will work, but I don't think it's what Ron is getting at.
>>
>> To be clear, it is not possible to restore a database into an *existing*
>> cluster using pgBackRest selective restore.  This is a limitation of
>> PostgreSQL file-level backups.
>>
>> To do what Ron wants you would need to restore it to a new cluster, then
>> use pg_dump to logically dump and restore it to whatever cluster you
>> want it in.  This still saves time since there is less to restore but is
>> obviously not ideal.
> 
> That's exactly what I'm referring to.
> 
> Presumably I could restore it to a new cluster on the same VM via initdb
> on a different port and PGDATA directory?

Definitely.  No need to initdb since all the required files will be
restored by pgBackRest.  You'll just need to create an empty directory
to restore into.

Regards,
-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
Hi Ron,

On 9/4/18 7:41 AM, Ron wrote:
> On 03/09/2018 08:56 AM, David Steele wrote:
> [snip]
>>> About pgBarman, I like :
>>> - be able restore on a remote server from the backup server
>> This a good feature, and one that has been requested for pgBackRest. You
>> can do this fairly trivially with ssh, however, so it generally hasn't
>> been a big deal for people.  Is there a particular reason you need this
>> feature?
> 
> (Sorry to dredge up this old thread.)
> 
> Do you just change the IP address of the "restore target"?

[I'll assume you wanted to hear about pgBackRest here since we discussed
it down thread.]

Generally restores are done from the database server, but if you want to
run a restore from the backup server you can run it via ssh:

ssh user@pg-server pgbackrest [...]

> 
>>> - use replication slots for backingup wal on the backup server.
>> Another good feature.  We have not added it yet because pgBackRest was
>> originally written for very high-volume clusters (100K+ WAL per day) and
>> our parallel async feature answers that need much better.  We recommend
>> a replicated standby for more update-to-date data.
> 
> Every N minutes you copy the WAL files to the backup server?
> 
> 

-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
On 9/4/18 7:41 AM, Ron wrote:
> On 03/09/2018 08:56 AM, David Steele wrote:
> 
>>> - use replication slots for backingup wal on the backup server.
>> Another good feature.  We have not added it yet because pgBackRest was
>> originally written for very high-volume clusters (100K+ WAL per day) and
>> our parallel async feature answers that need much better.  We recommend
>> a replicated standby for more update-to-date data.
> 
> Every N minutes you copy the WAL files to the backup server?

[Accidentally hit send on the previous post, here's the rest...]

WAL segments are transferred whenever Postgres indicates that a segment
is finished via the archive_command.  Async archiving "looks ahead" to
find WAL segments that are ready to archive.

You can use archive_timeout to force Postgres to push a WAL segment
every N seconds for clusters that have idle time.


Regards,
-- 
-David
da...@pgmasters.net



Re: Autovacuum degrades all other operations by keeping all buffers dirty?

2018-09-05 Thread David Pacheco
On Fri, Aug 31, 2018 at 3:50 PM, Andres Freund  wrote:

> On 2018-08-31 19:31:47 -0300, Alvaro Herrera wrote:
> > On 2018-Aug-31, David Pacheco wrote:
> >
> > > From reading the 9.6.3 source, it looks like the autovacuum process
> > > itself is single-threaded, and it reads pages essentially linearly
> > > from the relation (possibly skipping some).  When the autovacuum
> > > process needs to modify a page, it doesn't write it directly, but
> > > rather marks the buffer dirty.  The page will be written later,
> >
> > Unless there's some bug, there is a BufferAccessStrategy that only lets
> > a few dozen buffers go unwritten before the autovac worker process
> > itself is forced to write some.
>
> I've not re-checked, but I'm not sure that's true if the buffer is
> already in s_b, which it'll be for many workloads.
>


Does that mean this analysis from above is accurate?

It looks to me like the autovacuum process is effectively generating work
> (in
> the form of async writes) that's being distributed implicitly to the
> various
> backend processes, creating latency for any other query that happens to
> require
> a buffer (including read-only queries).


Thanks,
Dave


nested query problem

2018-09-06 Thread David Gauthier
Hi:

I'm having trouble with this query...

select sr.sqf_id, sr.status, fse.perl_sub_name, fse.end_datetime
  from
sqf_runs sr,
(select perl_sub_name, end_datetime from flow_step_events_view where
sqf_id = sr.sqf_id order by 2 limit 1) fse
  where sr.userid='foo';

ERROR:  invalid reference to FROM-clause entry for table "sr"
LINE 4: ...atetime from flow_step_events_view where sqf_id = sr.sqf_id ...
 ^
HINT:  There is an entry for table "sr", but it cannot be referenced from
this part of the query.


If this is indeed a foul, how can I accomplish the same thing ?

Thanks in Advance !


Re: nested query problem

2018-09-06 Thread David Gauthier
Not quite.  This returns one value. In the actual "sqf_runs" table, there
are many records with user_id = 'foo'.  I want one line for each where the
fse.p-erl_sub_name and fse.end_datetime values are the latest values found
in the flow_step_events_view view where the sqf_ids match.

On Thu, Sep 6, 2018 at 5:10 PM Paul Jungwirth 
wrote:

> On 09/06/2018 01:59 PM, David Gauthier wrote:
> > I'm having trouble with this query...
> >
> > select sr.sqf_id, sr.status, fse.perl_sub_name, fse.end_datetime
> >from
> >  sqf_runs sr,
> >  (select perl_sub_name, end_datetime from flow_step_events_view
> > where sqf_id = sr.sqf_id order by 2 limit 1) fse
> >where sr.userid='foo';
> >
> > ERROR:  invalid reference to FROM-clause entry for table "sr"
> > LINE 4: ...atetime from flow_step_events_view where sqf_id = sr.sqf_id
> ...
> >   ^
> > HINT:  There is an entry for table "sr", but it cannot be referenced
> > from this part of the query.
>
> This calls for a lateral join:
>
>  SELECT  sr.sqf_id, sr.status, fse.perl_sub_name, fse.end_datetime
>  FROMsqf_runs sr
>  LEFT OUTER JOIN LATERAL (
>SELECT  perl_sub_name, end_datetime
>FROMflow_step_events_view fsev
>WHERE   fsev.sqf_id = sr.sqf_id
>ORDER BY 2
>LIMIT 1
>  ) fse
>  ON  true
>  WHERE   sr.userid = 'foo'
>  ;
>
> It's nearly what you had already, but `LATERAL` lets the subquery
> reference columns in the other tables.
>
> A lateral join is conceptually a lot like running your subquery in for
> loop, looping over all the rows produced by the rest of the query. It
> doesn't have to produce 1 row for each iteration, but saying `LIMIT 1`
> ensures that here.
>
> The `ON true` is just pro forma because you can't have a join without an
> `ON` clause.
>
> You might prefer an INNER JOIN LATERAL, depending on your needs.
>
> --
> Paul  ~{:-)
> p...@illuminatedcomputing.com
>
>


Re: nested query problem

2018-09-06 Thread David Gauthier
Wow, I take that back.  I thought there were many recs with "foo" but there
wa sonly one.
When I ran this against a value that actually had multiple records, it ran
fine.

Sorry for that.
And Thanks for this query !


On Thu, Sep 6, 2018 at 5:15 PM David Gauthier 
wrote:

> Not quite.  This returns one value. In the actual "sqf_runs" table, there
> are many records with user_id = 'foo'.  I want one line for each where the
> fse.p-erl_sub_name and fse.end_datetime values are the latest values found
> in the flow_step_events_view view where the sqf_ids match.
>
> On Thu, Sep 6, 2018 at 5:10 PM Paul Jungwirth 
> wrote:
>
>> On 09/06/2018 01:59 PM, David Gauthier wrote:
>> > I'm having trouble with this query...
>> >
>> > select sr.sqf_id, sr.status, fse.perl_sub_name, fse.end_datetime
>> >from
>> >  sqf_runs sr,
>> >  (select perl_sub_name, end_datetime from flow_step_events_view
>> > where sqf_id = sr.sqf_id order by 2 limit 1) fse
>> >where sr.userid='foo';
>> >
>> > ERROR:  invalid reference to FROM-clause entry for table "sr"
>> > LINE 4: ...atetime from flow_step_events_view where sqf_id = sr.sqf_id
>> ...
>> >   ^
>> > HINT:  There is an entry for table "sr", but it cannot be referenced
>> > from this part of the query.
>>
>> This calls for a lateral join:
>>
>>  SELECT  sr.sqf_id, sr.status, fse.perl_sub_name, fse.end_datetime
>>  FROMsqf_runs sr
>>  LEFT OUTER JOIN LATERAL (
>>SELECT  perl_sub_name, end_datetime
>>FROMflow_step_events_view fsev
>>WHERE   fsev.sqf_id = sr.sqf_id
>>ORDER BY 2
>>LIMIT 1
>>  ) fse
>>  ON  true
>>  WHERE   sr.userid = 'foo'
>>  ;
>>
>> It's nearly what you had already, but `LATERAL` lets the subquery
>> reference columns in the other tables.
>>
>> A lateral join is conceptually a lot like running your subquery in for
>> loop, looping over all the rows produced by the rest of the query. It
>> doesn't have to produce 1 row for each iteration, but saying `LIMIT 1`
>> ensures that here.
>>
>> The `ON true` is just pro forma because you can't have a join without an
>> `ON` clause.
>>
>> You might prefer an INNER JOIN LATERAL, depending on your needs.
>>
>> --
>> Paul  ~{:-)
>> p...@illuminatedcomputing.com
>>
>>


Re: pgbackrest when data/base is symlinked to another volume

2018-09-07 Thread David Steele

Hi Ron,

On 9/6/18 11:21 PM, Ron wrote:


Will pgbackrest properly backup and restore the cluster if data/base, 
data/pg_xlog and data/pg_log are symlinks?


PGDATA=/var/lib/pgsql/9.6/data
$PGDATA/base -> /Database/9.6/base
$PGDATA/pg_log -> /Database/9.6/pg_log
$PGDATA/pg_xlog -> /Database/9.6/pg_xlog


Yes, this will work.  Note that restore does not recreate symlinks by 
default so you'll need to specify --link-all to enable symlink creation.


See 
https://pgbackrest.org/configuration.html#section-restore/option-link-all 
for details.


Using symlinks in this way will make management of your clusters more 
difficult, mostly because systems need more provisioning before restores 
can be performed.  In general I'd recommend against it unless there are 
performance considerations.


Also, you might consider using log_directory to relocate log files 
rather than a symlink.  This will exclude log files from your backup 
which is usually preferable -- primary logs restored to a standby are 
out of context and can cause confusion.


Regards,
--
-David
da...@pgmasters.net



Re: pgbackrest when data/base is symlinked to another volume

2018-09-08 Thread David Steele

On 9/7/18 8:47 PM, Ron wrote:

On 09/07/2018 05:22 PM, David Steele wrote:


On 9/6/18 11:21 PM, Ron wrote:


Will pgbackrest properly backup and restore the cluster if data/base, 
data/pg_xlog and data/pg_log are symlinks?


PGDATA=/var/lib/pgsql/9.6/data
$PGDATA/base -> /Database/9.6/base
$PGDATA/pg_log -> /Database/9.6/pg_log
$PGDATA/pg_xlog -> /Database/9.6/pg_xlog


Yes, this will work.  Note that restore does not recreate symlinks by 
default so you'll need to specify --link-all to enable symlink creation.


See 
https://pgbackrest.org/configuration.html#section-restore/option-link-all 
for details.


Using symlinks in this way will make management of your clusters more 
difficult, mostly because systems need more provisioning before 
restores can be performed.  In general I'd recommend against it unless 
there are performance considerations.


Now that I'm thinking more about what you wrote... "data" isn't on it's 
own partition.  data/*base* has it's own partition.


What's the recommended method for putting *base**/* on a partition 
different from data/?  Or is that not recommended?


All the user data goes in base so there's really no need to separate it 
out of data.  Typically pg_wal and tablespaces are relocated onto 
different devices for performance (or to get more space).  If the 
partitions are on the same device then there's no performance benefit, 
just admin hassle.


--
-David
da...@pgmasters.net



Re: Volume partitioning (was Re: pgbackrest when data/base is symlinked to another volume)

2018-09-10 Thread David Steele

Hi Ron

On 9/8/18 4:26 PM, Ron wrote:


Googled "postgresql disk partitioning" and "postgresql volume 
partitioning" without much success.


Is the best practice volume partitioning:
/Database/9.6/data
/Database/9.6/data/pg_log
/Database/9.6/data/pg_xlog

where /var/lib/pgsql/9.6 (on RHEL6) is a symlink to /Database/9.6/data 
and PGDATA=/Database/9.6/data


*or *

/Database/9.6/data/base
/Database/9.6/data/pg_log
/Database/9.6/data/pg_xlog

where PGDATA=/var/lib/pgsql/9.6/data and base, pg_log and px_xlog are 
symlinks to the partitions?


This is really a matter of preference, but simpler is better, so the 
first one looks better to me.


In general I prefer to keep everything in /var/lib/pgsql/9.6:

/var/lib/pgsql/9.6/data
/var/lib/pgsql/9.6/data/pg_xlog -> /var/lib/pgsql/9.6/wal

Then use the log_directory setting to put logs in:

/var/log/pgsql or /var/lib/pgsql/9.6/log

pgBackRest will expect to be pointed at a real directory (pg-path) and 
expect the data_directory in PostgreSQL to match.


Regards,
--
-David
da...@pgmasters.net



Re: Code of Conduct plan

2018-09-14 Thread David Wall

On 9/14/18 7:52 AM, James Keener wrote:


I fail to see how that makes everyone here part of a community anymore 
than I'm part of the "community" of regulars at a bar I walk into for 
the first time.


As I said, the rules can and should apply within the list, but 
applying them outside the list is odd and wreaks of authoritarianism.


Jim
In the 20 years I've been using PG, I've not noted any bizarre "list 
speech" except this discussion that suggests others should monitor 
people's behavior wherever they are, and report any "infraction" to PG, 
so PG can boot them.  I'm with those who think that idea is 
diametrically opposed to open source's freedom.  What next, monitor what 
apps people are using their DB for and decide if the "community" 
approves of its character or not?


David



Convert interval to hours

2018-09-14 Thread David Gauthier
Hi:

In perl/DBI, I have code that's getting me an "age" which returns something
like... "-17 days -08:29:35".  How can I convert that to a number of hours
(as a float I would presume) ?

Thanks


Re: Setting up continuous archiving

2018-09-26 Thread David Steele

On 9/26/18 8:20 AM, Yuri Kanivetsky wrote:


I'm trying to compile a basic set of instruction needed to set up
continuous archiving and to recover from a backup. I'm running
PostgreSQL 9.3 on Debian Stretch system.


This is an incredibly complex topic and it is very difficult to 
implement correctly.


My advice is that you use a mature backup solution like pgBackRest or 
barman rather than rolling your own.


--
-David
da...@pgmasters.net



How can I get and handle the status of sql statements that run in plpgsql ?

2018-10-02 Thread David Gauthier
Hi:
psql (9.6.7, server 9.5.2) on linux

How does one get the status of an sql statement executed in plpgsql?  If
that status is cryptic, how can that be translated to something which
someone could understand?   Finally, how can I effectively do a start
transaction and either rollback or commit based on the results of the sql
statements run?

create or replace function xfer_savings_to_checking(acct_no text, howmuch
float)
 returns text as $$

  declare
savings_balance float;
checking_balance float;
  begin

select balance into savings_balance from savings_acct_info where acct =
acct_no;
sql_status = 
if(something_went_wrong) then
  raise exception 'select savings statement was bad "%"',sql_status;
  return 'error';
end if;

select balance into checking_balance from checking_acct_info where acct
= acct_no;
sql_status = 
if(something_went_wrong) then
  raise exception 'select checking statement was bad "%"',sql_status;
  return 'error';
end if;

if(howmuch > saving_balance) then
  raise notice 'Hey, you dont have that much to xfer !  You only have
%',savings_balance;
  return 'error';
end if;

start transaction;

  update savings_acct_info set balance = balance - howmuch where acct =
acct_no;
  sql_status = 
  if(something_went_wrong) then
raise exception 'updating savings acct "%"',sql_status;
rollback;
return 'error';
  end if;

  update checking_acct_info set balance = balance + howmuch where acct
= acct_no;
  sql_status = 
  if(something_went_wrong) then
raise exception 'updating checking acct "%"',sql_status;
rollback;
return 'error';
  end if;

commit;

  end;
$$ language plpgsql;

Of course I don't know what the  and
"something_went_wrong" pieces look like, or they even make sense with how
this sort of thing shold be properly handled in plpgsql.  Also, in my
trials, it appears that plpgsql doesn't like "start transaction".  So how
is that piece done ?

Thanks in Advance for any help !


Re: How can I get and handle the status of sql statements that run in plpgsql ?

2018-10-03 Thread David Gauthier
Thanks Adrian and Christopher !

So the transaction will be rolled back automatically if there's a problem.
Got it !

Question:  How do I detect when it is appropriate to raise notice so as to
be able to see the error message?  Or will that automatically be sent to
stdout if an error occurs?

Question: Also, how can I detect how many records were operated on by, say,
an update statement?  In the example I gave, I would want to make sure that
the transfer amount was deducted from the savings AND that the amount was
added to the checking.  Both need to be 1, else I want to...
1) send a message (raise notice)
2) rollback somehow.




On Tue, Oct 2, 2018 at 5:27 PM Adrian Klaver 
wrote:

> On 10/2/18 1:47 PM, David Gauthier wrote:
> > Hi:
> > psql (9.6.7, server 9.5.2) on linux
> >
> > How does one get the status of an sql statement executed in plpgsql?  If
> > that status is cryptic, how can that be translated to something which
> > someone could understand?   Finally, how can I effectively do a start
> > transaction and either rollback or commit based on the results of the
> > sql statements run?
> >
>
> >
> > Of course I don't know what the  and
> > "something_went_wrong" pieces look like, or they even make sense with
> > how this sort of thing shold be properly handled in plpgsql.  Also, in
>
> The below(read to bottom of the page) might help:
>
>
> https://www.postgresql.org/docs/10/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING
>
>
> > my trials, it appears that plpgsql doesn't like "start transaction".  So
> > how is that piece done ?
> >
> > Thanks in Advance for any help !
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: How can I get and handle the status of sql statements that run in plpgsql ?

2018-10-03 Thread David Gauthier
I found "no_data" here...
https://www.postgresql.org/docs/10/static/errcodes-appendix.html

update blah, blah...
if(no_data) then
  raise exception "update failed to update anything";
end if

Is that how you detect if nothing was updated and how to make sure the
thing returns and rolls back ?

On Wed, Oct 3, 2018 at 11:46 AM David Gauthier 
wrote:

> Thanks Adrian and Christopher !
>
> So the transaction will be rolled back automatically if there's a
> problem.  Got it !
>
> Question:  How do I detect when it is appropriate to raise notice so as to
> be able to see the error message?  Or will that automatically be sent to
> stdout if an error occurs?
>
> Question: Also, how can I detect how many records were operated on by,
> say, an update statement?  In the example I gave, I would want to make sure
> that the transfer amount was deducted from the savings AND that the amount
> was added to the checking.  Both need to be 1, else I want to...
> 1) send a message (raise notice)
> 2) rollback somehow.
>
>
>
>
> On Tue, Oct 2, 2018 at 5:27 PM Adrian Klaver 
> wrote:
>
>> On 10/2/18 1:47 PM, David Gauthier wrote:
>> > Hi:
>> > psql (9.6.7, server 9.5.2) on linux
>> >
>> > How does one get the status of an sql statement executed in plpgsql?
>> If
>> > that status is cryptic, how can that be translated to something which
>> > someone could understand?   Finally, how can I effectively do a start
>> > transaction and either rollback or commit based on the results of the
>> > sql statements run?
>> >
>>
>> >
>> > Of course I don't know what the  and
>> > "something_went_wrong" pieces look like, or they even make sense with
>> > how this sort of thing shold be properly handled in plpgsql.  Also, in
>>
>> The below(read to bottom of the page) might help:
>>
>>
>> https://www.postgresql.org/docs/10/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING
>>
>>
>> > my trials, it appears that plpgsql doesn't like "start transaction".
>> So
>> > how is that piece done ?
>> >
>> > Thanks in Advance for any help !
>> >
>>
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com
>>
>


Re: Postgres 11, partitioning with a custom hash function

2018-10-03 Thread David Rowley
On 4 October 2018 at 16:22, Harry B  wrote:
> I am still having trouble reconciling what happens under the HASH
> partitioning!. If I have text column forming the basis of PARTITIONED BY
> HASH, the HASH value used in the partitioning setup does not seem to match
> to `hashtext()` of that value

It won't match. The hash partition hash is seeded with a special const
(HASH_PARTITION_SEED) see [1].

You could likely roll your own hash ops. See [2] for an example. This
can then be used to create a hash partitioned table like [3].

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/partitioning/partbounds.c#l2056
[2] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/test/regress/sql/insert.sql#l241
[3] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/test/regress/sql/hash_part.sql#l10

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Postgres 11, partitioning with a custom hash function

2018-10-04 Thread David Rowley
On 5 October 2018 at 06:18, Harry B  wrote:
>
> Thank you David! These helped me create an operator class.
> However, there still seems to be a 'off-by-a-fixed-N' difference between the
> hash value returned and how PG selects the partition.

hmm, actually, this is probably due to the hash_combine64() call in
compute_partition_hash_value(). This seems to combine the hash value
with 0 regardless of if there's another partition key column to hash.
If you could somehow do the reverse of what hash_combine64() will do
to you hash before returning it to the function then you might get
somewhere, but that does not look possible since it appears to be
throwing away some bits.

It may not have been a great choice to decide to have
compute_partition_hash_value() do this unnecessary combine, but it's
likely a few months too late to change that now.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Postgres 11, partitioning with a custom hash function

2018-10-04 Thread David Rowley
On 5 October 2018 at 09:43, Harry B  wrote:
> Now the big question: How scared should I be relying on this? I don't mind
> it breaking on major version upgrades (which would mean I need to dump &
> restore my entire set), but how likely is it to change unannounced in a
> minor/security release? Unless of course, you break it in a way that makes
> custom-hash function impossible.

I don't see how we could possibly change it once v11 is out the door.
Such a change would break pg_upgrade and I imagine we want that to
work for a long time to come yet, at least until there is some other
reason that is worthy of breaking it. The bar is likely set pretty
high for that.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Advice on logging strategy

2018-10-12 Thread David Steele

On 10/11/18 11:26 AM, Mike Martin wrote:


This results in enormous log files which take ages to import using copy 
becuase each execute statement is logged with the parameters chosen


Is there any way around this?

I cant find any way to filter dml statements


pgAudit (https://github.com/pgaudit/pgaudit) gives you fine-grain 
control over what is logged by command type, table, or user as well as a 
lot more detail.


--
-David
da...@pgmasters.net



Re: Setting up continuous archiving

2018-10-15 Thread David Steele

On 10/15/18 5:09 PM, Benoit Lobréau wrote:


By the way, do/can they both use streaming to receive WAL records? Or
streaming is only for standby servers. For backups you have only
file-based log shipping?

barman supports streaming but it's not as magical as one might think.
See pgbarman's documentation for how to manager .partial files.

pgbackrest archives only wal files when postgres uses the 
archive_command. You also have the option to do paralllel async wal 
push/get.
It can be useful if you write wals quicker than you can archive them or 
if you want to restore more quickly.


We have focused on archive_command because the performance is much 
better because it can be parallelized.



Then, I suppose they both don't support partial PITR
(http://docs.pgbarman.org/release/2.4/#scope), where there are
standalone backups that extends to points in time for which there are
no WAL files. I'm not sure if this matters, but I assume that it might
be effective in terms of disk space.

It's a hot backup so you have to have wals files so that your backup is 
consistent at the end of the backup.
You can build something like what you describe with pgbackrest, 
archive-copy and a copy of the backup directory to another place.


The --repo1-retention-archive-type and --repo1-retention-archive options 
allow you do keep WAL for a smaller number of backups in order to save 
space.


https://pgbackrest.org/configuration.html#section-repository/option-repo-retention-archive

So, for example:

[global]
repo1-retention-full=4
# The following option is the default but included for clarity
repo1-retention-archive-type=full
repo1-retention-archive=2

This configuration will retain 4 full backups but only keep PITR WAL for 
2 of them, i.e. the WAL generated between backups.  The WAL required to 
make a backup consistent is always retained so the 2 older backups can 
be played to consistency but no further.



I'd like to be able to handle db failure as fast as possible. Ideally,
automatically. Which probably means either repmgr, or corosync +
pacemaker + PAF. Is that what you mean by HA here? 



yes. You can also look into patrony for an alternative. It's a matter of 
preference and requirements.


pgBackRest certainly works with Pacemaker/Corosync and Patroni.

If your standby is lagging too far behind (and you dont use slots) you 
can use pgbackrest's archive to fill the gap in wal files and catchup 
with the master.


Recovering WAL out of the archive is safer than using slots.  Since the 
WAL is transferred over the network compressed it can also save a lot of 
bandwidth.


Regards,
--
-David
da...@pgmasters.net



Re: Help with list partitioning on expression

2018-10-21 Thread David Rowley
On 19 October 2018 at 02:49, Dinko Papak  wrote:
> I have created table partitioned by list on expression using timestamp
> column, so for each timestamp value function returns int and each partition
> table is for single value (range of timestamp for which function result is
> the same). This helps me to simplify querying as I do not need to always
> write date ranges. Querying (partition pruning) works nice, however when I
> attach new partition it seems to always scan whole table, although I do have
> necessary check constraint on partitioned table. I have tried to make
> timestamp column both null and not null. Also, it takes longer to attach
> partition then to add constraint itself although per my understanding those
> 2 operations should do the same scan.

It's not all that obvious, but if you have PARTITION BY LIST
(extract(minute FROM ts)) and try to attach a partition like:

CREATE TABLE mypartition PARTITION OF parted FOR VALUES IN (1);

then the partition constraint is actually (extract(minute FROM ts) IS
NOT NULL AND extract(minute FROM ts) = 1). If your CHECK constraint is
just checking `extract(minute FROM ts) = 1` then the current code in
PartConstraintImpliedByRelConstraint() is not smart enough to know
that `extract(minute FROM ts) = 1` is strict and cannot match nulls.
Perhaps that could be improved, but that's how it is today.

Likely you'll have better luck with a check constraint that explicitly
checks the function IS NOT NULL.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Aggregated join vs. aggregate in column?

2019-07-12 Thread David Rowley
On Fri, 12 Jul 2019 at 19:32, Durumdara  wrote:
> 2.) I may relocate this section as join...
>
> select
> request.*, s.max_s_date
> from request
> left join
> (
> select schedule.product_id, max(s_date) as max_s_date from schedule
> where schedule.ok = True
> group by  schedule.product_id
>  ) s on (s.product_id = request.product_id)
> ...
>
> But I really don't know what the hell will happen in query optimizer with 
> this method.
>
> a.)
> Optimizer is clever, and it calculates the aggregates only in the needed rows.
> So it find the request.product_id-s, and execute the "s" query only in these 
> rows.
>
> b.)
> Or it isn't enough wise, it executes the "s" subquery on whole schedule, and 
> later joins to main table.
> The schedule table is big in customers' database, so this is worst case. :-(

I'm afraid for the particular query above, the answer is closer to b)
However, that's only going to be a problem if there are many more
distinct product_id records in "schedule". If you were to add a WHERE
clause to the outer query that did WHERE request.product_id = X, then
that qual would be pushed down into the subquery.  This qual pushing
only works for equality. So if you changed out WHERE
request.product_id = X to WHERE request.product_id IN(X,Y); then that
wouldn't push the qual to the subquery.

> I asked this because sometimes I need to get more result columns in the 
> select, but I couldn't retreive more...
>
> (
> select max(s_date) as s_date from schedule
> where schedule.product_id = request.product_id and schedule.ok = True
>  ) as max_s_date,   <=== only one column
>
> So sometimes the join is better.
>
> But if the optimizer isn't enough wise, I can get these values only "WITH" 
> queries (select the main rows to temp, run subselects with only these 
> records, return the mix of main and subselects in one query).

If you don't want to repeat the same subquery in the SELECT list then
you could perform a CROSS JOIN LATERAL. For example:

select
request.*, s.max_s_date, s.max_s1_date
from request
cross join lateral (
select max(s_date) as max_s_date, max(s1_date) as max_s1_date
from schedule
where schedule.ok = True
   and s.product_id = request.product_id) s;

In this case, the subquery will be executed once per output row, so if
you have some restrictive WHERE clause on the outer query then the
subquery will be executed fewer times.

With a bit of training, you should be able to see what the query
planner has done for yourself by using the EXPLAIN command:
https://www.postgresql.org/docs/current/sql-explain.html

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: PGbackrest capacity related question

2019-07-22 Thread David Steele

On 7/22/19 5:08 PM, chiru r wrote:

Hi  All,

I am using pgbackrest on my production workloads and it is working as 
expected.

However our DB is growing more and more .
We have currently two disks and each one is 4 TB size. However 3.8TB 
size data got loaded into DB currently and our /pgback mount point size 
is 4TB.
So if the DB size grows more then 4TB, how to add another file system to 
store backup under "repo-path"?.


pgBackRest does not have multi-volume support because there are a number 
of ways to accomplish this using ZFS, Linux volumes, etc.  You'll need 
the expand the storage for the repo using one of these methods or reduce 
your retention settings.


Regards,
--
-David
da...@pgmasters.net




Re: lead() with arrays - strange behaviour

2019-08-08 Thread David Rowley
On Thu, 8 Aug 2019 at 21:06, Thomas Kellerer  wrote:
> The following statement tries to find the overlapping values in id_list 
> between the current row and the next row:
>
> select id,
>id_list,
>lead(id_list) over (order by id) as next_list,
>array(select unnest(id_list) intersect select unnest(lead(id_list) 
> over (order by id))) as common_ids
> from sample_data;
>
> The above returns:
>
> id | id_list | next_list | common_ids
> ---+-+---+---
>  1 | {1,2,3} | {2,3,4}   | {}
>  2 | {2,3,4} | {4,5,6}   | {}
>  3 | {4,5,6} |   | {}
>
> The empty array for "common_ids" is obviously incorrect.

I think you're confused with what the SELECT with the empty FROM
clause does here.  In your subquery "id_list" is just a parameter from
the outer query. LEAD(id_list) OVER (ORDER BY id) is never going to
return anything since those are both just effectively scalar values,
to which there is no "next" value.

--
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: FW: Undelivered Mail Returned to Sender

2019-08-11 Thread David Rowley
On Sun, 11 Aug 2019 at 06:53, stan  wrote:
>
> I apologize for asking, what I suspect will turn out to be a newbie
> question, but I have managed to get myself quite confused on this.
>
> I am defining a view as follows
>
>
> CREATE OR REPLACE view purchase_view as
> select
> project.proj_no ,
> qty ,
> mfg_part.mfg_part_no ,
> mfg.name as m_name ,
> mfg_part.descrip as description ,
> (
> SELECT
> name
> FROM
> vendor
> WHERE
> bom_item.vendor_key =
> (
> SELECT
> vendor_key
> FROM
> mfg_vendor_relationship
> WHERE
> bom_item.mfg_key = mfg_key
> AND
> prefered = TRUE
> AND
> bom_item.project_key = project_key
>
> )
> )
> as v_name ,

> ERROR:  more than one row returned by a subquery used as an expression
>
> Can someone please enlighten me as to the error of my ways?

Looks to me like your WHERE clause is wrong in the subquery. "WHERE
bom_item.vendor_key ="  surely that should be just "WHERE vendor_key =
" (assuming that's the primary key column of the vendor table).

Also, you've mentioned you've only a single record in the
mfg_vendor_relationship, so the error can't be due to multiple records
matching in the mfg_vendor_relationship table. However, given the
unique constraint on that table includes 3 columns and you're just
filtering on 2 of them, then it would only take some rows in there
with the same mfg_key and project_key values but a different
vendor_key to get the same error from that part of the query.  If that
shouldn't happen, then perhaps your UNIQUE constraint should not
contain the vendor_key column.  You'd have to explain what you need in
more detail for someone to be able to help you fix that.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Transaction state on connection Idle/Open/Failed

2019-08-16 Thread David Wall
In JDBC (latest version), I'm using the 
org.postgresql.jdbc.PgConnection.getTransactionState() that returns an 
enumeration of IDLE, OPEN or FAILED.


I am familiar with IDLE, meaning the connection has a new transaction 
started, but isn't doing anything.  We think of this as the "normal" 
state for our connections.


How does OPEN compare to IDLE?  If we stop using a connection that's in 
the OPEN state, is something wrong?  That is, does it have pending 
updates that require a commit/rollback?


Same for FAILED.  If we no longer need a connection in the FAILED state, 
is something wrong?  Does it have pending updates that require a 
commit/rollback (assuming in this state rollback makes more sense).


I'm really trying to find out what it means for the connection to be in 
OPEN or FAILED states compared to IDLE.


Thanks,
David





Rename a column if not already renamed.?

2019-08-20 Thread Day, David
I have a function used by my patch files which attempts to determine if the old 
column name exists, if so it  proceeds with the rename. 


CREATE OR REPLACE FUNCTION sys.rename_column(schema_table_ regclass, old_name_ 
text, new_name text)
 RETURNS boolean
 LANGUAGE plpgsql
AS $function$
BEGIN

IF EXISTS(SELECT 1 FROM pg_attribute
  WHERE attrelid = schema_table_
  AND attname = old_name_
  AND NOT attisdropped) THEN
EXECUTE format ('ALTER TABLE %s RENAME %s TO %s', schema_table_, old_name_, 
new_name);
RETURN TRUE;
ELSE
   RETURN FALSE;
END IF;

END;
$function$;


This seems to function correctly except.

If I am upgrading a site from Postgres 9.6 to version 11.3 and the column has 
already been renamed in the 9.6 branch the above existence test fails
and an exception is generated.

It all seems to work correctly if I repeat this same patch in the 11.3 branch.

The function definition is the same for both branches.

I suspect I am overlooking some fundamental issue here.
Anyone with a thought.


Thanks


Dave Day











RE: Rename a column if not already renamed.?

2019-08-20 Thread Day, David
Thanks for the feedback.

The error is something like column already exists and 
Quoting issue is Unlikely, and as I mentioned it works as expected if Im 
repeating the patch in 11.3
And I assume would if I was in 9.6.

I suspect it has something to do with the content,  or lack of, in  the 
pg_attribute table following an upgrade.

-Original Message-
From: Luca Ferrari [mailto:fluca1...@gmail.com] 
Sent: Tuesday, August 20, 2019 3:41 PM
To: Day, David 
Cc: pgsql-gene...@postgresql.org
Subject: Re: Rename a column if not already renamed.?

On Tue, Aug 20, 2019 at 9:07 PM Day, David  wrote:
> EXECUTE format ('ALTER TABLE %s RENAME %s TO %s', schema_table_, 
> old_name_, new_name);

Hard to say without the error, but any chance there is a quoting problem?
EXECUTE format ('ALTER TABLE %s RENAME %I TO %I', schema_table_, old_name_, 
new_name);

Luca


RE: Rename a column if not already renamed.?

2019-08-21 Thread Day, David
I agree the function could be improved to deal with both old and new name 
existing simultaneously.
That is almost certainly the root  cause, and one that I would confirm if the 
tester and site were currently available to me.  

Our work flow  for this scenario is something like:

1.  9.6 pg_dump takes a snapshot of our  9.6  database.
2.  Postgres is upgraded/freshly installed to  11.3..
3.  The 9.6 database is restored using the version 11 pg_restore tool.

4. Once our application process starts up, it sees there is a patch available 
in it's old branch that is one greater then it's restored  9.6 content.
That happens to be a merge patch which resets the expectations.
It attempts to apply all patches in the new branch since the point of 
divergence and runs into my current issue. 
 
It occurs to me I could simply put an exception handler in the rename column 
function and I would likely proceed merrily along.
But curiosity is killing me and the cat. What is causing the old name to 
persist in the pg_attribute table after the rename. ?

Would a stale function referencing the old column name be a contributor?


Regards


Dave Day




-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Tuesday, August 20, 2019 4:57 PM
To: Day, David 
Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
Subject: Re: Rename a column if not already renamed.?

"Day, David"  writes:
> The error is something like column already exists and

Not sure about the workflow this function is used within, but maybe you need to 
consider what to do when both the old and new column names exist.
Because that sure sounds like what is happening.

regards, tom lane






RE: Rename a column if not already renamed.?

2019-08-21 Thread Day, David
Restoring into 11.3 instance the 9.6 dump  ? ->  yes.

For the upgrade scenario, I  confirmed that both old column name and new name 
are in the pg_attribute table at the time the patch attempts to rename it.
Why both  is a big question.
However,
It is easy enough to re-write the column rename function to deal with the 
simultaneous possibility.  I will include the redefined function in the merge 
patch and see how it goes.

I'll update the thread after some further exploration.

Thanks all for your assistance.


-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com] 
Sent: Wednesday, August 21, 2019 11:47 AM
To: Day, David ; Tom Lane 
Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
Subject: Re: Rename a column if not already renamed.?

On 8/21/19 7:52 AM, Day, David wrote:
> I agree the function could be improved to deal with both old and new name 
> existing simultaneously.
> That is almost certainly the root  cause, and one that I would confirm if the 
> tester and site were currently available to me.
> 
> Our work flow  for this scenario is something like:
> 
> 1.  9.6 pg_dump takes a snapshot of our  9.6  database.
> 2.  Postgres is upgraded/freshly installed to  11.3..
> 3.  The 9.6 database is restored using the version 11 pg_restore tool.

In 3) you are restoring to the new 11.3 instance, correct?

> 
> 4. Once our application process starts up, it sees there is a patch available 
> in it's old branch that is one greater then it's restored  9.6 content.
> That happens to be a merge patch which resets the expectations.
> It attempts to apply all patches in the new branch since the point of 
> divergence and runs into my current issue.
>   
> It occurs to me I could simply put an exception handler in the rename column 
> function and I would likely proceed merrily along.
> But curiosity is killing me and the cat. What is causing the old name to 
> persist in the pg_attribute table after the rename. ?

If you are indeed working on the new instance pg_attribute would have no 
'memory' of the dropped column.  It would seem to me to come down to what is 
passed into sys.rename_column() as old_name_, new_name.

> 
> Would a stale function referencing the old column name be a contributor?
> 
> 
> Regards
> 
> 
> Dave Day
> 
> 
> 
> 
> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> Sent: Tuesday, August 20, 2019 4:57 PM
> To: Day, David 
> Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
> Subject: Re: Rename a column if not already renamed.?
> 
> "Day, David"  writes:
>> The error is something like column already exists and
> 
> Not sure about the workflow this function is used within, but maybe you need 
> to consider what to do when both the old and new column names exist.
> Because that sure sounds like what is happening.
> 
>   regards, tom lane
> 
> 
> 
> 
> 


--
Adrian Klaver
adrian.kla...@aklaver.com


For SELECT statement (just a reading one, no 'FOR UPDATE'), is COMMIT or ROLLBACK preferred?

2019-08-25 Thread David Wall
Using the latest PostgreSQL, does it matter if my code does a ROLLBACK 
or a COMMIT on an non-modifying SELECT statement?  My impression is 
they'd be the same as nothing is changed and therefore there's nothing 
to commit or rollback, but wondered if there was any difference in how 
they are processed by Postgres?


Thanks,
David





Re: For SELECT statement (just a reading one, no 'FOR UPDATE'), is COMMIT or ROLLBACK preferred?

2019-08-25 Thread David Wall

On 8/25/19 12:40 PM, Rob Sargent wrote

On Aug 25, 2019, at 1:09 PM, David Wall  wrote:

Using the latest PostgreSQL, does it matter if my code does a ROLLBACK or a 
COMMIT on an non-modifying SELECT statement?  My impression is they'd be the 
same as nothing is changed and therefore there's nothing to commit or rollback, 
but wondered if there was any difference in how they are processed by Postgres?

Thanks,
David




In interactive psql, both issue a warning that there is no current transaction. 
 What is your auto-commit setting and how is your code sent to the server?



We are accessing it via JDBC, and so we SQL via PreparedStatements 
against a Connection, and the connection is not auto-commit.  By 
default, the connection has a BEGIN TRANSACTION in place, so after all 
requests we do, we need to commit/rollback.  The main issue is that if 
we do a SELECT and get a ResultSet that has no rows, if we do a commit 
or a rollback, it seems reasonable that these are identical as no 
changes were made.  My inclination is to do a Connection.commit() on the 
connection because it wasn't in error or anything even if no rows were 
found, but wondered if a Connection.rollback() has any difference 
(positive/negative) in such a scenario.  We have SELECT sql statements 
that sometimes do a rollback after such queries because even though no 
rows was found is fine for SQL, it may be an issue in the application 
that expects there to be at least one row.  So we're trying to determine 
if there's actually any difference between commit/rollback after SELECT 
statements (with rows returned or not), a bit like if there's any 
difference for an UPDATE statement that returns zero rows were updated.








RE: Rename a column if not already renamed.?

2019-08-29 Thread Day, David
Hi,

Finally resolved this.  Bottom-line some stupidity-bad analysis on my part.
Scenario was - changes were ported from trunk back to a branch and then rolling 
that branch
back into trunk. 

Altering the rename_column fx to check that old and new name did not exist
was a necessary for  merge process to complete.   

I ended up with an additional patch in trunk that would only be relevant to a 
upgraded system,
to DROP  IF EXISTS old_column name that was re-added by a trunk patch to when 
the branch rolled forward.

Obviously nothing to do with 9.6 -> 11.3 postgres upgrade.


Again thanks to all for assistance


Dave



-Original Message-
From: Day, David 
Sent: Wednesday, August 21, 2019 2:58 PM
To: 'Adrian Klaver' ; Tom Lane 
Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
Subject: RE: Rename a column if not already renamed.?

Restoring into 11.3 instance the 9.6 dump  ? ->  yes.

For the upgrade scenario, I  confirmed that both old column name and new name 
are in the pg_attribute table at the time the patch attempts to rename it.
Why both  is a big question.
However,
It is easy enough to re-write the column rename function to deal with the 
simultaneous possibility.  I will include the redefined function in the merge 
patch and see how it goes.

I'll update the thread after some further exploration.

Thanks all for your assistance.


-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com] 
Sent: Wednesday, August 21, 2019 11:47 AM
To: Day, David ; Tom Lane 
Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
Subject: Re: Rename a column if not already renamed.?

On 8/21/19 7:52 AM, Day, David wrote:
> I agree the function could be improved to deal with both old and new name 
> existing simultaneously.
> That is almost certainly the root  cause, and one that I would confirm if the 
> tester and site were currently available to me.
> 
> Our work flow  for this scenario is something like:
> 
> 1.  9.6 pg_dump takes a snapshot of our  9.6  database.
> 2.  Postgres is upgraded/freshly installed to  11.3..
> 3.  The 9.6 database is restored using the version 11 pg_restore tool.

In 3) you are restoring to the new 11.3 instance, correct?

> 
> 4. Once our application process starts up, it sees there is a patch available 
> in it's old branch that is one greater then it's restored  9.6 content.
> That happens to be a merge patch which resets the expectations.
> It attempts to apply all patches in the new branch since the point of 
> divergence and runs into my current issue.
>   
> It occurs to me I could simply put an exception handler in the rename column 
> function and I would likely proceed merrily along.
> But curiosity is killing me and the cat. What is causing the old name to 
> persist in the pg_attribute table after the rename. ?

If you are indeed working on the new instance pg_attribute would have no 
'memory' of the dropped column.  It would seem to me to come down to what is 
passed into sys.rename_column() as old_name_, new_name.

> 
> Would a stale function referencing the old column name be a contributor?
> 
> 
> Regards
> 
> 
> Dave Day
> 
> 
> 
> 
> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> Sent: Tuesday, August 20, 2019 4:57 PM
> To: Day, David 
> Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
> Subject: Re: Rename a column if not already renamed.?
> 
> "Day, David"  writes:
>> The error is something like column already exists and
> 
> Not sure about the workflow this function is used within, but maybe you need 
> to consider what to do when both the old and new column names exist.
> Because that sure sounds like what is happening.
> 
>   regards, tom lane
> 
> 
> 
> 
> 


--
Adrian Klaver
adrian.kla...@aklaver.com


How to reformat output of "age()" function

2019-09-11 Thread David Gauthier
How can I change the default output of the "age" function to be, for
example, in minutes?

E.g.
dvdb=> select age('2019-09-11 09:00:00','2019-09-09 11:00:00');
  age

 1 day 22:00:00
(1 row)

I want the equivalent of that time delta in minutes.

Thanks in Advance


Re: How to reformat output of "age()" function

2019-09-11 Thread David Gauthier
Thanks a lot!

On Wed, Sep 11, 2019 at 12:34 PM Francisco Olarte 
wrote:

> David:
>
> On Wed, Sep 11, 2019 at 5:39 PM David Gauthier 
> wrote:
> > How can I change the default output of the "age" function to be, for
> example, in minutes?
> > E.g.
> > dvdb=> select age('2019-09-11 09:00:00','2019-09-09 11:00:00');
> >   age
> > 
> >  1 day 22:00:00
> > (1 row)
> > I want the equivalent of that time delta in minutes.
>
> Some answers have already been posted, but also consider what you are
> doing. Intervals have three components, months, days, seconds for a
> reason, ( ignoring leap seconds for now ) not all days have 24h (
> daylight saving time changes ) and not all months have 28/29/30/31
> days. IIRC interval normalization for epoch assumes all months have 30
> days, all days have 24 hours.
>
> If you want to know the elapsed minutes between two timestamps, it
> might be better to do it directly, extract the epoch from both (
> seconds ), substract, divide by 60 truncating/rounding if you need to.
>
> This is what happens in one case on my timezone ( Europe/Madrid ):
>
>
> test=# select '2019.11.20 20:00:00'::timestamptz as end, '2019.06.10
> 10:00:00'::timestamptz as start;
>   end   | start
> +
>  2019-11-20 20:00:00+01 | 2019-06-10 10:00:00+02
> (1 row)
>
> test=# select age('2019.11.20 20:00:00'::timestamptz, '2019.06.10
> 10:00:00'::timestamptz) as interval;
> interval
> -
>  5 mons 10 days 10:00:00
> (1 row)
>
> test=# select extract(epoch from age('2019.11.20
> 20:00:00'::timestamptz, '2019.06.10 10:00:00'::timestamptz)) as
> interval_seconds;
>  interval_seconds
> --
>  1386
> (1 row)
>
> test=# select extract(epoch from '2019.11.20 20:00:00'::timestamptz) -
> extract(epoch from '2019.06.10 10:00:00'::timestamptz) as
> elapsed_seconds;
>  elapsed_seconds
> -
> 14122800
> (1 row)
>
> Francisco Olarte.
>


Web GUI for PG table ?

2019-09-12 Thread David Gauthier
Hi:

We're considering replacing a windows AccessDB based system with PG.
Access was chosen because of it's GUI to its tables (looks and behaves like
a SS). But performance can be volatile given the fact that the AccessDB
front-ends and back-end are at different sites 1000+ miles apart.  The
belief is that a web-based approach where the DB server and web server are
at the same site would be better.

Question is, would such a "spreadsheet-like" GUI have to be built from
scratch or is there something that can be leveraged?  Things to consider
are record locking, constraint pull-downs, triggers updating values, etc...
.  We already have a nice PG server doing other work for us and would like
to use that if possible.


Re: Web GUI for PG table ?

2019-09-12 Thread David Gauthier
Many good visualization options but I need one that runs on the web AND
allows insert/update/delete records.


On Thu, Sep 12, 2019 at 10:42 AM Adrian Klaver 
wrote:

> On 9/12/19 7:08 AM, David Gauthier wrote:
> > Hi:
> >
> > We're considering replacing a windows AccessDB based system with PG.
> > Access was chosen because of it's GUI to its tables (looks and behaves
> > like a SS). But performance can be volatile given the fact that the
> > AccessDB front-ends and back-end are at different sites 1000+ miles
> > apart.  The belief is that a web-based approach where the DB server and
> > web server are at the same site would be better.
> > Question is, would such a "spreadsheet-like" GUI have to be built from
> > scratch or is there something that can be leveraged?  Things to consider
> > are record locking, constraint pull-downs, triggers updating values,
> > etc... .  We already have a nice PG server doing other work for us and
> > would like to use that if possible.
>
> I've been using Tabulator(http://tabulator.info/) a JavaScript library
> for displaying data in tabular form. It does not care about the backend
> just that it gets data.
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: pgbackrest restore to new location?

2019-09-17 Thread David Steele
On 9/17/19 7:23 AM, Luca Ferrari wrote:
> On Tue, Sep 17, 2019 at 12:00 PM Ron  wrote:
>> The real problem is that after doing that, "pg_ctl start -D
>> /path/to/new/data" fails with "PANIC: could not locate a valid checkpoint
>> record".

Sounds like backup_label is missing or has been deleted.

The easiest way to restore to a new location is just to copy
pgbackrest.conf from the primary (or create a new one) with the same
stanza and then alter pg1-path in pgbackrest.conf or at the command line.

Regards,

-- 
-David
da...@pgmasters.net




Re: pgbackrest restore to new location?

2019-09-18 Thread David Steele
On 9/17/19 10:03 PM, Stephen Frost wrote:
> 
> That said- it brings up a pretty serious issue that should be discussed,
> and that's nuking this:
> 
> HINT:  If you are not restoring from a backup, try removing the file 
> ".../backup_label".
> 
> That hint is absolutely wrong these days when many tools have been
> updated to use the non-exclusive backup method and it just ends up
> getting people into trouble and, worse, can result in them having
> corrupted clusters.
> 
> I'll get a patch into the next commitfest to remove it.  The exclusive
> method has been deprecated for quite a few releases and we should stop
> giving bad advice on the assumption that people are using it.

We updated the error message and hints in c900c152, but it was not
back-patched.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-18 Thread David Steele
On 9/18/19 6:59 PM, Ron wrote:
> 
> Scenario: there's data corruption on production server, so we need to do
> a PITR restore from "a few days ago" of the cluster holding the prod
> databases to a second cluster on that same VM in order to try and find
> the missing data and load it back into the prod cluster.
> 
> Other than putting a high I/O load on the LUN where repo-path is located
> (from both writing WALs to it and reading the backed up files), will
> there be any problems when "pg_ctl start" processes recovery.conf and
> applies the WAL files to the new cluster while the prod cluster is
> writing new WAL files.
> 
> Does my question make sense?

It does, but the answer lies outside of pgBackRest.  "Can the repo
storage handle the load of archive-push and archive-get at the same
time" is really a question of storage and network throughput.

pgBackRest compresses everything by default which goes a long way
towards increasing throughput, but ultimately we don't control the
bandwidth.

Having said that, if the storage and network throughput are sufficient,
restoring and recovering a standby using pgBackRest will not impact the
primary as a direct pg_basebackup will.

Regards,

-- 
-David
da...@pgmasters.net




Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-18 Thread David Steele
On 9/18/19 9:40 PM, Ron wrote:
> 
> I'm concerned with one pgbackrest process stepping over another one and
> the restore (or the "pg_ctl start" recovery phase) accidentally
> corrupting the production database by writing WAL files to the original
> cluster.

This is not an issue unless you seriously game the system.  When a
cluster is promoted it selects a new timeline and all WAL will be
archived to the repo on that new timeline.  It's possible to promote a
cluster without a timeline switch by tricking it but this is obviously a
bad idea.

So, if you promote the new cluster and forget to disable archive_command
there will be no conflict because the clusters will be generating WAL on
separate timelines.

In the case of a future failover a higher timeline will be selected so
there still won't be a conflict.

Unfortunately, that dead WAL from the rogue cluster will persist in the
repo until an PostgreSQL upgrade because expire doesn't know when it can
be removed since it has no context.  We're not quite sure how to handle
this but it seems a relatively minor issue, at least as far as
consistency is concerned.

If you do have a split-brain situation where two primaries are archiving
on the same timeline then first-in wins.  WAL from the losing primary
will be rejected.

Regards,
-- 
-David
da...@pgmasters.net




  1   2   3   4   5   6   7   8   9   10   >