RPMs from postgresql.org break CentOS/RHEL RPMs

2020-03-19 Thread Peter Krefting

Hi!

I have a system installed where one software is running a database 
using the PostgreSQL 9.2 packages included in the CentOS/RHEL 7 
release. On the side, I am trying to install another software that is 
using the PostgreSQL 12 packages found at download.postgresql.org 
(EL-7-x86_64). However, I am running into problems when both are 
installed at the same time, the postgresql.org ones causes the CentOS 
packages to fail at run-time:


psql: symbol lookup error: psql: undefined symbol: pqsignal
psql: symbol lookup error: psql: undefined symbol: pqsignal

# ldd /usr/bin/psql |grep pq
libpq.so.5 => /usr/pgsql-12/lib/libpq.so.5 (0x7f93fcbe2000)
# /usr/bin/psql --version
psql (PostgreSQL) 9.2.24

How do I resolve this issue, to allow both programs to run on the same 
host?


--
\\// Peter - http://www.softwolves.pp.se/




Re: PostgreSQL 10 not archiving some WAL files

2020-03-19 Thread Norberto Dellê
>On Wed, Mar 18, 2020 at 10:57:22AM -0300, Norberto Dellê wrote:
> >> This setup worked very well for a long time, but since we upgraded
> >> PostgreSQL to newer versions (10.x), sometimes it just skips archiving
> some
> >> wal files. This seems to happen mainly when the server is shut down.
> >> There's no messages about errors in the logs, and also no corresponding
> >> .done file in archive_status, it just goes to the next wal file.
> >>
> >> I would like to know if any of you had or has a problem like this, and
> if
> >> you had, how you solved it.
> >
> >Hmm.  I don't recall seeing any of that.  This would mean that we are
> >either missing the generation of some .ready file, or that some .done
> >file gets generated when they should not in archive_status/.  What
> >kind of server shutdown are you doing?  Immediate so as recovery
> >happens at the follow-up startup.  Or is that a clean service
> >shutdown?
> >--
> >Michael
> >...
>
> Hi Michael
>
> Almost daily the server is shutdown, because my client wants to save
> energy.
> In the logs, it appears that the service was not shut down properly.
> Here's the last lines of one log file:
>
> 2020-03-16 18:10:03.130 -04 [1964] LOG:  received fast shutdown request
> 2020-03-16 18:10:03.132 -04 [2948] ERROR:  canceling statement due to user
> request
> 2020-03-16 18:10:03.192 -04 [1964] LOG:  aborting any active transactions
> 2020-03-16 18:10:03.194 -04 [1964] LOG:  worker process: logical
> replication launcher (PID 2948) exited with exit code 1
> 2020-03-16 18:10:03.210 -04 [2188] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [6688] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [3348] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [6356] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [5736] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [4028] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.211 -04 [2552] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.212 -04 [4280] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.210 -04 [2336] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.227 -04 [6564] FATAL:  terminating connection due to
> administrator command
> 2020-03-16 18:10:03.255 -04 [2884] LOG:  shutting down
>
> When the service is started the next day, this is what Postgres logs:
>
> 2020-03-17 07:47:26.721 -04 [2272] LOG:  database system was interrupted;
> last known up at 2020-03-16 17:56:20 -04
> 2020-03-17 07:47:34.974 -04 [2884] FATAL:  the database system is starting
> up
> 2020-03-17 07:47:39.378 -04 [2272] LOG:  database system was not properly
> shut down; automatic recovery in progress
> 2020-03-17 07:47:39.631 -04 [2272] LOG:  redo starts at 38/F30F5108
> 2020-03-17 07:47:40.038 -04 [3052] FATAL:  the database system is starting
> up
> 2020-03-17 07:47:40.126 -04 [2272] LOG:  redo done at 38/F4001738
> 2020-03-17 07:47:40.207 -04 [2272] LOG:  last completed transaction was at
> log time 2020-03-16 18:06:53.269798-04
> 2020-03-17 07:47:42.590 -04 [2080] LOG:  database system is ready to
> accept connections
>
> In this shutdown/startup process, this PostgreSQL instance skipped
> archiving a wal file.
> Another thing that I perceived today is that in another instance, days
> after (about a week) skipping a wal file, the file was finally archived.
> Both of these instances that I talked about also have streaming
> replication configured. But I don't think it is related because there other
> instances skipping archiving wal files that don't have streaming
> replication configured.
>
> Norberto
>


Re: RPMs from postgresql.org break CentOS/RHEL RPMs

2020-03-19 Thread Michael Paquier
On Thu, Mar 19, 2020 at 01:41:55PM +0100, Peter Krefting wrote:
> I have a system installed where one software is running a database using the
> PostgreSQL 9.2 packages included in the CentOS/RHEL 7 release. On the side,
> I am trying to install another software that is using the PostgreSQL 12
> packages found at download.postgresql.org (EL-7-x86_64). However, I am
> running into problems when both are installed at the same time, the
> postgresql.org ones causes the CentOS packages to fail at run-time:
> 
> psql: symbol lookup error: psql: undefined symbol: pqsignal
> psql: symbol lookup error: psql: undefined symbol: pqsignal
> 
> # ldd /usr/bin/psql |grep pq
>   libpq.so.5 => /usr/pgsql-12/lib/libpq.so.5 (0x7f93fcbe2000)
> # /usr/bin/psql --version
> psql (PostgreSQL) 9.2.24
> 
> How do I resolve this issue, to allow both programs to run on the same host?

You need to update your package bundling Postgres 12's libpq to the
latest version available, as pqsignal() got removed from libpq and it
was a mistake.  It got back with this commit as we had similar
breakage reports causing linking issues like yours:
commit: 06a367c382d0a3595238eff2e777222dbc91911b
author: Tom Lane 
date: Thu, 10 Oct 2019 14:24:56 -0400
Put back pqsignal() as an exported libpq symbol.

Here is the thread of the discussion:
https://www.postgresql.org/message-id/e1g5vmt-0003k1...@gemulon.postgresql.org
--
Michael


signature.asc
Description: PGP signature


RE: PG12 autovac issues

2020-03-19 Thread Justin King
Hi Andres-

Thanks for the reply, answers below.

On Tue, Mar 17, 2020 at 8:19 PM Andres Freund  wrote:
>
> Hi,
>
> On 2020-03-17 17:18:57 -0500, Justin King wrote:
> > As you can see in this table, there are only ~80K rows, but billions
> > of updates.  What we have observed is that the frozenxid reaches the
> > 200M mark fairly quickly because of the amount of activity.
>
> And each of those updates is in a separate transaction? Is that
> required? I.e. any chance to perform multiple of those updates in one
> transaction?
>
> Have you considered just increasing the vacuum limit? It's mostly there
> because it can increase space usage a bit, but given today's systems its
> not a usually a problem unless you have hundreds of postgres instances
> on a single system.  It's pretty common to just increase that to 1.8
> billion or so.

We have considered increasing the limit, but as I mentioned, the
problem that we're seeing is that (effectively) a autovac starts on a
system database (postgres, template1) and never completes, or
deadlocks, or something.  This completely stops autovacs from running
until we manually intervene and run a VACUUM FREEZE -- at which point,
the autovacs resume.  If we increase the vacuum limit and this
situation happens, we're going to be in real trouble.

> From a single stats snapshot we can't actually understand the actual xid
> consumption - is it actually the xid usage that triggers the vacuums?

We have looked at this and the xid consumption averages around 1250
xid/sec -- this is when we see the "aggressive" autovac kick off in
the logs.  What I don't understand is why these xid's are being
consumed at this rate on the databases with no activity (postgres,
template1).

>
>
> What makes you think it is a problem that you have all these vacuums? If
> you actually update that much, and you have indexes, you're going want a
> lot of vacuums?
>

I actually don't think there's a problem with the vacuums (I was
mostly pointing out that they are very regular and not problematic).
The main problem I am having is that something is causing the
autovacuums to completely stop and require manual intervention to
resume -- and it seems to be when the "postgres" or "template1"
database hits the autovacuum_freeze_max_age.

>
> > What is interesting is that this happens with the 'postgres' and
> > 'template1' databases as well and there is absolutely no activity in
> > those databases.
>
> That's normal. They should be pretty darn quick in v12?

Yes, a manual VACUUM FREEZE of either database takes less than 1
second -- which is why it's perplexing that the autovac starts and
never seems to complete and prevents other autovacs from running.

>
> Greetings,
>
> Andres Freund




Re: Re: PG12 autovac issues

2020-03-19 Thread Justin King
On Wed, Mar 18, 2020 at 10:13 AM Adrian Klaver
 wrote:
>
> On 3/18/20 6:57 AM, Justin King wrote:
> Please reply to list also
> Ccing list
>
>
> >>> Here are the settings, these are the only ones that are not set to
> >>> default with the exception of a few tables that have been overridden
> >>> with a different value due to lots of updates and few rows:
> >>
> >> And those values are?
> >
> > Thanks for the response, hopefully this will help:
>
> The below is helpful, but what I was referring to above was the settings
> for the overridden tables.

Ah, apologies, I missed that.  They are:

alter table production.tita set
(autovacuum_analyze_scale_factor = 0, autovacuum_vacuum_scale_factor =
0, autovacuum_vacuum_threshold = 10, autovacuum_analyze_threshold
= 5);

>
> >
> > postgres=# select name,setting from pg_settings where name like '%vacuum%';
> > name = setting
> > autovacuum = on
> > autovacuum_analyze_scale_factor = 0.1
> > autovacuum_analyze_threshold = 2500
> > autovacuum_freeze_max_age = 2
> > autovacuum_max_workers = 8
> > autovacuum_multixact_freeze_max_age = 4
> > autovacuum_naptime = 15
> > autovacuum_vacuum_cost_delay = 20
> > autovacuum_vacuum_cost_limit = -1
> > autovacuum_vacuum_scale_factor = 0.2
> > autovacuum_vacuum_threshold = 500
> > autovacuum_work_mem = -1
> > log_autovacuum_min_duration = 0
> > vacuum_cleanup_index_scale_factor = 0.1
> > vacuum_cost_delay = 0
> > vacuum_cost_limit = 1000
> > vacuum_cost_page_dirty = 20
> > vacuum_cost_page_hit = 1
> > vacuum_cost_page_miss = 10
> > vacuum_defer_cleanup_age = 0
> > vacuum_freeze_min_age = 5000
> > vacuum_freeze_table_age = 15000
> > vacuum_multixact_freeze_min_age = 500
> > vacuum_multixact_freeze_table_age = 15000
> >
> >>
> >> More below.
> >>
> >>>
> >>> autovacuum = on
> >>> log_autovacuum_min_duration = 0
> >>> autovacuum_max_workers = 8
> >>> autovacuum_naptime = 15s
> >>> autovacuum_vacuum_threshold = 500
> >>> autovacuum_analyze_threshold = 2500
> >>> vacuum_cost_limit = 1000
> >>
> >> Are either of the below set > 0?:
> >>
> >> vacuum_cost_delay
> >>
> >> autovacuum_vacuum_cost_delay
> >>
> >>>
> >>> We want fairly aggressive autovacs to keep table bloat limited -- the
> >>> application latency suffers if it has to wade through dead tuples and
> >>> staying near realtime is important in our environment.
> >>>
> >>> ** Also, it should be noted that the autovacuum_analyze_threshold is
> >>> probably an incorrect value, we likely intended that to be 250 and
> >>> just have now realized it after poking more at the configuration.
> >>>
> 
> >>
> >>
> >>
> >> --
> >> Adrian Klaver
> >> adrian.kla...@aklaver.com
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com




Re: Fwd: PG12 autovac issues

2020-03-19 Thread Justin King
On Wed, Mar 18, 2020 at 1:40 PM Michael Lewis  wrote:
>
> Do you have default fillfactor set on this table? If not, I would wonder if 
> reducing it to 50% or even 20% would allow many more HOT updates that would 
> reduce bloat.

I don't believe we have a default fillfactor, but I'm still trying to
understand why autovacs would completely stop -- that seems like a
bug.  Especially since there was no change between PG10 and PG12 and
this problem never existed there.

> Also, is there any period of lower activity on your system that you could 
> schedule a vacuum freeze for daily or weekly? I believe having frozen pages 
> would also mean all the autovacuums would be able to skip more pages and 
> therefore be faster.
>
> >> autovacuum_vacuum_cost_delay = 20
>
> This was changed to 2ms in PG12. You should reduce that most likely.

Actually, we set that back from 2ms > 20ms to replicate what PG10 was
doing just in case it had an impact (spoiler: it did not).




Re: Fwd: PG12 autovac issues

2020-03-19 Thread Michael Lewis
On Thu, Mar 19, 2020 at 9:31 AM Justin King  wrote:

> On Wed, Mar 18, 2020 at 1:40 PM Michael Lewis  wrote:
> >
> > Do you have default fillfactor set on this table? If not, I would wonder
> if reducing it to 50% or even 20% would allow many more HOT updates that
> would reduce bloat.
>
> I don't believe we have a default fillfactor, but I'm still trying to
> understand why autovacs would completely stop -- that seems like a
> bug.  Especially since there was no change between PG10 and PG12 and
> this problem never existed there.
>

Is there any reason to not schedule vacuum freeze for each db daily? Just
curious.


Re: Fwd: PG12 autovac issues

2020-03-19 Thread Justin King
On Thu, Mar 19, 2020 at 11:02 AM Michael Lewis  wrote:
>
> On Thu, Mar 19, 2020 at 9:31 AM Justin King  wrote:
>>
>> On Wed, Mar 18, 2020 at 1:40 PM Michael Lewis  wrote:
>> >
>> > Do you have default fillfactor set on this table? If not, I would wonder 
>> > if reducing it to 50% or even 20% would allow many more HOT updates that 
>> > would reduce bloat.
>>
>> I don't believe we have a default fillfactor, but I'm still trying to
>> understand why autovacs would completely stop -- that seems like a
>> bug.  Especially since there was no change between PG10 and PG12 and
>> this problem never existed there.
>
>
> Is there any reason to not schedule vacuum freeze for each db daily? Just 
> curious.

No, not really -- it just feels like a bandaid for something that
should (and was) working already.  It is possible where I'm headed,
but I just thought I'd reach out to the community to see if anyone has
seen or could think of a reason why this might have started by moving
from PG10 > PG12 and whether it was some possible regression.




Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread pabloa98
Hello,

My schema requires a counter for each combination of 2 values. Something
like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model
requires a seq_number.

If I use a table "counter", I could still have counter collisions between 2
transactions. I need truly sequence behavior. Is that possible by using a
table like "counter" table, where the counter could be increased out of the
transaction so it performs as a sequence without having race conditions
between concurrent transactions?

The other option is to create sequences for each new pair of (group,
element) using triggers. There are millions of pairs. So this approach will
generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a
schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo


Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Rob Sargent



> On Mar 19, 2020, at 3:36 PM, pabloa98  wrote:
> 
> Hello,
> 
> My schema requires a counter for each combination of 2 values. Something like:
> 
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
> 
> For each entry in counter, aka for each (group, element) pair, the model 
> requires a seq_number.
> 
> If I use a table "counter", I could still have counter collisions between 2 
> transactions. I need truly sequence behavior. Is that possible by using a 
> table like "counter" table, where the counter could be increased out of the 
> transaction so it performs as a sequence without having race conditions 
> between concurrent transactions?
> 
> The other option is to create sequences for each new pair of (group, element) 
> using triggers. There are millions of pairs. So this approach will generate 
> millions of sequences.
> 
> How a PostgreSQL database would behave having millions of sequences in a 
> schema? Would it degrade its performance? Is there any negative impact?
> 
> Regards
> 
> Pablo
> 

To clarify, are you hoping for consecutive numbers as the each row is added to 
the table, i.e. “serial”?

What is the intension of “seq_number”?
> 
> 





Re: Invalid byte sequence errors on DB restore

2020-03-19 Thread Samuel Smith

On 3/15/20 11:18 PM, Samuel Smith wrote:
My current DB backup routine is just to call pg_dump and pipe to gzip. 
We recently started to get a failure to restore (which is basically just 
using psql -f on the pg_dump file) with the following errors:


invalid byte sequence for encoding "UTF8": 0xa0
  and
invalid byte sequence for encoding "UTF8": 0xd7 0x20


This is on a pg 9.2.24 instance. Any tips to troubleshoot?

Regards,
Samuel Smith




Our issue actually turned out to be that a couple of our production 
database had the encoding set to SQL_ASCII while all of our development 
servers had UTF-8. This meant in some cases where we would restore a 
production database into development (for testing or bug hunting), there 
would be a failure to parse the backup file. A similar issue to this 
blog post: 
https://www.endpoint.com/blog/2017/07/21/postgres-migrating-sqlascii-to-utf-8


So our fix was to dump the affected production databases using the 
LATIN1 encoding for pg_dump followed by destroying and recreating the 
database and setting its new encoding to UTF-8. Then we could restore 
the data using the pg_dump file with LATIN1 encoding.


Regards,
Samuel Smith




Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread pabloa98
On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent  wrote:

>
>
> > On Mar 19, 2020, at 3:36 PM, pabloa98  wrote:
> >
> > Hello,
> >
> > My schema requires a counter for each combination of 2 values. Something
> like:
> >
> > CREATE TABLE counter(
> > group INT NOT NULL,
> > element INT NOT NULL,
> > seq_number INT NOT NULL default 0,
> > CONSTRAINT PRIMARY KEY (group, element)
> > );
> >
> > For each entry in counter, aka for each (group, element) pair, the model
> requires a seq_number.
> >
> > If I use a table "counter", I could still have counter collisions
> between 2 transactions. I need truly sequence behavior. Is that possible by
> using a table like "counter" table, where the counter could be increased
> out of the transaction so it performs as a sequence without having race
> conditions between concurrent transactions?
> >
> > The other option is to create sequences for each new pair of (group,
> element) using triggers. There are millions of pairs. So this approach will
> generate millions of sequences.
> >
> > How a PostgreSQL database would behave having millions of sequences in a
> schema? Would it degrade its performance? Is there any negative impact?
> >
> > Regards
> >
> > Pablo
> >
>
> To clarify, are you hoping for consecutive numbers as the each row is
> added to the table, i.e. “serial”?
>
> What is the intension of “seq_number”?
> >
> >
>
>
the idea is to have like a serial sequence, but for each pair of (group,
element).

so that when we insert rows in another table, we could have something like:


group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence
created by the trigger.

I want to implement a variation of https://stackoverflow.com/a/30204854 and
that will generate millions of sequences.


Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Rob Sargent


> On Mar 19, 2020, at 4:13 PM, pabloa98  wrote:
> 
> 
> 
> On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent  > wrote:
> 
> 
> > On Mar 19, 2020, at 3:36 PM, pabloa98  > > wrote:
> > 
> > Hello,
> > 
> > My schema requires a counter for each combination of 2 values. Something 
> > like:
> > 
> > CREATE TABLE counter(
> > group INT NOT NULL,
> > element INT NOT NULL,
> > seq_number INT NOT NULL default 0,
> > CONSTRAINT PRIMARY KEY (group, element)
> > );
> > 
> > For each entry in counter, aka for each (group, element) pair, the model 
> > requires a seq_number.
> > 
> > If I use a table "counter", I could still have counter collisions between 2 
> > transactions. I need truly sequence behavior. Is that possible by using a 
> > table like "counter" table, where the counter could be increased out of the 
> > transaction so it performs as a sequence without having race conditions 
> > between concurrent transactions?
> > 
> > The other option is to create sequences for each new pair of (group, 
> > element) using triggers. There are millions of pairs. So this approach will 
> > generate millions of sequences.
> > 
> > How a PostgreSQL database would behave having millions of sequences in a 
> > schema? Would it degrade its performance? Is there any negative impact?
> > 
> > Regards
> > 
> > Pablo
> > 
> 
> To clarify, are you hoping for consecutive numbers as the each row is added 
> to the table, i.e. “serial”?
> 
> What is the intension of “seq_number”?
> > 
> > 
> 
> 
> the idea is to have like a serial sequence, but for each pair of (group, 
> element).
> 
> so that when we insert rows in another table, we could have something like:
> 
> 
> group, element, event_id, ...
> 1, 1, 1
> 1, 1, 2
> 1, 1, 3
> 2, 1, 1
> 1, 1, 4
> 1, 3, 1
> 1, 1, 5
> 1, 3, 2
> 2, 1, 2
> 2, 1, 3
> 
> The 3rd column is the sequence number we get from the appropriate sequence 
> created by the trigger.
>  
> I want to implement a variation of https://stackoverflow.com/a/30204854 
>  and that will generate millions of 
> sequences.
> 
> 
Then I don’t thing group/element can be a PRIMARY KEY 




How does pg_basebackup manage to create a snapshot of the filesystem?

2020-03-19 Thread Dennis Jacobfeuerborn
Hi,
I'm currently trying to understand how backups work. In the
documentation in section "25.2. File System Level Backup" it says that
filesystem level backups can only be made when the database if offline
yet pg_basebackup seems to do just that but works while the database is
online. Am I misunderstanding something here or does pg_basebackup use
some particular features of Postgres to accomplish this?

Regards,
  Dennis




Passwordcheck configuration

2020-03-19 Thread Dave Hughes
Hello,
I have a requirement to set some password complexity for our database such
as length of password, upper case, lower case, special characters,
expiration limit, reuse, etc.

I saw there was a module you can use for this called passwordcheck.  Seems
easy to install, but I don't see how you can configure it for you specific
needs?  The documentation (
https://www.postgresql.org/docs/11/passwordcheck.html) says it can be
configured using CrackLib, but again I don't see "how" you go about setting
this up.

Does anyone know of any documentation that shows you "how" to set up
specific requirements?

Thanks in advance,
Dave Hughes


Re: How does pg_basebackup manage to create a snapshot of the filesystem?

2020-03-19 Thread Christophe Pettus



> On Mar 19, 2020, at 15:19, Dennis Jacobfeuerborn  
> wrote:
> I'm currently trying to understand how backups work. In the
> documentation in section "25.2. File System Level Backup" it says that
> filesystem level backups can only be made when the database if offline
> yet pg_basebackup seems to do just that but works while the database is
> online. Am I misunderstanding something here or does pg_basebackup use
> some particular features of Postgres to accomplish this?

pg_basebackup does, indeed, take an inconsistent copy of the file system while 
it is running; what allows it to bring the database back up to consistency is 
the write-ahead log segments that are created while pg_basebackup is running.  
That's why it is important to have all of the WAL segments created during the 
run (which is what --wal-method=stream provides you).
--
-- Christophe Pettus
   x...@thebuild.com





Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread pabloa98
On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent  wrote:

>
>
> On Mar 19, 2020, at 4:13 PM, pabloa98  wrote:
>
>
>
> On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent  wrote:
>
>>
>>
>> > On Mar 19, 2020, at 3:36 PM, pabloa98  wrote:
>> >
>> > Hello,
>> >
>> > My schema requires a counter for each combination of 2 values.
>> Something like:
>> >
>> > CREATE TABLE counter(
>> > group INT NOT NULL,
>> > element INT NOT NULL,
>> > seq_number INT NOT NULL default 0,
>> > CONSTRAINT PRIMARY KEY (group, element)
>> > );
>> >
>> > For each entry in counter, aka for each (group, element) pair, the
>> model requires a seq_number.
>> >
>> > If I use a table "counter", I could still have counter collisions
>> between 2 transactions. I need truly sequence behavior. Is that possible by
>> using a table like "counter" table, where the counter could be increased
>> out of the transaction so it performs as a sequence without having race
>> conditions between concurrent transactions?
>> >
>> > The other option is to create sequences for each new pair of (group,
>> element) using triggers. There are millions of pairs. So this approach will
>> generate millions of sequences.
>> >
>> > How a PostgreSQL database would behave having millions of sequences in
>> a schema? Would it degrade its performance? Is there any negative impact?
>> >
>> > Regards
>> >
>> > Pablo
>> >
>>
>> To clarify, are you hoping for consecutive numbers as the each row is
>> added to the table, i.e. “serial”?
>>
>> What is the intension of “seq_number”?
>> >
>> >
>>
>>
> the idea is to have like a serial sequence, but for each pair of (group,
> element).
>
> so that when we insert rows in another table, we could have something like:
>
>
> group, element, event_id, ...
> 1, 1, 1
> 1, 1, 2
> 1, 1, 3
> 2, 1, 1
> 1, 1, 4
> 1, 3, 1
> 1, 1, 5
> 1, 3, 2
> 2, 1, 2
> 2, 1, 3
>
> The 3rd column is the sequence number we get from the appropriate sequence
> created by the trigger.
>
> I want to implement a variation of https://stackoverflow.com/a/30204854
> and that will generate millions of sequences.
>
>
> Then I don’t thing group/element can be a PRIMARY KEY
>
>
Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a new
row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq,
etc that will be used to insert values in **another** table. It will be
used for nothing else. When we insert millions of pairs group/element, the
trigger in that table will generate millions of sequences.

My question is how PostgreSQL will behave. Could it deal with millions of
sequences? What about system operations as vacuum, etc?

Pablo


Re: PG12 autovac issues

2020-03-19 Thread Andres Freund
Hi,

On 2020-03-19 10:23:48 -0500, Justin King wrote:
> > From a single stats snapshot we can't actually understand the actual xid
> > consumption - is it actually the xid usage that triggers the vacuums?
> 
> We have looked at this and the xid consumption averages around 1250
> xid/sec -- this is when we see the "aggressive" autovac kick off in
> the logs.  What I don't understand is why these xid's are being
> consumed at this rate on the databases with no activity (postgres,
> template1).

The xid counter is global across all databases.


> > What makes you think it is a problem that you have all these vacuums? If
> > you actually update that much, and you have indexes, you're going want a
> > lot of vacuums?

> I actually don't think there's a problem with the vacuums (I was
> mostly pointing out that they are very regular and not problematic).
> The main problem I am having is that something is causing the
> autovacuums to completely stop and require manual intervention to
> resume -- and it seems to be when the "postgres" or "template1"
> database hits the autovacuum_freeze_max_age.

Did you look at pg_stat_activity for those autovacuums to see whether
they're blocked on something?


> > > What is interesting is that this happens with the 'postgres' and
> > > 'template1' databases as well and there is absolutely no activity in
> > > those databases.
> >
> > That's normal. They should be pretty darn quick in v12?
> 
> Yes, a manual VACUUM FREEZE of either database takes less than 1
> second -- which is why it's perplexing that the autovac starts and
> never seems to complete and prevents other autovacs from running.

One big difference between a manual VACUUM and autovacuum is that with
the default settings VACUUM is not throttled, but autovacuum is.

What are your vacuum_cost_delay, autovacuum_vacuum_cost_delay,
vacuum_cost_limit, autovacuum_vacuum_cost_limit, vacuum_cost_page_hit,
vacuum_cost_page_miss set to?

Greetings,

Andres Freund




Re: PG12 autovac issues

2020-03-19 Thread Justin King
On Thu, Mar 19, 2020 at 5:35 PM Andres Freund  wrote:
>
> Hi,
>
> On 2020-03-19 10:23:48 -0500, Justin King wrote:
> > > From a single stats snapshot we can't actually understand the actual xid
> > > consumption - is it actually the xid usage that triggers the vacuums?
> >
> > We have looked at this and the xid consumption averages around 1250
> > xid/sec -- this is when we see the "aggressive" autovac kick off in
> > the logs.  What I don't understand is why these xid's are being
> > consumed at this rate on the databases with no activity (postgres,
> > template1).
>
> The xid counter is global across all databases.

Then what does the "age" value represent for each database in this
case?  Perhaps I'm misunderstanding what I'm looking at?

postgres=#  SELECT datname, age(datfrozenxid),
current_setting('autovacuum_freeze_max_age') FROM pg_database;
  datname  |age| current_setting
---+---+-
 postgres  | 100937449 | 2
 template1 |  50244438 | 2
 template0 | 160207297 | 2
 feedi | 150147602 | 2


>
> > > What makes you think it is a problem that you have all these vacuums? If
> > > you actually update that much, and you have indexes, you're going want a
> > > lot of vacuums?
>
> > I actually don't think there's a problem with the vacuums (I was
> > mostly pointing out that they are very regular and not problematic).
> > The main problem I am having is that something is causing the
> > autovacuums to completely stop and require manual intervention to
> > resume -- and it seems to be when the "postgres" or "template1"
> > database hits the autovacuum_freeze_max_age.
>
> Did you look at pg_stat_activity for those autovacuums to see whether
> they're blocked on something?

This is not something we've done yet but will next time it occurs.

> > > > What is interesting is that this happens with the 'postgres' and
> > > > 'template1' databases as well and there is absolutely no activity in
> > > > those databases.
> > >
> > > That's normal. They should be pretty darn quick in v12?
> >
> > Yes, a manual VACUUM FREEZE of either database takes less than 1
> > second -- which is why it's perplexing that the autovac starts and
> > never seems to complete and prevents other autovacs from running.
>
> One big difference between a manual VACUUM and autovacuum is that with
> the default settings VACUUM is not throttled, but autovacuum is.
>
> What are your vacuum_cost_delay, autovacuum_vacuum_cost_delay,
> vacuum_cost_limit, autovacuum_vacuum_cost_limit, vacuum_cost_page_hit,
> vacuum_cost_page_miss set to?

Here are all the vacuum related values for the server:

postgres=# select name,setting from pg_settings where name like '%vacuum%';
autovacuum = on
autovacuum_analyze_scale_factor = 0.1
autovacuum_analyze_threshold = 2500
autovacuum_freeze_max_age = 2
autovacuum_max_workers = 8
autovacuum_multixact_freeze_max_age = 4
autovacuum_naptime = 15
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = -1
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_threshold = 500
autovacuum_work_mem = -1
log_autovacuum_min_duration = 0
vacuum_cleanup_index_scale_factor = 0.1
vacuum_cost_delay = 0
vacuum_cost_limit = 1000
vacuum_cost_page_dirty = 20
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_defer_cleanup_age = 0
vacuum_freeze_min_age = 5000
vacuum_freeze_table_age = 15000
vacuum_multixact_freeze_min_age = 500
vacuum_multixact_freeze_table_age = 15000

I know the database is busy, so the throttling makes sense, but it
seems like it would complete eventually. We see blocked autovacs for
many hours.

> Greetings,
>
> Andres Freund




Re: Passwordcheck configuration

2020-03-19 Thread Tom Lane
Dave Hughes  writes:
> I have a requirement to set some password complexity for our database such
> as length of password, upper case, lower case, special characters,
> expiration limit, reuse, etc.

Usually, if you have to do something like that, we recommend setting PG to
use PAM authentication and configuring the restrictions on the PAM side.
The only native capability in that direction is that you can set a
password expiration date.

Note that it's widely believed that this sort of thing makes you LESS
secure, not more.  Quite aside from the well-established fact that forced
password changes are bad from a human-factors standpoint, you can't check
any of those other points unless the password is sent to the server as
cleartext.  That creates its own set of vulnerabilities, and I don't
know of anybody who considers it good practice.

> I saw there was a module you can use for this called passwordcheck.  Seems
> easy to install, but I don't see how you can configure it for you specific
> needs?

passwordcheck hasn't got any out-of-the-box configurability.  It's mainly
meant as sample code that people could modify if they have a mind to.

(I seem to recall some recent discussion about deprecating/removing
passwordcheck altogether, but I can't find it right now.)

regards, tom lane




Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread David G. Johnston
On Thu, Mar 19, 2020 at 3:33 PM pabloa98  wrote:

> Table "counter" will have triggers that will create a sequence with a new
> row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq,
> etc that will be used to insert values in **another** table. It will be
> used for nothing else. When we insert millions of pairs group/element, the
> trigger in that table will generate millions of sequences.
>
> My question is how PostgreSQL will behave. Could it deal with millions of
> sequences? What about system operations as vacuum, etc?
>
>
First, it sounds like you care about there being no gaps in the records you
end up saving.  If that is the case then sequences will not work for you.

.
.
.

If you are still reading because you can live with some gaps - then does
having one sequence per pair really matter?

Regardless, at this scale you probably should setup a performance test as
part of your continuous integration/test infrastructure, and let it answer
the "which performs better" question.

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters here
but I'm just guessing)

A big determinant would seem to be how your sessions would go about using
the sequences.  You've described the model requirement but haven't describe
the process by which the model will be used; and without that information
useful comments pertaining to alternative implementations are difficult to
formulate.

David J.


Re: PG12 autovac issues

2020-03-19 Thread Andres Freund
Hi,

On 2020-03-19 18:07:14 -0500, Justin King wrote:
> On Thu, Mar 19, 2020 at 5:35 PM Andres Freund  wrote:
> >
> > Hi,
> >
> > On 2020-03-19 10:23:48 -0500, Justin King wrote:
> > > > From a single stats snapshot we can't actually understand the actual xid
> > > > consumption - is it actually the xid usage that triggers the vacuums?
> > >
> > > We have looked at this and the xid consumption averages around 1250
> > > xid/sec -- this is when we see the "aggressive" autovac kick off in
> > > the logs.  What I don't understand is why these xid's are being
> > > consumed at this rate on the databases with no activity (postgres,
> > > template1).
> >
> > The xid counter is global across all databases.
> 
> Then what does the "age" value represent for each database in this
> case?  Perhaps I'm misunderstanding what I'm looking at?
> 
> postgres=#  SELECT datname, age(datfrozenxid),
> current_setting('autovacuum_freeze_max_age') FROM pg_database;
>   datname  |age| current_setting
> ---+---+-
>  postgres  | 100937449 | 2
>  template1 |  50244438 | 2
>  template0 | 160207297 | 2
>  feedi | 150147602 | 2

Look at datfrozenxid without the age(). age(xid) computes how "old" xid
is compared to the "next" xid to be assigned. Until vacuum comes around
and performs work, pg_database.datfrozenxid / pg_class.relfrozenxid are
constant, since they represent the values actually present in the
table.  But if xids are being consumed, their "age" increases, because
they're further and further in the past relative to the "newest" xids.


> > One big difference between a manual VACUUM and autovacuum is that with
> > the default settings VACUUM is not throttled, but autovacuum is.
> >
> > What are your vacuum_cost_delay, autovacuum_vacuum_cost_delay,
> > vacuum_cost_limit, autovacuum_vacuum_cost_limit, vacuum_cost_page_hit,
> > vacuum_cost_page_miss set to?
> 
> Here are all the vacuum related values for the server:
> 
> postgres=# select name,setting from pg_settings where name like '%vacuum%';
> autovacuum = on
> autovacuum_analyze_scale_factor = 0.1
> autovacuum_analyze_threshold = 2500
> autovacuum_freeze_max_age = 2
> autovacuum_max_workers = 8
> autovacuum_multixact_freeze_max_age = 4
> autovacuum_naptime = 15
> autovacuum_vacuum_cost_delay = 20
> autovacuum_vacuum_cost_limit = -1
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_vacuum_threshold = 500
> autovacuum_work_mem = -1
> log_autovacuum_min_duration = 0
> vacuum_cleanup_index_scale_factor = 0.1
> vacuum_cost_delay = 0
> vacuum_cost_limit = 1000
> vacuum_cost_page_dirty = 20
> vacuum_cost_page_hit = 1
> vacuum_cost_page_miss = 10
> vacuum_defer_cleanup_age = 0
> vacuum_freeze_min_age = 5000
> vacuum_freeze_table_age = 15000
> vacuum_multixact_freeze_min_age = 500
> vacuum_multixact_freeze_table_age = 15000
> 
> I know the database is busy, so the throttling makes sense, but it
> seems like it would complete eventually.

The cost limit/delay are way too long/small respectively for a busy
postgres instance.


> We see blocked autovacs for many hours.

On the same table, or just generally being busy?

Greetings,

Andres Freund




Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Adrian Klaver

On 3/19/20 3:32 PM, pabloa98 wrote:



On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent > wrote:





On Mar 19, 2020, at 4:13 PM, pabloa98 mailto:pablo...@gmail.com>> wrote:



On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent mailto:robjsarg...@gmail.com>> wrote:



> On Mar 19, 2020, at 3:36 PM, pabloa98 mailto:pablo...@gmail.com>> wrote:
>
> Hello,
>
> My schema requires a counter for each combination of 2
values. Something like:
>
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> For each entry in counter, aka for each (group, element)
pair, the model requires a seq_number.
>
> If I use a table "counter", I could still have counter
collisions between 2 transactions. I need truly sequence
behavior. Is that possible by using a table like "counter"
table, where the counter could be increased out of the
transaction so it performs as a sequence without having race
conditions between concurrent transactions?
>
> The other option is to create sequences for each new pair of
(group, element) using triggers. There are millions of pairs.
So this approach will generate millions of sequences.
>
> How a PostgreSQL database would behave having millions of
sequences in a schema? Would it degrade its performance? Is
there any negative impact?
>
> Regards
>
> Pablo
>

To clarify, are you hoping for consecutive numbers as the each
row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?
>
>


the idea is to have like a serial sequence, but for each pair of
(group, element).

so that when we insert rows in another table, we could have
something like:


group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate
sequence created by the trigger.
I want to implement a variation of
https://stackoverflow.com/a/30204854 and that will generate
millions of sequences.



Then I don’t thing group/element can be a PRIMARY KEY


Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a 
new row is inserted called counter_1_1_seq, counter_2_1_seq, 
counter_1_3_seq, etc that will be used to insert values in **another** 
table. It will be used for nothing else. When we insert millions of 
pairs group/element, the trigger in that table will generate millions of 
sequences.


This has train wreck written all over it. If you are going to have a 
trigger for each combination of (group, element) I gotta believe the 
table will fall over before you run into sequence issues. Not sure why 
there can't be serial column that has supplies the sequence numbers and 
therefore only one sequence in play. From what I see all you care about 
is that:


group, element, event_id

has an increasing event_id for (group, element) pairs.




My question is how PostgreSQL will behave. Could it deal with millions 
of sequences? What about system operations as vacuum, etc?


Pablo



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread pabloa98
On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver 
wrote:

> On 3/19/20 3:32 PM, pabloa98 wrote:
> >
> >
> > On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent  > > wrote:
> >
> >
> >
> >> On Mar 19, 2020, at 4:13 PM, pabloa98  >> > wrote:
> >>
> >>
> >>
> >> On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent  >> > wrote:
> >>
> >>
> >>
> >> > On Mar 19, 2020, at 3:36 PM, pabloa98  >> > wrote:
> >> >
> >> > Hello,
> >> >
> >> > My schema requires a counter for each combination of 2
> >> values. Something like:
> >> >
> >> > CREATE TABLE counter(
> >> > group INT NOT NULL,
> >> > element INT NOT NULL,
> >> > seq_number INT NOT NULL default 0,
> >> > CONSTRAINT PRIMARY KEY (group, element)
> >> > );
> >> >
> >> > For each entry in counter, aka for each (group, element)
> >> pair, the model requires a seq_number.
> >> >
> >> > If I use a table "counter", I could still have counter
> >> collisions between 2 transactions. I need truly sequence
> >> behavior. Is that possible by using a table like "counter"
> >> table, where the counter could be increased out of the
> >> transaction so it performs as a sequence without having race
> >> conditions between concurrent transactions?
> >> >
> >> > The other option is to create sequences for each new pair of
> >> (group, element) using triggers. There are millions of pairs.
> >> So this approach will generate millions of sequences.
> >> >
> >> > How a PostgreSQL database would behave having millions of
> >> sequences in a schema? Would it degrade its performance? Is
> >> there any negative impact?
> >> >
> >> > Regards
> >> >
> >> > Pablo
> >> >
> >>
> >> To clarify, are you hoping for consecutive numbers as the each
> >> row is added to the table, i.e. “serial”?
> >>
> >> What is the intension of “seq_number”?
> >> >
> >> >
> >>
> >>
> >> the idea is to have like a serial sequence, but for each pair of
> >> (group, element).
> >>
> >> so that when we insert rows in another table, we could have
> >> something like:
> >>
> >>
> >> group, element, event_id, ...
> >> 1, 1, 1
> >> 1, 1, 2
> >> 1, 1, 3
> >> 2, 1, 1
> >> 1, 1, 4
> >> 1, 3, 1
> >> 1, 1, 5
> >> 1, 3, 2
> >> 2, 1, 2
> >> 2, 1, 3
> >>
> >> The 3rd column is the sequence number we get from the appropriate
> >> sequence created by the trigger.
> >> I want to implement a variation of
> >> https://stackoverflow.com/a/30204854 and that will generate
> >> millions of sequences.
> >>
> >>
> > Then I don’t thing group/element can be a PRIMARY KEY
> >
> >
> > Thank you for your answer. Let me explain:
> >
> > Table "counter" will have triggers that will create a sequence with a
> > new row is inserted called counter_1_1_seq, counter_2_1_seq,
> > counter_1_3_seq, etc that will be used to insert values in **another**
> > table. It will be used for nothing else. When we insert millions of
> > pairs group/element, the trigger in that table will generate millions of
> > sequences.
>
> This has train wreck written all over it. If you are going to have a
> trigger for each combination of (group, element) I gotta believe the
> table will fall over before you run into sequence issues. Not sure why
> there can't be serial column that has supplies the sequence numbers and
> therefore only one sequence in play. From what I see all you care about
> is that:
>
> group, element, event_id
>
> has an increasing event_id for (group, element) pairs.
>
>

I need a different sequence because it is a business requirement.
Please forget about the first example. I was not clear. It will be
something like this.

I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code
that will be generated using a sequence of that is a function of
s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

I could make column code SERIAL and be done with it. HOWEVER, the "code"
column starting from 1 for each pair (group, element) is a business
requirement. The business domain dictates that for each pair of (group,
element) there is a code that will start from 1 to . It is OK to
have gaps in the sequence.

So this table "event" will contain entries like:

 1, 1, 1
 1, 1, 2
 1, 1, 3
 2, 1, 1
 1, 1, 4
 1, 3, 1
 1, 1, 5
 1, 3, 2
 2, 1, 2
 2, 1, 3

These tri

Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Rob Sargent


> On Mar 19, 2020, at 6:45 PM, pabloa98  wrote:
> 
> 
> 
> On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver  > wrote:
> On 3/19/20 3:32 PM, pabloa98 wrote:
> > 
> > 
> > On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent  >  
> > >> wrote:I have 
> > a table called "pair":
> 
> CREATE TABLE pair(
> group INT NOT NULL,
> element INT NOT NULL,
> CONSTRAINT PRIMARY KEY (group, element)
> );
> 
> I must add entries to the table "event". This table event will have a code 
> that will be generated using a sequence of that is a function of 
> s(group,element). 
> CREATE TABLE event(
> group INT NOT NULL,
> element INT NOT NULL,
> code INT NOT NULL,
> CONSTRAINT PRIMARY KEY(code, element, group)
> );
> 
Unless event table is searched by code more than group, you probably want to 
maintain the key order from the pair table's primary key.

If gaps are ok do you still near multiple sequences?




Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread pabloa98
On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent  wrote:

>
>
> On Mar 19, 2020, at 6:45 PM, pabloa98  wrote:
>
>
>
> On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver 
> wrote:
>
>> On 3/19/20 3:32 PM, pabloa98 wrote:
>> >
>> >
>> > On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent > > > wrote:I have a table called "pair":
>
>
> CREATE TABLE pair(
> group INT NOT NULL,
> element INT NOT NULL,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> I must add entries to the table "event". This table event will have a code
> that will be generated using a sequence of that is a function of
> s(group,element).
> CREATE TABLE event(
> group INT NOT NULL,
> element INT NOT NULL,
> code INT NOT NULL,
> CONSTRAINT PRIMARY KEY(code, element, group)
> );
>
> Unless event table is searched by code more than group, you probably want
> to maintain the key order from the pair table's primary key.
>

OK. I will do that.

>
> If gaps are ok do you still near multiple sequences?
>
> I need to start each "code" value from 1 in each (group, element) pair.
This is because we only have  codes for each pair. We do not want
to waste numbers. If there is a gap is OK but no a gap of millions of
numbers.


Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Rob Sargent


> On Mar 19, 2020, at 7:35 PM, pabloa98  wrote:
> 
> 
> 
> On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent  > wrote:
> 
> 
>> On Mar 19, 2020, at 6:45 PM, pabloa98 > > wrote:
>> 
>> 
>> 
>> On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver > > wrote:
>> On 3/19/20 3:32 PM, pabloa98 wrote:
>> > 
>> > 
>> > On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent > >  
>> > >> wrote:I 
>> > have a table called "pair":
>> 
>> CREATE TABLE pair(
>> group INT NOT NULL,
>> element INT NOT NULL,
>> CONSTRAINT PRIMARY KEY (group, element)
>> );
>> 
>> I must add entries to the table "event". This table event will have a code 
>> that will be generated using a sequence of that is a function of 
>> s(group,element). 
>> CREATE TABLE event(
>> group INT NOT NULL,
>> element INT NOT NULL,
>> code INT NOT NULL,
>> CONSTRAINT PRIMARY KEY(code, element, group)
>> );
>> 
> Unless event table is searched by code more than group, you probably want to 
> maintain the key order from the pair table's primary key.
> 
> OK. I will do that. 
> 
> If gaps are ok do you still near multiple sequences?
> 
> I need to start each "code" value from 1 in each (group, element) pair. This 
> is because we only have  codes for each pair. We do not want to waste 
> numbers. If there is a gap is OK but no a gap of millions of numbers.





Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Michael Lewis
On Thu, Mar 19, 2020, 5:48 PM David G. Johnston 
wrote:

> However, one other consideration with sequences: do you care that
> PostgreSQL will cache/pin (i.e., no release) every single sequence you
> touch for the lifetime of the session? (I do not think DISCARD matters here
> but I'm just guessing)
>


Would you expand on this point or is there someplace specific in the
documentation on this?

>


Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Tom Lane
Michael Lewis  writes:
> On Thu, Mar 19, 2020, 5:48 PM David G. Johnston 
> wrote:
>> However, one other consideration with sequences: do you care that
>> PostgreSQL will cache/pin (i.e., no release) every single sequence you
>> touch for the lifetime of the session? (I do not think DISCARD matters here
>> but I'm just guessing)

> Would you expand on this point or is there someplace specific in the
> documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation.  Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory.  I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that.  As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently.  It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

regards, tom lane




Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread Adrian Klaver

On 3/19/20 7:38 PM, Michael Lewis wrote:



On Thu, Mar 19, 2020, 5:48 PM David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:


However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence
you touch for the lifetime of the session? (I do not think DISCARD
matters here but I'm just guessing)



Would you expand on this point or is there someplace specific in the 
documentation on this?




See the section starting here:

https://www.postgresql.org/docs/12/sql-createsequence.html

Notes

"Unexpected results might be obtained if a cache setting greater than 
one is used for a sequence object that will be used concurrently by 
multiple sessions. Each session will allocate and cache successive 
sequence values during one access to the sequence object and increase 
the sequence object's last_value accordingly. Then, the next cache-1 
uses of nextval within that session simply return the preallocated 
values without touching the sequence object. So, any numbers allocated 
but not used within a session will be lost when that session ends, 
resulting in “holes” in the sequence.


...
"


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread pabloa98
On Thu, Mar 19, 2020 at 9:12 PM Adrian Klaver 
wrote:

> On 3/19/20 7:38 PM, Michael Lewis wrote:
> >
> >
> > On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
> > mailto:david.g.johns...@gmail.com>> wrote:
> >
> > However, one other consideration with sequences: do you care that
> > PostgreSQL will cache/pin (i.e., no release) every single sequence
> > you touch for the lifetime of the session? (I do not think DISCARD
> > matters here but I'm just guessing)
> >
> >
> >
> > Would you expand on this point or is there someplace specific in the
> > documentation on this?
> >
>
> See the section starting here:
>
> https://www.postgresql.org/docs/12/sql-createsequence.html
>
> Notes
>
> "Unexpected results might be obtained if a cache setting greater than
> one is used for a sequence object that will be used concurrently by
> multiple sessions. Each session will allocate and cache successive
> sequence values during one access to the sequence object and increase
> the sequence object's last_value accordingly. Then, the next cache-1
> uses of nextval within that session simply return the preallocated
> values without touching the sequence object. So, any numbers allocated
> but not used within a session will be lost when that session ends,
> resulting in “holes” in the sequence.
>
> ...
> "
>
> We will use a CACHE 1. This is because when nextval('seq') is invoked, we
are hitting 3 or 4 more tables so the sequence will not be a performance
blocker (compared with all the operations in the transaction).


> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread pabloa98
I see.

Any suggestion? It should behave like a sequence in the sense that
concurrent transitions will get different numbers from this alternative
sequence like solution.

In our case, we will need to do a call nextval('some_seq') (or similar)
from different processes no more than twice every minute.


It would be nice to have a sequence data type. Correct me if I am wrong,
but It seems to me that a sequence data type would cost the same or less
than the current sequence implementation.

The schema would be more clear too. We could have a table like:

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
seq SEQUENCE INCREMENT 1
  START 1
  CACHE 1
  MINVALUE 1
  MAXVALUE 
NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

And then:

INSERT INTO event(group, element, code)
VALUES (
 1,
 1,
 ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE p.group=1
and p.code=1 )
);

Or perhaps storing all the sequences in the same table as rows will have
the same behavior.

Pablo


On Thu, Mar 19, 2020 at 7:56 PM Tom Lane  wrote:

> Michael Lewis  writes:
> > On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <
> david.g.johns...@gmail.com>
> > wrote:
> >> However, one other consideration with sequences: do you care that
> >> PostgreSQL will cache/pin (i.e., no release) every single sequence you
> >> touch for the lifetime of the session? (I do not think DISCARD matters
> here
> >> but I'm just guessing)
>
> > Would you expand on this point or is there someplace specific in the
> > documentation on this?
>
> I think what David is worried about is that a sequence object is a
> one-row table in PG's implementation.  Thus
>
> (1) each sequence requires a dozen or two rows in assorted system
> catalogs (not sure exactly how many offhand).
>
> (2) each sequence uses up 8KB on disk for its table file.
>
> (3) each sequence you actually access within a particular session
> results in creation of relcache and catcache entries in that
> session's local memory.  I'm not sure offhand how big those are
> either, but a few KB per sequence would be a reasonable guess.
>
> (4) each sequence competes for space in the shared-buffer arena,
> since its 8K block has to be swapped into there whenever you try
> to access/increment the sequence's value.
>
> This is all mighty inefficient of course, and there's been talk
> of trying to reduce the per-sequence overhead; but I don't know
> of anyone actively working on that.  As things stand, I think
> having millions of sequences would be quite painful performance-
> wise, especially if your workload were actively hitting a lot
> of them concurrently.  It would work, for some value of "work",
> but it wouldn't perform very well.
>
> Also, as multiple people mentioned already, this still wouldn't
> guarantee gap-free sequences of ID values.
>
> regards, tom lane
>


Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread David G. Johnston
On Thursday, March 19, 2020, Tom Lane  wrote:

> Michael Lewis  writes:
> > On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <
> david.g.johns...@gmail.com>
> > wrote:
> >> However, one other consideration with sequences: do you care that
> >> PostgreSQL will cache/pin (i.e., no release) every single sequence you
> >> touch for the lifetime of the session? (I do not think DISCARD matters
> here
> >> but I'm just guessing)
>
> > Would you expand on this point or is there someplace specific in the
> > documentation on this?
>
> I think what David is worried about is that a sequence object is a
> one-row table in PG's implementation.  Thus
>
> (1) each sequence requires a dozen or two rows in assorted system
> catalogs (not sure exactly how many offhand).
>

Actually I seemed to have missed that dynamic.  I was actually referring to
the SeqTable hash table specified here:

https://github.com/postgres/postgres/blob/85f6b49c2c53fb1e08d918ec9305faac13cf7ad6/src/backend/commands/sequence.c#L69


I wouldn’t think there would be much spreading of data throughout the
catalog if the sequences are unowned (by tables).

David J.


Re: Could postgres12 support millions of sequences? (like 10 million)

2020-03-19 Thread David G. Johnston
On Thursday, March 19, 2020, pabloa98  wrote:
>
>
>> We will use a CACHE 1. This is because when nextval('seq') is invoked, we
> are hitting 3 or 4 more tables so the sequence will not be a performance
> blocker (compared with all the operations in the transaction).
>

The other implementation detail to remember is:

“Because nextval and setval calls are never rolled back, sequence objects
cannot be used if “gapless” assignment of sequence numbers is needed.”

David J.