kernel.shmmax and kernel.shmall for Linux server

2021-02-12 Thread Matthias Apitz


Hello,

I've read the doc https://www.postgresql.org/docs/13/kernel-resources.html
but I'm still relatively clueless, what to set. For example for our
standard installations like

- RAM: 32 GByte
- one PostgreSQL 13.1 server
- one database with ~400 tables, 4 GByte in size
- 100-200 connecting PostgreSQL-clients (C/C++ application servers,
  single threaded, for interactive usage)

The above document says, the Linux default values are good enough, but
they're:

# /usr/sbin/sysctl kernel | grep shm
kernel.shm_next_id = -1
kernel.shm_rmid_forced = 0
kernel.shmall = 1152921504606846720
kernel.shmmax = 18446744073709551615
kernel.shmmni = 4096

The values are coming from 

# cat /boot/sysctl.conf-5.3.18-24.46-default
...
kernel.shmmax = 0x
# SHMALL = SHMMAX/PAGE_SIZE*(SHMMNI/16)
kernel.shmall = 0x0f00

Any hints for real values?

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub




pg_stat_progress_vacuum empty when running vacuum full

2021-02-12 Thread Luca Ferrari
Hi all,
I'm running 11.5 and I'm monitoring pg_stat_progress_vacuum every 2
seconds, while doing from another connection per-table VACUUMs. Every
vacuum last 30 or more seconds, so I was expecting to see a record
within pg_stat_progress_vacuum, but nothing appears if the vacuum is
full. I suspect this is due to vacuum full performing a side-by-side
table rewriting, rather than in-place actions against the original
table, but I'm not sure if this is real reason or if I'm missing
something.

Thanks,
Luca




Re: pg_stat_progress_vacuum empty when running vacuum full

2021-02-12 Thread Ian Lawrence Barwick
2021年2月12日(金) 20:01 Luca Ferrari :

> Hi all,
> I'm running 11.5 and I'm monitoring pg_stat_progress_vacuum every 2
> seconds, while doing from another connection per-table VACUUMs. Every
> vacuum last 30 or more seconds, so I was expecting to see a record
> within pg_stat_progress_vacuum, but nothing appears if the vacuum is
> full. I suspect this is due to vacuum full performing a side-by-side
> table rewriting, rather than in-place actions against the original
> table, but I'm not sure if this is real reason or if I'm missing
> something.
>
>
If you're executing VACUUM FULL, you should be looking at
pg_stat_progress_cluster:


https://www.postgresql.org/docs/current/progress-reporting.html#CLUSTER-PROGRESS-REPORTING

Though that doesn't appear until Pg 12.

Regards

Ian Barwick
-- 
EnterpriseDB: https://www.enterprisedb.com


Re: pg_stat_progress_vacuum empty when running vacuum full

2021-02-12 Thread luis . roberto
> De: "Luca Ferrari" 
> Para: "pgsql-general" 
> Enviadas: Sexta-feira, 12 de fevereiro de 2021 8:00:46
> Assunto: pg_stat_progress_vacuum empty when running vacuum full

> Hi all, 
> I'm running 11.5 and I'm monitoring pg_stat_progress_vacuum every 2 
> seconds, while doing from another connection per-table VACUUMs. Every 
> vacuum last 30 or more seconds, so I was expecting to see a record 
> within pg_stat_progress_vacuum, but nothing appears if the vacuum is 
> full. I suspect this is due to vacuum full performing a side-by-side 
> table rewriting, rather than in-place actions against the original 
> table, but I'm not sure if this is real reason or if I'm missing 
> something. 
> 
> Thanks, 
> Luca 


Hi!

I believe VACUUM FULL progress can be monitored via the 
pg_stat_progress_cluster command: 
https://www.postgresql.org/docs/current/progress-reporting.html#VACUUM-PROGRESS-REPORTING




Re: PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-02-12 Thread Guy Burgess

On 12/02/2021 12:31 am, Lionel Bouton wrote:

I haven't dealt with a Windows environment for quite some time, but from
what I remember an antivirus installs a driver intercepting file
accesses and these drivers are still active even if you disable the
antivirus (I suppose they just call a noop instead of content analysis
code) and can still interfere with your system. For example some years
ago I've seen what looked like a race condition involving rename for
MySQL on Windows that could not be fixed by disabling the antivirus but
could by uninstalling it completely.

You might want to uninstall the antivirus temporarily to check this.


Thanks Lionel for this suggestion. Its the built-in Windows Defender AV 
which I believe can't be uninstalled, so am limited to 'disabling' it 
(plus it has exclusions for the PG directories & processes already). 
Using Procmon I can't see any AV file activity when it is disabled. Will 
keep monitoring it though.


Kind regards

Guy





Re: PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-02-12 Thread Guy Burgess

On 12/02/2021 4:33 am, Thorsten Schöning wrote:

If you see that somewhat frequently, use Process Monitor and Process
Explorer to see who accesses those files how. ProcExp easily allows
you to find all open handles per file. If it's not AV, it might be
something like Windows Search Indexer as well, if that is enabled by
default in Server 2019 at all.

Thanks Thorsten, that's a great idea - I'm using Procmon and Process 
Explorer and able to monitor activity on the WAL files, so hopefully 
that sheds some light. The Indexer service is disabled, so can rule that 
out.


Kind regards

Guy





Preventing free space from being reused

2021-02-12 Thread Noah Bergbauer
Hello,

I am working on a system to store and analyze JSON-based log data. The idea
is to store tuples with a monotonically increasing timestamp (i.e.
CURRENT_TIMESTAMP) in a table and run some queries that focus on specific
time windows (~one hour). It seems to me that a BRIN index is designed
exactly for this type of application.

However, I find that a BRIN index with standard settings (pages_per_range =
128) with very simple filters (e.g. ts > (CURRENT_TIMESTAM - INTERVAL '1
hour')) causes ~20x more pages to be scanned than I would expect. A
majority of these tuples is then discarded due to index rechecking.
Inspecting the BRIN pages using the pageinspect extension reveals the
following problem: it seems that if a page is left with some free space
that can not be filled right away (because the tuples being inserted are
too large), then this hole can be filled at some arbitrary later point in
time (sometimes hours later) once a small enough tuple comes along. This
substantially reduces the BRIN index's effectiveness.

I confirmed this theory by CLUSTERing the table using a temporary btree
index. Suddenly the query performance exactly matched my estimates.

The JSON data structure is often similar, so the table is stored on ZFS
with compression. Hence, filling these holes brings no benefit - they were
taking up no space thanks to the compression. On the other hand, rewriting
old pages in order to fill up holes also creates a lot of fragmentation
because of ZFS's copy-on-write semantics.

In summary, all of these problems would be solved if there was some way to
stop the database system from ever reusing free space.

Bonus question: what's the best TOAST configuration for this use case? Is
there any point in out-of-line storage when all tuples are going to be
quite large (i.e. > 1kB)? Is there any benefit in having postgresql
compress the data when ZFS runs compression regardless?

Thank you,
Noah Bergbauer


Re: kernel.shmmax and kernel.shmall for Linux server

2021-02-12 Thread Tom Lane
Matthias Apitz  writes:
> I've read the doc https://www.postgresql.org/docs/13/kernel-resources.html
> but I'm still relatively clueless, what to set.

For modern PG versions on Linux, you basically shouldn't ever have to
touch those numbers.  I'd only start to worry if I saw server startup
failures.  These values do *not* constrain what you can set for
shared_buffers or the like; at most, they'd constrain how many PG
instances you can run on one machine.

regards, tom lane




Re: Preventing free space from being reused

2021-02-12 Thread Michael Lewis
What version are you using? What is your usage pattern for
insert/update/deletes? If sometimes the JSON data gets too big and the data
is moved from in-line storage to TOASTED, then that would be opening up
gaps. Or if you are doing deletes. Perhaps adjusting your fillfactor
and/or TOAST_TUPLE_TARGET would influence that behavior in the direction
you want to go. As best I understand though if you are doing deletes, you
won't be able to prevent those gaps from showing up and being re-used. I
don't believe there is any way to influence Postgres to append-only the
tuples and pages.


Re: Preventing free space from being reused

2021-02-12 Thread Noah Bergbauer
This is on version 12.5. The usage pattern is inserts only, no updates or
deletes at all. Hence, fillfactor is set to 100.

It just seems extremely unfortunate in this particular case that Postgres
goes to all the trouble of tetris-ing new tuples into existing pages, only
to cripple performance in two different ways (BRIN and disk fragmentation).
In other words, if there is no workaround for this problem, then I believe
that an option to essentially just disable free space maps on a
per-relation basis should be added.

On Fri, Feb 12, 2021 at 4:58 PM Michael Lewis  wrote:

> What version are you using? What is your usage pattern for
> insert/update/deletes? If sometimes the JSON data gets too big and the data
> is moved from in-line storage to TOASTED, then that would be opening up
> gaps. Or if you are doing deletes. Perhaps adjusting your fillfactor
> and/or TOAST_TUPLE_TARGET would influence that behavior in the direction
> you want to go. As best I understand though if you are doing deletes, you
> won't be able to prevent those gaps from showing up and being re-used. I
> don't believe there is any way to influence Postgres to append-only the
> tuples and pages.
>


Re: Preventing free space from being reused

2021-02-12 Thread Michael Lewis
If you have no updates or deletes, then I would wonder about setting
fillfactor LOWER such that new rows are less likely to find a gap that is
acceptable. Depending how/when you use the json, lowering
toast_tuple_target may be appropriate to store (nearly?) all out of line
and making the data stored in the main relation be more uniform in size.
Are you seeing significant toast usage currently?


Re: Preventing free space from being reused

2021-02-12 Thread Noah Bergbauer
I'm experimenting with that right now. Setting storage to MAIN appears to
be counterproductive, whereas EXTERNAL with toast_tuple_target = 128 is
definitely an improvement. In theory, this configuration might eliminate
the problem, but due to the toast_tuple_target bug (
https://www.postgresql.org/message-id/flat/20190403063759.GF3298%40paquier.xyz)
plenty of 1kB tuples are still being stored inline. As a result I'm
averaging around 11.5 tuples per page, when it should be >200 (one tuple is
35 bytes when stored out of line). A small test query shows ~7000 tuples
removed by index recheck, but based on my statistics only ~1500 would be
expected (in the ideal case where tuple disk order matches insertion order).

On the other hand, wouldn't the toast table still run into the disk
fragmentation issue? Also, the 4-byte oid may actually become an issue a
few months down the road.

I wonder how hard it would be to hack up a table access method that is just
a copy of heap where HEAP_INSERT_SKIP_FSM is always set...

On Fri, Feb 12, 2021 at 5:49 PM Michael Lewis  wrote:

> If you have no updates or deletes, then I would wonder about setting
> fillfactor LOWER such that new rows are less likely to find a gap that is
> acceptable. Depending how/when you use the json, lowering
> toast_tuple_target may be appropriate to store (nearly?) all out of line
> and making the data stored in the main relation be more uniform in size.
> Are you seeing significant toast usage currently?
>


MultiXactMemberControlLock contention on a replica

2021-02-12 Thread Christophe Pettus
On a whole fleet of load-balanced replicas, we saw an incident where one 
particular query started backing up on MultiXactMemberControlLock and 
multixact_member.  There was no sign of this backup on the primary.  Under what 
conditions would there be enough multixact members on a replica (where you 
can't do UPDATE / SELECT FOR UPDATE / FOR SHARE) to start spilling to disk?
--
-- Christophe Pettus
   x...@thebuild.com





Re: Preventing free space from being reused

2021-02-12 Thread Noah Bergbauer
>I wonder how hard it would be to hack up a table access method that is
just a copy of heap where HEAP_INSERT_SKIP_FSM is always set...

Update: I think this actually works. It's awful because I just copied the
entire builtin heap code into an extension and then renamed a few functions
so they don't collide during dynamic linking, but after changing the
table's access method to the one from the extension there is no more
overlap!

Before:
 {2021-02-12 14:21:24.93711+01 .. 2021-02-12 15:14:28.564695+01}
 {2021-02-12 15:10:22.832827+01 .. 2021-02-12 20:45:34.918054+01}
 {2021-02-12 15:50:50.768549+01 .. 2021-02-12 20:51:20.487791+01}
 {2021-02-12 16:25:01.224667+01 .. 2021-02-12 17:07:31.95343+01}
 {2021-02-12 16:51:30.007498+01 .. 2021-02-12 18:15:42.377372+01}
 {2021-02-12 17:30:17.943716+01 .. 2021-02-12 18:55:00.030094+01}
 {2021-02-12 18:08:39.488203+01 .. 2021-02-12 20:55:00.012211+01}
 {2021-02-12 19:05:35.495433+01 .. 2021-02-12 20:20:00.088014+01}

After:
 {2021-02-12 20:00:32.61514+01 .. 2021-02-12 20:45:23.617548+01}
 {2021-02-12 20:45:23.617548+01 .. 2021-02-12 20:51:05.098479+01}
 {2021-02-12 20:51:05.219331+01 .. 2021-02-12 20:57:56.93961+01}
 {2021-02-12 20:57:57.000953+01 .. 2021-02-12 21:02:10.245561+01}




On Fri, Feb 12, 2021 at 6:38 PM Noah Bergbauer  wrote:

> I'm experimenting with that right now. Setting storage to MAIN appears to
> be counterproductive, whereas EXTERNAL with toast_tuple_target = 128 is
> definitely an improvement. In theory, this configuration might eliminate
> the problem, but due to the toast_tuple_target bug (
> https://www.postgresql.org/message-id/flat/20190403063759.GF3298%40paquier.xyz)
> plenty of 1kB tuples are still being stored inline. As a result I'm
> averaging around 11.5 tuples per page, when it should be >200 (one tuple is
> 35 bytes when stored out of line). A small test query shows ~7000 tuples
> removed by index recheck, but based on my statistics only ~1500 would be
> expected (in the ideal case where tuple disk order matches insertion order).
>
> On the other hand, wouldn't the toast table still run into the disk
> fragmentation issue? Also, the 4-byte oid may actually become an issue a
> few months down the road.
>
> I wonder how hard it would be to hack up a table access method that is
> just a copy of heap where HEAP_INSERT_SKIP_FSM is always set...
>
> On Fri, Feb 12, 2021 at 5:49 PM Michael Lewis  wrote:
>
>> If you have no updates or deletes, then I would wonder about setting
>> fillfactor LOWER such that new rows are less likely to find a gap that is
>> acceptable. Depending how/when you use the json, lowering
>> toast_tuple_target may be appropriate to store (nearly?) all out of line
>> and making the data stored in the main relation be more uniform in size.
>> Are you seeing significant toast usage currently?
>>
>


Re: AW: ldap connection parameter lookup

2021-02-12 Thread Thomas Guyot
On 2021-01-22 10:22, Zwettler Markus (OIZ) wrote:
>>
>> On Fri, 2021-01-15 at 14:09 +, Zwettler Markus (OIZ) wrote:
>>> I want to use ldap to lookup the connection parameters:
>>> https://www.postgresql.org/docs/12/libpq-ldap.html
>>>
>>> or is there also some kind of generic variant like this (meaning lookup 
>>> connection
>>> parameters for the database name I tell you somehow):
>>>
>>> [${PGDATABASE}]
>>> ldap://ldap.mycompany.com/dc=mycompany,dc=com?description?one?(cn=${PGD
>>> ATABASE})
>>
>> I proposed something like that a while ago:
>> https://postgr.es/m/D960CB61B694CF459DCFB4B0128514C2F3442B%40exadv11
>> .host.magwien.gv.at
>> but it was rejected.
>>
>> Perhaps you could come up with a better version.
>>
> I'm afraid not. My proposal is as close to yours.
> 
> Anyway. PostgreSQL needs some kind of generic central name resolution service.
> 
> It is not feasible to do static entries per database in a large environment 
> with hundreds of clients.
> 
> It's also not feasible to have ordinary endusers have to handle this static 
> entry by themselves.

Hi,

I think the error here is trying to pass parameters defined in the block
key to the ldap parameter. That's rather hackish and requires proper
understanding of all possible implications, and the uses is rather
limited (what if you need multiple LDAP parameters?)


What if we have something like:

[ldap]
ldap://ldap.mycompany.com/dc=mycompany,dc=com?description?one?(cn=${PGDATABASE})

Basically, allow any environment variable in the ldap queries, including
both coming from the user/application and setting some standard ones
like PGUSER, PGHOST, PGDATABASE for parameters passed on the command
line/api call (whichever takes precedence).

You then connect with the ldap service, pass any required option as env
or parameters (here the database name), and let postgresql interpolate
it in the query (of course we also need to properly escape the string in
ldap uri...). Unset env variables would be considered an error.

The only possible caveat I see is that we could have to allow escaping
the $, even though I don't think there's any realistic possibility for
it to be used on the ldap url (and can it be url-encoded? in which case
it would be already solved...)


Regards,

--
Thomas