Queries for monitor

2025-04-08 Thread yudhi s
Hello All,
I am trying to configure basic monitoring for our postgres database using
data dictionary views as below. Want to understand if these are accurate or
if I am making any mistakes here by querying this way. And also ,
it's something we want to do for the application specific sessions/queries
and want to exclude the system related sessions/queries , so how can that
be done in the same query?

https://gist.github.com/databasetech0073/5d8113eaba13ac62352f97521ce68a43

Regards
Yudhi


Re: Wal file query

2025-04-08 Thread Atul Kumar
I got the error:

Wal control functions cannot be executed during recovery.

I need the solution that should in when streaming replication is
configured.

My postgres version is 13.20


Regards.

On Tue, 8 Apr 2025, 17:32 Kashif Zeeshan,  wrote:

> Hi Atul
>
> Start by looking at the current WAL LSN and insert LSN. The
> pg_current_wal_lsn is the location of the last write. The
> pg_current_wal_insert_lsn is the logical location and reflects data in
> the buffer that has not been written to disk. There is also a flush value
> that shows what has been written to durable storage.
>
> [postgres] # select pg_current_wal_lsn(), pg_current_wal_insert_lsn();
>  pg_current_wal_lsn | pg_current_wal_insert_lsn
> +---
>  76/7D00| 76/7D28
> (1 row)
>
> Although you can guess the name of the WAL file based on the above output, it 
> is best to use the pg_walfile_name function.
>
>
> [postgres] # select pg_walfile_name('76/7D28');
>  pg_walfile_name
> --
>  00010076007D
> (1 row)
>
>
> For details visit following link : 
> https://www.crunchydata.com/blog/postgres-wal-files-and-sequuence-numbers
>
>
> Thanks
>
> Kashif Zeeshan
>
>
>
> On Tue, Apr 8, 2025 at 4:44 PM Atul Kumar  wrote:
>
>> In streaming replication What is the way to check which "WAL file" is
>> currently in use in primary  and in standby ?
>>
>>
>> Regards.
>>
>


Re: Wal file query

2025-04-08 Thread Ron Johnson
You cannot connect to the Primary while connected to the Replica, except
via postgres_fdw.  Even then, it might not work, since the replica
replicates _fdw definitions.

These exist on the primary:

pg_current_wal_lsn()
pg_replication_slots
pg_stat_replication

These exist on the replica:

pg_last_wal_receive_lsn()
pg_last_wal_replay_lsn()
pg_stat_wal_receiver
pg_get_wal_replay_pause_state()


On Tue, Apr 8, 2025 at 9:09 AM Atul Kumar  wrote:

> I got the error:
>
> Wal control functions cannot be executed during recovery.
>
> I need the solution that should in when streaming replication is
> configured.
>
> My postgres version is 13.20
>
>
> Regards.
>
> On Tue, 8 Apr 2025, 17:32 Kashif Zeeshan,  wrote:
>
>> Hi Atul
>>
>> Start by looking at the current WAL LSN and insert LSN. The
>> pg_current_wal_lsn is the location of the last write. The
>> pg_current_wal_insert_lsn is the logical location and reflects data in
>> the buffer that has not been written to disk. There is also a flush value
>> that shows what has been written to durable storage.
>>
>> [postgres] # select pg_current_wal_lsn(), pg_current_wal_insert_lsn();
>>  pg_current_wal_lsn | pg_current_wal_insert_lsn
>> +---
>>  76/7D00| 76/7D28
>> (1 row)
>>
>> Although you can guess the name of the WAL file based on the above output, 
>> it is best to use the pg_walfile_name function.
>>
>>
>> [postgres] # select pg_walfile_name('76/7D28');
>>  pg_walfile_name
>> --
>>  00010076007D
>> (1 row)
>>
>>
>> For details visit following link : 
>> https://www.crunchydata.com/blog/postgres-wal-files-and-sequuence-numbers
>>
>>
>> Thanks
>>
>> Kashif Zeeshan
>>
>>
>>
>> On Tue, Apr 8, 2025 at 4:44 PM Atul Kumar  wrote:
>>
>>> In streaming replication What is the way to check which "WAL file" is
>>> currently in use in primary  and in standby ?
>>>
>>>
>>> Regards.
>>>
>>

-- 
Death to , and butter sauce.
Don't boil me, I'm still alive.
 lobster!


Re: timescaledb vs NULL vs pg_timeseries vs partman + pgcron + pg_ivm

2025-04-08 Thread Amitabh Kant
On Tue, Apr 8, 2025 at 9:40 PM Achilleas Mantzios - cloud <
a.mantz...@cloud.gatewaynet.com> wrote:

> Hi
> 
> timescaledb seemed mature, but also exotic, allow me the term. No way to
> use native logical replication, shortage of options to run on premise or
> self hosted, which leaves us with those options :
>
> 
>
>
I cannot comment on the applicability of timescaledb in your context, but
running it on premise/self-hosted has posed no problems, at least on
Debian.

If I understood your query incorrectly, please ignore.

Amitabh


Re: timescaledb vs NULL vs pg_timeseries vs partman + pgcron + pg_ivm

2025-04-08 Thread Achilleas Mantzios


On 8/4/25 20:37, Amitabh Kant wrote:
On Tue, Apr 8, 2025 at 9:40 PM Achilleas Mantzios - cloud 
 wrote:


Hi

timescaledb seemed mature, but also exotic, allow me the term. No
way to
use native logical replication, shortage of options to run on
premise or
self hosted, which leaves us with those options :



I cannot comment on the applicability of timescaledb in your context, 
but running it on premise/self-hosted has posed no problems, at least 
on Debian.


If I understood your query incorrectly, please ignore.
Thank you, I meant the paid/supported service not the community version. 
Which of the two do you use?


Amitabh

Wal file query

2025-04-08 Thread Atul Kumar
In streaming replication What is the way to check which "WAL file" is
currently in use in primary  and in standby ?


Regards.


Re: Kubernetes, cgroups v2 and OOM killer - how to avoid?

2025-04-08 Thread Joe Conway

On 4/8/25 13:58, Ancoron Luciferis wrote:

On 2025-04-07 15:21, Joe Conway wrote:

On 4/5/25 07:53, Ancoron Luciferis wrote:

I've been investigating this topic every now and then but to this day
have not come to a setup that consistently leads to a PostgreSQL backend
process receiving an allocation error instead of being killed externally
by the OOM killer.

Why this is a problem for me? Because while applications are accessing
their DBs (multiple services having their own DBs, some high-frequency),
the whole server goes into recovery and kills all backends/connections.

While my applications are written to tolerate that, it also means that
at that time, esp. for the high-frequency apps, events are piling up,
which then leads to a burst as soon as connectivity is restored. This in
turn leads to peaks in resource usage in other places (event store,
in-memory buffers from apps, ...), which sometimes leads to a series of
OOM killer events being triggered, just because some analytics query
went overboard.

Ideally, I'd find a configuration that only terminates one backend but
leaves the others working.

I am wondering whether there is any way to receive a real ENOMEM inside
a cgroup as soon as I try to allocate beyond its memory.max, instead of
relying on the OOM killer.

I know the recommendation is to have vm.overcommit_memory set to 2, but
then that affects all workloads on the host, including critical infra
like the kubelet, CNI, CSI, monitoring, ...

I have already gone through and tested the obvious:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX- 
MEMORY-OVERCOMMIT


Importantly vm.overcommit_memory set to 2 only matters when memory is 
constrained at the host level.


As soon as you are running in a cgroup with a hard memory limit, 
vm.overcommit_memory is irrelevant.


You can have terabytes of free memory on the host, but if cgroup memory 
usage exceeds memory.limit (cgv1) or memory.max (cgv2) the OOM killer 
will pick the process in the cgroup with the highest oom_score and whack 
it.


Unfortunately there is no equivalent to vm.overcommit_memory within the 
cgroup.



And yes, I know that Linux cgroups v2 memory.max is not an actual hard
limit:

https://www.kernel.org/doc/html/latest/admin-guide/cgroup- 
v2.html#memory-interface-files


Read that again -- memory.max *is* a hard limit (same as memory.limit in 
cgv1).


   "memory.max

     A read-write single value file which exists on non-root cgroups. The
     default is “max”.

     Memory usage hard limit. This is the main mechanism to limit memory
     usage of a cgroup. If a cgroup’s memory usage reaches this limit and
     can’t be reduced, the OOM killer is invoked in the cgroup."


Yes, I know it says "hard limit", but then any app still can go beyond
(might just be on me here to assume any "hard limit" to imply an actual
error when trying to go beyond). The OOM killer then will kick in
eventually, but not in any way that any process inside the cgroup could
prevent. So there is no signal that the app could react to saying "hey,
you just went beyond what you're allowed, please adjust before I kill you".


No, that really is a hard limit and the OOM killer is *really* fast. 
Once that is hit there is no time to intervene. The soft limit 
(memory.high) is the one you want for that.


Or you can monitor PSI and try to anticipate problems, but that is 
difficult at best. If you want to see how that is done, check out 
senpai: https://github.com/facebookincubator/senpai/blob/main/README.md



If you want a soft limit use memory.high.

   "memory.high

     A read-write single value file which exists on non-root cgroups. The
     default is “max”.

     Memory usage throttle limit. If a cgroup’s usage goes over the high
     boundary, the processes of the cgroup are throttled and put under
     heavy reclaim pressure.

     Going over the high limit never invokes the OOM killer and under
     extreme conditions the limit may be breached. The high limit should
     be used in scenarios where an external process monitors the limited
     cgroup to alleviate heavy reclaim pressure.

You want to be using memory.high rather than memory.max.


Hm, so solely relying on reclaim? I think that'll just get the whole
cgroup into ultra-slow mode and would not actually prevent too much
memory allocation. While this may work out just fine for the PostgreSQL
instance, it'll for sure have effects on the other workloads on the same
node (which I have apparently, more PG instances).

Apparently, I also don't see a way to even try this out in a Kubernetes
environment, since there doesn't seem to be a way to set this field
through some workload manifests field.


Yeah, that part I have no idea about. I quit looking at kubernetes 
related things about 3 years ago. Although, this link seems to indicate 
there is a way related to how it does QoS: 
https://kubernetes.io/blog/2023/05/05/qos-memory-resources/#:~:text=memory.high%20formula


Also, I d

Postgres_fdw- User Mapping with md5-hashed password

2025-04-08 Thread Dirschel, Steve
I know I can create user steve_test with password testpassword122 as md5 by 
doing:

select 'md5'||md5('testpassword122steve_test');  Returns --> 
md5eb7e220574bf85096ee99370ad67cbd3

CREATE USER steve_test WITH  PASSWORD 'md5eb7e220574bf85096ee99370ad67cbd3';

And then I can login as steve_test with password testpassword122.

I'm trying to use similar logic when creating a user mapping:

CREATE USER MAPPING FOR postgres SERVER steve_snap0 OPTIONS (user 'steve_test', 
password 'md5eb7e220574bf85096ee99370ad67cbd3');

When I try and import a foreign schema I get an error:

ERROR:  could not connect to server "steve_snap0"

If I create the user mapping with the password:

CREATE USER MAPPING FOR postgres SERVER steve_snap0 OPTIONS (user 'steve_test', 
password 'testpassword122');

It works fine.

Is it not possible to use the same logic for the user mapping password that can 
be used when creating a user?

Thanks in advance.
This e-mail is for the sole use of the intended recipient and contains 
information that may be privileged and/or confidential. If you are not an 
intended recipient, please notify the sender by return e-mail and delete this 
e-mail and any attachments. Certain required legal entity disclosures can be 
accessed on our website: 
https://www.thomsonreuters.com/en/resources/disclosures.html


Re: timescaledb vs NULL vs pg_timeseries vs partman + pgcron + pg_ivm

2025-04-08 Thread Achilleas Mantzios - cloud


On 4/9/25 04:50, Amitabh Kant wrote:


Thank you, I meant the paid/supported service not the community
version. Which of the two do you use?



I use the community version.



On 4/9/25 05:23, Brent Wood wrote:

I also use the free community edition on internal servers, but under 
Ubuntu. No issues and very good performance.


Brent Wood...



Thanks Amitabh abd Brent, how do you plan to cope with future upgrades 
based on logical replication ? Do you run timescale on a 
dedicated/separate system from the rest of your PostgreSQL cluster(s)?

Re: PgBackRest fails due to filesystem full

2025-04-08 Thread Greg Sabino Mullane
On Mon, Apr 7, 2025 at 5:32 AM KK CHN  wrote:

> *ERROR: [082]: WAL segment 000101EB00*4B was not archived
> before the 6ms timeout
>

This is the part you need to focus on. Look at your Postgres logs and find
out why the archiver is failing. You can also test this without trying a
whole backup by using the "check" command:
https://pgbackrest.org/command.html#command-check

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support


timescaledb vs NULL vs pg_timeseries vs partman + pgcron + pg_ivm

2025-04-08 Thread Achilleas Mantzios - cloud

Hi
in continuation of "Ideas about presenting data coming from sensors"

https://www.postgresql.org/message-id/flat/8d2dd92a-da16-435b-a38e-fe72191fc9d1%40cloud.gatewaynet.com

we got the system working in single tables fashion (3 kinds of them), 
since no timeseries solution seemed to fit 100% all the requirements at 
the time, or simply because I didn't have the time to evaluate all the 
existing options.


Fast forward today, in a few months we got almost 63M rows , but this 
will increase exponentially since new vessels will be configured to send 
their sensor's data.


After an initial idea with timescaledb, I tried to install pg_timeseries 
today, and give it a try.


pg_timeseries does not seem active and their "columnar" requirement 
seems to have stuck due to citus not having been updated to postgresql 
17. Stopper.


timescaledb seemed mature, but also exotic, allow me the term. No way to 
use native logical replication, shortage of options to run on premise or 
self hosted, which leaves us with those options :


a) stick with timescaledb in their cloud offering and try to bridge the 
two systems (ours and the new timescaledb instance)


b) convert to native partitioning and just try to manage via partman, 
forgetting for the moment incremental views and columnar store, or maybe 
try to introduce some functionality from pg_ivm + pgcron


So the question : are those are our only options? google says so but is 
this really the case ?


thank you.






Re: Wal file query

2025-04-08 Thread Kashif Zeeshan
Hi Atul

Start by looking at the current WAL LSN and insert LSN. The
pg_current_wal_lsn is the location of the last write. The
pg_current_wal_insert_lsn is the logical location and reflects data in the
buffer that has not been written to disk. There is also a flush value that
shows what has been written to durable storage.

[postgres] # select pg_current_wal_lsn(), pg_current_wal_insert_lsn();
 pg_current_wal_lsn | pg_current_wal_insert_lsn
+---
 76/7D00| 76/7D28
(1 row)

Although you can guess the name of the WAL file based on the above
output, it is best to use the pg_walfile_name function.


[postgres] # select pg_walfile_name('76/7D28');
 pg_walfile_name
--
 00010076007D
(1 row)


For details visit following link :
https://www.crunchydata.com/blog/postgres-wal-files-and-sequuence-numbers


Thanks

Kashif Zeeshan



On Tue, Apr 8, 2025 at 4:44 PM Atul Kumar  wrote:

> In streaming replication What is the way to check which "WAL file" is
> currently in use in primary  and in standby ?
>
>
> Regards.
>


Re: Kubernetes, cgroups v2 and OOM killer - how to avoid?

2025-04-08 Thread Ancoron Luciferis

On 2025-04-07 15:21, Joe Conway wrote:

On 4/5/25 07:53, Ancoron Luciferis wrote:

I've been investigating this topic every now and then but to this day
have not come to a setup that consistently leads to a PostgreSQL backend
process receiving an allocation error instead of being killed externally
by the OOM killer.

Why this is a problem for me? Because while applications are accessing
their DBs (multiple services having their own DBs, some high-frequency),
the whole server goes into recovery and kills all backends/connections.

While my applications are written to tolerate that, it also means that
at that time, esp. for the high-frequency apps, events are piling up,
which then leads to a burst as soon as connectivity is restored. This in
turn leads to peaks in resource usage in other places (event store,
in-memory buffers from apps, ...), which sometimes leads to a series of
OOM killer events being triggered, just because some analytics query
went overboard.

Ideally, I'd find a configuration that only terminates one backend but
leaves the others working.

I am wondering whether there is any way to receive a real ENOMEM inside
a cgroup as soon as I try to allocate beyond its memory.max, instead of
relying on the OOM killer.

I know the recommendation is to have vm.overcommit_memory set to 2, but
then that affects all workloads on the host, including critical infra
like the kubelet, CNI, CSI, monitoring, ...

I have already gone through and tested the obvious:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX- 
MEMORY-OVERCOMMIT


Importantly vm.overcommit_memory set to 2 only matters when memory is 
constrained at the host level.


As soon as you are running in a cgroup with a hard memory limit, 
vm.overcommit_memory is irrelevant.


You can have terabytes of free memory on the host, but if cgroup memory 
usage exceeds memory.limit (cgv1) or memory.max (cgv2) the OOM killer 
will pick the process in the cgroup with the highest oom_score and whack 
it.


Unfortunately there is no equivalent to vm.overcommit_memory within the 
cgroup.



And yes, I know that Linux cgroups v2 memory.max is not an actual hard
limit:

https://www.kernel.org/doc/html/latest/admin-guide/cgroup- 
v2.html#memory-interface-files


Read that again -- memory.max *is* a hard limit (same as memory.limit in 
cgv1).


   "memory.max

     A read-write single value file which exists on non-root cgroups. The
     default is “max”.

     Memory usage hard limit. This is the main mechanism to limit memory
     usage of a cgroup. If a cgroup’s memory usage reaches this limit and
     can’t be reduced, the OOM killer is invoked in the cgroup."


Yes, I know it says "hard limit", but then any app still can go beyond 
(might just be on me here to assume any "hard limit" to imply an actual 
error when trying to go beyond). The OOM killer then will kick in 
eventually, but not in any way that any process inside the cgroup could 
prevent. So there is no signal that the app could react to saying "hey, 
you just went beyond what you're allowed, please adjust before I kill you".





If you want a soft limit use memory.high.

   "memory.high

     A read-write single value file which exists on non-root cgroups. The
     default is “max”.

     Memory usage throttle limit. If a cgroup’s usage goes over the high
     boundary, the processes of the cgroup are throttled and put under
     heavy reclaim pressure.

     Going over the high limit never invokes the OOM killer and under
     extreme conditions the limit may be breached. The high limit should
     be used in scenarios where an external process monitors the limited
     cgroup to alleviate heavy reclaim pressure.

You want to be using memory.high rather than memory.max.


Hm, so solely relying on reclaim? I think that'll just get the whole 
cgroup into ultra-slow mode and would not actually prevent too much 
memory allocation. While this may work out just fine for the PostgreSQL 
instance, it'll for sure have effects on the other workloads on the same 
node (which I have apparently, more PG instances).


Apparently, I also don't see a way to even try this out in a Kubernetes 
environment, since there doesn't seem to be a way to set this field 
through some workload manifests field.




Also, I don't know what kubernetes recommends these days, but it used to 
require you to disable swap. In more recent versions of kubernetes you 
are able to run with swap enabled but I have no idea what the default is 
-- make sure you run with swap enabled.


Yes, this is what I wanna try out next.



The combination of some swap being available, and the throttling under 
heavy reclaim will likely mitigate your problems.




Thank you for your insights, I have something to think about.

Cheers,

Ancoron






Re: timescaledb vs NULL vs pg_timeseries vs partman + pgcron + pg_ivm

2025-04-08 Thread Amitabh Kant
On Tue, Apr 8, 2025 at 11:29 PM Achilleas Mantzios <
a.mantz...@cloud.gatewaynet.com> wrote:

>
> On 8/4/25 20:37, Amitabh Kant wrote:
>
> On Tue, Apr 8, 2025 at 9:40 PM Achilleas Mantzios - cloud <
> a.mantz...@cloud.gatewaynet.com> wrote:
>
>> Hi
>> 
>> timescaledb seemed mature, but also exotic, allow me the term. No way to
>> use native logical replication, shortage of options to run on premise or
>> self hosted, which leaves us with those options :
>>
>> 
>>
>>
> I cannot comment on the applicability of timescaledb in your context, but
> running it on premise/self-hosted has posed no problems, at least on
> Debian.
>
> If I understood your query incorrectly, please ignore.
>
> Thank you, I meant the paid/supported service not the community version.
> Which of the two do you use?
>
>
>
I use the community version.


Re: Kubernetes, cgroups v2 and OOM killer - how to avoid?

2025-04-08 Thread Joe Conway

On 4/5/25 07:53, Ancoron Luciferis wrote:

I've been investigating this topic every now and then but to this day
have not come to a setup that consistently leads to a PostgreSQL backend
process receiving an allocation error instead of being killed externally
by the OOM killer.

Why this is a problem for me? Because while applications are accessing
their DBs (multiple services having their own DBs, some high-frequency),
the whole server goes into recovery and kills all backends/connections.

While my applications are written to tolerate that, it also means that
at that time, esp. for the high-frequency apps, events are piling up,
which then leads to a burst as soon as connectivity is restored. This in
turn leads to peaks in resource usage in other places (event store,
in-memory buffers from apps, ...), which sometimes leads to a series of
OOM killer events being triggered, just because some analytics query
went overboard.

Ideally, I'd find a configuration that only terminates one backend but
leaves the others working.

I am wondering whether there is any way to receive a real ENOMEM inside
a cgroup as soon as I try to allocate beyond its memory.max, instead of
relying on the OOM killer.

I know the recommendation is to have vm.overcommit_memory set to 2, but
then that affects all workloads on the host, including critical infra
like the kubelet, CNI, CSI, monitoring, ...

I have already gone through and tested the obvious:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT


Importantly vm.overcommit_memory set to 2 only matters when memory is 
constrained at the host level.


As soon as you are running in a cgroup with a hard memory limit, 
vm.overcommit_memory is irrelevant.


You can have terabytes of free memory on the host, but if cgroup memory 
usage exceeds memory.limit (cgv1) or memory.max (cgv2) the OOM killer 
will pick the process in the cgroup with the highest oom_score and whack it.


Unfortunately there is no equivalent to vm.overcommit_memory within the 
cgroup.



And yes, I know that Linux cgroups v2 memory.max is not an actual hard
limit:

https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files


Read that again -- memory.max *is* a hard limit (same as memory.limit in 
cgv1).


  "memory.max

A read-write single value file which exists on non-root cgroups. The
default is “max”.

Memory usage hard limit. This is the main mechanism to limit memory
usage of a cgroup. If a cgroup’s memory usage reaches this limit and
can’t be reduced, the OOM killer is invoked in the cgroup."


If you want a soft limit use memory.high.

  "memory.high

A read-write single value file which exists on non-root cgroups. The
default is “max”.

Memory usage throttle limit. If a cgroup’s usage goes over the high
boundary, the processes of the cgroup are throttled and put under
heavy reclaim pressure.

Going over the high limit never invokes the OOM killer and under
extreme conditions the limit may be breached. The high limit should
be used in scenarios where an external process monitors the limited
cgroup to alleviate heavy reclaim pressure.

You want to be using memory.high rather than memory.max.

Also, I don't know what kubernetes recommends these days, but it used to 
require you to disable swap. In more recent versions of kubernetes you 
are able to run with swap enabled but I have no idea what the default is 
-- make sure you run with swap enabled.


The combination of some swap being available, and the throttling under 
heavy reclaim will likely mitigate your problems.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com