Re: AW: ldap connection parameter lookup

2021-02-12 Thread Thomas Guyot
On 2021-01-22 10:22, Zwettler Markus (OIZ) wrote:
>>
>> On Fri, 2021-01-15 at 14:09 +, Zwettler Markus (OIZ) wrote:
>>> I want to use ldap to lookup the connection parameters:
>>> https://www.postgresql.org/docs/12/libpq-ldap.html
>>>
>>> or is there also some kind of generic variant like this (meaning lookup 
>>> connection
>>> parameters for the database name I tell you somehow):
>>>
>>> [${PGDATABASE}]
>>> ldap://ldap.mycompany.com/dc=mycompany,dc=com?description?one?(cn=${PGD
>>> ATABASE})
>>
>> I proposed something like that a while ago:
>> https://postgr.es/m/D960CB61B694CF459DCFB4B0128514C2F3442B%40exadv11
>> .host.magwien.gv.at
>> but it was rejected.
>>
>> Perhaps you could come up with a better version.
>>
> I'm afraid not. My proposal is as close to yours.
> 
> Anyway. PostgreSQL needs some kind of generic central name resolution service.
> 
> It is not feasible to do static entries per database in a large environment 
> with hundreds of clients.
> 
> It's also not feasible to have ordinary endusers have to handle this static 
> entry by themselves.

Hi,

I think the error here is trying to pass parameters defined in the block
key to the ldap parameter. That's rather hackish and requires proper
understanding of all possible implications, and the uses is rather
limited (what if you need multiple LDAP parameters?)


What if we have something like:

[ldap]
ldap://ldap.mycompany.com/dc=mycompany,dc=com?description?one?(cn=${PGDATABASE})

Basically, allow any environment variable in the ldap queries, including
both coming from the user/application and setting some standard ones
like PGUSER, PGHOST, PGDATABASE for parameters passed on the command
line/api call (whichever takes precedence).

You then connect with the ldap service, pass any required option as env
or parameters (here the database name), and let postgresql interpolate
it in the query (of course we also need to properly escape the string in
ldap uri...). Unset env variables would be considered an error.

The only possible caveat I see is that we could have to allow escaping
the $, even though I don't think there's any realistic possibility for
it to be used on the ldap url (and can it be url-encoded? in which case
it would be already solved...)


Regards,

--
Thomas




Re: How to post to this mailing list from a web based interface

2021-02-15 Thread Thomas Guyot
On 2021-02-14 22:44, Thomas Munro wrote:
> On Fri, Jan 29, 2021 at 4:27 AM Alvaro Herrera  
> wrote:
> 
> While catching up with some interesting new threads just now I was
> quite confused by the opening sentence of this message (which also
> arrived in my mailbox):
> 
> https://www.postgresql.org/message-id/1611355191319-0.post%40n3.nabble.com
> 
> ... until I got to the last line.  I wonder if the "Resend" facility
> on our own archives could be better advertised, via a "Want to join
> this thread?" link in the Quick Links section that explains how to use
> it and what problem it solves, or something...
> 

Hi,

I haven't read the whole thread, but my take on it as a newjoiner:

Nable didn't seem to work. I got a bounce iirc. The reply/quoting format
was quite uncommon too..

As part of the process I had to join the ML anyway, so I went to the
archives to find the thread I wanted to reply to.

The "download mbox" option doesn't work, I get asked for a user/password
every time (I would've downloaded archives for the lats two months to
get continuation on most threads).

The resend option does work, but you have to wait 30 seconds between
messages, and I like to have the entire thread as it makes it easier to
follow.

Regards,

--
Thomas





Re: How to post to this mailing list from a web based interface

2021-02-15 Thread Thomas Guyot
On 2021-02-15 12:30, Adrian Klaver wrote:
> On 2/15/21 9:24 AM, Thomas Guyot wrote:
> 
>> The "download mbox" option doesn't work, I get asked for a user/password
>> every time (I would've downloaded archives for the lats two months to
>> get continuation on most threads).
> 
> The user/password is in the message in the prompt.

Ha, thanks! Turning on developer mode I could see it in the response
headers. It's a shame Chrome doesn't show it, yet I guess there may be a
good reason such as phishing sites trying to make the prompt more legit
while impersonating an organization...

Possible suggestions though - maybe allow cookie-based auth when logged
in, or show the user/password in the 401 error page.


I could get the archives this time, thanks!

--
Thomas




Re: PostgreSQL Replication

2021-02-17 Thread Thomas Guyot
On 2021-02-16 09:28, Raul Giucich wrote:
> This article will help you
> https://wiki.postgresql.org/wiki/Multimaster
> . 
> 
> El mar., 16 feb. 2021 10:56, Mutuku Ndeti  > escribió:
> 
> Hi,
> 
> Need some advice here. I have an application using PostgreSQL. I
> need to install it on 2 servers for redundancy purposes and have 2
> databases. I need the DBs to replicate to each other, in real-time.
> Writes can be done on both DBs. 
> 
> Please let me know if this is a feasible setup and the best way to
> proceed. 
> 

Hi,

While I have no experience with replication on pgsql, in general
multi-master database replication is much more complex and often require
a pretty rigid setup. The graphs on that page seems to tell the same
story for pgsql.

Are you sure you really need multi-master replication as opposed to
having a single active master in a replicated set? If properly
configured, cluster software can automatically fail over the active
master, which provides very good redundancy and is much simpler from a
technological standpoint.

Regards,

--
Thomas




Re: PostgreSQL Replication

2021-02-17 Thread Thomas Guyot
On 2021-02-17 04:22, Mutuku Ndeti wrote:
> Thank you. 
> 
> I agree with you. Single master, with a standby replica, seems easier to
> manage. Is there a way to automatically promote the standby, when the
> active master fails? Is it feasible to have 2 instances of the
> application, writing onto the same DB, reason for two instances of the
> application is to allow for redundancy/load balancing. 
> 

Again, not being PostgreSQL-specific, it is possible to have an
active/standby master node with shared storage (ex fibrechannel,
iscsi... it can only be mounted on only one node at any given time and
it's the cluster's responsibility to ensure the other node has unmounted
properly). This is ideal when dealing with fully redundant storage
arrays, and fairly simple to setup on the cluster side. Attaching
additional slaves to an active/standby cluster is also usually simpler
since there is just one server/replication log.

Another solution is to have a replication ring (ex. A replicates to B, B
replicates to A). With such setup it is very important writes only ever
happen on a single node, and is easiest to do so by using a floating IP
that can only be reached from one node at any given time (again it's the
cluster's responsibility to ensure the floating IP is never on both).
This method may require a more complex configuration to ensure updates
are consistent and to keep additional slaves synchronized to the
cluster. Also since both servers have a permanent IP it is important to
ensure no updates are made directly to them. The warm side can still be
used for read-only access though (with a very small lag).

In either case, there is likely some configuration needed. This page
seems to list a few projects that may make your cluster setup much easier:

https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

Regards,

--
Thomas




Re: Is Client connections via ca.crt only possible?

2022-08-03 Thread Thomas Guyot

On 2022-08-01 04:12, Rejo Oommen wrote:

Requirement is to use only ca.crt and connect to postgres

Server.crt, Server.key and ca.crt are configured at the postgres 
server for tls connection.


Connection successful while using
psql ‘host=172.29.21.222 dbname=test user=postgres sslmode=verify-ca 
sslcert=/tmp/server.crt sslkey=/tmp/server.key sslrootcert=/tmp/ca.crt 
port=5432’


For clients to connect, can they use only ca.crt and connect to the 
DB. Tried and got the below error


psql ‘host=172.29.21.222 dbname=test user=postgres sslmode=verify-ca 
sslrootcert=/tmp/ca.crt port=5432’
psql: error: connection to server at “172.29.21.222”, port 50001 
failed: FATAL:  connection requires a valid client certificate




Hi Rejo,

I don't think you understand fully how mutual TLS auth works. For the 
client to authenticate using a certificate, it needs a valid certificate 
and key too, where the certificate is signed by a CA your server trusts 
(usually the same CA that signed your server cert) and with a proper 
subject (that bears the certificate owner's user name, the user you will 
use to grant privileges in the database). You shouldn't even need to 
pass a username, it will be in the certificate.


I'm talking purely from a generic view, I'm not familiar with any of the 
specifics of PostgreSQL configuration but TLS authentication requires a 
secret and a CA certificate isn't secret. Your server certificate 
authenticates the server, but nothing authenticates the client.


Regards,

--
Thomas




Re: Is Client connections via ca.crt only possible?

2022-08-03 Thread Thomas Guyot

On 2022-08-03 21:37, Rejo Oommen wrote:
Thank you for the reply Thomas. I agree with you on the mutual TLS 
that you mentioned.


Here is what I was looking at.

The configurations at the server end will be with auth-method as md5 
and auth-option as clientcert=verify-ca.




There's your issue. If you tell the server to validate the client cert, 
then it will require the client to provide a valid cert to identify itself.


In this way, the user's password along with the valid ca should allow 
connections to pass.




The ca on your setup is only useful for the client to ensure the server 
is the correct one and prevent MITM attacks. This is a client-side 
check, not server-side.


The only authentication security here is the password/md5, but protected 
from eavesdropping (passive and MITM) and connection hijacking by 
encryption, with some of these protections only effective when the 
client use the verify-ca option. The server cannot ensure the client is 
actually validating the ca, not even that it's taking to the actual 
client and not a MITM, simply because the client itself is not 
authenticated by mutual TLS.


Regards

--
Thomas




Re: High CPU usage

2022-10-20 Thread Thomas Guyot

On 2022-10-20 15:59, ertan.kucuko...@1nar.com.tr wrote:

Hello,

I am using PostgreSQL v14.5 on Linux Debian 11.5. I recently observe very
high CPU usage on my Linux system as below

 PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+
COMMAND
2357756 postgres  20   0 2441032   2,3g  4 S 298,7  67,9   2114:58
Tspjzj2Z

I could not find any file named Tspjzj2Z on the file system. I could not
find PID number using below SQL


Hi,

I'm not an expert in PostgreSQL but that looks like a rogue app, if 
you're lucky just a miner running as the prostgres user, likely the 
result of a postgres RCE exploited successfully... The more worring case 
would be a program exfiltrating and/or encrypting the database in a 
ransomware attack.


The executable has most likely been removed to hide traces, or cleaned 
up automatically from ex. /tmp, however if the process is still running 
you should be able to cat the executable, and any other open files, 
directly from /proc// (look for exe and fd/*).


I strongly recommend you check other postgres servers you have, make a 
copy of any process file found (for later investigation), then isolate 
or shutdown these servers and proceed with a proper investigation from a 
livecd or revovery OS.



There is no replication of any kind. This is a single instance server which
alows certification login only.


Is is even available from the outside world? Else you should likely 
audit any internal hosts that could have accessed your postgresql 
server. If you have firewall logs looks for unusual connection attempts, 
any evidence of scanning, etc.


Hackers will often spend quite some time once inside to gather as much 
information as possible before doing any real damage, although if this 
is effectively a miner it would be less likely to be that kind of attack 
as they would probably not risk getting discovered with something that 
will at best make them pennies...


If you see the process having any open database files, it's possible 
it's either compressing them to exfiltrate the data or encrypting them, 
or both


Hope this helps...

--
Thomas




Re: How do the Linux distributions create the Linux user/group "postgres"?

2023-01-09 Thread Thomas Guyot

On 2023-01-09 07:41, Matthias Apitz wrote:

Hello,

Please note: I'm talking about the user and group "postgres" in the
Linux OS and not in the PostgreSQL server.

We're compiling PostgreSQL from source (actually 14.1) and distribute that
to our customers. They're asked to setup user and group "postgres"
before creating the cluster. As nowadays there are a lot of setup such
things in bigger installations, like LDAP or AD, etc. I'd like to know
how other installations for Linux deal with this?


Hi Matthias,

Users are generally created by the pre-install or post-install scripts 
of the package. The specific ways in which they are created may depend 
on the packaging tool being used and packager that built the 
distribution package.


Regards,

--
Thomas




Re: PostgreSQL vs MariaDB

2023-03-27 Thread Thomas Guyot

On 2023-03-24 07:07, Inzamam Shafiq wrote:

Hi Team,

Hope you are doing well.

Can someone please list pros and cons of MariaDB vs PostgreSQL that 
actually needs serious consideration while choosing the right database 
for large OLTP DBs (Terabytes)?





Hi Inzamam,

I will have my take as well, but note I have much more experience with 
MySQL/MariaDB and mostly from 10 years ago (although I did use both in 
the last decade too, mostly for hobby and a bit of PostgreSQL at work, 
and I have both running on my workstation).



First of all unless you plan on licensing Oracle for MySQL support, you 
should probably go with MariaDB (which is what you seem to consider 
already). I've known and used MySQL before the MariaDB fork (and even 
before Sun's acquisition), and MariaDB is still heavily developed with 
open bug trackers and many 3rd party companies specializing in 
MySQL/MariaDB support.



Having a sysadmin background, I find MariaDB to be easier to understand 
and administer as a server application. In the main engines, tables are 
straight up files on disk (for InnoDB which is now the default engine, a 
file-per-table option also makes this possible). There isn't really a 
concept of tablespaces, OTOH you can just move some files and symlink 
them (while the DB is down of course) to get some tables onto bigger or 
faster disks.


Recent versions of InnoDB (shortly after the MariaDB fork at least) have 
had a lot of scalability and instrumentation improvement (a lot of it 
from Percona's XtraDB fork), and also allow you to further separate the 
common data files such as using separate files for the doublewrite 
buffer and redo logs (write-only except during crash recovery; perfect 
for spinning disks) from other read/write data files (containing undo 
logs and system tables amongst others, and table data when not using 
file-per-table).


There's obviously the plugable engines (it appears PostgreSQL is 
implementing this too now), I'm less familiar with the latest 
development of those and have mostly used InnoDB/XtraDB but there's 
quite a few very specialized engines too. One I find particularly 
interesting is MyRocks which is optimized for flash storage with 
compression and can do high performance bulk inserts from files.



OTOH my experience with PostgreSQL is that it seems to have greater 
support for some SQL features and concepts, or at least used to. I'm not 
sufficiently SQLiterate to give many specifics but I remember seeing a 
few examples in the past, one was lack of sequences which appears to 
have been added about 5 years ago (before that one could use 
auto_increment keys to get similar functionality).


From my perspective PostgreSQL appears to be more similar to other 
database engines when it comes to managing tablespaces, schemas, etc., 
that said I had only limited experience with using Oracle, Sybase, DB2 
and MSSQL, and not really anything about managing tablespaces/schemas. 
Also unlike MariaDB, Postgresql can version DDL too (in InnoDB they 
cause an implicit commit and rollbacks are no longer possible for the 
transaction executing it).


I feel there may also likely more edge cases that you may have to be 
aware for some specific operations with MariaDB (it's well documented 
too) esp. with replication... but maybe that's just me knowing it 
better, and it's mostly from 10y old experience (it tend to be getting 
better over time and I haven't worked on any replicated setup lately).



So, TL;DR if you're a real DBA with experience with other commercial DB 
engines, I think you will find yourself more at ease with PostgreSQL, 
and it will likely be easier to port statements from other engines.


Someone with a strong sysadmin background, will likely be more 
comfortable setting up and maintaining MariaDB, and some of its plugable 
engines may also be worth considering, but that really depend on the 
type of load and hardware you will be using.


I know there's very good instrumentation to troubleshoot performance 
issues with MariaDB/InnoDB, something I'm absolutely not familiar with 
PostgreSQL...


Regards,

Thomas




Re: Death postgres

2023-05-06 Thread Thomas Guyot

On 2023-05-05 21:14, Marc Millas wrote:

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer) 
after the postgres disk occupation did grow from 15TB to 16 TB.


What are the cases where postgres may grow without caring about 
temp_file_limit ?


thanks,



Some OSes like IIRC RHEL9 now default to tmpfs for /tmp - if your temp 
files are written in a tmpfs then it may may very well trigger the OOM 
because of the temp file used up all RAM.


Check the filesystem type of your temp file's location.

--
Thomas