Re: Segmentation fault on startup

2021-02-13 Thread Helmut Bender

Hi,

a little follow-up to this case...

since redis didn't work correctly, too, I looked around for a solution 
for that, too.


It seems that the alpine image 3.13 for arm7 is broken at the moment... 
see the answer here:

https://stackoverflow.com/questions/66091978/corrupt-date-with-redis6-alpine-on-raspi

And since I used the :10-alpine image for postgres, it apparently was 
also affected by this bug.


So - not the reboot crashed postgres, but alpine.

Am 02.02.21 um 19:22 schrieb Helmut Bender:

Today I had to restart the RasPi. Now the container fails to start with
a segmentation violation.


Not good --- sounds like you have data corruption.  After an OS crash
this is something that's quite possible if you haven't taken the time
to qualify the storage subsystem's honoring of fsync.


Well, it was a regular reboot... don't know what happend.

Thank you for your tips, it's running again. :-D



--
Gruß Helmut




[LDAPS] Test connection user with ldaps server

2021-02-13 Thread João Gaspar
Hi all,

I have a PostgreSQL 13.1 (RHEL 8.3) Server and I want to configure the
pg_hba.conf with a remote ldaps server.

My steps:

I create a PostgreSQL user1 with superuser role to test the ldaps
authentication method in the terminal client.

Modify the pg_hba.conf to:

hostall all 0.0.0.0/0   ldap
 ldapurl="ldaps://serverurl:636/DC=company,DC=example,DC=com?sAMAccountName?sub"
ldapbinddn="user-to-do-autentication-ldap-connection" ldapbindpasswd="
user-ldap-connection password-autentication"

Save and restart the PostgreSQL service.

Try to connect with the terminal client with psql -h
postgresqlremoteserverhost -U user1 and after putting the password give the
following error:
psql: FATAL:  LDAP authentication failed for user "user1"

I validate the ldap user1 with ldapsearch (in the RHEL host) and the user1
appears in the ldapsearch correctly using the same ldapurl, ldapbinddn and
ldapbinpasswd.

Checking the remote postgresql logs, the connection to the remote ldaps do
the correct authentication but can´t search by the
attribute sAMAccountName. Here is the PostgreSQL log:
could not search LDAP for filter "(sAMAccountName=user1)" on server
"serverurl": Operations error 2021-02-13 10:02:54.679 WET [1127801] DETAIL:
 LDAP diagnostics: 04DC: LdapErr: DSID-0C0907E9, comment: To perform
this operation a successful bind must be completed on the connection., data
0, v2580

Info: The user1 was created as well in the ldaps server with sAMAccountName
user1.

It seems that the problem is in the pg_hba.conf how to tell the search, can
anyone have similar problem ou resolution?

Thank you for your time.

Best regards,
João Gaspar


Insert into on conflict, data size upto 3 billion records

2021-02-13 Thread Karthik Kumar Kondamudi
Hi,

I'm looking for suggestions on how I can improve the performance of the
below merge statement, we have a batch process that batch load the data
into the _batch tables using Postgres and the task is to update the main
target tables if the record exists else into it, sometime these batch table
could go up to 5 billion records. Here is the current scenario

target_table_main has 700,070,247  records and is hash partitioned into 50
chunks, it has an index on logical_ts and the batch table has 2,715,020,546
close to 3 billion records, so I'm dealing with a huge set of data so
looking of doing this in the most efficient way.

Thank you


Why is Postgres only using 8 cores for partitioned count?

2021-02-13 Thread Seamus Abshere
hi,

How can I convince Postgres to use more than 8 cores?

I've got an r6gd.16xlarge with 64 vCpus and 512gb RAM running Postgres 13.1 on 
Ubuntu 20.04.

CREATE TABLE tbl (
  [...]
) PARTITION BY HASH (address_key);

It has 64 partitions - e.g. FOR VALUES WITH (modulus 64, remainder 0) etc.

We're running `SELECT COUNT(*) FROM tbl`.

I've watched top and I never see more than 8 cores going 100%.

Here is my (admittedly ridiculous) postgresql.conf:

checkpoint_completion_target = 0.9
data_directory='/tank/postgresql/13/main'
default_statistics_target = 100
effective_cache_size = 381696MB
effective_io_concurrency = 200
enable_partition_pruning=on
enable_partitionwise_aggregate=on
enable_partitionwise_join=on
listen_addresses='*'
maintenance_work_mem = 2GB
max_connections = 200
max_parallel_maintenance_workers = 4
max_parallel_workers = 512
max_parallel_workers_per_gather = 512
max_wal_size = 4GB
max_worker_processes = 512
min_wal_size = 1GB
random_page_cost = 1.1
shared_buffers = 127232MB
shared_preload_libraries = 'cstore_fdw'
synchronous_commit=off
wal_buffers = 16MB
work_mem = 1628560kB

Best,
Seamus


--
Seamus Abshere, SCEA
https://faraday.ai
https://github.com/seamusabshere
https://linkedin.com/in/seamusabshere




certs in connection string

2021-02-13 Thread Rob Sargent
I’m confused, as usual, about using a cert in a connection string.  I wish to 
connect form a “middle ware” piece to PG on be half of various clients.
Does each client need a corresponding cert/key or is the certification intended 
to say the sending machine is who it says it is (thereby needing only one cert)



Re: Insert into on conflict, data size upto 3 billion records

2021-02-13 Thread Ron

On 2/12/21 12:46 PM, Karthik Kumar Kondamudi wrote:

Hi,

I'm looking for suggestions on how I can improve the performance of the 
below merge statement, we have a batch process that batch load the data 
into the _batch tables using Postgres and the task is to update the main 
target tables if the record exists else into it, sometime these batch 
table could go up to 5 billion records. Here is the current scenario


|target_table_main| has 700,070,247  records and is hash partitioned into 
50 chunks, it has an index on |logical_ts| and the batch table has 
2,715,020,546 close to 3 billion records, so I'm dealing with a huge set 
of data so looking of doing this in the most efficient way.


Many times, I have drastically sped up batch processing by #1 partitioning 
on the same field as an index, and #2 pre-sorting the input data by that field.


That way, you get excellent "locality of data" (meaning lots of writes to 
the same hot bits of cache, which later get asynchronously flushed to 
disk).  Unfortunately for your situation, the purpose of hash partitioning 
is to /reduce/ locality of data.  (Sometimes that's useful, but *not* when 
processing batches.)


--
Angular momentum makes the world go 'round.


ODBC message "server closed the connection unexpectedly" when accessing a PG/11.3 DB

2021-02-13 Thread David Gauthier
We are using MS-Access as a front-end to work with tables served by
a Postrges/11.3 DB on linux as the MS-Access backend through ODBC.  I have
my users install on their Windows laptops a PG driver for ODBC from here...
https://www.postgresql.org/ftp/odbc/versions/msi and selecting
psqlodbc-12_02_.zip.  This setup seems to work fine for me. However...

I have a user who got a "server closed the connection unexpectedly" after
trying the "Test Connection" feature in ODBC.  The "server closed the
connection". Hm . Is this saying that the user was connected, then
the connection was closed ?

I'm not the DB Admin but I had them give me the contents of the instance
log for the time period in question.  There's nothing in there that would
indicate that the server closed a connection.

This user is on the other side of the world and there's a possibility that
a firewall is in play.  But I wouldn't expect to see a closed connection
error for something like that.  More like a message saying that it couldn't
make the connection in the first place.

I asked the user to recreate the error while running the Tracing facility
in ODBC, then send me the logs.  But in the meantime, has anyone heard of
problems like this using ODBC (on Windows if that matters) ?

Thanks in Advance !


Re: Why is Postgres only using 8 cores for partitioned count? [Parallel Append]

2021-02-13 Thread Seamus Abshere
hi,

I've traced this back to the formula for Parallel Append workers - 
log2(partitions).

The comment from Robert says: (src/backend/optimizer/path/allpaths.c)

/*
 * If the use of parallel append is permitted, always request 
at least
 * log2(# of children) workers. 

In my case, every partition takes 1 second to scan, I have 64 cores, I have 64 
partitions, and the wall time is 8 seconds with 8 workers.

I assume that if it it planned significantly more workers (16? 32? even 64?), 
it would get significantly faster (even accounting for transaction cost). So 
why doesn't it ask for more? Note that I've set max_parallel_workers=512, etc. 
(postgresql.conf in my first message).

Here are full plans 
https://gist.github.com/seamusabshere/1c1d869ee24145689f45e4124d8730d3

Best,
Seamus

On Fri, Feb 12, 2021, at 4:50 PM, Seamus Abshere wrote:
> hi,
> 
> How can I convince Postgres to use more than 8 cores?
> 
> I've got an r6gd.16xlarge with 64 vCpus and 512gb RAM running Postgres 
> 13.1 on Ubuntu 20.04.
> 
> CREATE TABLE tbl (
>   [...]
> ) PARTITION BY HASH (address_key);
> 
> It has 64 partitions - e.g. FOR VALUES WITH (modulus 64, remainder 0) etc.
> 
> We're running `SELECT COUNT(*) FROM tbl`.
> 
> I've watched top and I never see more than 8 cores going 100%.
> 
> Here is my (admittedly ridiculous) postgresql.conf:
> 
> checkpoint_completion_target = 0.9
> data_directory='/tank/postgresql/13/main'
> default_statistics_target = 100
> effective_cache_size = 381696MB
> effective_io_concurrency = 200
> enable_partition_pruning=on
> enable_partitionwise_aggregate=on
> enable_partitionwise_join=on
> listen_addresses='*'
> maintenance_work_mem = 2GB
> max_connections = 200
> max_parallel_maintenance_workers = 4
> max_parallel_workers = 512
> max_parallel_workers_per_gather = 512
> max_wal_size = 4GB
> max_worker_processes = 512
> min_wal_size = 1GB
> random_page_cost = 1.1
> shared_buffers = 127232MB
> shared_preload_libraries = 'cstore_fdw'
> synchronous_commit=off
> wal_buffers = 16MB
> work_mem = 1628560kB
> 
> Best,
> Seamus
> 
> 
> --
> Seamus Abshere, SCEA
> https://faraday.ai
> https://github.com/seamusabshere
> https://linkedin.com/in/seamusabshere