date:20200715

RE: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

2020-07-15 Thread TALLURI Nareshkumar


Hello Team,

Can some one help us on this please , Actually we are blocked .

Regards,
Narresh
From: TALLURI Nareshkumar SgscGtsPasDre
Sent: Tuesday, July 14, 2020 5:20 PM
To: pgsql-general@lists.postgresql.org
Subject: psql: FATAL: database "postgres" does not exist or ERROR: 23505: 
duplicate key value violates unique constraint "pg_namespace_nspname_index"

Hello Postgres Support Team,

Today we have an outage, our DB was wend down due to 100% space full at FS. We 
added space and could able to bring the cluster.

DB version: psql (PostgreSQL) 10.12
OS version : Red Hat Enterprise Linux Server release 7.8 (Maipo)

[0]postgres@axmdevhkdb008$ [PHKGAXMD008] pg_ctl start -D 
/AXMDEVHKDB008/postgres/PHKGAXMD008_bck
waiting for server to start2020-07-14 08:50:42.273 CEST db:[] user:[] [] 
[0] LOG:  listening on IPv4 address "176.5.88.68", port 5433
2020-07-14 08:50:42.278 CEST db:[] user:[] [] [0] LOG:  listening on Unix 
socket "/var/run/postgresql/.s.PGSQL.5433"
2020-07-14 08:50:42.280 CEST db:[] user:[] [] [0] LOG:  listening on Unix 
socket "/tmp/.s.PGSQL.5433"
2020-07-14 08:50:42.546 CEST db:[] user:[] [] [0] LOG:  redirecting log 
output to logging collector process
2020-07-14 08:50:42.546 CEST db:[] user:[] [] [0] HINT:  Future log output 
will appear in directory "log".
done
server started
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base
 [0]postgres@axmdevhkdb008$ [PHKGAXMD008] psql
psql: FATAL:  database "postgres" does not exist
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base
 [2]postgres@axmdevhkdb008$ [PHKGAXMD008] ps -ef|grep postgres
root 16567 16513  0 14:12 pts/100:00:00 sudo su - postgres
root 16569 16567  0 14:12 pts/100:00:00 su - postgres
postgres 16571 16569  0 14:12 pts/100:00:00 -ksh
postgres 23888 1  0 14:50 pts/100:00:00 
/opt/rh/rh-postgresql10/root/usr/bin/postgres -D 
/AXMDEVHKDB008/postgres/PHKGAXMD008_bck
postgres 23890 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: logger 
process
postgres 23892 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: 
checkpointer process
postgres 23893 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: writer 
process
postgres 23894 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: wal 
writer process
postgres 23895 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: 
autovacuum launcher process
postgres 23896 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: stats 
collector process
postgres 23897 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: 
bgworker: logical replication launcher
postgres 24689 16571  0 14:55 pts/100:00:00 ps -ef
postgres 24690 16571  0 14:55 pts/100:00:00 grep --color=auto postgres
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base

We have the directories at Data location.

 [2]postgres@axmdevhkdb008$ [PHKGAXMD008] ls -lrt
total 180
drwx-- 2 postgres postgres   4096 Jan  2  2020 1
drwx-- 2 postgres postgres   4096 Jan  2  2020 13211
drwx-- 2 postgres postgres   4096 Jul 13 09:51 pgsql_tmp
drwx-- 2 postgres postgres 139264 Jul 13 21:02 16389
drwx-- 2 postgres postgres  12288 Jul 13 21:02 13212
drwx-- 2 postgres postgres   4096 Jul 14 11:08 13213
drwx-- 2 postgres postgres  12288 Jul 14 11:08 13214
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base


How to overcome this ?

Note: we don't have any wal files, it is ok for us if we could bring it up with 
our recover.

2 nd one:

Since we are not able to connect to DB , we did a new initialization and copy 
the directories from base directory to new base directory(in new data location)

And update pg_catalog.pg_database table with the DB name

(postgres@[local]:5433)# [PHKGAXMP003]INSERT INTO
-- #   pg_catalog.pg_database(
-- #   datname, datdba, encoding, datcollate, datctype, datistemplate, 
datallowconn,
-- #   datconnlimit, datlastsysoid, datfrozenxid, datminmxid, 
dattablespace, datacl)
-- # VALUES(
-- #  -- Write Your collation
-- #   'axiom', 10, 0, 'C', 'C',
-- #   False, True, -1, 16389, '536', '1', 1663, Null);
INSERT 16384 1
Time: 70.239 ms
(postgres@[local]:5433)# [PHKGAXMP003]select oid from pg_database a where 
a.datname = 'axiom';
  oid
---
16384
(1 row)


After this we could able to see the DB axiom.

   Name|  Owner   | Encoding  | Collate | Ctype |   Access privileges   |  
Size   | Tablespace |Description
---+--+---+-+---+---+-++
axiom | postgres | SQL_ASCII | C   | C |   | 16 
GB   | pg_default |
postgres  | postgres | SQL_ASCII | C   | C |   | 
7647 kB | pg_default | default administrative connection database
template0 | postgres | SQL_ASCII | C   | C

Re: Same query taking less time in low configuration machine

2020-07-15 Thread Vishwa Kalyankar

Hi,

  I am pasting the output of both server cpu speed and memory speed, and we
have same os (centos) on both the machines and i have downloaded the
postgres rpms from https://www.postgresql.org/

 1) High end machine

 [root@localhost ~]# dmidecode -t processor | grep Speed
Max Speed: 3600 MHz
Current Speed: 2666 MHz
Max Speed: 3600 MHz
Current Speed: 2666 MHz

Ram slots 16x4=64 GB

[root@localhost ~]# dmidecode --type 17
# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x1100, DMI type 17, 28 bytes
Memory Device
Array Handle: 0x1000
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: 1
Locator: DIMM_A1
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 1333 MT/s
Manufacturer: 00CE00B380CE
Serial Number: 35E9FC94
Asset Tag: 02131263
Part Number: M393B2G70BH0-YH9
Rank: 2

 cache size  : 12288 KB

2) Low End machine

[root@localhost ~]# dmidecode -t processor | grep Speed
Max Speed: 3800 MHz
Current Speed: 3200 MHz

Ram slots 4x2=8GB
[root@localhost ~]# dmidecode --type 17
# dmidecode 3.0
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.

Handle 0x0038, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0037
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 4096 MB
Form Factor: DIMM
Set: None
Locator: DIMM1
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous
Speed: 1600 MHz
Manufacturer: Hynix/Hyundai
Serial Number: 0BB0390C003C
Asset Tag: 9876543210
Part Number: HMT351U6EFR8C-PB
Rank: 2
Configured Clock Speed: 1600 MHz

  cache size  : 6144 KB

  Any other details required? kindly let me know how to obtain those, i
will share you the same.

Regards,
Vishwa S Kalyankar


On Tue, Jul 14, 2020 at 10:12 PM Kenneth Marshall  wrote:

> On Tue, Jul 14, 2020 at 09:27:56PM +0530, Vishwa Kalyankar wrote:
> > HI,
> >
> > OS cache is updated and I had run the query few times with almost the
> same
> > result each time.
> >
> > Regards,
> > Vishwa
>
> Hi Vishwa,
>
> What are the CPU speeds, memory bandwidth, I/O bandwidth? Often the
> lower core count CPUs have a faster clock speed. What is the CPU cache
> size for both? Are you running the same OS and PostgreSQL build binaries
> on both?
>
> Regards,
> Ken
>

Re: Issue executing query from container

2020-07-15 Thread Eudald Valcàrcel Lacasa

Hello Tom,
Thanks for your answer! I didn't know about this plugin and configured
postgresql with it.

After running the query both manually and with the script, I've the
following logs:

MANUALLY:
2020-07-15 00:56:08.735 CEST [20457] cefron@kontriki LOG:  statement:
UPDATE import_temp_2 AS tmp SET status = 3 FROM blacklist_central bl
 WHERE tmp.status = 1 AND lower(tmp.email) =
lower(bl.value) AND bl.type = 1
2020-07-15 00:56:09.495 CEST [20457] cefron@kontriki LOG:  duration:
759.102 ms  plan:
Query Text: UPDATE import_temp_2 AS tmp SET status = 3 FROM
blacklist_central bl
 WHERE tmp.status = 1 AND lower(tmp.email) =
lower(bl.value) AND bl.type = 1
Update on import_temp_2 tmp  (cost=116.73..17352.10 rows=5557 width=293)
  ->  Hash Join  (cost=116.73..17352.10 rows=5557 width=293)
Hash Cond: (lower((tmp.email)::text) = lower((bl.value)::text))
->  Seq Scan on import_temp_2 tmp  (cost=0.00..14864.20
rows=370496 width=193)
  Filter: (status = 1)
->  Hash  (cost=116.70..116.70 rows=3 width=130)
  Buckets: 32768 (originally 1024)  Batches: 2
(originally 1)  Memory Usage: 3841kB
  ->  Foreign Scan on blacklist_central bl
(cost=100.00..116.70 rows=3 width=130)

AUTOMATED:
2020-07-15 01:01:27.336 CEST [22783] cefron@kontriki LOG:  duration: 0.049 ms
2020-07-15 01:01:27.337 CEST [22783] cefron@kontriki LOG:  statement:
UPDATE import_temp_2 AS tmp SET status = 3 FROM blacklist_central bl
 WHERE tmp.status = 1 AND lower(tmp.email) =
lower(bl.value) AND bl.type = 1
2020-07-15 03:22:01.398 CEST [22783] cefron@kontriki LOG:  duration:
8434060.530 ms  plan:
Query Text: UPDATE import_temp_2 AS tmp SET status = 3 FROM
blacklist_central bl
 WHERE tmp.status = 1 AND lower(tmp.email) =
lower(bl.value) AND bl.type = 1
Update on import_temp_2 tmp  (cost=100.00..13295.86 rows=15 width=500)
  ->  Nested Loop  (cost=100.00..13295.86 rows=15 width=500)
Join Filter: (lower((tmp.email)::text) = lower((bl.value)::text))
->  Seq Scan on import_temp_2 tmp  (cost=0.00..13118.74
rows=1007 width=400)
  Filter: (status = 1)
->  Materialize  (cost=100.00..116.71 rows=3 width=130)
  ->  Foreign Scan on blacklist_central bl
(cost=100.00..116.70 rows=3 width=130)

Honestly, I see some differences, but I don't see a real cause that
could make the execution take 2 hours instead of few seconds.
Maybe with these results you or someone in the list is able to find something.

Appreciated for your help,
Eudald

El mar., 14 jul. 2020 a las 23:51, Tom Lane () escribió:
>
> =?UTF-8?Q?Eudald_Valc=C3=A0rcel_Lacasa?=  writes:
> > I'm running a docker container that executes a php script running a
> > sequence of queries.
> > One of the queries gets stuck (takes more than 2 hours in execution,
> > active in pg_stat_activity).
> > The query is executed with a JOIN between a FOREIGN TABLE and a local table.
>
> > Executing this query from the psql console takes less than 5 seconds to 
> > resolve.
> > If I execute the query from the container created, manually, it
> > resolves as well within 5 seconds.
>
> > Can anyone guide me in a way I can troubleshoot what is causing the
> > query to get stuck?
>
> It's a very good bet that it's something about the query being
> parameterized or not.  You could try installing auto_explain to
> compare the plans that are generated.
>
> regards, tom lane

Re: some random() clarification needed

2020-07-15 Thread Marc Millas

Thanks!
makes it clearer :-)
its not that obvious to guess the consequences of the "volatile" behaviour.
regards,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Wed, Jul 15, 2020 at 1:53 AM David Rowley  wrote:

> On Wed, 15 Jul 2020 at 04:01, Marc Millas  wrote:
> > your answer helps me understand my first problem.
> > so, I rewrote a simple loop so as to avoid the "volatile" behaviour.
>
> Not sure what you're trying to do with the plpgsql, but you can just
> escape the multiple evaluations by putting the volatile function in a
> sub-select with no FROM clause.
>
> SELECT ... FROM ... WHERE id = (SELECT ceiling(random()*2582));
>
> Or the more traditional way to get a random row is:
>
> SELECT ... FROM ... WHERE id BETWEEN 0 AND 2585 ORDER BY random() LIMIT 1;
>
> David
>

How to enable TDE on Postgresql 12.3 deployed using Kubernetes

2020-07-15 Thread Samarendra Sahoo

If there are any references or you have done, could you pls share steps?

Thanks

Cross-site cookies warnings

2020-07-15 Thread Fabio

I'm using python 3.7.4, django 3.0.6, javascript, Postgres 12.3.1
pgadmin 4.21 windows7. When my page loads on the console there are these
warnings:

|Cookie“PGADMIN_KEY”will be soon treated ascross-site cookie against
“http://127.0.0.1:8000/lists/list-name/” because the scheme does not
match. list-nameCookie“PGADMIN_LANGUAGE”will be soon treated
ascross-site cookie against “http://127.0.0.1:8000/lists/list-name/”
because the scheme does not match. list-nameCookie“PGADMIN_KEY”will be
soon treated ascross-site cookie against
“http://127.0.0.1:8000/lists/list-name/” because the scheme does not
match. list-nameCookie“PGADMIN_LANGUAGE”will be soon treated
ascross-site cookie against “http://127.0.0.1:8000/lists/list-name/”
because the scheme does not match. list-nameCookie“PGADMIN_KEY”will be
soon treated ascross-site cookie against
“http://127.0.0.1:8000/static/lists/js/lists.js” because the scheme does
not match. lists.jsCookie“PGADMIN_LANGUAGE”will be soon treated
ascross-site cookie against
“http://127.0.0.1:8000/static/lists/js/lists.js” because the scheme does
not match. lists.jsCookie“PGADMIN_KEY”will be soon treated ascross-site
cookie against “http://127.0.0.1:8000/jsi18n/” because the scheme does
not match. jsi18nCookie“PGADMIN_LANGUAGE”will be soon treated
ascross-site cookie against “http://127.0.0.1:8000/jsi18n/” because the
scheme does not match. jsi18nCookie“PGADMIN_KEY”will be soon treated
ascross-site cookie against “http://127.0.0.1:8000/static/js/common.js”
because the scheme does not match. common.jsCookie“PGADMIN_LANGUAGE”will
be soon treated ascross-site cookie against
“http://127.0.0.1:8000/static/js/common.js” because the scheme does not
match. common.jsCookie“PGADMIN_KEY”will be soon treated ascross-site
cookie against “http://127.0.0.1:8000/static/icons/favicon.png” because
the scheme does not match. favicon.pngCookie“PGADMIN_LANGUAGE”will be
soon treated ascross-site cookie against
“http://127.0.0.1:8000/static/icons/favicon.png” because the scheme does
not match. favicon.png |They should come from thenew policy about cookies . The problem is|PGADMIN_KEY| and|PGADMIN_LANGUAGE| are cookies of PGAdmin and I don't use

them explicitly (I mean evidently the database himself use them but
in my code they don't appear). Is this a bug of pgadmin?

Then I upgraded to pgadmin 4.23 and now I have only these warnings:

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/lists/list-name/” because the scheme does not
match. 2 list-name

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/jsi18n/” because the scheme does not match.
jsi18n

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/static/js/common.js” because the scheme does not
match. common.js

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/static/lists/js/lists.js” because the scheme does
not match. lists.js

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/static/icons/favicon.png” because the scheme does
not match. favicon.png

Also I see a new cookie (PGADMIN_INT_KEY) but it doesn't give problems.
So there's some way to solve this problem or I just have to wait the new update?

thank you

RE: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

2020-07-15 Thread Daniel Verite

TALLURI Nareshkumar wrote:

>  [0]postgres@axmdevhkdb008$ [PHKGAXMD008] psql
> psql: FATAL:  database "postgres" does not exist

That's not necessarily a problem. The "postgres" database is generally
empty and some installations don't even have one.
Use "psql -d template1" instead, or start postgres in single-user mode.

> Since we are not able to connect to DB , we did a new initialization and
> copy the directories from base directory to new base directory(in new data
> location)

This can't work. If you have a backup from just before that point,
you should restore it and work from that without writing anything into
the data directory.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite

Re: single table - fighting a seq scan

2020-07-15 Thread Radoslav Nedyalkov

Shame on me. It's a partial index - *where is not null.*
Put the* is not null *predicate in place and planner always goes for index.
(tested with thousands of IN entries)
CTE version always goes for index too even *without **is not null , *which
led to a slight confusion.

Thanks Tom, Michael,
Best
Rado

On Wed, Jul 15, 2020 at 1:06 AM Tom Lane  wrote:

> Radoslav Nedyalkov  writes:
> > Ah, I could have messed up the examples I gave. Row numbers are
> different.
> > Once again the plans , sorry about that.
>
> Given that it works at 100 entries and not 101, I can't escape the
> suspicion that you're being burnt by predtest.c's MAX_SAOP_ARRAY_SIZE
> limit.  However, that only affects the planner's willingness to make
> constraint proofs involving the large IN clause, and nothing you've
> mentioned here explains why such a proof would be needed.  Is there
> something you're not telling us about this table's schema?  (I'm
> wondering if the index is partial, for instance, though one would
> think that the CTE form of the query wouldn't work either if so.)
>
> regards, tom lane
>

Re: Cross-site cookies warnings

2020-07-15 Thread Adrian Klaver

On 7/15/20 2:34 AM, Fabio wrote:
I'm using python 3.7.4, django 3.0.6, javascript, Postgres 12.3.1
pgadmin 4.21 windows7. When my page loads on the console there are these
warnings:

them explicitly (I mean evidently the database himself use them but
in my code they don't appear). Is this a bug of pgadmin?

As I said in my SO comment pgAdmin != Postgres and the database is not
setting these. These are cookies set when you use pgAdmin. If you want
this dealt with file an issue here:
https://redmine.postgresql.org/projects/pgadmin4/issues.

Then I upgraded to pgadmin 4.23 and now I have only these warnings:

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/lists/list-name/” because the scheme does not
match. 2 list-name

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/jsi18n/” because the scheme does not match.
jsi18n

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/static/js/common.js” because the scheme does not
match. common.js

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/static/lists/js/lists.js” because the scheme does
not match. lists.js

Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against
“http://127.0.0.1:8000/static/icons/favicon.png” because the scheme does
not match. favicon.png

Also I see a new cookie (PGADMIN_INT_KEY) but it doesn't give problems.
So there's some way to solve this problem or I just have to wait the new update?

thank you

--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Issue executing query from container

2020-07-15 Thread Tom Lane

=?UTF-8?Q?Eudald_Valc=C3=A0rcel_Lacasa?=  writes:
> After running the query both manually and with the script, I've the
> following logs:

> MANUALLY:
> Update on import_temp_2 tmp  (cost=116.73..17352.10 rows=5557 width=293)
>   ->  Hash Join  (cost=116.73..17352.10 rows=5557 width=293)
> Hash Cond: (lower((tmp.email)::text) = lower((bl.value)::text))
> ->  Seq Scan on import_temp_2 tmp  (cost=0.00..14864.20
> rows=370496 width=193)
>   Filter: (status = 1)
> ->  Hash  (cost=116.70..116.70 rows=3 width=130)
>   Buckets: 32768 (originally 1024)  Batches: 2
> (originally 1)  Memory Usage: 3841kB
>   ->  Foreign Scan on blacklist_central bl
> (cost=100.00..116.70 rows=3 width=130)

> AUTOMATED:
> Update on import_temp_2 tmp  (cost=100.00..13295.86 rows=15 width=500)
>   ->  Nested Loop  (cost=100.00..13295.86 rows=15 width=500)
> Join Filter: (lower((tmp.email)::text) = lower((bl.value)::text))
> ->  Seq Scan on import_temp_2 tmp  (cost=0.00..13118.74
> rows=1007 width=400)
>   Filter: (status = 1)
> ->  Materialize  (cost=100.00..116.71 rows=3 width=130)
>   ->  Foreign Scan on blacklist_central bl
> (cost=100.00..116.70 rows=3 width=130)

So the question is why you are getting an estimate of 370496 import_temp_2
rows with status = 1 in the first case, and only 1007 rows in the second.

I suspect that the true number of rows is quite large, causing the
nested-loop plan to run slowly.  (Is the row estimate of 3 for the
foreign scan anywhere near reality, either?)

You may need to insert a manual ANALYZE in your automated process to
ensure that import_temp_2 has up-to-date stats before you try to do
this step.  It seems somewhat likely that autovacuum takes care of
that for you in the "manual" case, but its reaction time is too slow
to fill the gap for the automated process.

regards, tom lane

Re: single table - fighting a seq scan

2020-07-15 Thread Tom Lane

Radoslav Nedyalkov  writes:
> Shame on me. It's a partial index - *where is not null.*
> Put the* is not null *predicate in place and planner always goes for index.
> (tested with thousands of IN entries)
> CTE version always goes for index too even *without **is not null , *which
> led to a slight confusion.

Ah.  That's actually something we fixed in v12 (see [1]).  In the CTE
version, the planner can prove "x is not null" from "x = cte_value" even
without knowing what the CTE output value is, just on the basis that "="
is strict.  In the IN form, it's likewise possible to prove "x is not
null" from "x IN (list)", but you need a special test to recognize that.
With a short IN list, the planner converts IN to "x = this OR x = that
OR x = the-other ..."  and can make the proof from that formulation.
But we prevent it from trying that on long IN lists, because it'd eat
lots of cycles and perhaps not be able to prove the desired partial index
qual anyway.

regards, tom lane

[1] https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=65ce07e02

2 million queries against a table

2020-07-15 Thread Adam Sanchez

Hi

I need to run 2 million queries against a three columns table t
(s,p,o) which size is 10 billions rows. The data type of each column
is string.  The server has 512G RAM, 32 cores and 14T SSD (RAID 0)

Only two types of queries:

select s p o from t where s = param
select s p o from t where o = param

If I store the table in a Postgresql database takes 6 hours using a
Java ThreadPoolExecutor.

Do you think Postgresql itself can speed up the queries processing
even more? What would be the best strategy?

These are my ideas:

1. Use Spark to launch queries against the table (extracted from
Postgresql) loaded in a dataframe
2. Use PG-Strom, an extension module of PostgreSQL with GPU support
and launch the queries against the table.


Any suggestion will be appreciated

Best,

Re: 2 million queries against a table

2020-07-15 Thread Ron


On 7/15/20 10:10 AM, Adam Sanchez wrote:

Hi

I need to run 2 million queries against a three columns table t
(s,p,o) which size is 10 billions rows. The data type of each column
is string.  The server has 512G RAM, 32 cores and 14T SSD (RAID 0)

Only two types of queries:

select s p o from t where s = param
select s p o from t where o = param


What's the index cardinality of "s" and "o" (about how many records per key)?

What kind of indexes do you have on them?  Is the table clustered on one of 
the keys?


What version of Postgresql?


If I store the table in a Postgresql database takes 6 hours using a
Java ThreadPoolExecutor.


How many threads?

What values of:
shared_buffers
temp_buffers
work_mem


Do you think Postgresql itself can speed up the queries processing
even more? What would be the best strategy?


2M queries in 6 hours is 93 queries/second.  Over 32 cores, that's only 
three per second.   Not very much.



These are my ideas:

1. Use Spark to launch queries against the table (extracted from
Postgresql) loaded in a dataframe
2. Use PG-Strom, an extension module of PostgreSQL with GPU support
and launch the queries against the table.


Any suggestion will be appreciated


IO is -- as usual -- the bottleneck, followed closely by cache efficiency.  
Are you issuing the queries in a random order, or sequentially by key (which 
would enhance cache efficiency)?


--
Angular momentum makes the world go 'round.

Re: 2 million queries against a table

2020-07-15 Thread Tom Lane

Adam Sanchez  writes:
> I need to run 2 million queries against a three columns table t
> (s,p,o) which size is 10 billions rows. The data type of each column
> is string.  The server has 512G RAM, 32 cores and 14T SSD (RAID 0)

> Only two types of queries:

> select s p o from t where s = param
> select s p o from t where o = param

TBH, this is a pretty silly way to use a SQL database.  Put
the probe values into a temporary table (in batches, perhaps)
and do a join.  The per-row cost of that sort of approach
will be multiple orders of magnitude smaller than a query
per row.

regards, tom lane

Re: Clustering solution ?

2020-07-15 Thread Laurent FAILLIE

Hello all,
thanks all for your replies : I started to learn Patroni.Anyway, any 
alternative with my customer want's also LoadBalancing ?
Thanks
Laurent
 

Le jeudi 9 juillet 2020 à 17:47:23 UTC+2, Laurent FAILLIE 
 a écrit :  
 
 Mainly for HA and DR but it can help also for the future if a load balancing 
is possible. But it's not mandatory yet.

Thanks
Le jeudi 9 juillet 2020 à 17:30:35 UTC+2, Ron  a 
écrit :  
 
  For high availability, for load sharing or for disaster recovery?
 
 On 7/9/20 8:53 AM, Laurent FAILLIE wrote:
  
 
  Hello, 
  I've been asked by one of my customer to know which are the possible 
clustering solution for PostgreSQL ? 
  Active/passive ? Active/active ? 
  if possible free or not too expensive. 
  Thanks 
  Laurent

 -- 
 Angular momentum makes the world go 'round.

Re: 2 million queries against a table

2020-07-15 Thread Rob Sargent





On 7/15/20 9:38 AM, Tom Lane wrote:

Adam Sanchez  writes:

I need to run 2 million queries against a three columns table t
(s,p,o) which size is 10 billions rows. The data type of each column
is string.  The server has 512G RAM, 32 cores and 14T SSD (RAID 0)



Only two types of queries:



select s p o from t where s = param
select s p o from t where o = param


TBH, this is a pretty silly way to use a SQL database.  Put
the probe values into a temporary table (in batches, perhaps)
and do a join.  The per-row cost of that sort of approach
will be multiple orders of magnitude smaller than a query
per row.

regards, tom lane


I had assumed the OP was planning a performance test for say 2M separate 
user/requests (perhaps non-unique) rather than asking for the specific 
2M records.

Re: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

2020-07-15 Thread Naresh Kumar

Thanks Denial,

Already I tried to connect with template0, template1, user database name,
postgres, nothing is working getting same error message as mentioned in
trail mail.

For the second one, yes we don't have backups to restore, we don't have any
option apart from data directory(no wal files)

Regards,
Narresh


On Wed 15 Jul, 2020, 3:28 PM Daniel Verite  TALLURI Nareshkumar wrote:
>
> >  [0]postgres@axmdevhkdb008$ [PHKGAXMD008] psql
> > psql: FATAL:  database "postgres" does not exist
>
> That's not necessarily a problem. The "postgres" database is generally
> empty and some installations don't even have one.
> Use "psql -d template1" instead, or start postgres in single-user mode.
>
> > Since we are not able to connect to DB , we did a new initialization and
> > copy the directories from base directory to new base directory(in new
> data
> > location)
>
> This can't work. If you have a backup from just before that point,
> you should restore it and work from that without writing anything into
> the data directory.
>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: https://www.manitou-mail.org
> Twitter: @DanielVerite
>
>
>

Re: Issue executing query from container

2020-07-15 Thread Eudald Valcàrcel Lacasa

Hello Tom,
I've noticed something that may be a factor on this problem.

In the automated script, 5 or 6 seconds before this query happens, the
foreign table is created.
If I create the foreign table manually and skip its creation in the
automated script, and then I run the automated script,
the query takes 5-6 seconds to be executed instead of 2 hours.

Hope this info is helpful to dig further.
Thanks,
Eudald

El mié., 15 jul. 2020 a las 16:42, Tom Lane () escribió:
>
> =?UTF-8?Q?Eudald_Valc=C3=A0rcel_Lacasa?=  writes:
> > After running the query both manually and with the script, I've the
> > following logs:
>
> > MANUALLY:
> > Update on import_temp_2 tmp  (cost=116.73..17352.10 rows=5557 width=293)
> >   ->  Hash Join  (cost=116.73..17352.10 rows=5557 width=293)
> > Hash Cond: (lower((tmp.email)::text) = lower((bl.value)::text))
> > ->  Seq Scan on import_temp_2 tmp  (cost=0.00..14864.20
> > rows=370496 width=193)
> >   Filter: (status = 1)
> > ->  Hash  (cost=116.70..116.70 rows=3 width=130)
> >   Buckets: 32768 (originally 1024)  Batches: 2
> > (originally 1)  Memory Usage: 3841kB
> >   ->  Foreign Scan on blacklist_central bl
> > (cost=100.00..116.70 rows=3 width=130)
>
> > AUTOMATED:
> > Update on import_temp_2 tmp  (cost=100.00..13295.86 rows=15 width=500)
> >   ->  Nested Loop  (cost=100.00..13295.86 rows=15 width=500)
> > Join Filter: (lower((tmp.email)::text) = 
> > lower((bl.value)::text))
> > ->  Seq Scan on import_temp_2 tmp  (cost=0.00..13118.74
> > rows=1007 width=400)
> >   Filter: (status = 1)
> > ->  Materialize  (cost=100.00..116.71 rows=3 width=130)
> >   ->  Foreign Scan on blacklist_central bl
> > (cost=100.00..116.70 rows=3 width=130)
>
> So the question is why you are getting an estimate of 370496 import_temp_2
> rows with status = 1 in the first case, and only 1007 rows in the second.
>
> I suspect that the true number of rows is quite large, causing the
> nested-loop plan to run slowly.  (Is the row estimate of 3 for the
> foreign scan anywhere near reality, either?)
>
> You may need to insert a manual ANALYZE in your automated process to
> ensure that import_temp_2 has up-to-date stats before you try to do
> this step.  It seems somewhat likely that autovacuum takes care of
> that for you in the "manual" case, but its reaction time is too slow
> to fill the gap for the automated process.
>
> regards, tom lane

Re: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

2020-07-15 Thread Karsten Hilbert

On Wed, Jul 15, 2020 at 10:03:06PM +0530, Naresh Kumar wrote:

> Already I tried to connect with template0, template1, user database name,
> postgres, nothing is working getting same error message as mentioned in
> trail mail.
>
> For the second one, yes we don't have backups to restore, we don't have any
> option apart from data directory(no wal files)

At this point you are very likely in need of (highly)
specialized professional help.

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B

ownership of "/var/run/postgresql"

2020-07-15 Thread Dmitry O Litvintsev

Hi, 

we run 9.6 postgresql DB on Linux box. We run as different user than postgres. 
To get it to work we had to chown /var/run/postgresql to be owned by the 
process user. 

Upgraded to 11 and now we see that file 

/var/run/postgresql changes ownership to postgres:postgres on reboot , even 
though postgresql-11.service is disabled. 

What is doing it and is it possible to disable it?

Thanks, 
Dmitry

Re: ownership of "/var/run/postgresql"

2020-07-15 Thread Christoph Moench-Tegeder

## Dmitry O Litvintsev (litvi...@fnal.gov):

> Upgraded to 11 and now we see that file 
> 
> /var/run/postgresql changes ownership to postgres:postgres on reboot ,
> even though postgresql-11.service is disabled. 

That's /usr/lib/tmpfiles.d/postgresql.conf (or similar). Don't
edit that file, see "man 5 tmpfiles.d" on how to override that
file and general info on the format.

Regards,
Christoph

-- 
Spare Space

Re: Same query taking less time in low configuration machine

2020-07-15 Thread Thomas Munro

On Tue, Jul 14, 2020 at 9:27 PM Vishwa Kalyankar
 wrote:
>  Function Scan on kseb_geometry_trace_with_barrier_partition  
> (cost=0.25..10.25 rows=1000 width=169) (actual time=11626.548..11626.568 
> rows=254 loops=1)

>  Function Scan on kseb_geometry_trace_with_barrier_partition  
> (cost=0.25..10.25 rows=1000 width=169) (actual time=22304.425..22304.448 
> rows=254 loops=1)

I have no idea what that function does, but perhaps it runs more
queries, and you can't see the plans for those here.  If you set up
auto_explain[1], and turn on auto_explain.log_nested_statements, then
you'll be able to see the query plans for the internal stuff happening
in that function.

I'd look at EXPLAIN (BUFFERS) or auto_explain.log_buffers to get more
information on cache hits and misses.  I'd look for settings
differences with EXPLAIN (SETTINGS) to see if there's anything
accidentally set differently (maybe JIT or paralelism or something
like that).  I'd look at pg_stat_activity repeatedly while it runs to
see what the processes are doing, especially the wait_event column.
I'd also look at the CPU and I/O on the systems with operating system
tools like iostat, htop, perf to try to find the difference.

[1] https://www.postgresql.org/docs/current/auto-explain.html

Re: Same query taking less time in low configuration machine

2020-07-15 Thread David Rowley

On Thu, 16 Jul 2020 at 09:50, Thomas Munro  wrote:
> I'd look at EXPLAIN (BUFFERS) or auto_explain.log_buffers to get more
> information on cache hits and misses.  I'd look for settings
> differences with EXPLAIN (SETTINGS) to see if there's anything
> accidentally set differently (maybe JIT or paralelism or something
> like that).  I'd look at pg_stat_activity repeatedly while it runs to
> see what the processes are doing, especially the wait_event column.
> I'd also look at the CPU and I/O on the systems with operating system
> tools like iostat, htop, perf to try to find the difference.

It might also be good to look at size of the tables and indexes that
are looked at within the function. If the smaller end machine was
loaded with data via pg_restore form a pg_dump taken from the larger
machine then the indexes might be in much better shape and the heap
may have less bloat.  Of course, that's just speculation. We've not
seen what the function does yet.

Vishwa, it would be good if you could follow the guide here:
https://wiki.postgresql.org/wiki/Slow_Query_Questions , in particular:

"Post the definitions of all tables and indexes referenced in the
query. If the query touches views or custom functions, we'll need
those definitions as well. Run psql command "\d table" with the
tables/views/indices referenced in the problem query."

David

RE: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

Re: Same query taking less time in low configuration machine

Re: Issue executing query from container

Re: some random() clarification needed

How to enable TDE on Postgresql 12.3 deployed using Kubernetes

Cross-site cookies warnings

RE: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

Re: single table - fighting a seq scan

Re: Cross-site cookies warnings

Re: Issue executing query from container

Re: single table - fighting a seq scan

2 million queries against a table

Re: 2 million queries against a table

Re: 2 million queries against a table

Re: Clustering solution ?

Re: 2 million queries against a table

Re: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

Re: Issue executing query from container

Re: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

ownership of "/var/run/postgresql"

Re: ownership of "/var/run/postgresql"

Re: Same query taking less time in low configuration machine

Re: Same query taking less time in low configuration machine

23 matches

Site Navigation

Mail list logo

Footer information