Re: Has there been any discussion of custom dictionaries being defined in the database?

2019-10-17 Thread Tom Lane
Morris de Oryx  writes:
> We're deployed on RDS, which does not give you any file system to access.
> I'd love to be able to create a custom thesaurus dictionary for our
> situation, which seems like it is impossible in a setup like ours.

> Has there been any discussion of making dictionary configuration files
> accessible via a dictionary table instead of a physical, structured disk
> file? Or, alternatively, something that could be accessed
> remotely/externally as a URL or FDW?

Nope.  TBH, I don't find this case terribly compelling.  You should be
beating up RDS for not letting you configure your DB the way you want.

regards, tom lane




Re: Can functions containing a CTE be PARALLEL SAFE?

2019-10-17 Thread Tom Lane
Erwin Brandstetter  writes:
>> The following operations are always parallel restricted.
>> - Scans of common table expressions (CTEs).

> Now I am unsure whether I can use `PARALLEL SAFE` for functions containing
> a CTE (while  fulfilling all other criteria)?

AFAIR, the reason for treating CTEs as parallel restricted is simply to
guarantee single evaluation of the CTE.  Within a function, that would
only matter per-function-execution, so I can't see why a function
containing such a query couldn't be pushed down to workers for execution.

regards, tom lane




Re: Has there been any discussion of custom dictionaries being defined in the database?

2019-10-17 Thread Morris de Oryx
Fair.

Given that Amazon is bragging this week about turning off Oracle, it seems
like they could kick some resources towards contributing something to the
Postgres project. With that in mind, is the idea of defining dictionaries
within a table somehow meritless, or unexpectedly difficult?


Re: Has there been any discussion of custom dictionaries being defined in the database?

2019-10-17 Thread Tom Lane
Morris de Oryx  writes:
> Given that Amazon is bragging this week about turning off Oracle, it seems
> like they could kick some resources towards contributing something to the
> Postgres project. With that in mind, is the idea of defining dictionaries
> within a table somehow meritless, or unexpectedly difficult?

Well, it'd just be totally different.  I don't think anybody cares to
provide two separate definitions of common dictionaries (which'd have to
somehow be kept in sync).

As for why we did it with external text files in the first place ---
for at least some of the dictionary types, the point is that you can
drop in data files that are available from upstream sources, without any
modification.  Getting the same info into a table would require some
nonzero amount of data transformation.

Having said that ... in the end a dictionary is really just a set of
functions implementing the dictionary API; where they get their data
from is their business.  So in theory you could roll your own
dictionary that gets its data out of a table.  But the dictionary API
would be pretty hard to implement except in C, and I bet RDS doesn't
let you install your own C functions either :-(

regards, tom lane




pgbackrest with PAF(corosync and pacmaker)

2019-10-17 Thread Ajay Pratap
##Arch
OS: Centos 7
PostgreSQL 10
pgbackrest 2.18
PAF: two nodes(primary and secondary ) with corosync and pacemaker

##Query
I am using pacemaker and corosync to setup two nodes High availability
cluster for postgreSQL 10. In the scenario I want to setup timely backup
with pgbackrest.
Using the _backup from standby_ feature I could able to take backup from my
secondary  postgres. But if secondary postgres is down, it does not take
backup from primary.

What I want?
I want to take backup from secondary postgres (--backup-standby), If
secondary is unavailable, take backup from primary postgres.
Or if Primary postgres goes down backup should still continues from
secondary (which would have been promoted to master now)

All backups are with remote repository
pgbackrest.conf for repo:












*[data]pg1-host=pgprimarypg1-path=/var/lib/pgsql/10/datapg2-host=pgsecpg2-path=/var/lib/pgsql/10/data[global]backup-standby=yprocess-max=3repo1-path=/var/lib/pgbackrestrepo1-retention-full=2start-fast=y*

pgbackrest for primary and secondary postgres:












*[data]pg1-host=mainpg1-path=/var/lib/pgsql/10/datapg2-host=secpg2-path=/var/lib/pgsql/10/data[global]backup-standby=nprocess-max=3repo1-path=/var/lib/pgbackrestrepo1-retention-full=2start-fast=y*


Is there any way to achieve this?
This sounds like a generic problem? Or am I doing something wrong

Regards,
ajay

-- 


*Disclaimer:* The information in this communication is confidential and 
may be legally privileged. It is intended solely for the use of the 
individual or entity to whom it is addressed and others authorized to 
receive it. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking action in reliance of 
the contents of this information is strictly prohibited and may be 
unlawful. Drishti is neither liable for the improper, incomplete 
transmission of the information contained in this communication nor any 
delay in its receipt. The communication is not intended to operate as an 
electronic signature under any applicable law. Drishti assumes no 
responsibility for any loss or damage resulting from the use of e-mails.


Re: Pgbackrest backup is too slow

2019-10-17 Thread Ajay Pratap
Thanks for your previous replies.
It turns out my machine was having kernel Issues, which made it slow.
I setup another cluster in different machine now I could able to backup 42G
in 6 min(s).


On Fri, Oct 11, 2019 at 7:23 PM Brad Nicholson  wrote:

> Stephen Frost  wrote on 2019/10/11 08:50:53 AM:
>
> > From: Stephen Frost 
> > To: Ajay Pratap 
> > Cc: Postgres General 
> > Date: 2019/10/11 08:51 AM
> > Subject: [EXTERNAL] Re: Pgbackrest backup is too slow
> >
> > Greetings,
> >
> > * Ajay Pratap (ajaypra...@ameyo.com) wrote:
> > > I have a Centos 7 server which runs Postgresql 10.7. I am using
> pgbackrest
> > > to take db backup.
> > > Problem is backup is too slow.
> >
> > Have you tried running 'top' to see what's going on?
> >
> > > My data dir size is 9.6G and full backup runtime is 22 mins
> > > I also tried using process-max=3, full backup runtime = 21 mins
> >
> > Erm, those numbers don't make any sense to me- as an example, we
> > regularly back up a 12GB database, from Dallas to San Fran, in 5
> > minutes.
>
>
> How many database objects do you have?  For databases with a lot of tables
> and/or indexes (think tens of thousands), backup performance will slow
> down.  This has improved a lot in newer versions, but still impacts
> performance of the backups.
>
> Brad.
>

-- 


*Disclaimer:* The information in this communication is confidential and 
may be legally privileged. It is intended solely for the use of the 
individual or entity to whom it is addressed and others authorized to 
receive it. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking action in reliance of 
the contents of this information is strictly prohibited and may be 
unlawful. Drishti is neither liable for the improper, incomplete 
transmission of the information contained in this communication nor any 
delay in its receipt. The communication is not intended to operate as an 
electronic signature under any applicable law. Drishti assumes no 
responsibility for any loss or damage resulting from the use of e-mails.


Re: Has there been any discussion of custom dictionaries being defined in the database?

2019-10-17 Thread Morris de Oryx
Nope, no custom C installs. RDS is super convenient in many ways, but also
limited. You can't, for example, run TimeScale, install RUM indexes (if
those still work), or any novel plugins. And you can't do anything at all
requiring a file reference. The backup features are outstanding. But, yeah,
sometimes frustrating.


Re: Changing PK on replicated database

2019-10-17 Thread PegoraroF10
Correct, those records are on replica too. I´m just talking about the best
way to update those keys and all their dependent tables.
If I change them first on master they will not be replicated because it will
be an update and their pk will not be found on replica, correct ?
If so, do I need to update manually on replica and later on master ? 
And on replica server, all FK will be updated if I change that PK ? On
replica triggers are not triggered, are foreign key cascade ?
If I do this way, when I change that PK on master I´ll get a warning on
replica server because that PK did not exist anymore ?

The only question is, what are correct steps to do when you need to change a
PK on replicated database, just that.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




Re: Has there been any discussion of custom dictionaries being defined in the database?

2019-10-17 Thread Karsten Hilbert
On Thu, Oct 17, 2019 at 11:52:39AM +0200, Tom Lane wrote:

> Morris de Oryx  writes:
> > Given that Amazon is bragging this week about turning off Oracle, it seems
> > like they could kick some resources towards contributing something to the
> > Postgres project. With that in mind, is the idea of defining dictionaries
> > within a table somehow meritless, or unexpectedly difficult?
>
> Well, it'd just be totally different.  I don't think anybody cares to
> provide two separate definitions of common dictionaries (which'd have to
> somehow be kept in sync).

Might crafty use of server side

COPY TO ... PROGRAM ...

enable OP to drop in dictionary data files as needed ?

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: pgbackrest with PAF(corosync and pacmaker)

2019-10-17 Thread David Steele

On 10/17/19 9:48 AM, Ajay Pratap wrote:


I am using pacemaker and corosync to setup two nodes High availability 
cluster for postgreSQL 10. In the scenario I want to setup timely backup 
with pgbackrest.
Using the _backup from standby_ feature I could able to take backup from 
my secondary  postgres. But if secondary postgres is down, it does not 
take backup from primary.


This is entirely on purpose.  Backup from standby is designed to reduce 
load on the primary so we believe it is counter-intuitive to put load 
back on the primary when the standby is down.  Running backups *at all* 
when degraded is usually a bad idea.


> Is there any way to achieve this?

You can run more than one standby and pgBackRest will automatically 
select the first one it finds that is up.  If you must take a backup 
when degraded then you can specify --no-backup-standby on the command 
line to disable standby for that one backup.


Regards,
--
-David
da...@pgmasters.net




Re: PostgreSQL memory usage

2019-10-17 Thread Laurenz Albe
Alexander Pyhalov wrote:
> After looking at my DBMS server for some time I've understood that I don't 
> understand what was going on...
> 
> A server has 48 GB RAM. shared_buffers is set to 12GB, work_mem - to 32MB, 
> pgbouncer
> in transaction mode is used to connect pool (pool size 80) to PostgreSQL 10.5 
> server. 
> 
> I see that at some point several postgresql backends start consuming about 16 
>  GB RAM.
> If we account for shared_buffers, it meens 4 GB RAM for private backend 
> memory.
> How can we achieve such numbers? I don't see any long-running (or complex) 
> queries
> (however, there could be long-running transactions and queries to large 
> partitioned tables).
> But how could they consume 512* work_mem memory?

A wild guess: you are using large data items, or you have PostGIS installed,
which is known to use lots of memory for certain operations, which is not
limited by "work_mem".

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Can you please tell us how set this prefetch attribute in following lines.

2019-10-17 Thread M Tarkeshwar Rao
Hi all,



How to fetch certain number of tuples from a postgres table.



Same I am doing in oracle using following lines by setting prefetch attribute.



For oracle
// Prepare query
if( OCIStmtPrepare( myOciStatement, myOciError, (text *)aSqlStatement,
// Get statement type
 OCIAttrGet( (void *)myOciStatement, OCI_HTYPE_STMT, &statement_type, 0, 
OCI_ATTR_STMT_TYPE, myOciError );
// Set prefetch count
  OCIAttrSet( myOciStatement, OCI_HTYPE_STMT, &prefetch, 0, 
OCI_ATTR_PREFETCH_ROWS, myOciError );
// Execute query
status = OCIStmtExecute( myOciServerCtx, myOciStatement, myOciError, iters, 0, 
NULL, NULL, OCI_DEFAULT );


For Postgres



Can you please tell us how set this prefetch attribute in following lines. Is 
PQexec returns all the rows from the table?


mySqlResultsPG = PQexec(connection, aSqlStatement);
if((PQresultStatus(mySqlResultsPG) == PGRES_FATAL_ERROR ) || 
(PQstatus(connection) != CONNECTION_OK)){}
if ((PQresultStatus(mySqlResultsPG) == PGRES_COMMAND_OK) || 
(PQresultStatus(mySqlResultsPG) == PGRES_TUPLES_OK))
{
myNumColumns = PQnfields(mySqlResultsPG);
myTotalNumberOfRowsInQueryResult = PQntuples(mySqlResultsPG);
myCurrentRowNum = 0 ;
}




Regards

Tarkeshwar



Re: A little confusion about JSON Path

2019-10-17 Thread Laurenz Albe
Thomas Kellerer wrote:
> I don't understand why the following two JSON Path expressions aren't doing 
> the same thing in Postgres 12:
> 
> with sample (data) as (
>   values
> ('{"k1": {"list":[1,2,3]}}'::jsonb)
> )
> select data, 
>jsonb_path_exists(data, '$.k1.list.type() ? (@ == "array")'), -- 
> returns true as expected
>jsonb_path_exists(data, '$.k1.list ? (@.type() == "array")') -- 
> returns false - not expected
> from sample;
> 
> 
> Apparently "@.type()" returns something different then "$.k1.list.type()"
> 
> But maybe I simply don't understand how the @ is supposed to work.

This seems to be a consequence of "lax" mode:

 "Besides, comparison operators automatically unwrap their operands in the lax 
mode,
  so you can compare SQL/JSON arrays out-of-the-box. An array of size 1 is 
considered
  equal to its sole element. Automatic unwrapping is not performed only when:

  - The path expression contains type() or size() methods that return the type 
and
the number of elements in the array, respectively.

(from https://www.postgresql.org/docs/12/functions-json.html)

with sample (data) as (
  values
('{"k1": {"list":[1,2,3]}}'::jsonb)
)
select data,
   jsonb_path_exists(data, '$.k1.list ? (@.type() == "number")'),   -- 
lax mode unwraps the array
   jsonb_path_exists(data, 'strict $.k1.list ? (@.type() == "array")')  -- 
strict mode doesn't
from sample;


data | jsonb_path_exists | jsonb_path_exists 
-+---+---
 {"k1": {"list": [1, 2, 3]}} | t | t
(1 row)

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





RE: PostgreSQL memory usage

2019-10-17 Thread Alexander Pyhalov
No, there are no long text or bytea fields, just int, bigint, ctime, limited 
varchar types.
However, one table has 2250 partitions and I've seen several selects to this 
table (based on the  primary key of individual partitions) with a huge IN ()  
list (about 500 keys). Don't expect this to be efficient, but unsure that these 
queries caused such memory consumption.

С уважением,
Александр Пыхалов,
программист отдела телекоммуникационной инфраструктуры
управления информационно-коммуникационной инфраструктуры ЮФУ



От: Laurenz Albe 
Отправлено: 17 октября 2019 г. 14:09
Кому: Alexander Pyhalov; pgsql-general@lists.postgresql.org
Тема: Re: PostgreSQL memory usage

Alexander Pyhalov wrote:
> After looking at my DBMS server for some time I've understood that I don't 
> understand what was going on...
>
> A server has 48 GB RAM. shared_buffers is set to 12GB, work_mem - to 32MB, 
> pgbouncer
> in transaction mode is used to connect pool (pool size 80) to PostgreSQL 10.5 
> server.
>
> I see that at some point several postgresql backends start consuming about 16 
>  GB RAM.
> If we account for shared_buffers, it meens 4 GB RAM for private backend 
> memory.
> How can we achieve such numbers? I don't see any long-running (or complex) 
> queries
> (however, there could be long-running transactions and queries to large 
> partitioned tables).
> But how could they consume 512* work_mem memory?

A wild guess: you are using large data items, or you have PostGIS installed,
which is known to use lots of memory for certain operations, which is not
limited by "work_mem".

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com





Re: PostgreSQL memory usage

2019-10-17 Thread Tom Lane
Alexander Pyhalov  writes:
> However, one table has 2250 partitions and I've seen several selects to this 
> table (based on the  primary key of individual partitions) with a huge IN ()  
> list (about 500 keys). Don't expect this to be efficient, but unsure that 
> these queries caused such memory consumption.

Yeah, it could be that it's just the number of partitions involved in
the query that's causing the memory bloat.  v10 is not bright about
pruning uninteresting partitions from a query (v12 is better), so
there will be 2250 tables involved so far as the planner and executor
are concerned.  And then if you've got a large/complex WHERE clause,
that's probably copied and modified for each of those tables :-(.

We don't generally recommend having more than a hundred or so partitions,
because of issues like this.  v12 is starting to make it feasible to have
more, but only in limited use-cases.

regards, tom lane




Re: Can functions containing a CTE be PARALLEL SAFE?

2019-10-17 Thread Erwin Brandstetter
Makes sense, thanks for the confirmation.
Maybe clarify in the manual?

Regards
Erwin

On Thu, Oct 17, 2019 at 11:20 AM Tom Lane  wrote:

> Erwin Brandstetter  writes:
> >> The following operations are always parallel restricted.
> >> - Scans of common table expressions (CTEs).
>
> > Now I am unsure whether I can use `PARALLEL SAFE` for functions
> containing
> > a CTE (while  fulfilling all other criteria)?
>
> AFAIR, the reason for treating CTEs as parallel restricted is simply to
> guarantee single evaluation of the CTE.  Within a function, that would
> only matter per-function-execution, so I can't see why a function
> containing such a query couldn't be pushed down to workers for execution.
>
> regards, tom lane
>


Conflict between autovacuum and backup restoration

2019-10-17 Thread Ekaterina Amez

Good afternoon,

We've finally made the migration+upgrade from old server with v7.14 to 
new server with v8.4 and, before test and plan an upgrade to v9.6, I'm 
checking logs to find out if there's any problem with this upgrade. 
We've been fixing things and in only remains one problem in the log that 
I don't understand and not sure how to fix, or if it has to be fixed. 
Original logs are not in english so I'm posting my translation.


In the morning we're making a backup of -let's call it- the main 
database. To avoid intensive use of this main database, we have a second 
db that's a copy of the main one, used only to display data to some 
users that don't need it up to date and that shouldn't change anything. 
So at night we're restoring the morning backup from main database into 
this second one. The upgrade that I've mentioned has ended with both, 
main and second databases, being in the same server. The "problem" 
(because I'm not sure if it's really a problem) is that while the backup 
is restoring in the second database, it seems like autovacuum is 
launched and conflicts with this db restore. The log is this:


[...] Several messages about checkpoints running too frequently [...]
2019-10-16 23:01:30.904 CEST - [162851] - user@[local]:secondDB - LOG:  
duration: 7446.139 ms  sentence: COPY one_table (some_columns) FROM stdin;
2019-10-16 23:01:37.457 CEST - [13750] LOG:  checkpoints are running too 
frequently ( 9 seconds)
2019-10-16 23:01:37.457 CEST - [13750] HINT:  Consider changing 
«checkpoint_segments» configuration parameter.
2019-10-16 23:01:58.663 CEST - [162851] - user@[local]:secondDB - LOG:  
duration: 6492.426 ms  sentence: CREATE INDEX another_table_index1 ON 
another_table USING btree (another_field1);
2019-10-16 23:02:04.042 CEST - [162851] - user@[local]:secondDB - LOG:  
duration: 5378.886 ms  sentence: CREATE INDEX another_table_index2 ON 
another_table USING btree (another_field2);
2019-10-16 23:02:11.742 CEST - [162851] - user@[local]:secondDB - LOG:  
duration: 7699.246 ms  sentence: CREATE INDEX another_table_index3 ON 
another_table USING btree (another_field3);
2019-10-16 23:02:12.742 CEST - [162851] - user@[local]:secondDB - LOG:  
sending cancel signal to blocking autovacuum with PID 162869
2019-10-16 23:02:12.742 CEST - [162851] - user@[local]:secondDB - 
DETAIL:  Process  162851 is waiting for  ShareLock on relation 
3880125365 for database 3880125112.
2019-10-16 23:02:12.743 CEST - [162851] - user@[local]:secondDB - 
SENTENCE:  CREATE INDEX another_table_index4 ON another_table USING 
btree (another_field4);

2019-10-16 23:02:12.743 CEST - [162869] ERROR:  cancelling autovacuum task
2019-10-16 23:02:12.743 CEST - [162869] CONTEXT:  automatic analyze of 
«secondDB.public.another_table»
2019-10-16 23:02:20.899 CEST - [162851] - user@[local]:secondDB - LOG:  
duration: 9157.371 ms  sentence: CREATE INDEX another_table_index4 ON 
another_table USING btree (another_field4);


After upgrade main db to the new server, I've tuned following parameters 
in postgresql.conf (using pgtune recommendations)


max_connections = 200
shared_buffers = 8GB
work_mem = 2621kB
maintenance_work_mem = 2GB
effective_io_concurrency = 2
wal_buffers = 8MB
checkpoint_segments = 32
checkpoint_completion_target = 0.7
effective_cache_size = 16GB
log_min_duration_statement = 5000
log_line_prefix = '%m - [%p] %q- %u@%h:%d - %a '
standard_conforming_strings = on


I've been looking for the problem with checkpoints and I've decided to 
let it be, because these messages only appear when we make the second db 
restore. The rest of the log is clean from checkpoint messages.


But I don't understand why I'm getting those messages about autovacuum 
blocking db restore process. I guess that after one table is created 
with COPY sentence, as many rows have been inserted, autoanalyze process 
runs to gather statistics for the Execution Planner. But why is 
happening this block? Is autoanalyze running before the table gets fully 
loaded? Is this really a problem? If so, how can I handle it? This task 
is running at night, when nobody is using second database.



Thank you for reading,

Ekaterina





Sv: Conflict between autovacuum and backup restoration

2019-10-17 Thread Andreas Joseph Krogh

På torsdag 17. oktober 2019 kl. 15:48:09, skrev Ekaterina Amez <
ekaterina.a...@zunibal.com >: 
Good afternoon,

 We've finally made the migration+upgrade from old server with v7.14 to
 new server with v8.4 and, before test and plan an upgrade to v9.6, I'm
 checking logs to find out if there's any problem with this upgrade.
 We've been fixing things and in only remains one problem in the log that
 I don't understand and not sure how to fix, or if it has to be fixed.
 Original logs are not in english so I'm posting my translation.

 In the morning we're making a backup of -let's call it- the main
 database. To avoid intensive use of this main database, we have a second
 db that's a copy of the main one, used only to display data to some
 users that don't need it up to date and that shouldn't change anything.
 So at night we're restoring the morning backup from main database into
 this second one. The upgrade that I've mentioned has ended with both,
 main and second databases, being in the same server. The "problem"
 (because I'm not sure if it's really a problem) is that while the backup
 is restoring in the second database, it seems like autovacuum is
 launched and conflicts with this db restore. The log is this:

 [...] Several messages about checkpoints running too frequently [...]
 2019-10-16 23:01:30.904 CEST - [162851] - user@[local]:secondDB - LOG: 
 duration: 7446.139 ms sentence: COPY one_table (some_columns) FROM stdin;
 2019-10-16 23:01:37.457 CEST - [13750] LOG: checkpoints are running too
 frequently ( 9 seconds)
 2019-10-16 23:01:37.457 CEST - [13750] HINT: Consider changing
 «checkpoint_segments» configuration parameter.
 2019-10-16 23:01:58.663 CEST - [162851] - user@[local]:secondDB - LOG: 
 duration: 6492.426 ms sentence: CREATE INDEX another_table_index1 ON
 another_table USING btree (another_field1);
 2019-10-16 23:02:04.042 CEST - [162851] - user@[local]:secondDB - LOG: 
 duration: 5378.886 ms sentence: CREATE INDEX another_table_index2 ON
 another_table USING btree (another_field2);
 2019-10-16 23:02:11.742 CEST - [162851] - user@[local]:secondDB - LOG: 
 duration: 7699.246 ms sentence: CREATE INDEX another_table_index3 ON
 another_table USING btree (another_field3);
 2019-10-16 23:02:12.742 CEST - [162851] - user@[local]:secondDB - LOG: 
 sending cancel signal to blocking autovacuum with PID 162869
 2019-10-16 23:02:12.742 CEST - [162851] - user@[local]:secondDB -
 DETAIL: Process 162851 is waiting for ShareLock on relation
 3880125365 for database 3880125112.
 2019-10-16 23:02:12.743 CEST - [162851] - user@[local]:secondDB -
 SENTENCE: CREATE INDEX another_table_index4 ON another_table USING
 btree (another_field4);
 2019-10-16 23:02:12.743 CEST - [162869] ERROR: cancelling autovacuum task
 2019-10-16 23:02:12.743 CEST - [162869] CONTEXT: automatic analyze of
 «secondDB.public.another_table»
 2019-10-16 23:02:20.899 CEST - [162851] - user@[local]:secondDB - LOG: 
 duration: 9157.371 ms sentence: CREATE INDEX another_table_index4 ON
 another_table USING btree (another_field4);

 After upgrade main db to the new server, I've tuned following parameters
 in postgresql.conf (using pgtune recommendations)

 max_connections = 200
 shared_buffers = 8GB
 work_mem = 2621kB
 maintenance_work_mem = 2GB
 effective_io_concurrency = 2
 wal_buffers = 8MB
 checkpoint_segments = 32
 checkpoint_completion_target = 0.7
 effective_cache_size = 16GB
 log_min_duration_statement = 5000
 log_line_prefix = '%m - [%p] %q- %u@%h:%d - %a '
 standard_conforming_strings = on


 I've been looking for the problem with checkpoints and I've decided to
 let it be, because these messages only appear when we make the second db
 restore. The rest of the log is clean from checkpoint messages.

 But I don't understand why I'm getting those messages about autovacuum
 blocking db restore process. I guess that after one table is created
 with COPY sentence, as many rows have been inserted, autoanalyze process
 runs to gather statistics for the Execution Planner. But why is
 happening this block? Is autoanalyze running before the table gets fully
 loaded? Is this really a problem? If so, how can I handle it? This task
 is running at night, when nobody is using second database.


 Thank you for reading,

 Ekaterina 

It is normal to get these "canceling autovacuum"-messages when restoring a 
database, just ignore them. 
If it bothers you, just turn autovacuum off by setting this in postgresql.conf:

autovacuum = off 

and reload the config (SIGHUP) 



--
 Andreas Joseph Krogh

Re: CVE-2018-1058

2019-10-17 Thread Adrian Klaver

On 10/17/19 6:46 AM, Lizeth Solis Aramayo wrote:

Forgot a fourth option:

4) Use the 9.6.15 pg_restore to restore the 9.6.15 pg_dump to the 9.6.5 
database.


I don't know how to do that.  May you help me  please

Is it just copy the pg_restore from one server to another?


It would be easier to just run the 9.6.15 version on the 9.6.15 machine 
against the 9.6.5 database, so:


pg_restore -h <9.6.5 hostname or IP> -p 5432 ...

Of course, how fast this runs would depend where the machines are 
relative to each on the network.


You could try copying the program, just not sure how compatible RH 6.5 
and RH 7.6 are with each other.


Your best bet would be to upgrade the 9.6.5 --> 9.6.15. There have been 
a lot of bug fixes in between.








-Mensaje original-
De: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
Enviado el: miércoles, 16 de octubre de 2019 18:32
Para: Lizeth Solis Aramayo; pgsql-gene...@postgresql.org
Asunto: Re: CVE-2018-1058

On 10/16/19 1:05 PM, Lizeth Solis Aramayo wrote:

What commands did you use to dump the 9.6.15 version and restore to
the
9.6.5 version?

Pg_dump -p 5433 -U postgres -Fc -d dbkerp -n param > param.dump And
Pg_restore -p 5432 -U postgres -d dbkerp param.dump


Server with pg_dump is Linux red hat 7.6

Server with pg_restore is linux red hat 6.5


In both servers I have postgresql 9.6, but in pg_dump is 9.6.15,   and in 
pg_restore is 9.6.5.


The pg_dump is correct,  everything goes ok.,  but when I do the
pg_restore I gota n error :  pg_restore: [archiver] unsupported
version (1.13) in file header


Forgot a fourth option:

4) Use the 9.6.15 pg_restore to restore the 9.6.15 pg_dump to the 9.6.5 
database.




I searched solutions,  and I found that I can apply a patch CVE-2018-1058,  but 
I don¡t know how.
How  to download,  and install,  I dont find documents about it.

he reason why you can't upgrade the 9.6.5 to 9.6.15?  I dont know how.



--
Adrian Klaver
adrian.kla...@aklaver.com
La información contenida en este mensaje esta dirigida en forma exclusiva 
para el uso personal y confidencial del o los destinatarios arriba nombrados. 
Si el lector de este mensaje no es el destinatario previsto o una persona 
responsable para su distribución al destinatario, se le notifica que ha 
recibido este correo por error y que la revisión, distribución, difusión o 
copia de este mensaje esta estrictamente prohibida. Si por error recibió esta 
comunicación, por favor notifiquenos inmediatamente y borre el mensaje 
original. The information contained in this message is intended only for 
the personal and confidential use of the recipient(s) named above. If the 
reader of this message is not the intended recipient or an agent responsible 
for delivering it to the intended recipient, you are hereby notified that you 
have received this document in error and that any review, dissemination, 
distribution, or copying of this message is strictly prohibited. If you have 
received this communication in error, please notify us immediately, and delete 
the original message.




--
Adrian Klaver
adrian.kla...@aklaver.com


RE: CVE-2018-1058

2019-10-17 Thread Lizeth Solis Aramayo
Forgot a fourth option:

4) Use the 9.6.15 pg_restore to restore the 9.6.15 pg_dump to the 9.6.5 
database.


I don't know how to do that.  May you help me  please

Is it just copy the pg_restore from one server to another?





-Mensaje original-
De: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
Enviado el: miércoles, 16 de octubre de 2019 18:32
Para: Lizeth Solis Aramayo; pgsql-gene...@postgresql.org
Asunto: Re: CVE-2018-1058

On 10/16/19 1:05 PM, Lizeth Solis Aramayo wrote:
> What commands did you use to dump the 9.6.15 version and restore to
> the
> 9.6.5 version?
>
> Pg_dump -p 5433 -U postgres -Fc -d dbkerp -n param > param.dump And
> Pg_restore -p 5432 -U postgres -d dbkerp param.dump
>
>
> Server with pg_dump is Linux red hat 7.6
>
> Server with pg_restore is linux red hat 6.5
>
>
> In both servers I have postgresql 9.6, but in pg_dump is 9.6.15,   and in 
> pg_restore is 9.6.5.
>
>
> The pg_dump is correct,  everything goes ok.,  but when I do the
> pg_restore I gota n error :  pg_restore: [archiver] unsupported
> version (1.13) in file header

Forgot a fourth option:

4) Use the 9.6.15 pg_restore to restore the 9.6.15 pg_dump to the 9.6.5 
database.

>
>
> I searched solutions,  and I found that I can apply a patch CVE-2018-1058,  
> but I don¡t know how.
> How  to download,  and install,  I dont find documents about it.
>
> he reason why you can't upgrade the 9.6.5 to 9.6.15?  I dont know how.
>

--
Adrian Klaver
adrian.kla...@aklaver.com
La información contenida en este mensaje esta dirigida en forma exclusiva 
para el uso personal y confidencial del o los destinatarios arriba nombrados. 
Si el lector de este mensaje no es el destinatario previsto o una persona 
responsable para su distribución al destinatario, se le notifica que ha 
recibido este correo por error y que la revisión, distribución, difusión o 
copia de este mensaje esta estrictamente prohibida. Si por error recibió esta 
comunicación, por favor notifiquenos inmediatamente y borre el mensaje 
original. The information contained in this message is intended only for 
the personal and confidential use of the recipient(s) named above. If the 
reader of this message is not the intended recipient or an agent responsible 
for delivering it to the intended recipient, you are hereby notified that you 
have received this document in error and that any review, dissemination, 
distribution, or copying of this message is strictly prohibited. If you have 
received this communication in error, please notify us immediately, and delete 
the original message.


Re: Changing PK on replicated database

2019-10-17 Thread Adrian Klaver

On 10/17/19 3:13 AM, PegoraroF10 wrote:

Correct, those records are on replica too. I´m just talking about the best
way to update those keys and all their dependent tables.
If I change them first on master they will not be replicated because it will
be an update and their pk will not be found on replica, correct ?


https://www.postgresql.org/docs/11/sql-altertable.html#SQL-CREATETABLE-REPLICA-IDENTITY
"DEFAULT (the default for non-system tables) records the old values of 
the columns of the primary key, if any."


AFAIK I know there is no restrictions on what you want to do(changing PK):

https://www.postgresql.org/docs/11/logical-replication-restrictions.html

To test and confirm this set up a test parent/child table combination 
and change a PK. As to the FK, pretty sure that is handled by the 
changes on the primary child tables being replicated to the standby 
child tables.



If so, do I need to update manually on replica and later on master ?
And on replica server, all FK will be updated if I change that PK ? On
replica triggers are not triggered, are foreign key cascade ?
If I do this way, when I change that PK on master I´ll get a warning on
replica server because that PK did not exist anymore ?

The only question is, what are correct steps to do when you need to change a
PK on replicated database, just that.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html






--
Adrian Klaver
adrian.kla...@aklaver.com




Vacuum very big table - how the full vacuum works in background/internally?

2019-10-17 Thread Durumdara
Hello!

We stored some binaries in largeobjects.
Because of that the table size now 80 GB.
We deleted 80% of records (lo_unlink), and autovacuum reclaimed space for
new elements. So the table don't grow anymore, but we have to get more
space in this server.

We can delete 99% of these records, but for really reclaim free space in
HDD we need to run full vacuum.

For this operation we need to know how the PGSQL vacuum works in the
background.

Some of admins said to us that:
a.) It copies the table fully (minium 66 GB space needed).
b.) Then it deletes the unneeded data.
In this case we need extra empty space in a temporary period, and more time
(the copy of 66 GB could be slow in SSD too).

The DBISAM/ElevateDB, ZIP file deletion, VirtualBOX VDI Compact works as:
a.) It locks the original file/table.
b.) Copy remaining elements to new (first empty) file.
c.) Then it removes old file, and use new.
In this case we need only very limited empty space (3-4 GB), and the
operation is much faster (because of less HDD operation).

Please help me, how the PGSQL full vacuum works internally? (1., 2. case,
or something else)

How we (and the clients) prepare to this operation?
We must know it to avoid disk out problems, and too much off-time.

Thank you for your help!

Best regards
   dd


Re: Vacuum very big table - how the full vacuum works in background/internally?

2019-10-17 Thread Luca Ferrari
On Thu, Oct 17, 2019 at 5:10 PM Durumdara  wrote:
> Please help me, how the PGSQL full vacuum works internally? (1., 2. case, or 
> something else)
>

As far as I know a VACUUM FULL will rewrite the whole table without
inserting (and therefore deleting later) not visible tuples. I would
be quite surprised of it duplicating the table and removing after the
tuples.
Surely it is a very invasive command that locks the table and requires I/O.

> How we (and the clients) prepare to this operation?
> We must know it to avoid disk out problems, and too much off-time.

The best guess is that you are going to need almost the double of the
table size. Since you said that autovacuum is preventing the table to
grow, it could mean all the reclaimed space has been consumed by other
tuples, so I'm not sure vacuum full can provide you space.
Have you looked at pg_stat_user_tables to see the dead tuples fraction?

Luca




Re: Can you please tell us how set this prefetch attribute in following lines.

2019-10-17 Thread Reid Thompson
On Thu, 2019-10-17 at 11:16 +, M Tarkeshwar Rao wrote:
> [EXTERNAL SOURCE]
> 
>  
> 
> Hi all,
>  
> How to fetch certain number of tuples from a postgres table.
>  
> Same I am doing in oracle using following lines by setting prefetch attribute.
>  
> For oracle
> // Prepare query
> if( OCIStmtPrepare( myOciStatement, myOciError, (text *)aSqlStatement,
> // Get statement type
>  OCIAttrGet( (void *)myOciStatement, OCI_HTYPE_STMT, &statement_type, 0, 
> OCI_ATTR_STMT_TYPE, myOciError );
> // Set prefetch count   
>   OCIAttrSet( myOciStatement, OCI_HTYPE_STMT, &prefetch, 0, 
> OCI_ATTR_PREFETCH_ROWS, myOciError );   
> // Execute query
> status = OCIStmtExecute( myOciServerCtx, myOciStatement, myOciError, iters, 
> 0, NULL, NULL, OCI_DEFAULT );
>  
>  
> For Postgres
>  
> Can you please tell us how set this prefetch attribute in following lines. Is 
> PQexec returns all the rows from the table?
>  
> mySqlResultsPG = PQexec(connection, aSqlStatement);
> if((PQresultStatus(mySqlResultsPG) == PGRES_FATAL_ERROR ) || 
> (PQstatus(connection) != CONNECTION_OK)){}
> if ((PQresultStatus(mySqlResultsPG) == PGRES_COMMAND_OK) || 
> (PQresultStatus(mySqlResultsPG) == PGRES_TUPLES_OK))
> {
> myNumColumns = PQnfields(mySqlResultsPG);
> myTotalNumberOfRowsInQueryResult = PQntuples(mySqlResultsPG);
> myCurrentRowNum = 0 ;
> }
> 
>  
> Regards
> Tarkeshwar
>  

declare a cursor and fetch

https://books.google.com/books?id=Nc5ZT2X5mOcC&pg=PA405&lpg=PA405&dq=pqexec+fetch&source=bl&ots=8P8w5JemcL&sig=ACfU3U0POGGSP0tYTrs5oxykJdOeffaspA&hl=en&sa=X&ved=2ahUKEwjevbmA2KPlAhXukOAKHaBIBcoQ6AEwCnoECDEQAQ#v=onepage&q=pqexec%20fetch&f=false




Re: Can you please tell us how set this prefetch attribute in following lines.

2019-10-17 Thread Laurenz Albe
On Thu, 2019-10-17 at 11:16 +, M Tarkeshwar Rao wrote:
> How to fetch certain number of tuples from a postgres table.
> 
> Same I am doing in oracle using following lines by setting prefetch attribute.
> 
> For oracle
> // Prepare query
> if( OCIStmtPrepare( myOciStatement, myOciError, (text *)aSqlStatement,
> // Get statement type
>  OCIAttrGet( (void *)myOciStatement, OCI_HTYPE_STMT, &statement_type, 0, 
> OCI_ATTR_STMT_TYPE, myOciError );
> // Set prefetch count   
>   OCIAttrSet( myOciStatement, OCI_HTYPE_STMT, &prefetch, 0, 
> OCI_ATTR_PREFETCH_ROWS, myOciError );   
> // Execute query
> status = OCIStmtExecute( myOciServerCtx, myOciStatement, myOciError, iters, 
> 0, NULL, NULL, OCI_DEFAULT );
> 
> For Postgres
> 
> Can you please tell us how set this prefetch attribute in following lines. Is 
> PQexec returns all the rows from the table?
> 
> mySqlResultsPG = PQexec(connection, aSqlStatement);
> 
> if((PQresultStatus(mySqlResultsPG) == PGRES_FATAL_ERROR ) || 
> (PQstatus(connection) != CONNECTION_OK)){}
> if ((PQresultStatus(mySqlResultsPG) == PGRES_COMMAND_OK) || 
> (PQresultStatus(mySqlResultsPG) == PGRES_TUPLES_OK))
> {
> myNumColumns = PQnfields(mySqlResultsPG);
> myTotalNumberOfRowsInQueryResult = PQntuples(mySqlResultsPG);
> myCurrentRowNum = 0 ;
> }

The C API doesn't offer anything like Oracle prefetch to force prefetching of a 
certain
number of result rows.

In the PostgreSQL code you show above, the whole result set will be fetched in 
one go
and cached in client RAM, so in a way you have "prefetch all".

The alternative thet the C API gives you is PQsetSingleRowMode(), which, when 
called,
will return the result rows one by one, as they arrive from the server.
That disables prefetching.

If you want to prefetch only a certain number of rows, you can use the DECLARE 
and
FETCH SQL statements to create a cursor in SQL and fetch it in batches.

This workaround has the down side that the current query shown in 
"pg_stat_activity"
or "pg_stat_statements" is always something like "FETCH 32", and you are left 
to guess
which statement actually caused the problem.


If you are willing to bypass the C API and directly speak the network protocol 
with
the server, you can do better.  This is documented in
https://www.postgresql.org/docs/current/protocol.html

The "Execute" ('E') message allows you to send an integer with the maximum 
number of
rows to return (0 means everything), so that does exactly what you want.

The backend will send a "PortalSuspended" ('s') to indicate that there is more 
to come,
and you keep sending "Execute" until you get a "CommandComplete" ('C').

I you feel hacky you could write C API support for that...


If you use that or a cursor, PostgreSQL will know that you are executing a 
cursor
and will plan its queries differently: it will assume that only 
"cursor_tuple_fraction"
(default 0.1) of your result set is actually fetched and prefer fast startup 
plans.
If you don't want that, because you are fetching batches as fast as you can 
without
lengthy intermediate client processing, you might want to set the parameter to 
1.0.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





releasing space

2019-10-17 Thread Julie Nishimura
Hello everybody,
We are running PostgreSQL 9.6.2 cluster master -> standby (streaming 
replication). 22 tb of space (constantly struggling with the space, pruning the 
old data, but not fast enough). The biggest db takes 16 tb. So, we've copied it 
to another server, and now we would like to delete it from our original source, 
to free up the space. What would be the right approach for this?  Just issue 
drop database command (16tb). How long it might take? Should we do it gradually 
(drop biggest tables first)? Any suggestions? Caveats?

Thank you!

-Julie


Using PostgreSQL for Machine Learning Data Pipelines

2019-10-17 Thread Pankaj Jangid
Hi,

I am working on a machine-learning project. Because of the available
study material in the ML area, the team is inclined towards Apache
Kafka, Apache Spark for data-pipelines and analytics.

Our requirement is to store huge amounts of continuously increasing data
that cannot fit into a single machine. The algorithms require data in
batches so it is not necessary to keep full data ready for
consumption. Using Kafka, the data can be distributed and fetched in
varying batch sizes as and when required.

I am more comfortable with PostgreSQL. And wanted to know more about
case-studies where PostgreSQL is deployed for ML use. Any pointers
referring to study material will be helpful. Please share in this
thread.

--
Thanks & Regards,
Pankaj Jangid




Re: Vacuum very big table - how the full vacuum works in background/internally?

2019-10-17 Thread Michael Lewis
>
> We must know it to avoid disk out problems, and too much off-time.
>

You may be interested in this extension- https://github.com/reorg/pg_repack


Re: releasing space

2019-10-17 Thread Adrian Klaver

On 10/17/19 10:20 AM, Julie Nishimura wrote:

Hello everybody,
We are running PostgreSQL 9.6.2 cluster master -> standby (streaming 
replication). 22 tb of space (constantly struggling with the space, 
pruning the old data, but not fast enough). The biggest db takes 16 tb. 
So, we've copied it to another server, and now we would like to delete 
it from our original source, to free up the space. What would be the 
right approach for this?  Just issue drop database command (16tb). How 
long it might take? Should we do it gradually (drop biggest tables 
first)? Any suggestions? Caveats?


https://www.postgresql.org/docs/11/sql-dropdatabase.html
"DROP DATABASE drops a database. It removes the catalog entries for the 
database and deletes the directory containing the data. It can only be 
executed by the database owner. Also, it cannot be executed while you or 
anyone else are connected to the target database. (Connect to postgres 
or any other database to issue this command.)


...

DROP DATABASE cannot be undone. Use it with care!"
^

Means there is no transaction involved unlike DROP TABLE, so I would say 
it is quicker.


If you want to see all that is involved:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/dbcommands.c;h=f47a13d1844ca36d81ba9815f807646a44750de4;hb=86ca7f81f7dfc17f04698189dec8973d358bc711

Start at line 767




Thank you!

-Julie



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Changing PK on replicated database

2019-10-17 Thread PegoraroF10
Very good, records were changed and everything is ok.
Well, sorry for this stupid question but some time ago we did something
similar to this and our replica stoped, so I was just afraid of that
happenning again.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




drop database

2019-10-17 Thread Julie Nishimura
Guys, can anybody reply if they drop any postgresql database larger than 1 tb 
and how long did it take?

Thanks


Re: drop database

2019-10-17 Thread Ron

On 10/17/19 3:44 PM, Julie Nishimura wrote:
Guys, can anybody reply if they drop any postgresql database larger than 1 
tb and how long did it take?


A lot has to do with how quickly the underlying file system can delete files.

To be honest, though... *does it really matter* how long it takes?  (If I 
were worried about it -- which I might be -- then I'd put a DROP DATABASE 
script in crontab and run it from there.)


--
Angular momentum makes the world go 'round.


Sv: drop database

2019-10-17 Thread Andreas Joseph Krogh

På torsdag 17. oktober 2019 kl. 22:44:15, skrev Julie Nishimura <
juliez...@hotmail.com >: 
Guys, can anybody reply if they drop any postgresql database larger than 1 tb 
and how long did it take? 

Thanks 

About 280ms: 

andreak@[local]:5432 12.0 andreak=# select 
pg_size_pretty(pg_database_size('rsm'));
 ┌┐
 │ pg_size_pretty │
 ├┤
 │ 26 GB │
 └┘
 (1 row)

andreak@[local]:5432 12.0 andreak=# \timing 
 Timing is on.
andreak@[local]:5432 12.0 andreak=# drop DATABASE rsm; 
 DROP DATABASE
 Time: 280,355 ms


-- 
Andreas Joseph Krogh 
CTO / Partner - Visena AS 
Mobile: +47 909 56 963 
andr...@visena.com  
www.visena.com  
  


Sv: drop database

2019-10-17 Thread Andreas Joseph Krogh

På torsdag 17. oktober 2019 kl. 22:44:15, skrev Julie Nishimura <
juliez...@hotmail.com >: 
Guys, can anybody reply if they drop any postgresql database larger than 1 tb 
and how long did it take? 

Sorry, I missread you question as 1GB (not TB)... 


-- 
Andreas Joseph Krogh 
CTO / Partner - Visena AS 
Mobile: +47 909 56 963 
andr...@visena.com  
www.visena.com  
  


Re: Sv: drop database

2019-10-17 Thread Ron

On 10/17/19 3:58 PM, Andreas Joseph Krogh wrote:
På torsdag 17. oktober 2019 kl. 22:44:15, skrev Julie Nishimura 
mailto:juliez...@hotmail.com>>:


Guys, can anybody reply if they drop any postgresql database larger
than 1 tb and how long did it take?
Thanks

About 280ms:
andreak@[local]:543212.0 andreak=# select 
pg_size_pretty(pg_database_size('rsm'));

┌┐
│ pg_size_pretty │
├┤
│ 26 GB  │
└┘
(1 row)

andreak@[local]:543212.0 andreak=# \timing
Timing is on.
andreak@[local]:543212.0 andreak=# drop DATABASE rsm;
DROP DATABASE
Time: 280,355 ms


1TB is 38x larger than 26GB.


--
Angular momentum makes the world go 'round.


Re: drop database

2019-10-17 Thread Julie Nishimura
Ive seen some performance degradation on some other RDBMS systems when "drop 
database" was in progress. We need to drop database which is 16 tb with minimal 
impact for our end users. There are 32 other databases with hundreds of 
connections on the same cluster, and I just want to release the space with 
minimal impact. Trying to find the best solution. I could even script 'truncate 
table' or 'drop table' in the loop if it helps. I dont have luxury to test such 
large db drop in action.

Thanks!




From: Ron 
Sent: Thursday, October 17, 2019 1:59 PM
To: pgsql-general@lists.postgresql.org 
Subject: Re: drop database

On 10/17/19 3:44 PM, Julie Nishimura wrote:
Guys, can anybody reply if they drop any postgresql database larger than 1 tb 
and how long did it take?

A lot has to do with how quickly the underlying file system can delete files.

To be honest, though... does it really matter how long it takes?  (If I were 
worried about it -- which I might be -- then I'd put a DROP DATABASE script in 
crontab and run it from there.)

--
Angular momentum makes the world go 'round.


Re: drop database

2019-10-17 Thread Julie Nishimura
I think you don't swallow an entire elephant at once, you can choke on it, it 
is more safe to be eaten piece by piece:)


From: Julie Nishimura 
Sent: Thursday, October 17, 2019 2:19 PM
To: Ron ; pgsql-general@lists.postgresql.org 

Subject: Re: drop database

Ive seen some performance degradation on some other RDBMS systems when "drop 
database" was in progress. We need to drop database which is 16 tb with minimal 
impact for our end users. There are 32 other databases with hundreds of 
connections on the same cluster, and I just want to release the space with 
minimal impact. Trying to find the best solution. I could even script 'truncate 
table' or 'drop table' in the loop if it helps. I dont have luxury to test such 
large db drop in action.

Thanks!




From: Ron 
Sent: Thursday, October 17, 2019 1:59 PM
To: pgsql-general@lists.postgresql.org 
Subject: Re: drop database

On 10/17/19 3:44 PM, Julie Nishimura wrote:
Guys, can anybody reply if they drop any postgresql database larger than 1 tb 
and how long did it take?

A lot has to do with how quickly the underlying file system can delete files.

To be honest, though... does it really matter how long it takes?  (If I were 
worried about it -- which I might be -- then I'd put a DROP DATABASE script in 
crontab and run it from there.)

--
Angular momentum makes the world go 'round.


Re: drop database

2019-10-17 Thread Michael Lewis
Your plan to loop over tables and truncate them seems great if you are
worried. It seems simple to verify that space is being freed as you go, and
also easy to change tactics if the need arises.

>


Re: drop database

2019-10-17 Thread Andrew Kerber
If you are decommissioning the database, why not just rm -rf the whole
system?

On Thu, Oct 17, 2019 at 4:31 PM Michael Lewis  wrote:

> Your plan to loop over tables and truncate them seems great if you are
> worried. It seems simple to verify that space is being freed as you go, and
> also easy to change tactics if the need arises.
>
>>

-- 
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


Re: drop database

2019-10-17 Thread pabloa98
Perhaps you want to TRUNCATE TABLEs. That will mitigate any I/O impact


On Thu, Oct 17, 2019 at 3:13 PM Andrew Kerber 
wrote:

> If you are decommissioning the database, why not just rm -rf the whole
> system?
>
> On Thu, Oct 17, 2019 at 4:31 PM Michael Lewis  wrote:
>
>> Your plan to loop over tables and truncate them seems great if you are
>> worried. It seems simple to verify that space is being freed as you go, and
>> also easy to change tactics if the need arises.
>>
>>>
>
> --
> Andrew W. Kerber
>
> 'If at first you dont succeed, dont take up skydiving.'
>


stable for each row before insert trigger

2019-10-17 Thread Олег Самойлов
Hi all.

According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is somehow 
useful with trigger functions, for instance mentioned that the AFTER INSERT 
trigger should be VOLATILE. The question is how this words affect a for each 
row before insert trigger? Can be some optimisation here?



Re: stable for each row before insert trigger

2019-10-17 Thread Adrian Klaver

On 10/17/19 4:31 PM, Олег Самойлов wrote:

Hi all.

According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is somehow 
useful with trigger functions, for instance mentioned that the AFTER INSERT trigger 
should be VOLATILE. The question is how this words affect a for each row before insert 
trigger? Can be some optimisation here?




https://www.postgresql.org/docs/11/xfunc-volatility.html

--
Adrian Klaver
adrian.kla...@aklaver.com




Re: drop database

2019-10-17 Thread Ron

There are other databases in the cluster.

On 10/17/19 5:12 PM, Andrew Kerber wrote:

If you are decommissioning the database, why not just rm -rf the whole system?

On Thu, Oct 17, 2019 at 4:31 PM Michael Lewis > wrote:


Your plan to loop over tables and truncate them seems great if you are
worried. It seems simple to verify that space is being freed as you
go, and also easy to change tactics if the need arises.




--
Angular momentum makes the world go 'round.


Re:Using PostgreSQL for Machine Learning Data Pipelines

2019-10-17 Thread chris
Hi there,


There is a project named Apache MADlib, may help you.


http://madlib.apache.org


Thanks,
| |
Chris
|
|
yuanzefuwa...@126.com
|
签名由网易邮箱大师定制


On 10/18/2019 02:04,Pankaj Jangid wrote:
Hi,

I am working on a machine-learning project. Because of the available
study material in the ML area, the team is inclined towards Apache
Kafka, Apache Spark for data-pipelines and analytics.

Our requirement is to store huge amounts of continuously increasing data
that cannot fit into a single machine. The algorithms require data in
batches so it is not necessary to keep full data ready for
consumption. Using Kafka, the data can be distributed and fetched in
varying batch sizes as and when required.

I am more comfortable with PostgreSQL. And wanted to know more about
case-studies where PostgreSQL is deployed for ML use. Any pointers
referring to study material will be helpful. Please share in this
thread.

--
Thanks & Regards,
Pankaj Jangid



Can you please tell us how set this prefetch attribute in following lines.

2019-10-17 Thread M Tarkeshwar Rao
Hi all,



How to fetch certain number of tuples from a postgres table.



Same I am doing in oracle using following lines by setting prefetch attribute.



For oracle
// Prepare query
if( OCIStmtPrepare( myOciStatement, myOciError, (text *)aSqlStatement,
// Get statement type
 OCIAttrGet( (void *)myOciStatement, OCI_HTYPE_STMT, &statement_type, 0, 
OCI_ATTR_STMT_TYPE, myOciError );
// Set prefetch count
  OCIAttrSet( myOciStatement, OCI_HTYPE_STMT, &prefetch, 0, 
OCI_ATTR_PREFETCH_ROWS, myOciError );
// Execute query
status = OCIStmtExecute( myOciServerCtx, myOciStatement, myOciError, iters, 0, 
NULL, NULL, OCI_DEFAULT );


For Postgres



Can you please tell us how set this prefetch attribute in following lines. Is 
PQexec returns all the rows from the table?


mySqlResultsPG = PQexec(connection, aSqlStatement);
if((PQresultStatus(mySqlResultsPG) == PGRES_FATAL_ERROR ) || 
(PQstatus(connection) != CONNECTION_OK)){}
if ((PQresultStatus(mySqlResultsPG) == PGRES_COMMAND_OK) || 
(PQresultStatus(mySqlResultsPG) == PGRES_TUPLES_OK))
{
myNumColumns = PQnfields(mySqlResultsPG);
myTotalNumberOfRowsInQueryResult = PQntuples(mySqlResultsPG);
myCurrentRowNum = 0 ;
}



Regards

Tarkeshwar



Postgres Point in time Recovery (PITR),

2019-10-17 Thread Daulat Ram
Hello All,
Can you please share some ideas and scenarios how we can do the PITR in case of 
disaster.


Thanks,


Re: stable for each row before insert trigger

2019-10-17 Thread Олег Самойлов
Eh, stupid answer. Of cause, I read the documentation. But what about you? Do 
you know what is a trigger function? (for each row before insert)

A trigger function don't have parameters, instead it get special variables. OLD 
and NEW for instance or TG_ARGV[].

If trigger function depends only on NEW is it IMMUTABLE? (With the same NEW it 
must return the same changed NEW). If trigger function makes SELECTs and change 
only NEW is it can be declared as STABLE? And etc.

> 18 окт. 2019 г., в 2:41, Adrian Klaver  написал(а):
> 
> On 10/17/19 4:31 PM, Олег Самойлов wrote:
>> Hi all.
>> According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is 
>> somehow useful with trigger functions, for instance mentioned that the AFTER 
>> INSERT trigger should be VOLATILE. The question is how this words affect a 
>> for each row before insert trigger? Can be some optimisation here?
> 
> https://www.postgresql.org/docs/11/xfunc-volatility.html
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com