Re: How to shorten a chain of logically replicated servers

2020-01-07 Thread Laurenz Albe
On Tue, 2019-12-31 at 15:51 -0800, Mike Lissner wrote:
> Hi, I'm trying to figure out how to shorten a chain of logically
> replicating servers. Right now we have three servers replicating like
> so:
> 
> A --> B --> C
> 
> And I'd like to remove B from the chain of replication so that I only have:
> 
> A --> C
> 
> Of course, doing this without losing data is the goal. If the
> replication to C breaks temporarily, that's fine, so long as all the
> changes on A make it to C eventually.
> 
> I'm not sure how to proceed with this. My best theory is:
> 
> 1. In a transaction, DISABLE the replication from A to B and start a
> new PUBLICATION on A that C will subscribe to in step ③ below. The
> hope is that this will simultaneously stop sending changes to B while
> starting a log of new changes that can later be sent to C.
> 
> 2. Let any changes queued on B flush to C. (How to know when they're
> all flushed?)
> 
> 3. Subscribe C to the new PUBLICATION created in step ①. Create the
> subscription with copy_data=False. This should send all changes to C
> that hadn't been sent to B, without sending the complete tables.
> 
> 4. DROP all replication to/from B (this is just cleanup; the incoming
> changes to B were disabled in step ①, and outgoing changes from B were
> flushed in step ②).
> 
> Does this sound even close to the right approach? Logical replication
> can be a bit finicky, so I'd love to have some validation of the
> general approach before I go down this road.

I don't think that will work.

Any changes on A that take place between step 1 and step 3 wouldn't be
replicated to C.

You'd have to suspend all data modification on A in that interval.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: How to reset a server error '25P02 in_failed_sql_transaction'

2020-01-07 Thread Laurenz Albe
On Tue, 2019-12-31 at 06:55 +0100, Matthias Apitz wrote:
> Due to a wrong human input in the GUI of our application our
> application server, from the point of view of the PostgreSQL server it
> is the PostgreSQL client, issues a broken ESQL/C command to the PostgreSQL
> server, here from our own debug logging the command and the error
> message of the server:
> 
> 
> posDebug: [27.12.2019 15:20:59:043] stmt: SELECT ctid, * from titel_datum 
> WHERE desk>='*2019' AND feldnr IN ( 2 )   ORDER BY desk ASC , feldnr ASC
> posDebug: [27.12.2019 15:20:59:043] ESQL: PREPARE sid_titel_datum  FROM 
> :select_anw;
>  sqlca 
> sqlcode: -400
> sqlerrm.sqlerrml: 61
> sqlerrm.sqlerrmc: invalid input syntax for type date: »*2019« on line 918
> ...
> 
> All subsequent correct (SELECT ...) statements get rejected with, for 
> example: 
> 
> 
> ...
> posDebug: [27.12.2019 15:20:59:044] stmt: SELECT ctid, * from titel_datum 
> WHERE desk>='31.12.1900' AND feldnr IN ( 2 )   ORDER BY desk ASC , feldnr ASC
> posDebug: [27.12.2019 15:20:59:044] ESQL: PREPARE sid_titel_datum  FROM 
> :select_anw;
>  sqlca 
> sqlcode: -400
> sqlerrm.sqlerrml: 105
> sqlerrm.sqlerrmc: current transaction is aborted, commands ignored until end 
> of transaction block on line 918
> sqlerrd: 0 0 0 0 0 0
> sqlwarn: 0 0 0 0 0 0 0 0
> sqlstate: 25P02
> posSqlError===
> ...
> 
> Note: we are not in some kind TRANSACTION block, like 'EXEC SQL BEGIN 
> TRANSACTION;'
> 
> What is the correct way to abort the "transaction" as requested by the PG
> server to return to normal operations?

Then you must have "autocommit" set to "off", and you must catch the error
in your code and issue an explicit rollback.

Maybe you should have a look at your transaction management in general, because
it is very important that COMMIT is issued once any transaction is completed.
COMMIT on an aborted transaction will automatically cause a rollback.

Did you make sure that your code is not vulnerable to SQL injection?
What is the statement if the user enters a single quote for the date?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Writing Postgres Extensions in C on Windows

2020-01-07 Thread İlyas Derse
I want to register C code to PostgreSql on Windows. So I think, I have to
make a extension for PostgreSql. But I did not find to written extension on
windows. Do you have an idea ?

Thanks ...


RE: UPDATE many records

2020-01-07 Thread Mark Zellers
You don’t tell us if other users will be concurrently changing any of the 
records involved.  If you could guarantee that the table won’t be changed, you 
might be better off doing a CREATE TABLE table_new as SELECT … FROM table_old, 
dropping table_old, and finally renaming table_new.   Given the way Postgres 
handles updates, I would think that might perform significantly better.  Even 
if you did the work in batches (create a new table, insert/select from the old 
table, drop, rename), that could well be better.  Especially if you re-create 
the indexes after all the data is moved.



From: Israel Brewster 
Sent: Monday, January 6, 2020 10:36 AM
To: pgsql-general@lists.postgresql.org
Subject: UPDATE many records

Thanks to a change in historical data, I have a need to update a large number 
of records (around 50 million). The update itself is straight forward, as I can 
just issue an "UPDATE table_name SET changed_field=new_value();" (yes, 
new_value is the result of a stored procedure, if that makes a difference) 
command via psql, and it should work. However, due to the large number of 
records this command will obviously take a while, and if anything goes wrong 
during the update (one bad value in row 45 million, lost connection, etc), all 
the work that has been done already will be lost due to the transactional 
nature of such commands (unless I am missing something).

Given that each row update is completely independent of any other row, I have 
the following questions:

1) Is there any way to set the command such that each row change is committed 
as it is calculated?
2) Is there some way to run this command in parallel in order to better utilize 
multiple processor cores, other than manually breaking the data into chunks and 
running a separate psql/update process for each chunk? Honestly, manual 
parallelizing wouldn’t be too bad (there are a number of logical segregations I 
can apply), I’m just wondering if there is a more automatic option.
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145



Re: UPDATE many records

2020-01-07 Thread Israel Brewster
Nice approach! Using that method, I can do the select to generate the new table 
in only 6 minutes! I’m sure it helps that through a slightly creative use of 
Joins, I can re-write my function to actually be part of the select, just using 
bult-in trig functions, rather than having to called a stored function that I 
wrote for each row. And at only 6 minutes, I don’t need to worry about things 
like committing in batches or parallelizing. 

I haven’t tried selecting to a new table yet, presumably writing the data back 
could take some time, but the concept seems like it should work well. Thanks!
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> On Jan 7, 2020, at 10:09 AM, Mark Zellers  wrote:
> 
> You don’t tell us if other users will be concurrently changing any of the 
> records involved.  If you could guarantee that the table won’t be changed, 
> you might be better off doing a CREATE TABLE table_new as SELECT … FROM 
> table_old, dropping table_old, and finally renaming table_new.   Given the 
> way Postgres handles updates, I would think that might perform significantly 
> better.  Even if you did the work in batches (create a new table, 
> insert/select from the old table, drop, rename), that could well be better.  
> Especially if you re-create the indexes after all the data is moved.
>  
>  
>  
> From: Israel Brewster  
> Sent: Monday, January 6, 2020 10:36 AM
> To: pgsql-general@lists.postgresql.org
> Subject: UPDATE many records
>  
> Thanks to a change in historical data, I have a need to update a large number 
> of records (around 50 million). The update itself is straight forward, as I 
> can just issue an "UPDATE table_name SET changed_field=new_value();" (yes, 
> new_value is the result of a stored procedure, if that makes a difference) 
> command via psql, and it should work. However, due to the large number of 
> records this command will obviously take a while, and if anything goes wrong 
> during the update (one bad value in row 45 million, lost connection, etc), 
> all the work that has been done already will be lost due to the transactional 
> nature of such commands (unless I am missing something).
>  
> Given that each row update is completely independent of any other row, I have 
> the following questions:
>  
> 1) Is there any way to set the command such that each row change is committed 
> as it is calculated?
> 2) Is there some way to run this command in parallel in order to better 
> utilize multiple processor cores, other than manually breaking the data into 
> chunks and running a separate psql/update process for each chunk? Honestly, 
> manual parallelizing wouldn’t be too bad (there are a number of logical 
> segregations I can apply), I’m just wondering if there is a more automatic 
> option.
> ---
> Israel Brewster
> Software Engineer
> Alaska Volcano Observatory 
> Geophysical Institute - UAF 
> 2156 Koyukuk Drive 
> Fairbanks AK 99775-7320
> Work: 907-474-5172
> cell:  907-328-9145



Re: jsonb_set() strictness considered harmful to data

2020-01-07 Thread Pavel Stehule
Hi

po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <
andrew.duns...@2ndquadrant.com> napsal:

> On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan
>  wrote:
> >
> >
> > On 11/27/19 9:35 PM, Michael Paquier wrote:
> > > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> > >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> > >> errdetail - a exception due setting "null_value_treatment" =>
> > >> raise_exception
> > >> and maybe some errhint - "Maybe you would to use Jsonb NULL -
> "null"::jsonb"
> > >>
> > >> I don't know, but in this case, the exception should be verbose. This
> is
> > >> "rich" function with lot of functionality
> > > @Andrew: This patch is waiting on input from you for a couple of days
> > > now.
> > >
> >
> >
>
>
> Updated version including docco and better error message.
>
> cheers
>
> andrew
>

I think so my objections are solved. I have small objection

+ errdetail("exception raised due to \"null_value_treatment :=
'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or
ensure that an SQL NULL is not used")));

"null_value_treatment := 'raise_exception'\""

it use proprietary PostgreSQL syntax for named parameters. Better to use
ANSI/SQL syntax

"null_value_treatment => 'raise_exception'\""

It is fixed in attached patch

source compilation without warnings,
compilation docs without warnings
check-world passed without any problems

I'll mark this patch as ready for commiter

Thank you for your work

Pavel


>
> --
> Andrew Dunstanhttps://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4b42f12862..72072e7545 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12231,6 +12231,9 @@ table2-mapping
   
jsonb_set
   
+  
+   jsonb_set_lax
+  
   
jsonb_insert
   
@@ -12545,6 +12548,26 @@ table2-mapping
  [{"f1": 1, "f2": null, "f3": [2, 3, 4]}, 2]
 

+  
+   jsonb_set_lax(target jsonb, path text[], new_value jsonb , create_missing boolean , null_value_treatment text)
+ 
+   jsonb
+   
+If new_value is not null,
+behaves identically to jsonb_set. Otherwise behaves
+according to the value of null_value_treatment
+which must be one of 'raise_exception',
+'use_json_null', 'delete_key', or
+'return_target'. The default is
+'use_json_null'.
+   
+   jsonb_set_lax('[{"f1":1,"f2":null},2,null,3]', '{0,f1}',null)
+ jsonb_set_lax('[{"f1":99,"f2":null},2]', '{0,f3}',null, true, 'return_target')
+ 
+   [{"f1":null,"f2":null},2,null,3]
+ [{"f1": 99, "f2": null}, 2]
+
+   
   


diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2fc3e3ff90..1cb2af1bcd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1237,6 +1237,15 @@ LANGUAGE INTERNAL
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'jsonb_set';
 
+CREATE OR REPLACE FUNCTION
+  jsonb_set_lax(jsonb_in jsonb, path text[] , replacement jsonb,
+create_if_missing boolean DEFAULT true,
+null_value_treatment text DEFAULT 'use_json_null')
+RETURNS jsonb
+LANGUAGE INTERNAL
+CALLED ON NULL INPUT IMMUTABLE PARALLEL SAFE
+AS 'jsonb_set_lax';
+
 CREATE OR REPLACE FUNCTION
   parse_ident(str text, strict boolean DEFAULT true)
 RETURNS text[]
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index ab5a24a858..4b5a0214dc 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -4395,6 +4395,70 @@ jsonb_set(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ * SQL function jsonb_set_lax(jsonb, text[], jsonb, boolean, text)
+ */
+Datum
+jsonb_set_lax(PG_FUNCTION_ARGS)
+{
+	/* Jsonb	   *in = PG_GETARG_JSONB_P(0); */
+	/* ArrayType  *path = PG_GETARG_ARRAYTYPE_P(1); */
+	/* Jsonb	  *newval = PG_GETARG_JSONB_P(2); */
+	/* bool		create = PG_GETARG_BOOL(3); */
+	text   *handle_null;
+	char   *handle_val;
+
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(3))
+		PG_RETURN_NULL();
+
+	/* could happen if they pass in an explicit NULL */
+	if (PG_ARGISNULL(4))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("need delete_key, return_target, use_json_null, or raise_exception")));
+
+	/* if the new value isn't an SQL NULL just call jsonb_set */
+	if (! PG_ARGISNULL(2))
+		return jsonb_set(fcinfo);
+
+	handle_null = PG_GETARG_TEXT_P(4);
+	handle_val = text_to_cstring(handle_null);
+
+	if (strcmp(handle_val,"raise_exception") == 0)
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ errmsg("NULL is not allowed"),
+ errdetail("exception raised due to \"null_value_treatment => 'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or ensu

Re: UPDATE many records

2020-01-07 Thread Israel Brewster
One potential issue I just thought of with this approach: disk space. Will I be 
doubling the amount of space used while both tables exist? If so, that would 
prevent this from working - I don’t have that much space available at the 
moment.
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> On Jan 7, 2020, at 10:09 AM, Mark Zellers  wrote:
> 
> You don’t tell us if other users will be concurrently changing any of the 
> records involved.  If you could guarantee that the table won’t be changed, 
> you might be better off doing a CREATE TABLE table_new as SELECT … FROM 
> table_old, dropping table_old, and finally renaming table_new.   Given the 
> way Postgres handles updates, I would think that might perform significantly 
> better.  Even if you did the work in batches (create a new table, 
> insert/select from the old table, drop, rename), that could well be better.  
> Especially if you re-create the indexes after all the data is moved.
>  
>  
>  
> From: Israel Brewster  
> Sent: Monday, January 6, 2020 10:36 AM
> To: pgsql-general@lists.postgresql.org
> Subject: UPDATE many records
>  
> Thanks to a change in historical data, I have a need to update a large number 
> of records (around 50 million). The update itself is straight forward, as I 
> can just issue an "UPDATE table_name SET changed_field=new_value();" (yes, 
> new_value is the result of a stored procedure, if that makes a difference) 
> command via psql, and it should work. However, due to the large number of 
> records this command will obviously take a while, and if anything goes wrong 
> during the update (one bad value in row 45 million, lost connection, etc), 
> all the work that has been done already will be lost due to the transactional 
> nature of such commands (unless I am missing something).
>  
> Given that each row update is completely independent of any other row, I have 
> the following questions:
>  
> 1) Is there any way to set the command such that each row change is committed 
> as it is calculated?
> 2) Is there some way to run this command in parallel in order to better 
> utilize multiple processor cores, other than manually breaking the data into 
> chunks and running a separate psql/update process for each chunk? Honestly, 
> manual parallelizing wouldn’t be too bad (there are a number of logical 
> segregations I can apply), I’m just wondering if there is a more automatic 
> option.
> ---
> Israel Brewster
> Software Engineer
> Alaska Volcano Observatory 
> Geophysical Institute - UAF 
> 2156 Koyukuk Drive 
> Fairbanks AK 99775-7320
> Work: 907-474-5172
> cell:  907-328-9145



Re: Writing Postgres Extensions in C on Windows

2020-01-07 Thread Thomas Munro
On Wed, Jan 8, 2020 at 4:32 AM İlyas Derse  wrote:
> I want to register C code to PostgreSql on Windows. So I think, I have to 
> make a extension for PostgreSql. But I did not find to written extension on 
> windows. Do you have an idea ?

I don't do Windows myself but this blog from Craig Ringer looks like a
good starting point:

https://www.2ndquadrant.com/en/blog/compiling-postgresql-extensions-visual-studio-windows/




Re: UPDATE many records

2020-01-07 Thread Israel Brewster
> On Jan 7, 2020, at 11:56 AM, Alan Hodgson  wrote:
> 
> On Tue, 2020-01-07 at 11:47 -0900, Israel Brewster wrote:
>> One potential issue I just thought of with this approach: disk space.
>> Will I be doubling the amount of space used while both tables exist?
>> If so, that would prevent this from working - I don’t have that much
>> space available at the moment.
> 
> The original update you planned would do that, too.
> 
> You probably need to just do the update in batches and vacuum the table
> between batches.
> 
> 

Really? Why? With the update I am only changing data - I’m not adding any 
additional data, so the total size should stay the same, right? I’m obviously 
missing something… :-)

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> 





Re: UPDATE many records

2020-01-07 Thread Alan Hodgson
On Tue, 2020-01-07 at 11:47 -0900, Israel Brewster wrote:
> One potential issue I just thought of with this approach: disk space.
> Will I be doubling the amount of space used while both tables exist?
> If so, that would prevent this from working - I don’t have that much
> space available at the moment.

The original update you planned would do that, too.

You probably need to just do the update in batches and vacuum the table
between batches.





Re: UPDATE many records

2020-01-07 Thread Adrian Klaver

On 1/7/20 12:47 PM, Israel Brewster wrote:
One potential issue I just thought of with this approach: disk space. 
Will I be doubling the amount of space used while both tables exist? If 
so, that would prevent this from working - I don’t have that much space 
available at the moment.


It will definitely increase the disk space by at least the data in the 
new table. How much relative to the old table is going to depend on how 
aggressive the AUTOVACUUM/VACUUM is.


A suggestion for an alternative approach:

1) Create a table:

create table change_table(id int, changed_fld some_type)

where is is the PK from the existing table.

2) Run your conversion function against existing table with change to 
have it put new field value in change_table keyed to id/PK. Probably do 
this in batches.


3) Once all the values have been updated, do an UPDATE set changed_field 
= changed_fld from change_table where existing_table.pk = change_table.id;



---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: UPDATE many records

2020-01-07 Thread Adrian Klaver

On 1/7/20 12:58 PM, Israel Brewster wrote:

On Jan 7, 2020, at 11:56 AM, Alan Hodgson  wrote:

On Tue, 2020-01-07 at 11:47 -0900, Israel Brewster wrote:

One potential issue I just thought of with this approach: disk space.
Will I be doubling the amount of space used while both tables exist?
If so, that would prevent this from working - I don’t have that much
space available at the moment.


The original update you planned would do that, too.

You probably need to just do the update in batches and vacuum the table
between batches.




Really? Why? With the update I am only changing data - I’m not adding any 
additional data, so the total size should stay the same, right? I’m obviously 
missing something… :-)


https://www.postgresql.org/docs/12/sql-vacuum.html

"VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL 
operation, tuples that are deleted or obsoleted by an update are not 
physically removed from their table; they remain present until a VACUUM 
is done. Therefore it's necessary to do VACUUM periodically, especially 
on frequently-updated tables."




---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145










--
Adrian Klaver
adrian.kla...@aklaver.com




Re: UPDATE many records

2020-01-07 Thread Israel Brewster


> On Jan 7, 2020, at 12:01 PM, Adrian Klaver  wrote:
> 
> On 1/7/20 12:47 PM, Israel Brewster wrote:
>> One potential issue I just thought of with this approach: disk space. Will I 
>> be doubling the amount of space used while both tables exist? If so, that 
>> would prevent this from working - I don’t have that much space available at 
>> the moment.
> 
> It will definitely increase the disk space by at least the data in the new 
> table. How much relative to the old table is going to depend on how 
> aggressive the AUTOVACUUM/VACUUM is.
> 
> A suggestion for an alternative approach:
> 
> 1) Create a table:
> 
> create table change_table(id int, changed_fld some_type)
> 
> where is is the PK from the existing table.
> 
> 2) Run your conversion function against existing table with change to have it 
> put new field value in change_table keyed to id/PK. Probably do this in 
> batches.
> 
> 3) Once all the values have been updated, do an UPDATE set changed_field = 
> changed_fld from change_table where existing_table.pk = change_table.id;

Makes sense. Use the fast SELECT to create/populate the other table, then the 
update can just be setting a value, not having to call any functions. From what 
you are saying about updates though, I may still need to batch the UPDATE 
section, with occasional VACUUMs to maintain disk space. Unless I am not 
understanding the concept of “tuples that are obsoleted by an update”, which is 
possible.

> 
>> ---
>> Israel Brewster
>> Software Engineer
>> Alaska Volcano Observatory
>> Geophysical Institute - UAF
>> 2156 Koyukuk Drive
>> Fairbanks AK 99775-7320
>> Work: 907-474-5172
>> cell:  907-328-9145
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com





Re: UPDATE many records

2020-01-07 Thread Alan Hodgson
On Tue, 2020-01-07 at 11:58 -0900, Israel Brewster wrote:
> > 
> Really? Why? With the update I am only changing data - I’m not adding
> any additional data, so the total size should stay the same, right?
> I’m obviously missing something… :-)
> 

PostgreSQL keeps the old row until it gets vacuumed, as it needs to be
visible to other transactions. Not only that, but every index record
gets updated to point to the location of the new data row too (excluding
HOT), and those old index blocks also need to get vacuumed. And none of
those rows can get removed until your update finishes.

I know this isn't universally true with HOT and fillfactor etc. but with
an update this big I think it's safe to say most of the space will get
doubled.

Plus you'll get a ton of write-ahead logs.





Re: UPDATE many records

2020-01-07 Thread Israel Brewster
> On Jan 7, 2020, at 12:15 PM, Alan Hodgson  wrote:
> 
> On Tue, 2020-01-07 at 11:58 -0900, Israel Brewster wrote:
>>> 
>> Really? Why? With the update I am only changing data - I’m not adding
>> any additional data, so the total size should stay the same, right?
>> I’m obviously missing something… :-)
>> 
> 
> PostgreSQL keeps the old row until it gets vacuumed, as it needs to be
> visible to other transactions. Not only that, but every index record
> gets updated to point to the location of the new data row too (excluding
> HOT), and those old index blocks also need to get vacuumed. And none of
> those rows can get removed until your update finishes.
> 
> I know this isn't universally true with HOT and fillfactor etc. but with
> an update this big I think it's safe to say most of the space will get
> doubled.
> 
> Plus you'll get a ton of write-ahead logs.

Gotcha. Batches with VACUUM it is! Thanks for the info.

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> 
> 
> 





Re: UPDATE many records

2020-01-07 Thread Adrian Klaver

On 1/7/20 1:10 PM, Israel Brewster wrote:



On Jan 7, 2020, at 12:01 PM, Adrian Klaver  wrote:

On 1/7/20 12:47 PM, Israel Brewster wrote:

One potential issue I just thought of with this approach: disk space. Will I be 
doubling the amount of space used while both tables exist? If so, that would 
prevent this from working - I don’t have that much space available at the 
moment.


It will definitely increase the disk space by at least the data in the new 
table. How much relative to the old table is going to depend on how aggressive 
the AUTOVACUUM/VACUUM is.

A suggestion for an alternative approach:

1) Create a table:

create table change_table(id int, changed_fld some_type)

where is is the PK from the existing table.

2) Run your conversion function against existing table with change to have it 
put new field value in change_table keyed to id/PK. Probably do this in batches.

3) Once all the values have been updated, do an UPDATE set changed_field = 
changed_fld from change_table where existing_table.pk = change_table.id;


Makes sense. Use the fast SELECT to create/populate the other table, then the 
update can just be setting a value, not having to call any functions. From what 
you are saying about updates though, I may still need to batch the UPDATE 
section, with occasional VACUUMs to maintain disk space. Unless I am not 
understanding the concept of “tuples that are obsoleted by an update”, which is 
possible.


You are not. For a more thorough explanation see:

https://www.postgresql.org/docs/12/routine-vacuuming.html#VACUUM-BASICS

How much space do you have to work with?

To get an idea of the disk space currently used by table see;

https://www.postgresql.org/docs/12/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT






---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145



--
Adrian Klaver
adrian.kla...@aklaver.com





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: UPDATE many records

2020-01-07 Thread Israel Brewster
> On Jan 7, 2020, at 12:21 PM, Adrian Klaver  wrote:
> 
> On 1/7/20 1:10 PM, Israel Brewster wrote:
>>> On Jan 7, 2020, at 12:01 PM, Adrian Klaver  
>>> wrote:
>>> 
>>> On 1/7/20 12:47 PM, Israel Brewster wrote:
 One potential issue I just thought of with this approach: disk space. Will 
 I be doubling the amount of space used while both tables exist? If so, 
 that would prevent this from working - I don’t have that much space 
 available at the moment.
>>> 
>>> It will definitely increase the disk space by at least the data in the new 
>>> table. How much relative to the old table is going to depend on how 
>>> aggressive the AUTOVACUUM/VACUUM is.
>>> 
>>> A suggestion for an alternative approach:
>>> 
>>> 1) Create a table:
>>> 
>>> create table change_table(id int, changed_fld some_type)
>>> 
>>> where is is the PK from the existing table.
>>> 
>>> 2) Run your conversion function against existing table with change to have 
>>> it put new field value in change_table keyed to id/PK. Probably do this in 
>>> batches.
>>> 
>>> 3) Once all the values have been updated, do an UPDATE set changed_field = 
>>> changed_fld from change_table where existing_table.pk = change_table.id;
>> Makes sense. Use the fast SELECT to create/populate the other table, then 
>> the update can just be setting a value, not having to call any functions. 
>> From what you are saying about updates though, I may still need to batch the 
>> UPDATE section, with occasional VACUUMs to maintain disk space. Unless I am 
>> not understanding the concept of “tuples that are obsoleted by an update”, 
>> which is possible.
> 
> You are not. For a more thorough explanation see:
> 
> https://www.postgresql.org/docs/12/routine-vacuuming.html#VACUUM-BASICS 
> 
> 
> How much space do you have to work with?
> 
> To get an idea of the disk space currently used by table see;
> 
> https://www.postgresql.org/docs/12/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT
>  
> 

Oh, ok, I guess I was being overly paranoid on this front. Those functions 
would indicate that the table is only 7.5 GB, with another 8.7GB of indexes, 
for a total of around 16GB. So not a problem after all - I have around 100GB 
available.

Of course, that now leaves me with the mystery of where my other 500GB of disk 
space is going, since it is apparently NOT going to my DB as I had assumed, but 
solving that can wait.

Thanks again for all the good information and suggestions!
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145
> 
>>> 
 ---
 Israel Brewster
 Software Engineer
 Alaska Volcano Observatory
 Geophysical Institute - UAF
 2156 Koyukuk Drive
 Fairbanks AK 99775-7320
 Work: 907-474-5172
 cell:  907-328-9145
>>> 
>>> 
>>> -- 
>>> Adrian Klaver
>>> adrian.kla...@aklaver.com
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com 


Re: UPDATE many records

2020-01-07 Thread Adrian Klaver

On 1/7/20 1:43 PM, Israel Brewster wrote:
On Jan 7, 2020, at 12:21 PM, Adrian Klaver > wrote:


On 1/7/20 1:10 PM, Israel Brewster wrote:
On Jan 7, 2020, at 12:01 PM, Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 1/7/20 12:47 PM, Israel Brewster wrote:
One potential issue I just thought of with this approach: disk 
space. Will I be doubling the amount of space used while both 
tables exist? If so, that would prevent this from working - I don’t 
have that much space available at the moment.


It will definitely increase the disk space by at least the data in 
the new table. How much relative to the old table is going to depend 
on how aggressive the AUTOVACUUM/VACUUM is.


A suggestion for an alternative approach:

1) Create a table:

create table change_table(id int, changed_fld some_type)

where is is the PK from the existing table.

2) Run your conversion function against existing table with change 
to have it put new field value in change_table keyed to id/PK. 
Probably do this in batches.


3) Once all the values have been updated, do an UPDATE set 
changed_field = changed_fld from change_table where 
existing_table.pk = change_table.id ;
Makes sense. Use the fast SELECT to create/populate the other table, 
then the update can just be setting a value, not having to call any 
functions. From what you are saying about updates though, I may still 
need to batch the UPDATE section, with occasional VACUUMs to maintain 
disk space. Unless I am not understanding the concept of “tuples that 
are obsoleted by an update”, which is possible.


You are not. For a more thorough explanation see:

https://www.postgresql.org/docs/12/routine-vacuuming.html#VACUUM-BASICS

How much space do you have to work with?

To get an idea of the disk space currently used by table see;

https://www.postgresql.org/docs/12/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT


Oh, ok, I guess I was being overly paranoid on this front. Those 
functions would indicate that the table is only 7.5 GB, with another 
8.7GB of indexes, for a total of around 16GB. So not a problem after all 
- I have around 100GB available.


Of course, that now leaves me with the mystery of where my other 500GB 
of disk space is going, since it is apparently NOT going to my DB as I 
had assumed, but solving that can wait.


Assuming you are on some form of Linux:

sudo du -h -d 1 /

Then you can drill down into the output of above.



Thanks again for all the good information and suggestions!
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145





---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145



--
Adrian Klaver
adrian.kla...@aklaver.com 



--
Adrian Klaver
adrian.kla...@aklaver.com 





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: UPDATE many records

2020-01-07 Thread Israel Brewster
> 
> On Jan 7, 2020, at 12:57 PM, Adrian Klaver  wrote:
> 
> On 1/7/20 1:43 PM, Israel Brewster wrote:
>>> On Jan 7, 2020, at 12:21 PM, Adrian Klaver >> > wrote:
>>> 
>>> On 1/7/20 1:10 PM, Israel Brewster wrote:
> On Jan 7, 2020, at 12:01 PM, Adrian Klaver  > wrote:
> 
> On 1/7/20 12:47 PM, Israel Brewster wrote:
>> One potential issue I just thought of with this approach: disk space. 
>> Will I be doubling the amount of space used while both tables exist? If 
>> so, that would prevent this from working - I don’t have that much space 
>> available at the moment.
> 
> It will definitely increase the disk space by at least the data in the 
> new table. How much relative to the old table is going to depend on how 
> aggressive the AUTOVACUUM/VACUUM is.
> 
> A suggestion for an alternative approach:
> 
> 1) Create a table:
> 
> create table change_table(id int, changed_fld some_type)
> 
> where is is the PK from the existing table.
> 
> 2) Run your conversion function against existing table with change to 
> have it put new field value in change_table keyed to id/PK. Probably do 
> this in batches.
> 
> 3) Once all the values have been updated, do an UPDATE set changed_field 
> = changed_fld from change_table where existing_table.pk = change_table.id 
> ;
 Makes sense. Use the fast SELECT to create/populate the other table, then 
 the update can just be setting a value, not having to call any functions. 
 From what you are saying about updates though, I may still need to batch 
 the UPDATE section, with occasional VACUUMs to maintain disk space. Unless 
 I am not understanding the concept of “tuples that are obsoleted by an 
 update”, which is possible.
>>> 
>>> You are not. For a more thorough explanation see:
>>> 
>>> https://www.postgresql.org/docs/12/routine-vacuuming.html#VACUUM-BASICS
>>> 
>>> How much space do you have to work with?
>>> 
>>> To get an idea of the disk space currently used by table see;
>>> 
>>> https://www.postgresql.org/docs/12/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT
>> Oh, ok, I guess I was being overly paranoid on this front. Those functions 
>> would indicate that the table is only 7.5 GB, with another 8.7GB of indexes, 
>> for a total of around 16GB. So not a problem after all - I have around 100GB 
>> available.
>> Of course, that now leaves me with the mystery of where my other 500GB of 
>> disk space is going, since it is apparently NOT going to my DB as I had 
>> assumed, but solving that can wait.
> 
> Assuming you are on some form of Linux:
> 
> sudo du -h -d 1 /
> 
> Then you can drill down into the output of above.

Yep. Done it many times to discover a runaway log file or the like. Just mildly 
amusing that solving one problem leads to another I need to take care of as 
well… But at least the select into a new table should work nicely. Thanks!

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145
> 
>> Thanks again for all the good information and suggestions!
>> ---
>> Israel Brewster
>> Software Engineer
>> Alaska Volcano Observatory
>> Geophysical Institute - UAF
>> 2156 Koyukuk Drive
>> Fairbanks AK 99775-7320
>> Work: 907-474-5172
>> cell:  907-328-9145
>>> 
> 
>> ---
>> Israel Brewster
>> Software Engineer
>> Alaska Volcano Observatory
>> Geophysical Institute - UAF
>> 2156 Koyukuk Drive
>> Fairbanks AK 99775-7320
>> Work: 907-474-5172
>> cell:  907-328-9145
> 
> 
> --
> Adrian Klaver
> adrian.kla...@aklaver.com 
>>> 
>>> 
>>> --
>>> Adrian Klaver
>>> adrian.kla...@aklaver.com 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com





Re: Upgrade PostgreSQL 9.6 to 10.6

2020-01-07 Thread github kran
On Tue, Dec 10, 2019 at 2:53 AM Andreas Kretschmer 
wrote:

>
>
> Am 09.12.19 um 23:37 schrieb github kran:
> > Great, thanks Andreas, So this seems to be a good feature using the
> > core concept of replication. Can I use this extension and do the major
> > upgrade without paying ?.
>
> yes, this extension is free.
>
>
> Regards, Andreas
>

  Hello Andreas,
   As a follow up on this thread, I want to ask if this extension works on
postgreSQL version 10.6 ? and also does this extension can be installed on
Aurora version of PostGreSQL.?.  if it doesnt do you have any
   other alternative to make this work on Aurora engine ?

   Thanks !!



>
> --
> 2ndQuadrant - The PostgreSQL Support Company.
> www.2ndQuadrant.com
>
>


Postgres streaming replication

2020-01-07 Thread Daulat Ram
Hello team,

I have to implement the streaming replication for our prod environment.
Can you please share the list of parameters to setup the PostgreSQL 11 
streaming replication with continuous archiving and give clarity on the below.


  *   Do we need to enable the archive_mode and archive_command parameters at 
standby side if implementing PostgreSQL 11 streaming replication with 
continuous archiving ?



  *   What is the benefits for implementing streaming replication with 
continuous archiving over the PostgreSQL Streaming replication?


Thanks,





Re: jsonb_set() strictness considered harmful to data

2020-01-07 Thread Andrew Dunstan
On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule  wrote:
>
> Hi
>
> po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan 
>  napsal:
>>
>>
>> Updated version including docco and better error message.
>>
>> cheers
>>
>> andrew
>
>
> I think so my objections are solved. I have small objection
>
> + errdetail("exception raised due to \"null_value_treatment := 
> 'raise_exception'\""),
> + errhint("to avoid, either change the null_value_treatment argument or 
> ensure that an SQL NULL is not used")));
>
> "null_value_treatment := 'raise_exception'\""
>
> it use proprietary PostgreSQL syntax for named parameters. Better to use 
> ANSI/SQL syntax
>
> "null_value_treatment => 'raise_exception'\""
>
> It is fixed in attached patch
>
> source compilation without warnings,
> compilation docs without warnings
> check-world passed without any problems
>
> I'll mark this patch as ready for commiter
>
> Thank you for your work
>


Thanks for the review. I propose to commit this shortly.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Setting up an environment of EDB Advance server

2020-01-07 Thread Daulat Ram
Hi team,

We would need your help in setting up an environment of EDB Advance Server 11 
on Dev CentOS 7 VM.


  1.  We need to create three database with separate table spaces :

Test1
Test2
Test3
 What would be the good strategy for the setup with regards to the tablespaces?
Can we create a separate cluster for each database:

Thanks,



Re: How to shorten a chain of logically replicated servers

2020-01-07 Thread Mike Lissner
> You'd have to suspend all data modification on A in that interval.

I know how to stop the DB completely, but I can't think of any obvious
ways to make sure that it doesn't get any data modification for a
period of time. Is there a trick here? This is feeling a bit hopeless.

Thanks for the response, Laurenz.

On Tue, Jan 7, 2020 at 3:11 AM Laurenz Albe  wrote:
>
> On Tue, 2019-12-31 at 15:51 -0800, Mike Lissner wrote:
> > Hi, I'm trying to figure out how to shorten a chain of logically
> > replicating servers. Right now we have three servers replicating like
> > so:
> >
> > A --> B --> C
> >
> > And I'd like to remove B from the chain of replication so that I only have:
> >
> > A --> C
> >
> > Of course, doing this without losing data is the goal. If the
> > replication to C breaks temporarily, that's fine, so long as all the
> > changes on A make it to C eventually.
> >
> > I'm not sure how to proceed with this. My best theory is:
> >
> > 1. In a transaction, DISABLE the replication from A to B and start a
> > new PUBLICATION on A that C will subscribe to in step ③ below. The
> > hope is that this will simultaneously stop sending changes to B while
> > starting a log of new changes that can later be sent to C.
> >
> > 2. Let any changes queued on B flush to C. (How to know when they're
> > all flushed?)
> >
> > 3. Subscribe C to the new PUBLICATION created in step ①. Create the
> > subscription with copy_data=False. This should send all changes to C
> > that hadn't been sent to B, without sending the complete tables.
> >
> > 4. DROP all replication to/from B (this is just cleanup; the incoming
> > changes to B were disabled in step ①, and outgoing changes from B were
> > flushed in step ②).
> >
> > Does this sound even close to the right approach? Logical replication
> > can be a bit finicky, so I'd love to have some validation of the
> > general approach before I go down this road.
>
> I don't think that will work.
>
> Any changes on A that take place between step 1 and step 3 wouldn't be
> replicated to C.
>
> You'd have to suspend all data modification on A in that interval.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>