Re: adding more space to the existing server

2019-08-04 Thread Julie Nishimura
Alban, thank you for your reply. Your suggestion makes sense, and I will be 
talking to our engineers about it. Currently we need to understand:

a)   How do we break A -> B replication such that both can become 
independent primaries
b)  How do we reassign C from B->C replication to A->C replication
c)   Thoughts on why this isn’t a good plan

Current:
 A replicates to B
all requests go to A

Soon:
  A replicates to B -> cascading to C and D

Transition:
  break A replication to B such that both can become primary
  stop B replication to C  then setup A to replicate to C

End state:
A replicates to C
B replicates to D


we remove some of the dbs from A and B, then reassign the traffic based on db 
selections

I hope it all makes sense...

Thank you



From: Alban Hertroys 
Sent: Saturday, August 3, 2019 3:15 AM
To: Julie Nishimura 
Cc: Adrian Klaver ; 
pgsql-general@lists.postgresql.org ; 
pgsql-general 
Subject: Re: adding more space to the existing server


> On 2 Aug 2019, at 21:45, Julie Nishimura  wrote:

> 1) We use streaming replication, and due to hardware limitation, we cannot 
> add more drives to the existing host. That is why we thought by breaking the 
> existing streaming replication (from a->b), instead of currently identical 
> standby (b), we can introduce twice larger host, then start the replication 
> to the newly larger host, and when it is caught up, break it again. Then 
> break rep again, make modification to 'a" host, making it larger, then 
> replicate b->a. After it is caught up, break the rep again, switch 
> master->standby (if necessary).

Let’s be clear, I’m in no way an expert on replication. In fact, I’ve only done 
(streaming) replication once and I managed to mess that up in a minor way 
(disabled the wrong service during failover, so data still went to the database 
I was attempting to replace for, like, 15 minutes).

> 2) I am not sure about the time, but it is understood it is required 2 full 
> replication cycles, and might be up to 2 weeks with no standby situation

No standby situation? Murphy is probably just waiting for that to strike…
I recall a fairly recent story on the FreeBSD ML about someone on Malta doing a 
migration of a couple dozen terabytes from her main server (because of some 
failing disks in her RAID set) using her backup server to move data around 
(with backups removed to make room), when, due to an accident outside the 
building, an aerial 10KV power line hit another power line in the ground, 
causing a fire in one UPS and frying the other one. Losing power at that point 
meant that the file systems (ZFS) on both servers ended up in an unrecoverable 
state with no backups. It didn’t help that the UPS’s were at the bottom of the 
rack, with the heat and smoke going up into the servers. What are the chances, 
right? (And then it turned out that it is really hard to try to recover data 
from a ZFS file system in such a state, which is what her actual inquiry was 
about)

I would definitely prefer to add a 3rd machine into the mix, even if it were 
just a temporary machine - a rental perhaps?

>From there, I’m certain Adrian knows more about replication than I do. I’d go 
>with the approach he suggested.

> 4) by pg_basebackup and restore
>
> As of now, we are thinking about possibly other solutions, as of splitting 
> existing 37 databases on the cluster into 2 hosts with their own standbys. 
> This solution requires breaking up existing replication as well. Can you 
> please point me to some document which lists all steps describing breaking up 
> the existing replication properly? we are using 9.6 postgres

I’m going to assume that you will have data coming in while this split is 
taking place and that you therefore cannot offline the entire set of databases 
for as long as this takes. If not, that would probably allow for a simpler 
(faster) scenario.

I think the easiest for this scenario would be to add two more machines (c and 
d) and replicate them off the current setup. You want that to happen as 
parallel as possible, so perhaps replicate c off a and d off b.

If you aren’t already using “replication slots”, I found that to make things 
both easier to understand and more reliable. You can query their status, for 
one thing.

Those replicas will take extra time of course (about double) because you’re 
replicating twice what you need, but I don’t think you can replicate parts of a 
cluster with your setup unless you go for a different replication approach (I 
think per database replication requires statement level replication?).

After that, decouple both sets into:
a —> b (your current machine)
c —> d (the new ones)

(Although any order should be fine, really, as long as they have caught up.)

At that point I would probably (temporarily) pause replication in at least one 
set and create a backup of that.

This is the point to start removing superfluous databases from a and c (so that 
a+c make up

How to check if a field exists in NEW in trigger

2019-08-04 Thread Igal @ Lucee.org

I have the following statement in a trigger:

    new.email = lower(new.email);

When I try to update a record without setting the email column however, 
I get an error:


SQL Error [42703]: ERROR: record "new" has no field "email"
  Where: SQL statement "SELECT lower(new.email)"
PL/pgSQL function on_record_modified() line 26 at assignment

I have seen some hacks suggesting TRY/CATCH or converting to a JSON and 
checking if the field exists, but I would think that there's a better 
way to check if the field is in the NEW record, no?


Any ideas?  Thanks!

Igal Sapir
Lucee Core Developer
Lucee.org 



Re: How to check if a field exists in NEW in trigger

2019-08-04 Thread Adrian Klaver

On 8/4/19 3:52 PM, Igal @ Lucee.org wrote:

I have the following statement in a trigger:

     new.email = lower(new.email);

When I try to update a record without setting the email column however, 


Do you mean:

1) There is no actual email column?

2) There is an email column but no value for it?




I get an error:

SQL Error [42703]: ERROR: record "new" has no field "email"
   Where: SQL statement "SELECT lower(new.email)"
PL/pgSQL function on_record_modified() line 26 at assignment


Is:

on_record_modified()

a generic function that will be applied to many tables with differing 
schema?




I have seen some hacks suggesting TRY/CATCH or converting to a JSON and 
checking if the field exists, but I would think that there's a better 
way to check if the field is in the NEW record, no?


Use

TG_RELID

and look up the columns in:

https://www.postgresql.org/docs/11/catalog-pg-attribute.html

Or use the columns information_schema:

https://www.postgresql.org/docs/11/infoschema-columns.html



Any ideas?  Thanks!

Igal Sapir
Lucee Core Developer
Lucee.org 




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: How to check if a field exists in NEW in trigger

2019-08-04 Thread Thomas Kellerer
Igal @ Lucee.org schrieb am 05.08.2019 um 00:52:
> I have the following statement in a trigger:
> 
>     new.email = lower(new.email);
> 
> When I try to update a record without setting the email column however, I get 
> an error:
> 
> SQL Error [42703]: ERROR: record "new" has no field "email"
>   Where: SQL statement "SELECT lower(new.email)"
> PL/pgSQL function on_record_modified() line 26 at assignment
> 
> I have seen some hacks suggesting TRY/CATCH or converting to a JSON
> and checking if the field exists, but I would think that there's a
> better way to check if the field is in the NEW record, no?


I assume using to_jsonb(new) and then check for the key in the json value 
will be faster than checking e.g. information_schema.column 
or pg_catalog.pg_attribute




Re: How to check if a field exists in NEW in trigger

2019-08-04 Thread Pavel Stehule
po 5. 8. 2019 v 7:55 odesílatel Thomas Kellerer  napsal:

> Igal @ Lucee.org schrieb am 05.08.2019 um 00:52:
> > I have the following statement in a trigger:
> >
> > new.email = lower(new.email);
> >
> > When I try to update a record without setting the email column however,
> I get an error:
> >
> > SQL Error [42703]: ERROR: record "new" has no field "email"
> >   Where: SQL statement "SELECT lower(new.email)"
> > PL/pgSQL function on_record_modified() line 26 at assignment
> >
> > I have seen some hacks suggesting TRY/CATCH or converting to a JSON
> > and checking if the field exists, but I would think that there's a
> > better way to check if the field is in the NEW record, no?
>
>
> I assume using to_jsonb(new) and then check for the key in the json value
> will be faster than checking e.g. information_schema.column
> or pg_catalog.pg_attribute
>

Alternative solution can be using other language than PLpgSQL - PLPythonu
or PLPerl (there it is simple task). This language is not designed for too
dynamic code. PLpgSQL triggers are designed for stable schema - you should
to know if table has email column or not.

Catching errors in PLpgSQL is relative expensive solution due related
savepoint overhead in background.

Regards

Pavel


Compression In Postgresql 9.6

2019-08-04 Thread Shital A
Hello,

Need inputs on below:

We are working on a setting up a new highly transactional (tps 100k) OLTP
system for payments using blockchain and postgresql 9.6 as DB on Rhel 7.6.
Postgres version is 9.6 and not latest because of specs of blockchain
component.

There is a requirement for data compression on DB level. Please provide
your inputs on how this can be best achieved.

Checked in-build Toast, it compressed the data provided exceed the 2kb
pagesize? If the data values are small and even if there are billion
records they wont be compressed, this is what I understood.

Are there any suggestions of compressing older data irrespective of row
size transparently?

Thanks.