Re: Temporarily disable not null constraints
generally, you shouldn't be disabling your constraints, especially if you are having multiple parallel processes accessing your db. instead, you should create them DEFERRABLE and have them checked at the end of your transaction. regarding your question about NOT NULL: it is not possible to have it deferred (please check this page: https://www.postgresql.org/docs/13/sql-set-constraints.html) you may alter your column, remove it, and then get it back, but still all rows will have to be checked, which I doubt you would like to see on a large table. regards, milos On Thu, Dec 3, 2020 at 9:00 PM Nagaraj Raj wrote: > Hi, > > Can we disable not null constraints temporarily in the session-based > transaction, like we disable FK constraints? > > SET session_replication_role = ‘replica’; > alter table table_name disable trigger user;” > > above two options are working for unique constraints violation exception. > > Thanks, > Rj >
Re: Postgres performance comparing GCP and AWS
Hi Maurici, as a starting point: can you make sure your GPC instance is configured in the same way AWS is? Once you do it, repeat the tests, and post the outcome. Thanks, Milos On Tue, Feb 23, 2021 at 11:14 PM Maurici Meneghetti < maurici.meneghe...@bixtecnologia.com.br> wrote: > Hi everyone, > > I have 2 postgres instances created from the same dump (backup), one on a > GCP VM and the other on AWS RDS. The first instance takes 18 minutes and > the second one takes less than 20s to run this simples query: > SELECT "Id", "DateTime", "SignalRegisterId", "Raw" FROM > "SignalRecordsBlobs" WHERE "SignalSettingId" = 103 AND "DateTime" BETWEEN > '2019-11-28T14:00:12.54020' AND '2020-07-23T21:12:32.24900'; > I’ve run this query a few times to make sure both should be reading data > from cache. > I expect my postgres on GPC to be at least similar to the one managed by > AWS RDS so that I can work on improvements parallelly and compare. > > > > *DETAILS:Query explain for Postgres on GCP VM:*Bitmap Heap Scan on > SignalRecordsBlobs SignalRecordsBlobs (cost=18.80..2480.65 rows=799 > width=70) (actual time=216.766..776.032 rows=5122 loops=1) > Filter: (("DateTime" >= \'2019-11-28 14:00:12.5402\'::timestamp > without time zone) AND ("DateTime" <= \'2020-07-23 > 21:12:32.249\'::timestamp without time zone)) > Heap Blocks: exact=5223 > Buffers: shared hit=423 read=4821 > -> Bitmap Index Scan on IDX_SignalRecordsBlobs_SignalSettingId > (cost=0.00..18.61 rows=824 width=0) (actual time=109.000..109.001 > rows=5228 loops=1) > Index Cond: ("SignalSettingId" = 103) > Buffers: shared hit=3 read=18 > Planning time: 456.315 ms > Execution time: 776.976 ms > > > *Query explain for Postgres on AWS RDS:*Bitmap Heap Scan on > SignalRecordsBlobs SignalRecordsBlobs (cost=190.02..13204.28 rows=6213 > width=69) (actual time=2.215..14.505 rows=5122 loops=1) > Filter: (("DateTime" >= \'2019-11-28 14:00:12.5402\'::timestamp > without time zone) AND ("DateTime" <= \'2020-07-23 > 21:12:32.249\'::timestamp without time zone)) > Heap Blocks: exact=5209 > Buffers: shared hit=3290 read=1948 > -> Bitmap Index Scan on IDX_SignalRecordsBlobs_SignalSettingId > (cost=0.00..188.46 rows=6405 width=0) (actual time=1.159..1.159 rows=5228 > loops=1) > Index Cond: ("SignalSettingId" = 103) > Buffers: shared hit=3 read=26 > Planning time: 0.407 ms > Execution time: 14.87 ms > > > *PostgreSQL version number running:• VM on GCP*: PostgreSQL 11.10 (Debian > 11.10-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) > 8.3.0, 64-bit > *• Managed by RDS on AWS:* PostgreSQL 11.10 on x86_64-pc-linux-gnu, > compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit > > > *How PostgreSQL was installed:• VM on GCP*: Already installed when > created VM running Debian on Google Console. > *• Managed by RDS on AWS:* RDS managed the installation. > > > *Changes made to the settings in the postgresql.conf file:*Here are some > postgres parameters that might be useful: > *Instance on VM on GCP (2 vCPUs, 2 GB memory, 800 GB disk):* > • effective_cache_size: 1496MB > • maintenance_work_mem: 255462kB (close to 249MB) > • max_wal_size: 1GB > • min_wal_size: 512MB > • shared_buffers: 510920kB (close to 499MB) > • max_locks_per_transaction 1000 > • wal_buffers: 15320kB (close to 15MB) > • work_mem: 2554kB > • effective_io_concurrency: 200 > • dynamic_shared_memory_type: posix > On this instance we installed a postgres extension called timescaledb to > gain performance on other tables. Some of these parameters were set using > recommendations from that extension. > > *Instance managed by RDS (2 vCPUs, 2 GiB RAM, 250GB disk, 750 de IOPS):* > • effective_cache_size: 1887792kB (close to 1844MB) > • maintenance_work_mem: 64MB > • max_wal_size: 2GB > • min_wal_size: 192MB > • shared_buffers: 943896kB (close to 922MB) > • max_locks_per_transaction 64 > > > *Operating system and version by runing "uname -a":• VM on GCP:* Linux > {{{my instance name}}} 4.19.0-14-cloud-amd64 #1 SMP Debian 4.19.171-2 > (2021-01-30) x86_64 GNU/Linux > *• Managed by AWS RDS:* Aparently Red Hay as shown using SELECT version(); > > *Program used to connect to PostgreSQL:* Python psycopg2.connect() to > create the connection and pandas read_sql_query() to query using that > connection. > > Thanks in advance >
Re: procedure using CURSOR to insert is extremely slow
Hi Zoltan, is there any particular reason why you don't do a bulk insert as: insert into target_table select ... from source_table(s) (with joins etc) Regards, Milos On Thu, Apr 8, 2021 at 1:24 PM Szalontai Zoltán < szalontai.zol...@t-online.hu> wrote: > Hi, > > > > We have a Class db.t2.medium database on AWS. > > We use a procedure to transfer data records from the Source to the Target > Schema. > > Transfers are identified by the log_id field in the target table. > > > > The procedure is: > > 1 all records are deleted from the Target table with the actual log_id > value > > 2 a complicated SELECT (numerous tables are joined) is created on the > Source system > > 3 a cursor is defined based on this SELECT > > 4 we go trough the CURSOR and insert new records into the Target table > with this log_id > > > > (Actually we have about 100 tables in the Target schema and the size of > the database backup file is about 1GByte. But we do the same for all the > Target tables.) > > > > Our procedure is extremely slow for the first run: 3 days for the 100 > tables. For the second and all subsequent run it is fast enough (15 > minutes). > > The only difference between the first run and all the others is that in > the first run there are no records in the Target schema with this log_id. > > > > It seems, that in the first step the DELETE operation makes free some > “space”, and the INSET operation in the 4. step can reuse this space. But > if no records are deleted in the first step, the procedure is extremely > slow. > > > > To speed up the first run we found the following workaround: > > We inserted dummy records into the Target tables with the proper log_id, > and really the first run became very fast again. > > > > Is there any “normal” way to speed up this procedure? > > In the production environment there will be only “first runs”, the same > log_id will never be used again. > > > > > > thank > > Zoltán > > > > > -- Milos Babic http://www.linkedin.com/in/milosbabic
Re: procedure using CURSOR to insert is extremely slow
Hi Zoltan, you should try to rethink the logic behind the query. Numerous if/then/else can be transformed into case-when, or a bunch of unions, which, I'm 100% certain will do much better than row-by-row insertion. However, this is a general note. Still doesn't explain why it takes faster to insert with deletions (?!!) Is there any chance the set you inserting in the second run is smaller (e.g. only a fraction of the original one)? If possible, you can send over a fragment of the code, and we can look into it. regards, Milos On Thu, Apr 8, 2021 at 3:56 PM Szalontai Zoltán < szalontai.zol...@t-online.hu> wrote: > Hi Milos, > > > > Inside the loops there are frequently if / else branches value > transformations used. > > We could not solve it without using a cursor. > > > > Regards, > > Zoltán > > > > *From:* Milos Babic > *Sent:* Thursday, April 8, 2021 2:31 PM > *To:* Szalontai Zoltán > *Cc:* Pgsql Performance > *Subject:* Re: procedure using CURSOR to insert is extremely slow > > > > Hi Zoltan, > > > > is there any particular reason why you don't do a bulk insert as: > >insert into target_table > >select ... from source_table(s) (with joins etc) > > > > Regards, > > Milos > > > > > > > > On Thu, Apr 8, 2021 at 1:24 PM Szalontai Zoltán < > szalontai.zol...@t-online.hu> wrote: > > Hi, > > > > We have a Class db.t2.medium database on AWS. > > We use a procedure to transfer data records from the Source to the Target > Schema. > > Transfers are identified by the log_id field in the target table. > > > > The procedure is: > > 1 all records are deleted from the Target table with the actual log_id > value > > 2 a complicated SELECT (numerous tables are joined) is created on the > Source system > > 3 a cursor is defined based on this SELECT > > 4 we go trough the CURSOR and insert new records into the Target table > with this log_id > > > > (Actually we have about 100 tables in the Target schema and the size of > the database backup file is about 1GByte. But we do the same for all the > Target tables.) > > > > Our procedure is extremely slow for the first run: 3 days for the 100 > tables. For the second and all subsequent run it is fast enough (15 > minutes). > > The only difference between the first run and all the others is that in > the first run there are no records in the Target schema with this log_id. > > > > It seems, that in the first step the DELETE operation makes free some > “space”, and the INSET operation in the 4. step can reuse this space. But > if no records are deleted in the first step, the procedure is extremely > slow. > > > > To speed up the first run we found the following workaround: > > We inserted dummy records into the Target tables with the proper log_id, > and really the first run became very fast again. > > > > Is there any “normal” way to speed up this procedure? > > In the production environment there will be only “first runs”, the same > log_id will never be used again. > > > > > > thank > > Zoltán > > > > > > > > > -- > > Milos Babic > > http://www.linkedin.com/in/milosbabic > -- Milos Babic http://www.linkedin.com/in/milosbabic