Re: How batch processing works

2024-10-04 Thread Lok P
On Mon, Sep 23, 2024 at 12:53 AM Peter J. Holzer wrote: > > > Thank you so much. > > I was expecting method-3(batch insert) to be the fastest or atleast as > you said > > perform with similar speed as method-2 (row by row insert with batch > commit) > > Oops, sorry! I wrote that the wrong way aro

Re: Grants not working on partitions

2024-09-28 Thread Lok P
On Sat, Sep 28, 2024 at 8:46 PM Adrian Klaver wrote: > On 9/28/24 04:02, Lok P wrote: > > Hi, > > While we are creating any new tables, we used to give SELECT privilege > > on the newly created tables using the below command. But we are seeing > > now , in case of pa

Grants not working on partitions

2024-09-28 Thread Lok P
Hi, While we are creating any new tables, we used to give SELECT privilege on the newly created tables using the below command. But we are seeing now , in case of partitioned tables even if we had given the privileges in the same fashion, the user is not able to query specific partitions but only t

Re: How batch processing works

2024-09-21 Thread Lok P
On Sun, Sep 22, 2024 at 12:46 AM Adrian Klaver wrote: > On 9/21/24 07:36, Peter J. Holzer wrote: > > On 2024-09-21 16:44:08 +0530, Lok P wrote: > > > > --- > > #!/usr/bin/

Re: How batch processing works

2024-09-21 Thread Lok P
On Sat, Sep 21, 2024 at 8:07 PM Peter J. Holzer wrote: > On 2024-09-21 16:44:08 +0530, Lok P wrote: > > But wondering why we don't see any difference in performance between > method-2 > > and method-3 above. > > The code runs completely inside the database. So

Re: How batch processing works

2024-09-21 Thread Lok P
On Sat, Sep 21, 2024 at 9:51 AM Michał Kłeczek wrote: > Hi, > > > On 19 Sep 2024, at 07:30, Lok P wrote: > > > [snip] > > > > Method-4 > > > > INSERT INTO parent_table VALUES (1, 'a'), (2, 'a'); > > INSERT INTO child_tabl

Re: How batch processing works

2024-09-19 Thread Lok P
s understanding correct? *Method-1- 00:01:44.48* *Method-2- 00:00:02.67* *Method-3- 00:00:02.39* https://gist.github.com/databasetech0073/8e9106757d751358c0c0c65a2374dbc6 On Thu, Sep 19, 2024 at 6:42 PM Lok P wrote: > > > On Thu, Sep 19, 2024 at 5:40 PM Ron Johnson > wrote: >

Re: How batch processing works

2024-09-19 Thread Lok P
On Thu, Sep 19, 2024 at 5:40 PM Ron Johnson wrote: > On Thu, Sep 19, 2024 at 5:24 AM Lok P wrote: > >> >> >> [snip] > >> DO $$ >> DECLARE >> num_inserts INTEGER := 10; >> batch_size INTEGER := 50; >> start_time TIMESTAMP

Re: How batch processing works

2024-09-19 Thread Lok P
On Thu, Sep 19, 2024 at 11:31 AM Ron Johnson wrote: > > [snip] > >> >> Method-4 >> >> INSERT INTO parent_table VALUES (1, 'a'), (2, 'a'); >> INSERT INTO child_table VALUES (1,1, 'a'), (1,2, 'a'); >> commit; >> > > If I knew that I had to load a structured input data file (even if it had > pare

How batch processing works

2024-09-18 Thread Lok P
Hello, Saw multiple threads around the same , so I want some clarification. As we know row by row is slow by slow processing , so in heavy write systems(say the client app is in Java) , people asked to do DMLS in batches rather in a row by row fashion to minimize the chatting or context switches be

Re: Faster data load

2024-09-05 Thread Lok P
On Fri, 6 Sept, 2024, 9:20 am Muhammad Usman Khan, wrote: > Hi, > > You can use pg_partman. If your table is partitioned, you can manage > partitions in parallel by distributing the load across partitions > concurrently. Or you can use citus. It can be an excellent solution, > especially for hand

Faster data load

2024-09-05 Thread Lok P
Hi, We are having a requirement to create approx 50 billion rows in a partition table(~1 billion rows per partition, 200+gb size daily partitions) for a performance test. We are currently using ' insert into select.. From or ;' method . We have dropped all indexes and constraints First and then

Re: Column type modification in big tables

2024-08-22 Thread Lok P
On Thu, 15 Aug, 2024, 9:18 pm Alban Hertroys, wrote: > > > On 15 Aug 2024, at 14:15, Lok P wrote: > > (…) > > > Hello Greg, > > > > In terms of testing on sample data and extrapolating, as i picked the > avg partition sizeof the table (which is ~20GB)

Re: Column type modification in big tables

2024-08-15 Thread Lok P
On Fri, Aug 16, 2024 at 2:04 AM Lok P wrote: > > On Thu, Aug 15, 2024 at 9:18 PM Alban Hertroys wrote: > >> >> > On 15 Aug 2024, at 14:15, Lok P wrote: >> >> (…) >> Are all those partitions critical, or only a relative few? >> >> If th

Re: Column type modification in big tables

2024-08-15 Thread Lok P
On Thu, Aug 15, 2024 at 9:18 PM Alban Hertroys wrote: > > > On 15 Aug 2024, at 14:15, Lok P wrote: > > (…) > Are all those partitions critical, or only a relative few? > > If that’s the case, you could: > 1) detach the non-critical partitions >

Re: Column type modification in big tables

2024-08-15 Thread Lok P
On Tue, Aug 13, 2024 at 7:39 PM Greg Sabino Mullane wrote: > On Sat, Aug 10, 2024 at 5:06 PM Lok P wrote: > > >> Can someone through some light , in case we get 5-6hrs downtime for this >> change , then what method should we choose for this Alter operation? >> >

Re: Column type modification in big tables

2024-08-13 Thread Lok P
On Tue, Aug 13, 2024 at 7:39 PM Greg Sabino Mullane wrote: > On Sat, Aug 10, 2024 at 5:06 PM Lok P wrote: > > >> Can someone through some light , in case we get 5-6hrs downtime for this >> change , then what method should we choose for this Alter operation? >> >

Re: Column type modification in big tables

2024-08-10 Thread Lok P
On Sat, Aug 10, 2024 at 5:47 PM sud wrote: > > > My 2cents. > If you have enough time then from a simplicity point of view, your single > line alter command may look good, but how are you going to see the amount > of progress it has made so far and how much time it's going to take to > finish. A

Re: Column type modification in big tables

2024-08-09 Thread Lok P
On Fri, Aug 9, 2024 at 9:19 PM Greg Sabino Mullane wrote: > On Fri, Aug 9, 2024 at 6:39 AM Lok P wrote: > >> Thank you so much. Will definitely try to evaluate this approach. The >> Only concern I have is , as this data is moving downstream with exactly the >> same d

Re: Column type modification in big tables

2024-08-09 Thread Lok P
On Fri, Aug 9, 2024 at 2:06 AM Greg Sabino Mullane wrote: > On Thu, Aug 8, 2024 at 2:39 PM Lok P wrote: > >> Can anybody suggest any other possible way here. >> > > Sure - how about not changing the column type at all? > > > one of the columns from varchar(

Re: Column type modification in big tables

2024-08-08 Thread Lok P
On Thu, Aug 8, 2024 at 1:06 AM sud wrote: > > > On Wed, Aug 7, 2024 at 5:00 PM Lok P wrote: > >> >> >> On Wed, Aug 7, 2024 at 4:51 PM sud wrote: >> >>> >>> >>> Others may correct but i think, If you don't have the FK defined

Re: Column type modification in big tables

2024-08-07 Thread Lok P
On Wed, Aug 7, 2024 at 4:51 PM sud wrote: > > > Others may correct but i think, If you don't have the FK defined on these > columns you can do below. > > > --Alter table add column which will be very fast within seconds as it will > just add it to the data dictionary. > > ALTER TABLE tab1 ADD COL

Column type modification in big tables

2024-08-07 Thread Lok P
Hello all, We have a postgres table which is a range partitions on a timestamp column having total size ~3TB holding a total ~5billion rows spanning across ~150 daily partitions and there are ~140+columns in the table. Also this table is a child to another partition table. And we have partition cre

Re: Partition boundary messed up

2024-07-27 Thread Lok P
Thank you. I understand that is going to take a lot of time as we already have billions of rows in the main table spread across 40+ partitions. Also this table is child to another parent partition table and so, it will take a lot of time to validate the FK constraint back for the new table. Is the

Partition boundary messed up

2024-07-24 Thread Lok P
Hello All, We normally operate on UTC timezone so we normally create partitions in UTC timezone so that each day partition starts from today's midnight UTC to next day's midnight UTC. The script looks something like below. And also that way reference partition tables are also created in a similar w

Re: Design strategy for table with many attributes

2024-07-05 Thread Lok P
On Fri, 5 Jul, 2024, 1:44 pm David Rowley, wrote: > On Fri, 5 Jul 2024 at 19:53, Lok P wrote: > > As David suggested it breaks if a row exceeds the 8k limit I. E a single > page size , will that still holds true if we have a column with JSON in it? > > You wouldn't be at

Re: Design strategy for table with many attributes

2024-07-05 Thread Lok P
, wrote: > On Jul 4, 2024, at 23:28, Lok P wrote: > > > > *"Note that you might want to split up the “parent” table if that > naturally groups its columns together for certain uses. In that case, you > could have the same pk on all the 1:1 tables you then have. In that

Re: Design strategy for table with many attributes

2024-07-04 Thread Lok P
On Fri, Jul 5, 2024 at 10:45 AM Guyren Howe wrote: > On Jul 4, 2024, at 22:07, Lok P wrote: > > If you stick to the principle of grouping columns in a table when you use > those columns together, you should be good. > > Note that you might want to split up the “parent” table

Re: Design strategy for table with many attributes

2024-07-04 Thread Lok P
On Fri, Jul 5, 2024 at 1:26 AM David G. Johnston wrote: > On Thu, Jul 4, 2024 at 12:38 PM Lok P wrote: > >> >> Should we break the single transaction into multiple tables like one main >> table and other addenda tables with the same primary key to join and fetch

Design strategy for table with many attributes

2024-07-04 Thread Lok P
Hello, In one of the applications we are getting transactions in messages/events format and also in files and then they are getting parsed and stored into the relational database. The number of attributes/columns each transaction has is ~900+. Logically they are part of one single transaction and s

Re: Creating big indexes

2024-06-09 Thread Lok P
On Sun, Jun 9, 2024 at 10:39 AM Lok P wrote: > > > On Sun, Jun 9, 2024 at 10:36 AM sud wrote: > >> >> You can first create the index on the table using the "On ONLY"keyword, >> something as below. >> >> CREATE INDEX idx ON ONLY tab(c

Re: Creating big indexes

2024-06-08 Thread Lok P
On Sun, Jun 9, 2024 at 10:36 AM sud wrote: > > You can first create the index on the table using the "On ONLY"keyword, > something as below. > > CREATE INDEX idx ON ONLY tab(col1); > > Then create indexes on each partition in "concurrently" from multiple > sessions in chunks. > > CREATE INDEX CO

Re: How to create efficient index in this scenario?

2024-06-08 Thread Lok P
On Sat, Jun 8, 2024 at 7:03 PM veem v wrote: > Hi , > It's postgres version 15.4. A table is daily range partitioned on a column > transaction_timestamp. It has a unique identifier which is the ideal for > primary key (say transaction_id) , however as there is a limitation in > which we have to i

Creating big indexes

2024-06-08 Thread Lok P
Hello, We have a few tables having size ~5TB and are partitioned on a timestamp column. They have ~90 partitions in them and are storing 90 days of data. We want to create a couple of indexes on those tables. They are getting the incoming transactions(mainly inserts) 24/7 , which are mostly happeni

Re: Question on roles and privileges

2024-05-09 Thread Lok P
For the initial installation the extensions may need superuser privileges. On Fri, May 10, 2024 at 10:04 AM yudhi s wrote: > Hello All, > We want to make sure to keep minimal privileges for the users based on > their roles and responsibility. We have one user group who will be working > on analy

How you make efficient design for CDC and book marking

2024-04-25 Thread Lok P
Hello, My understanding is that the replication tools normally rely on the database transaction logs to find the CDC/delta changes(Insert/Update/Delete) for tables and then move those delta changes to the target system/databases. Whatever may be the source database (It might be open source postgres

Re: Logging statement having any threat?

2024-04-21 Thread Lok P
On Sat, Apr 20, 2024 at 10:02 PM Adrian Klaver wrote: > > Have you tried?: > > > https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT > > " > log_statement (enum) > ><...> > > The default is none. Only superusers and users with the appropriate SET >

Logging statement having any threat?

2024-04-20 Thread Lok P
Hello All, Its postgres version 15.4 and its RDS, in which our dev team gets the infrastructure code from another third party team which provides us base infrastructure code to build a postgres database, in which we will be able to do change DB parameter values etc whatever is mentioned in the file

Re: Issue with date/timezone conversion function

2024-04-09 Thread Lok P
On Tue, Apr 9, 2024 at 10:33 PM Tom Lane wrote: > Lok P writes: > > These tables are INSERT only tables and the data in the create_timestamp > > column is populated using the now() function from the application, which > > means it will always be incremental, and the histo

Re: Issue with date/timezone conversion function

2024-04-09 Thread Lok P
On Tue, Apr 9, 2024 at 9:26 PM Adrian Klaver wrote: > On 4/9/24 08:43, Lok P wrote: > > Hi All, > > It's version 15.4 of postgresql database. Every "date/time" data type > > attribute gets stored in the database in UTC timezone only. One of the > > suppo

Issue with date/timezone conversion function

2024-04-09 Thread Lok P
Hi All, It's version 15.4 of postgresql database. Every "date/time" data type attribute gets stored in the database in UTC timezone only. One of the support persons local timezone is "asia/kolkata" and that support person needs to fetch the count of transactions from a table- transaction_tab and s

Cron not running

2024-03-28 Thread Lok P
Hello All, In RDS postgres(version 15.4), we have scheduled partition maintenance through pg_partman and it's scheduled through pg_cron as below. The same script has been executed in dev and test environments, and we are seeing the cron job is scheduled in both environments because we see one recor

Re: Not able to purge partition

2024-03-24 Thread Lok P
On Sun, Mar 24, 2024 at 12:38 AM veem v wrote: > On Sat, 23 Mar 2024 at 23:08, Laurenz Albe > wrote: > >> On Sat, 2024-03-23 at 22:41 +0530, veem v wrote: >> > 1)As we see having foreign key defined is making the detach partition >> run >> > for minutes(in our case 5-10minutes for 60 million r

Re: Is partition pruning impacted by data type

2024-03-05 Thread Lok P
On Tue, Mar 5, 2024 at 1:09 AM sud wrote: > > However the question we have is , > 1)If there is any downside of having the partition key with "timestamp > with timezone" type? Will it impact the partition pruning of the queries > anyway by appending any run time "time zone" conversion function du

Re: Creating table and indexes for new application

2024-02-25 Thread Lok P
On Sun, 25 Feb, 2024, 1:05 am yudhi s, wrote: > > On Fri, Feb 23, 2024 at 5:26 PM sud wrote: > >> >> >> On Fri, 23 Feb, 2024, 1:28 pm yudhi s, >> wrote: >> >>> >>> >>> On Fri, 23 Feb, 2024, 1:20 pm sud, wrote: >>> On Fri, 23 Feb, 2024, 12:41 pm Laurenz Albe, wrote: >>

Re: Creating table and indexes for new application

2024-02-22 Thread Lok P
My 2 cents... Foreign key indexes are required for avoiding locking when deleting and updating the pk in parent. But insert only table may not have any issue. And we used to do this in other relational databases like oracle but not sure how different is this in PG. However considering your high tra

Re: Question on Table creation

2024-02-20 Thread Lok P
*"1)In this situation , do we have to drop the "Schema_Owner" and recreate it with all small letters? And then create the schema with small letters again?"* As per above question goes, I believe OP is not required to drop and recreate but has to just Rename the user something as below and that w

Users and object privileges maintenance

2024-02-17 Thread Lok P
Hello All, We were having past experience in Oracle and are newly getting moved to postgres database. In there we have schema which alternately also called as Users and the super user was sys/system through which the dev team never login but only DBA does. And DBA used to create some functional us

Re: How to do faster DML

2024-02-07 Thread Lok P
> On Tue, Feb 6, 2024 at 8:34 AM Greg Sabino Mullane > wrote: > >> On Sun, Feb 4, 2024 at 3:52 PM Lok P wrote: >> >>> What I mean was, we will definitely need the data for querying purposes >>> by the users, but just to keep the storage space incontrol (and

Re: How to do faster DML

2024-02-05 Thread Lok P
Thank you Greg. *"and keeping all your active stuff in cache. Since you have 127 columns, only pull back the columns you absolutely need for each query."* Understood the point. As postgres is a "row store" database, so keeping the size of the row lesser by making the number of columns to minimum,

Re: How to do faster DML

2024-02-05 Thread Lok P
On Tue, Feb 6, 2024 at 8:34 AM Greg Sabino Mullane wrote: > On Sun, Feb 4, 2024 at 3:52 PM Lok P wrote: > >> This table will always be queried on the transaction_date column as one >> of the filters criteria. But the querying/search criteria can span from a >>

Re: How to do faster DML

2024-02-05 Thread Lok P
On Mon, 5 Feb, 2024, 2:21 am Lok P, wrote: On Sun, Feb 4, 2024 at 9:18 PM Greg Sabino Mullane wrote: Partitioning will depend a lot on how you access the data. Is it always using that transaction_date column in the where clause? Can you share the table definition via \d? We've talked

Re: How to do faster DML

2024-02-04 Thread Lok P
On Sun, Feb 4, 2024 at 9:25 PM Ron Johnson wrote: > 1. Load the children before attaching them to the parent. > 2. Create the child indices, PK and FKs before attaching to the parent. > 3. Do step 2 in multiple parallel jobs. (cron is your friend.) > 4. Attach the children to the "naked" (no PK,

Re: How to do faster DML

2024-02-04 Thread Lok P
On Sun, Feb 4, 2024 at 9:18 PM Greg Sabino Mullane wrote: > Partitioning will depend a lot on how you access the data. Is it always > using that transaction_date column in the where clause? Can you share the > table definition via \d? We've talked about this table quite a bit, but not > actually

Re: How to do faster DML

2024-02-04 Thread Lok P
On Sun, Feb 4, 2024 at 8:14 PM Dennis White wrote: > I'm surprised no one has mentioned perhaps it's a good idea to partition > this table while adding the pk. By your own statements the table is > difficult to work with as is. Without partitioning the table, row inserts > would need to walk the

Re: How to do faster DML

2024-02-03 Thread Lok P
On Sun, Feb 4, 2024 at 12:50 AM Francisco Olarte wrote: > On Sat, 3 Feb 2024 at 19:29, Greg Sabino Mullane > wrote: > ... > > Given the size of your table, you probably want to divide that up. > > As long as nothing is changing the original table, you could do: > > > > insert into mytable2 selec

Re: How to do faster DML

2024-02-03 Thread Lok P
Ron Johnson 7:37 PM (1 hour ago) to *pgsql-general* On Sat, Feb 3, 2024 at 7:37 PM Ron Johnson wrote: > On Sat, Feb 3, 2024 at 8:55 AM Lok P wrote: > >> Apology. One correction, the query is like below. I. E filter will be on >> on ctid which I believe is equivalent of rowi

Re: How to do faster DML

2024-02-03 Thread Lok P
;=1 ); On Sat, Feb 3, 2024 at 5:50 PM Lok P wrote: > Hello All, > A non partitioned table having ~4.8 billion rows in it and having data > size as ~1.4TB , row size as ~313 bytes having ~127 columns in it. This has > got approx ~1billion+ duplicate rows inserted in it and we want to get

How to do faster DML

2024-02-03 Thread Lok P
Hello All, A non partitioned table having ~4.8 billion rows in it and having data size as ~1.4TB , row size as ~313 bytes having ~127 columns in it. This has got approx ~1billion+ duplicate rows inserted in it and we want to get the duplicate data removed for this table and create a PK/unique const

Re: Read write performance check

2023-12-22 Thread Lok P
imilar setups are >> already available. Can someone please guide me to some docs or blogs or >> sample scripts, on same please. >> >> On Wed, 20 Dec, 2023, 10:34 am Lok P, wrote: >> >>> As Rob mentioned, the syntax you posted is not correct. You need to &

Re: Read write performance check

2023-12-19 Thread Lok P
As Rob mentioned, the syntax you posted is not correct. You need to process or read a certain batch of rows like 1000 or 10k etc. Not all 100M at one shot. But again your uses case seems common one considering you want to compare the read and write performance on multiple databases with similar ta