Re: Insert into on conflict, data size upto 3 billion records
exactly, for now, what I did was, as the table is already partitioned, I created 50 different connections and tried updating the target table by directly querying from the source partition tables. Are there any other techniques that I can use to speed this up? also when we use on conflict statement for both insert and update does Postgres uses batching internally (committing for every 1 records etc) or will it update all records at once, in that case, does it create a version for each record and do swap all at once? I'm wondering how atomicity is guaranteed, also if I have to do batching other than selecting from individual partitions does doing it batches of 1 records help? On Sat, Feb 13, 2021 at 12:04 PM Ron wrote: > On 2/12/21 12:46 PM, Karthik Kumar Kondamudi wrote: > > Hi, > > I'm looking for suggestions on how I can improve the performance of the > below merge statement, we have a batch process that batch load the data > into the _batch tables using Postgres and the task is to update the main > target tables if the record exists else into it, sometime these batch table > could go up to 5 billion records. Here is the current scenario > > target_table_main has 700,070,247 records and is hash partitioned into > 50 chunks, it has an index on logical_ts and the batch table has > 2,715,020,546 close to 3 billion records, so I'm dealing with a huge set of > data so looking of doing this in the most efficient way. > > > Many times, I have drastically sped up batch processing by #1 partitioning > on the same field as an index, and #2 pre-sorting the input data by that > field. > > That way, you get excellent "locality of data" (meaning lots of writes to > the same hot bits of cache, which later get asynchronously flushed to > disk). Unfortunately for your situation, the purpose of hash partitioning > is to *reduce* locality of data. (Sometimes that's useful, but *not* > when processing batches.) > > -- > Angular momentum makes the world go 'round. > -- Regards, Karthik K Kondamudi
Re: Insert into on conflict, data size upto 3 billion records
yes, I'm using \copy to load the batch table, with the new design that we are doing, we expect updates to be less going forward and more inserts, one of the target columns I'm updating is indexed, so I will drop the index and try it out, also from your suggestion above splitting the on conflict into insert and update is performant but in order to split the record into batches( low, high) I need to do a count of primary key on the batch tables to first split it into batches On Mon, Feb 15, 2021 at 11:06 AM Rob Sargent wrote: > > > On 2/15/21 11:41 AM, Karthik K wrote: > > exactly, for now, what I did was, as the table is already partitioned, I > > created 50 different connections and tried updating the target table by > > directly querying from the source partition tables. Are there any other > > techniques that I can use to speed this up? also when we use on conflict > > statement for both insert and update does Postgres uses batching > > internally (committing for every 1 records etc) or will it update > > all records at once, in that case, does it create a version for each > > record and do swap all at once? I'm wondering how atomicity is > > guaranteed, also if I have to do batching other than selecting from > > individual partitions does doing it batches of 1 records help? > > > > > > What is your ratio of inserts versus update? Can you separate the > inserts and updates? Is the target table indexed other than on primary > key? If so can they be dropped? > > Assuming you use \copy to load the batch tables > > I've found this strategy to be effective: > index batch on id > > --update first > begin > update target t set "all fields" from batch b where t.id = b.id and b.id > between "hi" and "low" > commit > increment hi low, avoid overlap; repeat > > --insert > begin; > insert into target as select b.* from from batch b where not exists > (select 1 from target v where b.id = v.id) and b.id between "hi" and "low" > commit > increment hi, low, avoid overlap; repeat > > > > > > > > > -- Regards, Karthik K Kondamudi
Migrating from Oracle - Implicit Casting Issue
Hi Team, We are migrating from Oracle 12c to Aurora Postgres 13 and running into implicit casting issues. Oracle is able to implicitly cast the bind value of prepared statements executed from the application to appropriate type - String -> Number, String -> Date, Number -> String etc. when there is a mismatch b/w java data type and the column Datatype. For example: If the Datatype of a Column is defined as Number and the application sends the bind value as a String (with single quotes in the query) - Oracle DB is able to implicitly cast to Number and execute the query and return the results. The same is not true with Postgres and we are getting below exception *org.postgresql.util.PSQLException: ERROR: operator does not exist: bigint = character varying* *Hint: No operator matches the given name and argument types. You might need to add explicit type casts..* We found a Postgres Driver property - stringtype=unspecified which appears to solve this problem and have the following questions. https://jdbc.postgresql.org/documentation/83/connect.html Could you please let us know the following? Q1) Will configuring this stringtype property introduce overhead on Postgres leading to Performance issues Q2)Does setting this attribute have any other implications on the data in the DB. Q3)Is there any plan to deprecate / stop supporting this attribute in future Aurora Postgres releases. -- Karthik klv
Re: Migrating from Oracle - Implicit Casting Issue
Hi David, Thanks for the quick response. Making sure I got it right - U mean Postgres DB Server when you say server right? IIUC, by configuring this property, the driver will not set the type and leave it to the Postgres DB Server to map it to the appropriate type. Will this have any performance implication on the Postgres server that we need to worry about? Thanks, Karthik K L V On Tue, Jul 19, 2022 at 12:12 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Monday, July 18, 2022, Karthik K L V > wrote: > >> Hi Team, >> >> We are migrating from Oracle 12c to Aurora Postgres 13 and running into >> implicit casting issues. >> >> Oracle is able to implicitly cast the bind value of prepared statements >> executed from the application to appropriate type - String -> Number, >> String -> Date, Number -> String etc. when there is a mismatch b/w java >> data type and the column Datatype. >> >> For example: If the Datatype of a Column is defined as Number and the >> application sends the bind value as a String (with single quotes in the >> query) - Oracle DB is able to implicitly cast to Number and execute the >> query and return the results. >> >> The same is not true with Postgres and we are getting below exception >> >> *org.postgresql.util.PSQLException: ERROR: operator does not exist: >> bigint = character varying* >> *Hint: No operator matches the given name and argument types. You might >> need to add explicit type casts..* >> >> We found a Postgres Driver property - stringtype=unspecified which >> appears to solve this problem and have the following questions. >> https://jdbc.postgresql.org/documentation/83/connect.html >> >> Could you please let us know the following? >> >> Q1) Will configuring this stringtype property introduce overhead on >> Postgres leading to Performance issues >> Q2)Does setting this attribute have any other implications on the data in >> the DB. >> Q3)Is there any plan to deprecate / stop supporting this attribute in >> future Aurora Postgres releases. >> >> > That setting is not recognized by the server in any way, it is a driver > concern only. IIUC it makes the Java Driver behave in a way consistent > with the expectations of the server since by leaving the supplied type info > undeclared the server can use its own logic. If it works for you I say use > it, it will be less problematic than methodically fixing your queries up > front. Though if there are some that show to be bottlenecks getting the > type info correct may prove to make a difference in some situations. > > David J. > > -- Karthik klv
operator does not exist: text = bytea
Hi Team, I am getting the below error while executing a Select query using Spring DataJPA and Hibernate framework in Aurora Postgres SQL. *Caused by: org.postgresql.util.PSQLException: ERROR: operator does not exist: text = bytea Hint: No operator matches the given name and argument types. You might need to add explicit type casts. Position: 1037* We have a query with bind value which sometimes gets resolved to null (no data) depending on the application scenario. The datatype of the bindvalue and the corresponding column is String. The same query executes fine when the value of the bindvalue is populated. Could you please let me know how I can resolve this issue without making changes to the query? Is there any configuration available in the Postgres Driver or on the Postgres DB Server which can be set to resolve null bind values? Appreciate your help. Thank you. -- Karthik klv
Re: operator does not exist: text = bytea
Update: Followed this thread PostgreSQL: Re: Null bind variable in where clause <https://www.postgresql.org/message-id/4906DD3E-322A-4E26-8600-B967AFA8A58F%40fastcrypt.com> and set *transform_null_equals to ON* in the parameter group on the AWS Console. But no luck. We are using Aurora PostgresSQL v13.3 On Wed, Jul 20, 2022 at 3:02 PM Karthik K L V wrote: > Hi Team, > > I am getting the below error while executing a Select query using Spring > DataJPA and Hibernate framework in Aurora Postgres SQL. > > > > *Caused by: org.postgresql.util.PSQLException: ERROR: operator does not > exist: text = bytea Hint: No operator matches the given name and argument > types. You might need to add explicit type casts. Position: 1037* > > We have a query with bind value which sometimes gets resolved to null (no > data) depending on the application scenario. > The datatype of the bindvalue and the corresponding column is String. > The same query executes fine when the value of the bindvalue is populated. > > Could you please let me know how I can resolve this issue without making > changes to the query? > Is there any configuration available in the Postgres Driver or on the > Postgres DB Server which can be set to resolve null bind values? > > Appreciate your help. Thank you. > > -- > Karthik klv > -- Karthik klv
Re: operator does not exist: text = bytea
Hi depesz, Thanks for your reply. But, this issue is happening only when the bind value of the query resolves to null. I am not trying to compare text to bytes. And the same query works fine when the bind value gets resolves to some String. So, looking for an option which can tell Postgres Engine to read *= null* as *is null*. On Wed, Jul 20, 2022 at 5:29 PM hubert depesz lubaczewski wrote: > On Wed, Jul 20, 2022 at 03:02:13PM +0530, Karthik K L V wrote: > > *Caused by: org.postgresql.util.PSQLException: ERROR: operator does not > > exist: text = bytea Hint: No operator matches the given name and > argument > > types. You might need to add explicit type casts. Position: 1037* > > Could you please let me know how I can resolve this issue without making > > changes to the query? > > I don't think it's possible. > > bytea is basically array of bytes. > text on the other hand is array of characters. > > Do the bytes "\xbf\xf3\xb3\x77" equal text "żółw"? > > They actually kinda do, if we assume the bytes are text in encoding > Windows-1252 - in which case the bytes mean "żółw". > > But if we'd assume the encoding to be, for example, iso8859-1, then the > same sequence of bytes means "¿ó³w" > > That's why you can't compare bytes to characters. > > You have to either convert bytes to text using convert or convert_from > functions, or change text into bytea using convert_to. > > In some cases you can simply cast text to bytea: > > $ select 'depesz'::text::bytea; > bytea > > \x64657065737a > (1 row) > > which will work using current server encoding, afair. > > depesz > -- Karthik klv
Postgres SQL unable to handle Null values for Text datatype
Hi Team, We are migrating from Oracle 12C to Aurora Postgres 13 and running into query failures when the bind value of a Text datatype resolves to null. The same query works fine in Oracle without any issues. We use SpringDataJPA and Hibernate framework to connect and execute queries and the application uses native queries. Here is an example query: *Select * from A where middle_name=?1* The above query fails with the below exception when the value of ?1 resolves to null. *org.postgresql.util.PSQLException: ERROR: operator does not exist: character varying = bytea Hint: No operator matches the given name and argument types. You might need to add explicit type casts. Position: 64* We debugged through the Hibernate code comparing Oracle vs Postgres for the same query to understand if the framework was doing anything different when switched to Postgres and didn't notice any difference in the behaviour. We have also set *transform_null_equals *to ON in Postgres..but this doesn't help. Could you please let us know if there are any other configurations that need to be set in Postgres to make it work similar to Oracle? This issue is impacting multiple modules in our application and any help will be appreciated. -- Karthik klv