Re: [EXTERNAL]: Re: UPSERT in Postgres

Israel Brewster Sat, 08 Apr 2023 10:10:12 -0700

> On Apr 6, 2023, at 4:49 PM, Louis Tian <louis.t...@aquamonix.com.au> wrote:
> 
> Hi Peter,
> 
> Thanks for your reply. Appreciate the help and discussion.
> 
>>  In general UPSERT (or any definition of it that I can think of) does
>> not imply idempotency.
> 
> "Idempotence is the property of certain operations in mathematics and 
> computer science whereby they can be applied multiple times without changing 
> the result beyond the initial application." from Wikipedia.
> the concept of Idempotence when applies to HTTP is consistent with the above. 
> https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you 
> going by a different defintion that I am not aware of?
> If you execute the same upsert multiple times, the state of the database 
> remains the same as if only execute once.


Not necessarily. Consider the following UPSERT statement:

INSERT INTO customer_order_counts (customer_id,order_count) VALUES (123456, 1) 
ON CONFLICT (customer_id) DO UPDATE SET 
order_count=customer_order_counts.order_count+1;

That is completely valid (I tested it), and actually makes sense as something 
you might want to do - keep track of how many orders a customer has placed, for 
example if you only keep the order records for 6 months, but still want to know 
the total number of orders the customer has placed. If it is a new customer, 
you insert a record for the customer with an order count of 1. Otherwise, you 
update the record to increment the order count. Clearly this is NOT an 
idempotent operation - every time you run it, it changes the order count, so 
the state of the database does NOT remain the same as if you only execute it 
once.

> If a row already exists, the first statement will update the row so does any 
> subsequent statements. executing the same update multiple time is the same as 
> executing it only once.
> If the row doesn't exist, the first statement will insert that row and any 
> subsequent will try to update, but the update has no real effect since it the 
> value is exactly the same as the insert. 
> So by defintion, upsert is idempotent.
> 
>> It could just be a unique index or a unique constraint. So you can
>> upsert on any individual unique constraint/index, or the primary key.
>> Of course there might be several on a given table, but you can only
>> use one as the "conflict arbiter" per statement.
> 
> Understand that I can use any unique constraint with on conflict. 
> But semantically the only correct one is the primary key, since that's what 
> identifies a row logically.

Not really. Generally when I am doing an UPSERT, I am NOT using the primary 
key, but rather some other UNIQUE-ly indexed column(s). My primary key is 
typically an ID column that is defined as a serial, and automatically generated 
by the database. The unique column I use for the upset, however, is generally 
something that would identify the row to a human - such as Station ID and 
timestamp  columns in a database I have of seismic readings. Each reading gets 
a unique ID (the primary key) that identifies it to the database and is used in 
joins. However, occasionally a reading gets updated, so I update the row, based 
not on the primary key (which I don’t know for incoming data), but on the 
station and timestamp. UPSERT, but not based on the primary key.

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> In that sense, any unique column(s) is a potential candidate for primary key.
> It's more of a pedantic point rather than pragmatic one.
> It's less of a problem for PostgreSQL where the semantic importance of 
> primary key is not manifested at implementation level, since all index points 
> to the tuple directly
> Whereas it is more import for Databaes like MySQL where the secondary index 
> points to the primary key index.

Re: [EXTERNAL]: Re: UPSERT in Postgres

Reply via email to