> On Apr 6, 2023, at 4:49 PM, Louis Tian <louis.t...@aquamonix.com.au> wrote: > > Hi Peter, > > Thanks for your reply. Appreciate the help and discussion. > >> In general UPSERT (or any definition of it that I can think of) does >> not imply idempotency. > > "Idempotence is the property of certain operations in mathematics and > computer science whereby they can be applied multiple times without changing > the result beyond the initial application." from Wikipedia. > the concept of Idempotence when applies to HTTP is consistent with the above. > https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you > going by a different defintion that I am not aware of? > If you execute the same upsert multiple times, the state of the database > remains the same as if only execute once.
Not necessarily. Consider the following UPSERT statement: INSERT INTO customer_order_counts (customer_id,order_count) VALUES (123456, 1) ON CONFLICT (customer_id) DO UPDATE SET order_count=customer_order_counts.order_count+1; That is completely valid (I tested it), and actually makes sense as something you might want to do - keep track of how many orders a customer has placed, for example if you only keep the order records for 6 months, but still want to know the total number of orders the customer has placed. If it is a new customer, you insert a record for the customer with an order count of 1. Otherwise, you update the record to increment the order count. Clearly this is NOT an idempotent operation - every time you run it, it changes the order count, so the state of the database does NOT remain the same as if you only execute it once. > If a row already exists, the first statement will update the row so does any > subsequent statements. executing the same update multiple time is the same as > executing it only once. > If the row doesn't exist, the first statement will insert that row and any > subsequent will try to update, but the update has no real effect since it the > value is exactly the same as the insert. > So by defintion, upsert is idempotent. > >> It could just be a unique index or a unique constraint. So you can >> upsert on any individual unique constraint/index, or the primary key. >> Of course there might be several on a given table, but you can only >> use one as the "conflict arbiter" per statement. > > Understand that I can use any unique constraint with on conflict. > But semantically the only correct one is the primary key, since that's what > identifies a row logically. Not really. Generally when I am doing an UPSERT, I am NOT using the primary key, but rather some other UNIQUE-ly indexed column(s). My primary key is typically an ID column that is defined as a serial, and automatically generated by the database. The unique column I use for the upset, however, is generally something that would identify the row to a human - such as Station ID and timestamp columns in a database I have of seismic readings. Each reading gets a unique ID (the primary key) that identifies it to the database and is used in joins. However, occasionally a reading gets updated, so I update the row, based not on the primary key (which I don’t know for incoming data), but on the station and timestamp. UPSERT, but not based on the primary key. --- Israel Brewster Software Engineer Alaska Volcano Observatory Geophysical Institute - UAF 2156 Koyukuk Drive Fairbanks AK 99775-7320 Work: 907-474-5172 cell: 907-328-9145 > In that sense, any unique column(s) is a potential candidate for primary key. > It's more of a pedantic point rather than pragmatic one. > It's less of a problem for PostgreSQL where the semantic importance of > primary key is not manifested at implementation level, since all index points > to the tuple directly > Whereas it is more import for Databaes like MySQL where the secondary index > points to the primary key index.