On Wed, Jul 23, 2014 at 7:35 PM, Peter Geoghegan <p...@heroku.com> wrote: >> It's certain arguable whether you should INSERT and then turn failures >> into an update or try to UPDATE and then turn failures into an INSERT; >> we might even want to have both options available, though that smells >> a little like airing too much of our dirty laundry. But I think I >> generally favor optimizing for the UPDATE case for more or less the >> same reasons Kevin articulated. > > I don't see the connection between this and Kevin's remarks. And FWIW, > I don't see a reason to favor inserts or updates. Fortunately, what I > have balances both cases very well, and doesn't cause bloat. The work > of descending the index to lock it isn't wasted if an update is > required. My implementation decides to either insert or update at > literally the latest possible moment.
AFAIUI, this is because your implementation uses lwlocks in a way that Andres and I both find unacceptable. My suspicion is that any version of this that ends up getting committed is going to involve a risk of bloat in cases involving retries, and I think it will be easier to minimize bloat in an update-driven implementation. But I suppose that's speculative. >> Here you seem to be suggested that I intended to propose your existing >> design rather than something else, which I didn't. In this design, >> you find the conflict (at most one) but scanning for the tuple to be >> updated. > > Yes, but what if you don't see a conflict because it isn't visible to > your snapshot, and then you insert, and only then (step 5), presumably > with a dirty snapshot, you find a conflict? How does the loop > terminate if that brings you back to step 1 with the same MVCC > snapshot feeding the update? Good point. Maybe the syntax should be something like: UPSERT table (keycol [, keycol] ...) { VALUES (val [, val] ...) [, ...] | select_query } That would address both the concern about being able to pipe multiple tuples through it and the point you just raised. We look for a row that matches each given tuple on the key columns; if one is found, we update it; if none is found, we insert. > I agree that you want to uniquely identify each tuple. What I meant > was, why should we not be able to upsert multiple rows in a single > command? What's wrong with that? Nothing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers