To clarify, the work around (for SQLAlchemy) does not require the RETURNING values to be ordered; they will be reordered code side. Longer term matching the order up would be an added bonus!
On Thu, 13 Apr 2023 at 04:02, John Howroyd <jdhowr...@googlemail.com> wrote: > The ideal solution would be that the RETURNING values are in the same > order as the declared rows in the table literal "FROM (VALUES (10, 11, 1), > (12, 13, 2)) AS sel(p1, p2, num)". But before any of you jump down my > throat about about sets and a bar to parallel inserts, I think it's a small > matter of some internal smoke and mirrors: simply marshal the RETURNING > results through and in memory/temporary table maintaining the order from > the SELECT and return these to the client upon commit. At this moment, I > believe the folk at SQLAlchemy are looking for a work around that can work > for most use cases with either an identity column or a fake "sentinel" > column for matching back to the ORM objects. There may be an upper limit > of 1000 to the number rows in a single insert (this arises in MS SQL > Server, who absolutely do not guarantee the order of their OUTPUT clause -- > it's tied to the physical row insert rather than marshalled). > > My feeling is that PostgreSQL may already do this. See the commit: > > https://github.com/postgres/postgres/commit/9118d03a8cca3d97327c56bf89a72e328e454e63 > and the original message thread > > https://www.postgresql.org/message-id/flat/CADBwkHv1nUx5bqxqRLp26k8iUWSnY3uW-1_0EMYxKns0t2JnUA%40mail.gmail.com > and references to undesired behaviour prior to PostgreSQL 9.6 such as in > https://www.postgresql.org/docs/current/sql-select.html. > > The test cases in the above commit use: > SELECT nextval('sequence'), datavals FROM ... ORDER BY something; > type syntax. And I presume that's exactly what's happening internally > when there's a serial identity column (via query rewrite rules or another > mechanism). > > So really, that just needs confirming. What happens with triggers? > Presumably the same as long as query rewrite rules are used, but this is > presumably getting deeper into the code for actual inserts after the > initial select. The jump to the output being ordered, is just a guess but > there's some bits and pieces that seem to suggest that there may indeed be > a marshalling process going on in the background (whether or not that is > linked to the original order is another matter). > > I have set up a PostgreSQL server to test if I can break this hypothesis > and see what query explains can allude to. Does anyone have a test case > where the order is not preserved? > > Might I also point out that if the implementation of parallel inserts does > create a bar then doing so may end up with the programmatic interfaces > (such as SQLAlchemy) not being able to use that feature (possibly reverting > back to single inserts). Ur, so what would be the problem being solved > with parallel inserts? > > On Thu, 13 Apr 2023 at 02:27, Mike Bayer <mike...@zzzcomputing.com> wrote: > >> We do something different for uuids. These are usually created client >> side anyway or are very easy to create client side if not and then we also >> have a separate sentinel column option. >> >> Overall the whole thing is an optimization that can be turned off for >> uncommon cases. We just can't go back to having insert of 1000 rows be >> 3-5x slower for all tables w server generated primary keys which is what >> statement at a time gets us. Surrogate integer Pks w sequence or identity >> is the big case. >> >> On Wed, Apr 12, 2023, at 8:58 PM, Thorsten Glaser wrote: >> > On Wed, 12 Apr 2023, Mike Bayer wrote: >> > >> >>ascending values for "mytable_id_seq" will correspond to the ascending >> >>values for "num". >> > >> > But, again, what if it’s uuid or something instead of a sequence? >> > >> > bye, >> > //mirabilos >> > -- >> > 15:41⎜<Lo-lan-do:#fusionforge> Somebody write a testsuite for >> helloworld :-) >> >> >>