Mark Kirkwood <mark.kirkw...@catalyst.net.nz> writes: > A construction of the form
> DECLARE cur CURSOR WITH HOLD FOR SELECT * FROM obj > loop > FETCH 1000 FROM cur > process 'em > COMMIT > results in some of the same rows being emitted more than once, altho the > final rowcount is correct (i.e some rows end up being never seen). I poked into this a bit, and it looks sort of nasty. Mark's immediate complaint is a consequence of the synchronize_seqscan patch, but there are other issues too. The problem comes from the fact that a WITH HOLD cursor is initially treated the same as a regular cursor, ie, we just fetch on demand. If it's still open at transaction commit, we do this: ExecutorRewind(); fetch all the rows into a tuplestore; advance the tuplestore past the number of rows previously fetched; and then later transactions can fetch-on-demand from the tuplestore. The reason for the bug is that with synchronize_seqscan on, a SeqScan node that gets rewound does not necessarily restart from the same point in the table that it initially started reading from. So the initial fetch grabs 1000 rows, but then when we rewind, the first 1000 rows loaded into the tuplestore may come from a different range of the table. This does not only affect cursors WITH HOLD. Some paths in the cursor MOVE logic also rely on ExecutorRewind to behave sanely. We could probably fix this specific issue by refactoring things in such a way that the seqscan start point is frozen on the first read and re-used after rewinds. However, it strikes me also that a cursor query containing volatile functions is going to cause some similar issues --- you can't just re-execute the query for "the same" rows and expect to get stable results. What should we do about that? The technically best solution is probably similar to what Materialize nodes do, ie, read the query only once and be careful to stash rows aside into a tuplestore as they are read. This would fix both issues with one patch. The problem with that is that if the user doesn't actually do any backwards fetching, you waste all the tuplestore maintenance work. Or we could just document that cursors containing volatile functions don't behave stably if you try to read backwards; or try to enforce that you can't do so. The volatile-function issue has been there since the dawn of time, and we've never had a complaint about it AFAIR. So maybe trying to "fix" it isn't a good thing and we should just document the behavior. But the syncscan instability is new and probably ought to be dealt with. Thoughts? regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs