On Wed, Jan 17, 2018 at 10:45 AM, Jacek Kołodziej <kolodzi...@gmail.com> wrote:
> > > 17.01.2018 3:30 PM "Melvin Davidson" <melvin6...@gmail.com> napisał(a): > > > > On Wed, Jan 17, 2018 at 8:08 AM, Jacek Kołodziej <kolodzi...@gmail.com> > wrote: > >> Hello everyone, >> >> this is my first post here and I'm starting with asking a question about >> data consistency between two consecutive SELECTs using PostgreSQL 9.6. >> >> I'm sorry if that's something that was already discussed - I couldn't >> find it either in archives, nor in _general internet_. If it is, I would >> appreciate pointing it out. >> >> I have an "append-only" events table - only INSERT and SELECT queries are >> issued to it. It has an integer (sequence) ID as a primary key. >> >> I'm issuing following two queries (within the same READ COMMITTED >> transaction) to that table: >> - first query (A): SELECT id FROM events ORDER BY events.id DESC LIMIT 1 >> - I'm saving the result as a "max ID" for the second query >> - second query (B): SELECT id, ... FROM events WHERE id > "min ID" AND id >> <= "max ID" AND ... >> - I won't dig into what's "min ID" but you can assume it's at most 100 >> less than "max ID" >> >> Concurrently, rows are being added to that table. >> >> Please note that there's enough data in the index for the first query (A) >> to perform an index-only scan. I'm not sure if that's relevant but "B" >> query does an index scan because of other conditions I haven't mentioned >> but still needs to fetch data from the table's heap. >> >> Here's what happening to me: the "A" query occasionally (in my case: on >> the order of tenths per day) returns an ID _higher_ than any ID present in >> second query's result (other conditions I haven't specified do _not_ filter >> any more rows than "id <= max ID") - as if some entries were visible for >> the first query, but not for the second one. This is an inconsistency that >> is very problematic for me. >> >> All I can think of is that it might be caused by the index-only-scan and >> READ COMMITTED transaction isolation level but from what I gather from >> documentation, it should not be possible due to the use of visibility >> map[0][1][2]. >> >> And yet it happens, likely for some other reason but I can't think of >> any. I've tried switching transaction isolation level to REPEATABLE READ >> (on the basis that it's an faulty phenomenon occurring during to some bug) >> but that didn't help. >> >> Where am I wrong? What am I missing? What information may I provide to >> help with investigating this? >> >> [0] https://www.postgresql.org/docs/9.6/static/indexes-index-onl >> y-scans.html >> [1] https://www.postgresql.org/docs/9.6/static/storage-vm.html >> [2] https://wiki.postgresql.org/wiki/Index-only_scans >> >> -- >> Kind regards, >> Jacek Kołodziej >> http://kolodziejj.info >> > > > *>- first query (A): SELECT id FROM events ORDER BY events.id > <http://events.id> DESC LIMIT 1 - I'm saving the result as a "max ID" for > the second query* > > *Just a suggestion. The first query is not really needed.* > *You can simply do:* > > *second query (B): * > > > > > > > > *SELECT id, ... FROM events WHERE id > MIN(ID) AND id <= MAX(ID) > AND ...* > > > *See https://www.postgresql.org/docs/9.6/static/functions-aggregate.html > <https://www.postgresql.org/docs/9.6/static/functions-aggregate.html> * > > *MAX and MIN functions* > > > Hi Melvin, thank you for the suggestion. Unfortunately it won't do in my > case. Sorry for not providing enough context in the first place. > > After making the first query (A), I'm iterating over that table (with > LIMIT 100 and increasing OFFSET) - using a query "B" - until another > condition is met; overall, code is supposed to gather a number of rows from > the table. I'm also using the "max ID" for another purpose. > > > > -- > *Melvin Davidson* > I reserve the right to fantasize. Whether or not you > wish to share my fantasy is entirely up to you. > > > *Oops, my bad. I didn't have my morning coffee yet.* *Try this instead:* *WITH minmax AS* * (SELECT MIN(id) as min_id,* * MAX(id AS max_id)* * FROM events * * ) * *SELECT id, ... FROM events e,* * minmax m* * WHERE e.id <http://e.id> > m.min_id AND e.id <http://e.id> <= m.max_id AND ...* -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.