Re: [HACKERS] Serializable snapshot isolation patch

2010-10-24 Thread Kevin Grittner
Jeff Davis wrote: > On Mon, 2010-10-18 at 13:26 -0500, Kevin Grittner wrote: >>> 3. Limited shared memory space to hold information about >>> committed transactions that are still "interesting". >>> It's a challenging problem, however, and the current solution is >>> less than ideal. >> >> I

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-21 Thread Kevin Grittner
Jeff Davis wrote: > When using locks in an unconventional way, it would be helpful to > describe the invalid schedules that you're preventing. Perhaps an > example if you think it would be reasonably simple? Also some > indication of how another process is intended to modify the list > without w

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-21 Thread Kevin Grittner
Jeff Davis wrote: > in this case we do clearly have a problem, because the result is > not equal to the serial execution of the transactions in either > order. Yeah, you're right. I misread that example -- newbie with the PERIOD type. > So the question is: at what point is the logic wrong?

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-21 Thread Jeff Davis
On Thu, 2010-10-21 at 10:29 -0500, Kevin Grittner wrote: > Basically, when we already have a pivot, but no transaction has yet > committed, we wait to see if TN commits first. If so, we have a > problem; if not, we don't. There's probably some room for improving > performance by cancelling T0 or

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-21 Thread Kevin Grittner
Jeff Davis wrote: > That looks like a reasonable state to me, but I'm not sure exactly > what the design calls for. I am guessing that the real problem is > in PreCommit_CheckForSerializationFailure(), where there are 6 > conditions that must be met for an error to be thrown. T2 falls > out righ

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-21 Thread Kevin Grittner
Jeff Davis wrote: >> Also, it appears to be non-deterministic, to a degree at least, >> so you may not observe the problem in the exact way that I do. > The SELECTs only look at the root and the predicate doesn't match. > So each SELECT sets an SIReadLock on block 0 and exits the search. > Looks

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-20 Thread Jeff Davis
On Sun, 2010-10-17 at 22:53 -0700, Jeff Davis wrote: > 2. I think there's a GiST bug (illustrating with PERIOD type): > > create table foo(p period); > create index foo_idx on foo using gist (p); > insert into foo select period( > '2009-01-01'::timestamptz + g * '1 microsecond'::interv

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-20 Thread Kevin Grittner
Robert Haas wrote: > On Tue, Oct 19, 2010 at 6:28 PM, Kevin Grittner > wrote: >> One thing that would work, but I really don't think I like it, is >> that a request for a snapshot for such a transaction would not >> only block until it could get a "clean" snapshot (no overlapping >> serializable

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-19 Thread Robert Haas
On Tue, Oct 19, 2010 at 6:28 PM, Kevin Grittner wrote: > One thing that would work, but I really don't think I like it, is > that a request for a snapshot for such a transaction would not only > block until it could get a "clean" snapshot (no overlapping > serializable non-read-only transactions w

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-19 Thread Kevin Grittner
Jeff Davis wrote: > I briefly looked into this when I woke up this morning, and I > think I'm close. I can reproduce it every time, so I should be > able to fix this as soon as I can find some free time (tomorrow > night, probably). OK, I'll focus on other areas. > I might also be able to he

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-19 Thread Jeff Davis
On Mon, 2010-10-18 at 22:12 -0500, Kevin Grittner wrote: > Hmmm... When Joe was looking at the patch he exposed an intermittent > problem with btree indexes which turned out to be related to improper > handling of the predicate locks during index page clean-up caused by a > vacuum. Easy to fix on

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-18 Thread Kevin Grittner
> Jeff Davis wrote: > On Mon, 2010-10-18 at 13:26 -0500, Kevin Grittner wrote: > I assume here that you mean that you _did_ see the failure > (serialization error) and therefore did not see the problem? Yeah. > Also, are you sure it was using the GiST index for the searches and > didn't just

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-18 Thread Jeff Davis
On Mon, 2010-10-18 at 13:26 -0500, Kevin Grittner wrote: > > 2. I think there's a GiST bug (illustrating with PERIOD type): > > My assumptions for GiST were that: > > (1) A search for a matching value could bail out at any level in the > tree; there is no requirement for the search to proceed

Re: [HACKERS] Serializable snapshot isolation patch

2010-10-18 Thread Kevin Grittner
First off, thanks for the review! I know that it's a lot of work, and I do appreciate it. Jeff Davis wrote: > * Trivial stuff: > > I get a compiler warning: > > indexfsm.c: In function *RecordFreeIndexPage*: > indexfsm.c:55: warning: implicit declaration of function > *PageIsPredica

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Kevin Grittner
Greg Stark wrote: > Just to be clear I wasn't saying it was or wasn't a problem, I was > just trying to see if I understand the problem and if I do maybe > help bring others up to speed. Thanks for that, and my apologies for misunderstanding you. It does sound like you have a firm grasp on my

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Robert Haas
On Sat, Sep 25, 2010 at 10:45 AM, Tom Lane wrote: > Greg Stark writes: >> On Thu, Sep 23, 2010 at 4:08 PM, Kevin Grittner >> wrote: >>> One place I'm particularly interested in using such a feature is in >>> pg_dump. Without it we have the choice of using a SERIALIZABLE >>> transaction, which mi

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Greg Stark
Just to be clear I wasn't saying it was or wasn't a problem, I was just trying to see if I understand the problem and if I do maybe help bring others up to speed. On 25 Sep 2010 23:28, "Kevin Grittner" wrote: > Greg Stark wrote: > >> So T1 must have happened before TN because it wrote something ba

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Kevin Grittner
Greg Stark wrote: > So T1 must have happened before TN because it wrote something based > on data as it was before TN modified it. But T0 can see TN but not > T1 so there's no complete ordering between the three transactions > that makes them all make sense. Correct. > The thing is that the

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Greg Stark
On Sat, Sep 25, 2010 at 4:24 PM, Kevin Grittner wrote: > OK, to get back to the question -- pg_dump's transaction (T0) could > see an inconsistent version of the database if one transaction (TN) > writes to a table, another transaction (T1) overlaps TN and can't > read something written by TN beca

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Kevin Grittner
Nicolas Barbier wrote: > IOW, one could say that the backup is consistent only if it were > never compared against the system as it continued running after the > dump took place. Precisely. I considered making that point in the email I just sent, but figured I had rambled enough. I suppose I

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Kevin Grittner
Greg Stark wrote: > Kevin Grittner wrote: >> One place I'm particularly interested in using such a feature is >> in pg_dump. Without it we have the choice of using a SERIALIZABLE >> transaction, which might fail or cause failures (which doesn't >> seem good for a backup program) or using REPEAT

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Tom Lane
Greg Stark writes: > On Thu, Sep 23, 2010 at 4:08 PM, Kevin Grittner > wrote: >> One place I'm particularly interested in using such a feature is in >> pg_dump. Without it we have the choice of using a SERIALIZABLE >> transaction, which might fail or cause failures (which doesn't seem >> good for

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Nicolas Barbier
[ Forgot the list, resending. ] 2010/9/25 Greg Stark : > On Thu, Sep 23, 2010 at 4:08 PM, Kevin Grittner > wrote: > >> One place I'm particularly interested in using such a feature is in >> pg_dump. Without it we have the choice of using a SERIALIZABLE >> transaction, which might fail or cause f

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-25 Thread Greg Stark
On Thu, Sep 23, 2010 at 4:08 PM, Kevin Grittner wrote: > One place I'm particularly interested in using such a feature is in > pg_dump. Without it we have the choice of using a SERIALIZABLE > transaction, which might fail or cause failures (which doesn't seem > good for a backup program) or using

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-24 Thread Kevin Grittner
Robert Haas wrote: > I think the only changes we should make now are things that we're > sure are improvements. In that vein, anyone who is considering reviewing the patch should check the latest from the git repo or request an incremental patch. I've committed a few things since the last pat

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-24 Thread Robert Haas
On Fri, Sep 24, 2010 at 1:35 PM, Kevin Grittner wrote: > Robert Haas wrote: >> On Fri, Sep 24, 2010 at 12:17 PM, Kevin Grittner >> wrote: >>> Thoughts? >> >> Premature optimization is the root of all evil.  I'm not convinced >> that we should tinker with any of this before committing it and >> g

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-24 Thread Kevin Grittner
Robert Haas wrote: > On Fri, Sep 24, 2010 at 12:17 PM, Kevin Grittner > wrote: >> Thoughts? > > Premature optimization is the root of all evil. I'm not convinced > that we should tinker with any of this before committing it and > getting some real-world experience. It's not going to be perfect

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-24 Thread Robert Haas
On Fri, Sep 24, 2010 at 12:17 PM, Kevin Grittner wrote: > Thoughts? Premature optimization is the root of all evil. I'm not convinced that we should tinker with any of this before committing it and getting some real-world experience. It's not going to be perfect in the first version, just like

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-24 Thread Kevin Grittner
Heikki Linnakangas wrote: > My aim is still to put an upper bound on the amount of shared > memory required, regardless of the number of committed but still > interesting transactions. > That maps nicely to a SLRU table Well, that didn't take as long to get my head around as I feared. I th

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-23 Thread Kevin Grittner
Heikki Linnakangas wrote: > On 23/09/10 18:08, Kevin Grittner wrote: >> Less important than any of the above, but still significant in my >> book, I fear that conflict recording and dangerous structure >> detection could become very convoluted and fragile if we >> eliminate this structure for comm

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-23 Thread Heikki Linnakangas
On 23/09/10 18:08, Kevin Grittner wrote: Less important than any of the above, but still significant in my book, I fear that conflict recording and dangerous structure detection could become very convoluted and fragile if we eliminate this structure for committed transactions. Conflicts among sp

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-23 Thread Kevin Grittner
Heikki Linnakangas wrote: > On 23/09/10 02:14, Kevin Grittner wrote: >> There is a rub on the other point, though. Without transaction >> information you have no way of telling whether TN committed >> before T0, so you would need to assume that it did. So on this >> count, there is bound to be

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-22 Thread Heikki Linnakangas
On 23/09/10 02:14, Kevin Grittner wrote: There is a rub on the other point, though. Without transaction information you have no way of telling whether TN committed before T0, so you would need to assume that it did. So on this count, there is bound to be some increase in false positives leading

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-22 Thread Kevin Grittner
Heikki Linnakangas wrote: > When a transaction is commits, its predicate locks must be held, > but it's not important anymore *who* holds them, as long as > they're hold for long enough. > > Let's move the finishedBefore field from SERIALIZABLEXACT to > PREDICATELOCK. When a transaction commits,

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-22 Thread Heikki Linnakangas
On 19/09/10 21:57, I wrote: Putting that aside for now, we have one very serious problem with this algorithm: While they [SIREAD locks] are associated with a transaction, they must survive a successful COMMIT of that transaction, and remain until all overlapping > transactions complete. Long

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 12:57 PM, Kevin Grittner wrote: >> What is the likelyhood that there exists an update pattern that >> always give the failure in the slow transaction ? > > I don't know how to quantify that.  I haven't seen it yet in > testing, but many of my tests so far have been rather c

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-21 Thread Kevin Grittner
Dan S wrote: > A starvation scenario is what worries me: > > Lets say we have a slow complex transaction with many tables > involved. Concurrently smaller transactions begins and commits . > > Wouldn't it be possible for a starvation scenario where the slower > transaction will never run to c

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-21 Thread Dan S
A starvation scenario is what worries me: Lets say we have a slow complex transaction with many tables involved. Concurrently smaller transactions begins and commits . Wouldn't it be possible for a starvation scenario where the slower transaction will never run to completion but give a serializat

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-20 Thread Kevin Grittner
Dan S wrote: > Well I guess one would like some way to find out which statements > in the involved transactions are the cause of the serialization > failure and what programs they reside in. Unless we get the conflict list optimization added after the base patch, you might get anywhere from on

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-20 Thread Dan S
Well I guess one would like some way to find out which statements in the involved transactions are the cause of the serialization failure and what programs they reside in. Also which relations were involved, the sql-statements may contain many relations but just one or a few might be involved in t

Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-20 Thread Kevin Grittner
Dan S wrote: > I wonder if the SSI implementation will give some way of detecting > the cause of a serialization failure. > Something like the deadlock detection maybe where you get the > sql-statements involved. I've been wondering what detail to try to include. There will often be three tra

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-20 Thread Kevin Grittner
I wrote: > Heikki Linnakangas wrote: > >> ISTM you never search the SerializableXactHash table using a hash >> key, except the one call in CheckForSerializableConflictOut, but >> there you already have a pointer to the SERIALIZABLEXACT struct. >> You only re-find it to make sure it hasn't gone aw

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-19 Thread Heikki Linnakangas
On 19/09/10 16:48, Kevin Grittner wrote: After tossing it around in my head for a bit, the only thing that I see (so far) which might work is to maintain a *list* of SERIALIZABLEXACT objects in memory rather than a using a hash table. The recheck after releasing the shared lock and acquiring an e

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-19 Thread Kevin Grittner
Heikki Linnakangas wrote: > ISTM you never search the SerializableXactHash table using a hash > key, except the one call in CheckForSerializableConflictOut, but > there you already have a pointer to the SERIALIZABLEXACT struct. > You only re-find it to make sure it hasn't gone away while you > tr

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-18 Thread Heikki Linnakangas
On 18/09/10 21:52, Kevin Grittner wrote: [Apologies for not reply-linking this; work email is down so I'm sending from gmail.] Based on feedback from Heikki and Tom I've reworked how I find the top-level transaction. This is in the git repo, and the changes can be viewed at: http://git.postgre

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-18 Thread Kevin Grittner
[Apologies for not reply-linking this; work email is down so I'm sending from gmail.] Based on feedback from Heikki and Tom I've reworked how I find the top-level transaction. This is in the git repo, and the changes can be viewed at: http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-17 Thread Kevin Grittner
Tom Lane wrote: > That assumption is absolutely, totally not going to fly. Understood; I'm already working on it based on Heikki's input. >> This needs to work when the xid of a transaction is found in the >> MVCC data of a tuple for any overlapping serializable transaction >> -- even if tha

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-17 Thread Tom Lane
"Kevin Grittner" writes: > Heikki Linnakangas wrote: >> That sounds like it can eat through your shared memory very quickly >> if you have a lot of subtransactions. > Hmmm I've never explicitly used subtransactions, so I don't tend > to think of them routinely going too deep. And the stru

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-17 Thread Kevin Grittner
Heikki Linnakangas wrote: > On 17/09/10 14:56, Kevin Grittner wrote: >> Heikki Linnakangas wrote: >>> Why not use SubTransGetTopmostTransaction() ? >> >> This needs to work when the xid of a transaction is found in the >> MVCC data of a tuple for any overlapping serializable transaction >> -- even

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-17 Thread Heikki Linnakangas
On 17/09/10 14:56, Kevin Grittner wrote: Heikki Linnakangas wrote: Why not use SubTransGetTopmostTransaction() ? This needs to work when the xid of a transaction is found in the MVCC data of a tuple for any overlapping serializable transaction -- even if that transaction has completed and its

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-17 Thread Kevin Grittner
Heikki Linnakangas wrote: > So, the purpose of SerializableXidHash is to provide quick access > to the SERIALIZABLEXACT struct of a top-level transaction, when you > know its transaction id or any of its subtransaction ids. Right. > To implement the "or any of its subtransaction ids" part, y

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-16 Thread Heikki Linnakangas
On 17/09/10 01:35, Kevin Grittner wrote: Heikki Linnakangas wrote: The functions are well commented, but an overview at the top of the file of all the hash tables and other data structures would be nice. What is stored in each, when are they updated, etc. I moved all the structures from pred

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-16 Thread Kevin Grittner
Alvaro Herrera wrote: > Now that I look at your new patch, I noticed that I was actually > confusing relcache.h with rel.h. The latter includes a big chunk > of our headers, but relcache.h is pretty thin. Including > relcache.h in another header is not much of a problem. OK, thanks for the c

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-16 Thread Alvaro Herrera
Excerpts from Kevin Grittner's message of mié sep 15 14:52:36 -0400 2010: > Alvaro Herrera wrote: > > > I think that would also solve a concern that I had, which is that > > we were starting to include relcache.h (and perhaps other headers > > as well, but that's the one that triggered it for me

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-16 Thread Kevin Grittner
Heikki Linnakangas wrote: > The functions are well commented, but an overview at the top of > the file of all the hash tables and other data structures would be > nice. What is stored in each, when are they updated, etc. I moved all the structures from predicate.h and predicate.c to a new pred

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-15 Thread Kevin Grittner
Alvaro Herrera wrote: > I think that would also solve a concern that I had, which is that > we were starting to include relcache.h (and perhaps other headers > as well, but that's the one that triggered it for me) a bit too > liberally, so +1 from me. Unfortunately, what I proposed doesn't sol

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-15 Thread Alvaro Herrera
Excerpts from Kevin Grittner's message of mié sep 15 09:15:53 -0400 2010: > I'm inclined to move everything except the function prototypes out > of predicate.h to a new predicate_interal.h, and move the structures > defined in predicate.c there, too. I think that would also solve a concern that I

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-15 Thread Kevin Grittner
Heikki Linnakangas wrote: > Now that I understand what the predicate locks are for, I'm now > trying to get my head around all the data structures in > predicate.c. The functions are well commented, but an overview at > the top of the file of all the hash tables and other data > structures woul

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-15 Thread Heikki Linnakangas
On 15/09/10 00:49, Kevin Grittner wrote: Heikki Linnakangas wrote: A short description of how the predicate locks help to implement serializable mode would be nice too. I haven't read Cahill's papers, and I'm left wondering what the RW conflicts and dependencies are, when you're supposed to gr

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-14 Thread Kevin Grittner
I've been thinking about these points, and reconsidered somewhat. Heikki Linnakangas wrote: > Should add a citation to Cahill's work this is based on. > Preferably with a hyperlink. I've been thinking that this should be mentioned in both the README and the source code. > A short descriptio

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-14 Thread Kevin Grittner
Heikki Linnakangas wrote: > Great work! A year ago I thought it would be impossible to have a > true serializable mode in PostgreSQL because of the way we do > MVCC, and now we have a patch. > > At a quick read-through, the code looks very tidy and clear now. > Some comments: > > Should add a

Re: [HACKERS] Serializable Snapshot Isolation

2010-09-14 Thread Heikki Linnakangas
On 14/09/10 19:34, Kevin Grittner wrote: Attached is the latest Serializable Snapshot Isolation (SSI) patch. Great work! A year ago I thought it would be impossible to have a true serializable mode in PostgreSQL because of the way we do MVCC, and now we have a patch. At a quick read-through