>Yep, that's pretty much what it does, although xmax is actually >defined as the XID *following* the last one that ended, and I think >xmin needs to also be in xip, so in this case you'd actually end up >with xmin = 15, xmax = 22, xip = { 15, 16, 17, 19 }. But you've got >the basic idea of it.
Shouldn't Xmax be 21 okay as current check in TupleVisibility indicate if XID is greater than equal to Xmax then it returns tuple is not visible. >In particular, if someone with proc->xmin = InvalidTransactionId is >taking a snapshot while you're computing RecentGlobalXmin, and then >stores a proc->xmin less than your newly-computed RecentGlobalXmin, >you've got a problem. I am assuming here you are reffering to take a snapshot means it has to be updated in shared memory because otherwise no need to refer proc with your new design. Session-1 Updating RecentGlobalXmin during GetSnapshotData() using shared memory copy of snapshot and completed transactions as RecentGlobalXmin can be updated if we get xmin. Session-2 Getting Snapshot to update in shared memory, here it needs to go through procarray. Now when it is going through procarray using proclock it can be case that proc of Session-1 has InvalidTransId, so we will ignore it and go through remaining session procs. Now normally Session-1 proc should not get lesser xmin as compare to other session procs but incase it has got his copy from shared memory ring buffer before other session procs then it can be lower and which can cause a problem. >> It's not one extra read - you'd have to look at every PGPROC. If the above explanation is right then is this the reason that to update RecentGlobalXmin, it has to go through every PGPROC. **************************************************************************** *********** This e-mail and attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient's) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -----Original Message----- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Monday, September 12, 2011 9:31 PM To: Amit Kapila Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] cheaper snapshots redux On Mon, Sep 12, 2011 at 11:07 AM, Amit Kapila <amit.kap...@huawei.com> wrote: >>If you know what transactions were running the last time a snapshot summary >> was written and what >transactions have ended since then, you can work out >> the new xmin on the fly. I have working >code for this and it's actually >> quite simple. > > I believe one method to do same is as follows: > > Let us assume at some point of time the snapshot and completed XID list is > somewhat as follows: > > Snapshot > > { Xmin 5, Xip[] 8 10 12, Xmax - 15 } > > Committed XIDS 8, 10 , 12, 18, 20, 21 > > So it means 16,17,19 are running transactions. So it will behave as follows: > > { Xmin 16, Xmax 21, Xip[] 17,19 } Yep, that's pretty much what it does, although xmax is actually defined as the XID *following* the last one that ended, and I think xmin needs to also be in xip, so in this case you'd actually end up with xmin = 15, xmax = 22, xip = { 15, 16, 17, 19 }. But you've got the basic idea of it. > But if we do above way to calculate Xmin, we need to check in existing Xip > array and committed Xid array to find Xmin. Wont this cause reasonable time > even though it is outside lock time if Xip and Xid are large. Yes, Tom raised this concern earlier. I can't answer it for sure without benchmarking, but clearly xip[] can't be allowed to get too big. >> Because GetSnapshotData() computes a new value for RecentGlobalXmin by >> scanning the ProcArray. > This isn't costing a whole lot extra right now >> because the xmin and xid fields are normally in > the same cache line, so >> once you've looked at one of them it doesn't cost that much extra to >> look at the other. If, on the other hand, you're not looking at (or even >> locking) the >> ProcArray, then doing so just to recomputed RecentGlobalXmin sucks. > > Yes, this is more time as compare to earlier, but if our approach to > calculate Xmin is like above point, then one extra read outside lock should > not matter. However if for above point approach is different then it will be > costlier. It's not one extra read - you'd have to look at every PGPROC. And it is not outside a lock, either. You definitely need locking around computing RecentGlobalXmin; see src/backend/access/transa/README. In particular, if someone with proc->xmin = InvalidTransactionId is taking a snapshot while you're computing RecentGlobalXmin, and then stores a proc->xmin less than your newly-computed RecentGlobalXmin, you've got a problem. That can't happen right now because no transactions can commit while RecentGlobalXmin is being computed, but the point here is precisely to allow those operations to (mostly) run in parallel. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers