Andres Freund wrote:
> On 2014-04-21 19:43:15 -0400, Andrew Dunstan wrote:
> > 
> > On 04/21/2014 02:54 PM, Andres Freund wrote:
> > >Hi,
> > >
> > >I spent the last two hours poking arounds in the environment Andrew
> > >provided and I was able to reproduce the issue, find a assert to
> > >reproduce it much faster and find a possible root cause.
> > 
> > 
> > What's the assert that makes it happen faster? That might help a lot in
> > constructing a self-contained test.
> 
> Assertion and *preliminary*, *hacky* fix attached.

Thanks for the analysis and patches.  I've been playing with this on my
own a bit, and one thing that I just noticed is that at least for
heap_update I cannot reproduce a problem when the xmax is originally a
multixact, so AFAICT the number of places that need patched aren't as
many.

Some testing later, I think the issue only occurs if we determine that
we don't need to wait for the xid/multi to complete, because otherwise
the wait itself saves us.  (It's easy to cause the problem by adding a
breakpoint in heapam.c:3325, i.e. just before re-acquiring the buffer
lock, and then having transaction A lock for key share, then transaction
B update the tuple which stops at the breakpoint, then transaction A
also update the tuple, and finally release transaction B).

For now I offer a cleaned up version of your patch to add the assertion
that multis don't contain multiple updates.  I considered the idea of
making this #ifdef USE_ASSERT_CHECKING, because it has to walk the
complete array of members; and then have full elogs in MultiXactIdExpand
and MultiXactIdCreate, which are lighter because they can check more
easily.  But on second thoughts I refrained from doing that, because
surely the arrays are not as large anyway, are they.

I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes.  Does that sound okay?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***************
*** 190,198 **** static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
  /* Get the LockTupleMode for a given MultiXactStatus */
  #define TUPLOCK_from_mxstatus(status) \
  			(MultiXactStatusLock[(status)])
- /* Get the is_update bit for a given MultiXactStatus */
- #define ISUPDATE_from_mxstatus(status) \
- 			((status) > MultiXactStatusForUpdate)
  
  /* ----------------------------------------------------------------
   *						 heap support routines
--- 190,195 ----
*** a/src/backend/access/transam/multixact.c
--- b/src/backend/access/transam/multixact.c
***************
*** 457,463 **** MultiXactIdExpand(MultiXactId multi, TransactionId xid, MultiXactStatus status)
  	for (i = 0, j = 0; i < nmembers; i++)
  	{
  		if (TransactionIdIsInProgress(members[i].xid) ||
! 			((members[i].status > MultiXactStatusForUpdate) &&
  			 TransactionIdDidCommit(members[i].xid)))
  		{
  			newMembers[j].xid = members[i].xid;
--- 457,463 ----
  	for (i = 0, j = 0; i < nmembers; i++)
  	{
  		if (TransactionIdIsInProgress(members[i].xid) ||
! 			(ISUPDATE_from_mxstatus(members[i].status) &&
  			 TransactionIdDidCommit(members[i].xid)))
  		{
  			newMembers[j].xid = members[i].xid;
***************
*** 713,718 **** MultiXactIdCreateFromMembers(int nmembers, MultiXactMember *members)
--- 713,734 ----
  		return multi;
  	}
  
+ 	/* Verify that there is a single update Xid among the given members. */
+ 	{
+ 		int			i;
+ 		bool		has_update = false;
+ 
+ 		for (i = 0; i < nmembers; i++)
+ 		{
+ 			if (ISUPDATE_from_mxstatus(members[i].status))
+ 			{
+ 				if (has_update)
+ 					elog(ERROR, "new multixact has more than one updating member");
+ 				has_update = true;
+ 			}
+ 		}
+ 	}
+ 
  	/*
  	 * Assign the MXID and offsets range to use, and make sure there is space
  	 * in the OFFSETs and MEMBERs files.  NB: this routine does
*** a/src/include/access/multixact.h
--- b/src/include/access/multixact.h
***************
*** 48,53 **** typedef enum
--- 48,57 ----
  
  #define MaxMultiXactStatus MultiXactStatusUpdate
  
+ /* does a status value correspond to a tuple update? */
+ #define ISUPDATE_from_mxstatus(status) \
+ 			((status) > MultiXactStatusForUpdate)
+ 
  
  typedef struct MultiXactMember
  {
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to