Rusty Conover wrote:
> It seems like this is a race condition cause by the system catalog cache not 
> being locked properly. I've included a perl script below that causes the 
> crash on my box consistently.
> 
> The script forks two different types of processes:
> 
> #1 - begin transaction, create a few temp tables and analyze them in a 
> transaction, commit (running in database foobar_1)
> #2 - begin transaction, truncate table, insert records into table from select 
> in a transaction, commit (running in database foobar_2)
> 
> I setup the process to have 10 instances of task #1 and 1 instance of task #2.
> 
> Running this script causes the crash of postgres within seconds on my box.

Thanks, that script crashes on my laptop too, with assertions enabled.

According to the comments above RelationClearRelation(), if it's called
with 'rebuild=true', the caller should hold a lock on the relation, i.e
refcnt > 0. That's not the case in RelationFlushRelation() when it
rebuilds a new relcache entry.

Attached patch should fix that, by incrementing the reference count
while the entry is rebuilt. It also adds an Assertion in
RelationClearRelation() to check that the refcnt is indeed > 0.
Comments?

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/utils/cache/relcache.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/cache/relcache.c,v
retrieving revision 1.287.2.3
diff -c -r1.287.2.3 relcache.c
*** src/backend/utils/cache/relcache.c	13 Jan 2010 23:07:15 -0000	1.287.2.3
--- src/backend/utils/cache/relcache.c	14 Apr 2010 11:09:23 -0000
***************
*** 1773,1778 ****
--- 1773,1781 ----
  {
  	Oid			old_reltype = relation->rd_rel->reltype;
  
+ 	Assert((rebuild && relation->rd_refcnt > 0) ||
+ 		   (!rebuild && relation->rd_refcnt == 0));
+ 
  	/*
  	 * Make sure smgr and lower levels close the relation's files, if they
  	 * weren't closed already.  If the relation is not getting deleted, the
***************
*** 1968,1975 ****
  static void
  RelationFlushRelation(Relation relation)
  {
- 	bool		rebuild;
- 
  	if (relation->rd_createSubid != InvalidSubTransactionId ||
  		relation->rd_newRelfilenodeSubid != InvalidSubTransactionId)
  	{
--- 1971,1976 ----
***************
*** 1978,1994 ****
  		 * forget the "new" status of the relation, which is a useful
  		 * optimization to have.  Ditto for the new-relfilenode status.
  		 */
! 		rebuild = true;
  	}
  	else
  	{
  		/*
  		 * Pre-existing rels can be dropped from the relcache if not open.
  		 */
! 		rebuild = !RelationHasReferenceCountZero(relation);
  	}
- 
- 	RelationClearRelation(relation, rebuild);
  }
  
  /*
--- 1979,1996 ----
  		 * forget the "new" status of the relation, which is a useful
  		 * optimization to have.  Ditto for the new-relfilenode status.
  		 */
! 		RelationIncrementReferenceCount(relation);
! 		RelationClearRelation(relation, true);
! 		RelationDecrementReferenceCount(relation);
  	}
  	else
  	{
  		/*
  		 * Pre-existing rels can be dropped from the relcache if not open.
  		 */
! 		bool rebuild = !RelationHasReferenceCountZero(relation);
! 		RelationClearRelation(relation, rebuild);
  	}
  }
  
  /*
-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to