On Sat, Jul 04, 2015 at 08:07:49PM -0400, Noah Misch wrote: > On Sun, Jul 05, 2015 at 12:54:43AM +0200, Andres Freund wrote: > > On 2015-07-04 18:40:41 -0400, Noah Misch wrote: > > > (1) "IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)". Getting it > > > working > > > required the attached patch. > > > Will you apply? Having the ability to test change seems to put you in a > > much better spot then me. > > I will.
That commit (0d32d2e) permitted things to compile and usually pass tests, but I missed the synchronization bug. Since 2015-10-01, the buildfarm has seen sixteen duplicate-catalog-OID failures. The compiler has always been xlC, and the branch has been HEAD or 9.5. Examples: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2015-12-15%2004%3A12%3A49 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2015-12-08%2001%3A20%3A16 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2015-12-11%2005%3A25%3A54 These suggested OidGenLock wasn't doing its job. I've seen similar symptoms around WALInsertLocks with "IBM XL C/C++ for Linux, V13.1.2 (5725-C73, 5765-J08)" for ppc64le. The problem is generic-xlc.h pg_atomic_compare_exchange_u32_impl() issuing __isync() before __compare_and_swap(). __isync() shall follow __compare_and_swap(); see our own s_lock.h, its references, and other projects' usage: https://github.com/boostorg/smart_ptr/blob/boost-1.60.0/include/boost/smart_ptr/detail/sp_counted_base_vacpp_ppc.hpp#L55 http://redmine.gromacs.org/projects/gromacs/repository/entry/src/external/thread_mpi/include/thread_mpi/atomic/xlc_ppc.h?rev=v5.1.2#L112 https://www-01.ibm.com/support/knowledgecenter/SSGH2K_13.1.2/com.ibm.xlc131.aix.doc/language_ref/asm_example.html This patch's test case would have failed about half the time under today's generic-xlc.h. Fast machines run it in about 1s. A test that detects the bug 99% of the time would run far longer, hence this compromise. Archaeology enthusiasts may enjoy the bug's changing presentation over time. LWLockAcquire() exposed it after commit ab5194e redesigned lwlock.c in terms of atomics. Assert builds were unaffected for commits in [ab5194e, bbfd7ed); atomics.h asserts fattened code enough to mask the bug. However, removing the AssertPointerAlignment() from pg_atomic_fetch_or_u32() would have sufficed to surface it in assert builds. All builds demonstrated the bug once bbfd7ed introduced xlC pg_attribute_noreturn(), which elicits a simpler instruction sequence around the ExceptionalCondition() call embedded in each Assert. nm
diff --git a/src/bin/pgbench/.gitignore b/src/bin/pgbench/.gitignore index aae819e..983a3cd 100644 --- a/src/bin/pgbench/.gitignore +++ b/src/bin/pgbench/.gitignore @@ -1,3 +1,4 @@ /exprparse.c /exprscan.c /pgbench +/tmp_check/ diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile index 18fdf58..e9b1b74 100644 --- a/src/bin/pgbench/Makefile +++ b/src/bin/pgbench/Makefile @@ -40,3 +40,9 @@ clean distclean: maintainer-clean: distclean rm -f exprparse.c exprscan.c + +check: + $(prove_check) + +installcheck: + $(prove_installcheck) diff --git a/src/bin/pgbench/t/001_pgbench.pl b/src/bin/pgbench/t/001_pgbench.pl new file mode 100644 index 0000000..88e83ab --- /dev/null +++ b/src/bin/pgbench/t/001_pgbench.pl @@ -0,0 +1,25 @@ +use strict; +use warnings; + +use PostgresNode; +use TestLib; +use Test::More tests => 3; + +# Test concurrent insertion into table with UNIQUE oid column. DDL expects +# GetNewOidWithIndex() to successfully avoid violating uniqueness for indexes +# like pg_class_oid_index and pg_proc_oid_index. This indirectly exercises +# LWLock and spinlock concurrency. This test makes a 5-MiB table. +my $node = get_new_node('main'); +$node->init; +$node->start; +$node->psql('postgres', + 'CREATE UNLOGGED TABLE oid_tbl () WITH OIDS; ' + . 'ALTER TABLE oid_tbl ADD UNIQUE (oid);'); +my $script = $node->basedir . '/pgbench_script'; +append_to_file($script, + 'INSERT INTO oid_tbl SELECT FROM generate_series(1,1000);'); +$node->command_like( + [ qw(pgbench --no-vacuum --client=5 --protocol=prepared + --transactions=25 --file), $script ], + qr{processed: 125/125}, + 'concurrent OID generation'); diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h index 1e8a11e..f24e3af 100644 --- a/src/include/port/atomics/generic-xlc.h +++ b/src/include/port/atomics/generic-xlc.h @@ -41,18 +41,22 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 *expected, uint32 newval) { /* + * XXX: __compare_and_swap is defined to take signed parameters, but that + * shouldn't matter since we don't perform any arithmetic operations. + */ + bool ret = __compare_and_swap((volatile int*)&ptr->value, + (int *)expected, (int)newval); + + /* * xlc's documentation tells us: * "If __compare_and_swap is used as a locking primitive, insert a call to * the __isync built-in function at the start of any critical sections." + * + * The critical section begins immediately after __compare_and_swap(). */ __isync(); - /* - * XXX: __compare_and_swap is defined to take signed parameters, but that - * shouldn't matter since we don't perform any arithmetic operations. - */ - return __compare_and_swap((volatile int*)&ptr->value, - (int *)expected, (int)newval); + return ret; } #define PG_HAVE_ATOMIC_FETCH_ADD_U32 @@ -69,10 +73,12 @@ static inline bool pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 *expected, uint64 newval) { + bool ret = __compare_and_swaplp((volatile long*)&ptr->value, + (long *)expected, (long)newval); + __isync(); - return __compare_and_swaplp((volatile long*)&ptr->value, - (long *)expected, (long)newval);; + return ret; } #define PG_HAVE_ATOMIC_FETCH_ADD_U64 diff --git a/src/tools/msvc/clean.bat b/src/tools/msvc/clean.bat index feb0fe5..d21692f 100755 --- a/src/tools/msvc/clean.bat +++ b/src/tools/msvc/clean.bat @@ -94,6 +94,7 @@ if exist src\bin\pg_basebackup\tmp_check rd /s /q src\bin\pg_basebackup\tmp_chec if exist src\bin\pg_config\tmp_check rd /s /q src\bin\pg_config\tmp_check if exist src\bin\pg_ctl\tmp_check rd /s /q src\bin\pg_ctl\tmp_check if exist src\bin\pg_rewind\tmp_check rd /s /q src\bin\pg_rewind\tmp_check +if exist src\bin\pgbench\tmp_check rd /s /q src\bin\pgbench\tmp_check if exist src\bin\scripts\tmp_check rd /s /q src\bin\scripts\tmp_check REM Clean up datafiles built with contrib
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers