hi, all of the following answers are with the patch you provided in other mail applied.
> YAMAMOTO Takashi <y...@mwd.biglobe.ne.jp> wrote: > >> i have seen this actually happen. i've confirmed the creation of >> the loop with the attached patch. it's easily reproducable with >> my application. i can provide the full source code of my >> application if you want. (but it isn't easy to run unless you are >> familiar with the recent version of NetBSD) >> i haven't found a smaller reproducible test case yet. > > I've never used NetBSD, so maybe a few details will help point me in > the right direction faster than the source code. > > Has your application ever triggered any of the assertions in the > code? (In particular, it would be interesting if it ever hit the > one right above where you patched.) the assertion right above is sometimes triggered. sometimes not. > > How long was the loop? see below. > Did you notice whether the loop involved multiple tuples within a > single page? if i understand correctly, yes. the following is a snippet of my debug code (dump targets when triggerCheckTargetForConflictsIn loops >1000 times) and its output. the same locktag_field3 value means the same page, right? + for (t = target, i = 0; t != NULL; i++) { + elog(WARNING, "[%u] target %p tag %" PRIx32 ":%" PRIx32 ":%" PRIx32 + ":%" PRIx16 ":%" PRIx16 " prior %p next %p", i, t, + t->tag.locktag_field1, + t->tag.locktag_field2, + t->tag.locktag_field3, + t->tag.locktag_field4, + t->tag.locktag_field5, + t->priorVersionOfRow, + t->nextVersionOfRow); + t = t->priorVersionOfRow; + if (t == target) { + elog(WARNING, "found a loop"); + break; + } + } WARNING: [0] target 0xbb51f238 tag 4000:4017:53b:6c:0 prior 0xbb51f350 next 0xbb51f350 WARNING: [1] target 0xbb51f350 tag 4000:4017:53b:69:0 prior 0xbb51f238 next 0xbb51f238 WARNING: found a loop another sample: WARNING: [0] target 0xbb51f530 tag 4000:4017:565:ae:0 prior 0xbb51f1e8 next 0xbb51f300 WARNING: [1] target 0xbb51f1e8 tag 4000:4017:565:ad:0 prior 0xbb51f580 next 0xbb51f530 WARNING: [2] target 0xbb51f580 tag 4000:4017:565:ac:0 prior 0xbb51f300 next 0xbb51f1e8 WARNING: [3] target 0xbb51f300 tag 4000:4017:565:ab:0 prior 0xbb51f530 next 0xbb51f580 WARNING: found a loop the table seems mostly hot-updated, if it matters. hoge=# select * from pg_stat_user_tables where relid=16407; -[ RECORD 1 ]-----+-------------------- relid | 16407 schemaname | pgfs relname | file seq_scan | 0 seq_tup_read | 0 idx_scan | 53681 idx_tup_fetch | 52253 n_tup_ins | 569 n_tup_upd | 12054 n_tup_del | 476 n_tup_hot_upd | 12041 n_live_tup | 93 n_dead_tup | 559 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0 autovacuum_count | 0 analyze_count | 4922528128875102208 autoanalyze_count | 7598807461784802080 (values in the last two columns seems bogus. i don't know if it's related or not.) > Did this coincide with an autovacuum of the table? no. (assuming that autovacuum=off in postgresql.conf is enough to exclude the possibility.) > > These last two are of interest because it seems likely that such a > cycle might be related to this new code not properly allowing for > some aspect of tuple cleanup. > > Thanks for finding this and reporting it, and thanks in advance for > any further detail you can provide. thanks for looking. YAMAMOTO Takashi > > -Kevin > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers