On 2/8/19 12:32 PM, Ideriha, Takeshi wrote: >> From: Tomas Vondra [mailto:tomas.von...@2ndquadrant.com] >> But it's a bit funnier, because there's also DropRelationFiles() which does >> smgrclose on >> a batch of relations too, and it says this >> >> /* >> * Call smgrclose() in reverse order as when smgropen() is called. >> * This trick enables remove_from_unowned_list() in smgrclose() >> * to search the SMgrRelation from the unowned list, >> * with O(1) performance. >> */ >> for (i = ndelrels - 1; i >= 0; i--) >> ... >> >> but it's called from two places in xact.c only. And if you trigger the issue >> with 100k x >> CREATE TABLE + ROLLBACK, and then force a recovery by killing postmaster, you >> actually get the very same behavior with always traversing the unowned list >> for some >> reason. I'm not quite sure why, but it seems the optimization is exactly the >> wrong thing >> to do here. > > So when DropRelationFiles() is called, order of calling smgr_close() and order of unowed list is always same? > Well, let's say you create 10k tables in a single transaction, and then kill the server with "kill -9" right after the commit. Then on restart, during recovery this happens 2019-02-08 14:16:21.781 CET [12817] LOG: remove_from_unowned_list 10015 2019-02-08 14:16:21.871 CET [12817] LOG: remove_from_unowned_list 10005 2019-02-08 14:16:21.967 CET [12817] LOG: remove_from_unowned_list 10004 2019-02-08 14:16:22.057 CET [12817] LOG: remove_from_unowned_list 10001 2019-02-08 14:16:22.147 CET [12817] LOG: remove_from_unowned_list 10000 2019-02-08 14:16:22.238 CET [12817] LOG: remove_from_unowned_list 9999 2019-02-08 14:16:22.327 CET [12817] LOG: remove_from_unowned_list 9998 2019-02-08 14:16:22.421 CET [12817] LOG: remove_from_unowned_list 9996 2019-02-08 14:16:22.513 CET [12817] LOG: remove_from_unowned_list 9995 2019-02-08 14:16:22.605 CET [12817] LOG: remove_from_unowned_list 9994 2019-02-08 14:16:22.696 CET [12817] LOG: remove_from_unowned_list 9993 ... 2019-02-08 14:19:13.921 CET [12817] LOG: remove_from_unowned_list 8396 2019-02-08 14:19:14.025 CET [12817] LOG: remove_from_unowned_list 8395 2019-02-08 14:19:14.174 CET [12817] LOG: remove_from_unowned_list 8394 2019-02-08 14:19:14.277 CET [12817] LOG: remove_from_unowned_list 8393 2019-02-08 14:19:14.387 CET [12817] LOG: remove_from_unowned_list 8392 2019-02-08 14:19:14.508 CET [12817] LOG: remove_from_unowned_list 8391 2019-02-08 14:19:14.631 CET [12817] LOG: remove_from_unowned_list 8390 2019-02-08 14:19:14.770 CET [12817] LOG: remove_from_unowned_list 8389 ... This is with the attached debug patch, which simply prints index of the unowned list index entry. And yes, it took ~3 minutes to get from 10k to 8k (at which point I interrupted the recovery and killed the cluster). I see similar issue after creating a lot of tables in the same xact and rolling it back, although in this case it's faster for some reason. > This one was inroduced at b4166911 and I'd like to hear author and reviewer's > opinion. > https://www.postgresql.org/message-id/CAHGQGwH0hwXwrCDnmUU2Twj5YgHcrmMCVD7O%3D1NrRTpHcbtCBw%40mail.gmail.com > That probably explains why we haven't seen the issue before. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c index 0c0bba4ab3..8eb724b80d 100644 --- a/src/backend/storage/smgr/smgr.c +++ b/src/backend/storage/smgr/smgr.c @@ -281,6 +281,7 @@ remove_from_unowned_list(SMgrRelation reln) { SMgrRelation *link; SMgrRelation cur; + int idx = 0; for (link = &first_unowned_reln, cur = *link; cur != NULL; @@ -292,7 +293,12 @@ remove_from_unowned_list(SMgrRelation reln) cur->next_unowned_reln = NULL; break; } + + idx++; } + + if (idx > 0) + elog(LOG, "remove_from_unowned_list %d", idx); } /*