Package: libdb5.3 Version: 5.3.28-12+deb9u1 I've been running into https://bugzilla.redhat.com/show_bug.cgi?id=1349779 for some time now and patched it locally more than a year ago and the issue went away. Only recently when we updated to a more recent version of Debian and began running into the issue again did I realize that I had never pushed the patch with Debian upstream.
Unfortunately, I don't have an easy reproducer (we run into it every month or two, but presumably increasing compaction frequency in 389ds would make it happen somewhat sooner). Note that this also used to be: https://pagure.io/389-ds-base/issue/49360 Which is what I originally found, but that issue seems to have disappeared. The patch I got from there in Jan 2018 and the one in Fedora appear to be the same (with the exception of some fuzz) though. Patch is little more than: $ cat debian/patches/checkpoint-opd-deadlock.patch Index: db5.3-5.3.28/src/db/db_cam.c =================================================================== --- db5.3-5.3.28.orig/src/db/db_cam.c +++ db5.3-5.3.28/src/db/db_cam.c @@ -868,6 +868,11 @@ __dbc_iget(dbc, key, data, flags) flags == DB_PREV || flags == DB_PREV_DUP)) { if (tmp_rmw && (ret = dbc->am_writelock(dbc)) != 0) goto err; + /* Latch the primary tree page here in order to not deadlock later. */ + if (cp->page == NULL && + (ret = __memp_fget(mpf, &cp->pgno, + dbc->thread_info, dbc->txn, 0, &cp->page)) != 0) + goto err; if (F_ISSET(dbc, DBC_TRANSIENT)) opd = cp->opd; else if ((ret = __dbc_idup(cp->opd, &opd, DB_POSITION)) != 0)