Bug#932758: OPD checkpoint deadlock in libdb5.3.28-12+deb9u1

Thomas Walker Mon, 22 Jul 2019 12:15:13 -0700

Package: libdb5.3
Version: 5.3.28-12+deb9u1

I've been running into https://bugzilla.redhat.com/show_bug.cgi?id=1349779
for some time now and patched it locally more than a year ago and the issue
went away.  Only recently when we updated to a more recent version of Debian
and began running into the issue again did I realize that I had never pushed
the patch with Debian upstream.


Unfortunately, I don't have an easy reproducer (we run into it every month or
two, but presumably increasing compaction frequency in 389ds would make it
happen somewhat sooner).

Note that this also used to be:
https://pagure.io/389-ds-base/issue/49360
Which is what I originally found, but that issue seems to have disappeared.
The patch I got from there in Jan 2018 and the one in Fedora appear to be the
same (with the exception of some fuzz) though.

Patch is little more than:

$ cat debian/patches/checkpoint-opd-deadlock.patch 
Index: db5.3-5.3.28/src/db/db_cam.c
===================================================================
--- db5.3-5.3.28.orig/src/db/db_cam.c
+++ db5.3-5.3.28/src/db/db_cam.c
@@ -868,6 +868,11 @@ __dbc_iget(dbc, key, data, flags)
            flags == DB_PREV || flags == DB_PREV_DUP)) {
                if (tmp_rmw && (ret = dbc->am_writelock(dbc)) != 0)
                        goto err;
+               /* Latch the primary tree page here in order to not deadlock 
later. */
+               if (cp->page == NULL &&
+                       (ret = __memp_fget(mpf, &cp->pgno,
+                            dbc->thread_info, dbc->txn, 0, &cp->page)) != 0)
+                           goto err;
                if (F_ISSET(dbc, DBC_TRANSIENT))
                        opd = cp->opd;
                else if ((ret = __dbc_idup(cp->opd, &opd, DB_POSITION)) != 0)

Bug#932758: OPD checkpoint deadlock in libdb5.3.28-12+deb9u1

Reply via email to