https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246207

            Bug ID: 246207
           Summary: [geom] geli livelocks during panic
           Product: Base System
           Version: 12.1-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: b...@freebsd.org
          Reporter: asom...@freebsd.org

Some geli-using machines I administer occasionally panic.  When they do, they
sometimes dump core but often don't.  When they don't, they simply hang after
printing the stack trace, but before printing the uptime.

I've traced the problem to geli's shutdown_pre_sync event handler.  It tries to
destroy each geli device.  We can't simply skip that step if a panic is
underway; erasing the keys is necessary to prevent warm-boot attacks.  The
problem lies in the following lines.  

g_eli_destroy:
        sc->sc_flags |= G_ELI_FLAG_DESTROY;
        wakeup(sc);
        /*
         * Wait for kernel threads self destruction.
         */
        while (!LIST_EMPTY(&sc->sc_workers)) {
                msleep(&sc->sc_workers, &sc->sc_queue_mtx, PRIBIO,
                    "geli:destroy", 0);
        }

_sleep:
        if (SCHEDULER_STOPPED_TD(td)) {
                if (lock != NULL && priority & PDROP)
                        class->lc_unlock(lock);
                return (0);
        }

As you can see, if the scheduler is stopped for the current thread (which it
will be during a panic), then msleep does nothing, cause g_eli_destroy to loop
indefinitely.  The obvious solution, which I haven't yet tested, would be to
skip that section in g_eli_destroy when the scheduler is stopped.  What I don't
understand is why g_eli_destroy _ever_ works during a panic.  Perhaps it has
something to do with the allocation of worker threads among cores?  Perhaps it
only succeeds when all worker threads happen to be on different cores?  I find
that unlikely though, because these servers have thousands of worker threads.

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"

Reply via email to