On Mon, May 23, 2022 at 2:10 PM David Marchand
<david.march...@redhat.com> wrote:
>
> On Mon, May 23, 2022 at 5:17 AM <zhichaox.z...@intel.com> wrote:
> >
> > From: Zhichao Zeng <zhichaox.z...@intel.com>
> >
> > The eal-intr-thread is not closed before memory cleanup in the
> > process of exiting. There is a small probability that when the
> > eal-intr-thread is about to use some pointers, the memory were
> > just cleaned, which cause the segment fault error caught by ASan.
> >
> > This patch close the eal-intr-thread before memory cleanup when
> > exiting to avoid segment fault.
>
> This breaks the debug_autotest unit test.
> It results in a segfault in a forked process executing
> rte_exit()/rte_eal_cleanup().
>
> That's probably because intr_thread thread does not exist in the forked 
> process.

Reading fork() manual:
       *  The  child  process is created with a single thread—the one
that called fork().  The entire virtual address space of the parent is
replicated in the child, including the states of mutexes, condi‐
          tion variables, and other pthreads objects; the use of
pthread_atfork(3) may be helpful for dealing with problems that this
can cause.


We may need a check like diff below.
But then, debug_autotest code seems dangerous, because it does exactly
what the added check wants to warn about.

Opinions?


diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 1ef263434a..1e6fd01d5d 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -857,12 +857,25 @@ is_iommu_enabled(void)
        return n > 2;
 }

+static uint32_t run_once;
+
+static void warn_parent(void)
+{
+       RTE_LOG(WARNING, EAL, "fork() was called, DPDK won't work in the child "
+               "process unless it calls rte_eal_init()\n");
+}
+
+static void scratch_child(void)
+{
+       /* Scratch run_once so that a call to rte_eal_cleanup won't crash... */
+       __atomic_store_n(&run_once, 0, __ATOMIC_RELAXED);
+}
+
 /* Launch threads, called at application init(). */
 int
 rte_eal_init(int argc, char **argv)
 {
        int i, fctret, ret;
-       static uint32_t run_once;
        uint32_t has_run = 0;
        const char *p;
        static char logid[PATH_MAX];
@@ -1228,6 +1241,8 @@ rte_eal_init(int argc, char **argv)

        eal_mcfg_complete();

+       pthread_atfork(NULL, warn_parent, scratch_child);
+
        return fctret;
 }

@@ -1257,6 +1272,9 @@ rte_eal_cleanup(void)
        struct internal_config *internal_conf =
                eal_get_internal_configuration();

+       if (__atomic_load_n(&run_once, __ATOMIC_RELAXED) == 0)
+               return 0;
+
        if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
                        internal_conf->hugepage_file.unlink_existing)
                rte_memseg_walk(mark_freeable, NULL);


-- 
David Marchand

Reply via email to