Thanks. A few days ago I wrote a short-n-simple little program that cloned two thread which each did some things with containers. It was definately racy.
Based on your input I"ll take a closer look at the new monitoring code. I'm hoping to take a much closer look next week. I.e. load two containers, and fork threads just to do c->is_defined; just c->wait; just c->daemonize; just c->start(); etc. and see which ones are racy after a few runs. Quoting S.Çağlar Onur (cag...@10ur.org): > Hi, > > I think staging (my head is @ 813a48...) started to stuck while creating > containers concurrently after monitoring related changes. > > I observed that issue with the Go bindings first. Then I wrote a test case > to remove Go from the picture and I also thought that having a test case > would be helpful (see "[PATCH] tests: Introduce lxc-test-concurrent for > testing basic actions concurrently"). > > Normally one should see following > > [caglar@qgq:~/Projects/lxc(staging)] sudo lxc-test-concurrent > > Executing (create) for 5 containers... > > Executing (start) for 5 containers... > > Executing (stop) for 5 containers... > > Executing (destroy) for 5 containers... > > > but occasionally create started to stuck on my test system (just try to run > couple of times). > > Cheers, > > > > On Thu, Sep 12, 2013 at 10:41 AM, Serge Hallyn <serge.hal...@ubuntu.com>wrote: > > > Quoting Dwight Engen (dwight.en...@oracle.com): > > > On Thu, 12 Sep 2013 00:27:04 -0400 > > > Stéphane Graber <stgra...@ubuntu.com> wrote: > > > > > > > Hello, > > > > > > > > It looks like Dwight's last change introduce a bit of a regression > > > > when running lxc-start -d. > > > > > > Yikes, sorry I didn't catch that in my testing. My follow on patch > > > for doing the monitor socket in the abstract space gets rid of this > > > entirely, so this is an additional reason to consider it. > > > > > > > Tracing it down (added a ton of printf all over), it looks like it's > > > > hanging on: > > > > - lxcapi_start > > > > - wait_on_daemonized_start > > > > - lxcapi_wait > > > > - lxc_wait > > > > - lxc_monitor_open > > > > - lxc_monitor_sock_name > > > > > > > > Specifically, it's hanging at the process_lock() call because > > > > process_lock() was already called as part of lxcapi_start and only > > > > gets unlocked right after wait_on_daemonized_start returns. > > > > > > > > > > > > Looking at the code, I'm not even sure why we need process_lock there. > > > > What it protects is another thread triggering the mkdir_p in parallel, > > > > but that shouldn't really be a problem since running two mkdir_p at > > > > the same time should still result in the hierarchy being created, or > > > > did I miss something? > > > > > > That sounds logical to me, but hmm, does that mean we don't need it in > > > lxclock_name() either (where I was modeling this on)? I wonder if > > > there is a code flow that its possible for us to hang there. > > > > Well mkdir uses the umask right? (and *may* use the cwd). Both of > > which are shared among threads. It won't set them, but something else > > might change them underneath them. > > > > So I could be wrong and we might not need it, but it seemed like we > > might. > > > > -serge > > > > > > ------------------------------------------------------------------------------ > > How ServiceNow helps IT people transform IT departments: > > 1. Consolidate legacy IT systems to a single system of record for IT > > 2. Standardize and globalize service processes across IT > > 3. Implement zero-touch automation to replace manual, redundant tasks > > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk > > _______________________________________________ > > Lxc-devel mailing list > > Lxc-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/lxc-devel > > > > > > -- > S.Çağlar Onur <cag...@10ur.org> ------------------------------------------------------------------------------ How ServiceNow helps IT people transform IT departments: 1. Consolidate legacy IT systems to a single system of record for IT 2. Standardize and globalize service processes across IT 3. Implement zero-touch automation to replace manual, redundant tasks http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk _______________________________________________ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel