Hi Yuriy, I read multiprocessing source code just now. Now I feel it may not solve this problem very easily. For example, let us assume that we will use the proxy object in Manager's process to call libguestfs. In manager.py, I see it needs to create a pipe, before fork the child process. The write end of this pipe is required by child process.
http://sourcecodebrowser.com/python-multiprocessing/2.6.2.1/classmultiprocessing_1_1managers_1_1_base_manager.html#a57fe9abe7a3d281286556c4bf3fbf4d5 And in Process._bootstrp(), I think we will need to register a function to be called by _run_after_forkers(), in order to closed the fds inherited from Nova process. http://sourcecodebrowser.com/python-multiprocessing/2.6.2.1/classmultiprocessing_1_1process_1_1_process.html#ae594800e7bdef288d9bfbf8b79019d2e And we also can not close the write end fd created by Manager in _run_after_forkers(). One feasible way may be getting that fd from the 5th element of _args attribute of Process object, then skip to close this fd.... I have not investigate if or not Manager need to use other fds, besides this pipe. Personally, I feel such an implementation will be a little tricky and risky, because it tightly depends on Manager code. If Manager opens other files, or change the argument order, our code will fail to run. Am I wrong? Is there any other safer way? On Thu, Jun 5, 2014 at 11:40 PM, Yuriy Taraday <yorik....@gmail.com> wrote: > Please take a look at > https://docs.python.org/2.7/library/multiprocessing.html#managers - > everything is already implemented there. > All you need is to start one manager that would serve all your requests to > libguestfs. The implementation in stdlib will provide you with all > exceptions and return values with minimum code changes on Nova side. > Create a new Manager, register an libguestfs "endpoint" in it and call > start(). It will spawn a separate process that will speak with calling > process over very simple RPC. > From the looks of it all you need to do is replace tpool.Proxy calls in > VFSGuestFS.setup method to calls to this new Manager. > > > On Thu, Jun 5, 2014 at 7:21 PM, Qin Zhao <chaoc...@gmail.com> wrote: > >> Hi Yuriy, >> >> Thanks for reading my bug! You are right. Python 3.3 or 3.4 should not >> have this issue, since they have can secure the file descriptor. Before >> OpenStack move to Python 3, we may still need a solution. Calling >> libguestfs in a separate process seems to be a way. This way, Nova code can >> close those fd by itself, not depending upon CLOEXEC. However, that will be >> an expensive solution, since it requires a lot of code change. At least we >> need to write code to pass the return value and exception between these two >> processes. That will make this solution very complex. Do you agree? >> >> >> On Thu, Jun 5, 2014 at 9:39 PM, Yuriy Taraday <yorik....@gmail.com> >> wrote: >> >>> This behavior of os.pipe() has changed in Python 3.x so it won't be an >>> issue on newer Python (if only it was accessible for us). >>> >>> From the looks of it you can mitigate the problem by running libguestfs >>> requests in a separate process (multiprocessing.managers comes to mind). >>> This way the only descriptors child process could theoretically inherit >>> would be long-lived pipes to main process although they won't leak because >>> they should be marked with CLOEXEC before any libguestfs request is run. >>> The other benefit is that this separate process won't be busy opening and >>> closing tons of fds so the problem with inheriting will be avoided. >>> >>> >>> On Thu, Jun 5, 2014 at 2:17 PM, laserjetyang <laserjety...@gmail.com> >>> wrote: >>> >>>> Will this patch of Python fix your problem? >>>> *http://bugs.python.org/issue7213 >>>> <http://bugs.python.org/issue7213>* >>>> >>>> >>>> On Wed, Jun 4, 2014 at 10:41 PM, Qin Zhao <chaoc...@gmail.com> wrote: >>>> >>>>> Hi Zhu Zhu, >>>>> >>>>> Thank you for reading my diagram! I need to clarify that this >>>>> problem does not occur during data injection. Before creating the ISO, >>>>> the >>>>> driver code will extend the disk. Libguestfs is invoked in that time >>>>> frame. >>>>> >>>>> And now I think this problem may occur at any time, if the code use >>>>> tpool to invoke libguestfs, and one external commend is executed in >>>>> another >>>>> green thread simultaneously. Please correct me if I am wrong. >>>>> >>>>> I think one simple solution for this issue is to call libguestfs >>>>> routine in greenthread, rather than another native thread. But it will >>>>> impact the performance very much. So I do not think that is an acceptable >>>>> solution. >>>>> >>>>> >>>>> >>>>> On Wed, Jun 4, 2014 at 12:00 PM, Zhu Zhu <bjzzu...@gmail.com> wrote: >>>>> >>>>>> Hi Qin Zhao, >>>>>> >>>>>> Thanks for raising this issue and analysis. According to the issue >>>>>> description and happen scenario( >>>>>> https://docs.google.com/drawings/d/1pItX9urLd6fmjws3BVovXQvRg_qMdTHS-0JhYfSkkVc/pub?w=960&h=720 >>>>>> ), if that's the case, concurrent mutiple KVM spawn instances(*with >>>>>> both config drive and data injection enabled*) are triggered, the >>>>>> issue can be very likely to happen. >>>>>> As in libvirt/driver.py _create_image method, right after iso making >>>>>> "cdb.make_drive", the driver will attempt "data injection" which >>>>>> will call the libguestfs launch in another thread. >>>>>> >>>>>> Looks there were also a couple of libguestfs hang issues from Launch >>>>>> pad as below. . I am not sure if libguestfs itself can have certain >>>>>> mechanism to free/close the fds that inherited from parent process >>>>>> instead >>>>>> of require explicitly calling the tear down. Maybe open a defect to >>>>>> libguestfs to see what their thoughts? >>>>>> >>>>>> https://bugs.launchpad.net/nova/+bug/1286256 >>>>>> https://bugs.launchpad.net/nova/+bug/1270304 >>>>>> >>>>>> ------------------------------ >>>>>> Zhu Zhu >>>>>> Best Regards >>>>>> >>>>>> >>>>>> *From:* Qin Zhao <chaoc...@gmail.com> >>>>>> *Date:* 2014-05-31 01:25 >>>>>> *To:* OpenStack Development Mailing List (not for usage questions) >>>>>> <openstack-dev@lists.openstack.org> >>>>>> *Subject:* [openstack-dev] [Nova] nova-compute deadlock >>>>>> Hi all, >>>>>> >>>>>> When I run Icehouse code, I encountered a strange problem. The >>>>>> nova-compute service becomes stuck, when I boot instances. I report this >>>>>> bug in https://bugs.launchpad.net/nova/+bug/1313477. >>>>>> >>>>>> After thinking several days, I feel I know its root cause. This bug >>>>>> should be a deadlock problem cause by pipe fd leaking. I draw a diagram >>>>>> to >>>>>> illustrate this problem. >>>>>> https://docs.google.com/drawings/d/1pItX9urLd6fmjws3BVovXQvRg_qMdTHS-0JhYfSkkVc/pub?w=960&h=720 >>>>>> >>>>>> However, I have not find a very good solution to prevent this >>>>>> deadlock. This problem is related with Python runtime, libguestfs, and >>>>>> eventlet. The situation is a little complicated. Is there any expert who >>>>>> can help me to look for a solution? I will appreciate for your help! >>>>>> >>>>>> -- >>>>>> Qin Zhao >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> OpenStack-dev mailing list >>>>>> OpenStack-dev@lists.openstack.org >>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Qin Zhao >>>>> >>>>> _______________________________________________ >>>>> OpenStack-dev mailing list >>>>> OpenStack-dev@lists.openstack.org >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> OpenStack-dev mailing list >>>> OpenStack-dev@lists.openstack.org >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >>>> >>> >>> >>> -- >>> >>> Kind regards, Yuriy. >>> >>> _______________________________________________ >>> OpenStack-dev mailing list >>> OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >> >> >> -- >> Qin Zhao >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > > -- > > Kind regards, Yuriy. > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- Qin Zhao
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev