Hi Lakshmi,
On 2/13/24 10:35, Lakshmi Deverkonda wrote:
Yes. We are trying to join only the threads related to the application.
The timeout is happening while trying to join the threads started by the
application.
In that case, I suspect that the issue is not related to lttngust. I
can't help with your internal application code.
If you're able to produce a minimal example that reproduces an issue
wherein you have deadlock when lttngust is imported, but not when it's
omitted I think that would be very interesting.
I would also recommend reviewing the bug reporting guidelines at
https://lttng.org/community/ to ensure that all the necessary
information is present.
thanks,
kienan
Regards,
Lakshmi
------------------------------------------------------------------------
*From:* Kienan Stewart <kstew...@efficios.com>
*Sent:* 13 February 2024 20:50
*To:* Lakshmi Deverkonda <la...@nvidia.com>; lttng-dev@lists.lttng.org
<lttng-dev@lists.lttng.org>
*Subject:* Re: [lttng-dev] Crash in application due to watchdog timeout
with python3 lttng
External email: Use caution opening links or attachments
Hi Lakshmi,
when the lttngust python agent starts, it attempts to connect to one or
more session daemons[1].
Each connection starts a thread that loops forever, retrying the
registration in case an exception occurs[2].
I don't think the it's designed to have `join()` called on those
threads, which I assume is happening in some of the code you or your
team have written.
My initial thought is that you should `join()` only the threads that
pertinent to your application, ignoring the lttngust agent threads and
then exit the application as normal.
[1]:
https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L334
<https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L334>
[2]:
https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L83
<https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L83>
thanks,
kienan
On 2/13/24 09:23, Lakshmi Deverkonda via lttng-dev wrote:
Hi,
We are able to integrate python3 lttng module in our application(python3
based). However, we are seeing that whenever the application terminates,
there is watchdog timeout due to timeout in joining the threads. What
could be the reason for this ? Does lttng module hold any thread event
locks ?
We are completely blocked on this issue. Could you please help ?
Here is the snippet of the core dump
(gdb) py-bt
Traceback (most recent call first):
File "/usr/lib/python3.7/threading.py", line 1048, in
_wait_for_tstate_lock
elif lock.acquire(block, timeout):
File "/usr/lib/python3.7/threading.py", line 1032, in join
self._wait_for_tstate_lock()
File "/usr/lib/python3/dist-packages/h.py", line 231, in JoinThreads
self.TT.join()
File "/usr/sbin/c", line 1466, in do_exit
H.JoinThreads()
File "/usr/sbin/c", line 7201, in main
do_exit(nlm, status)
File "/usr/sbin/c", line 7233, in <module>
main()
(gdb)
On a parallel note, thanks to Kienan who has been trying to provide
pointers on various issues reported so far.
Need help on this issue as well.
Thanks in advance,
Regards,
Lakshmi
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
<https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev>
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev