Hi all,

I'm getting freezes at application startup with LTTng-UST which I assume are 
due to my lack of understanding of TRACEPOINT_PROBE_DYNAMIC_LINKAGE.

I'm running:
- lttng-tools-2.0.4
- lttng-ust-2.0.5 patched with 009745db "Cache the procname per-thread rather 
than per-process to take into account that prctl() can be used to set thread 
names." and e699eda "don't spawn per-user thread if HOME is not set".
- liburcu0-0.7.4-0

This is being run on CentOS 5.4.

The lttng-ust man page says:

"""
BUILDING/LINKING THE TRACEPOINT PROVIDER
[...]
         1.1) Compile the Tracepoint provider with the application, either
              directly or through a static library (.a):
[...]
         2) Compile the Tracepoint Provider separately from the application,
            using dynamic linking:
           - Into exactly one object of your application: define
             "TRACEPOINT_DEFINE" _and_ also define
             "TRACEPOINT_PROBE_DYNAMIC_LINKAGE", then include the tracepoint
             provider header.   
[...]
"""

I have a dynamic library that includes code with tracepoint() calls, plus the 
relevant tracepoint definitions and trace providers - it is sort of 
"self-contained".  My Perl script dlopen()s this dynamic library.  Am I correct 
in thinking that this falls into option 2) above, because whilst the trace 
provider is statically linked with the code that uses the tracepoints, the 
trace provider is being dynamically linked into the application, and that is 
the part that matters to LTTng? 
 
When I build this dynamic library without TRACEPOINT_PROBE_DYNAMIC_LINKAGE 
#defined in the tracepoint definitions, I get a freeze that looks like 
https://bugs.lttng.org/issues/142 from the strace output:

"""
open("/usr/lib64/liburcu-common.so.1", O_RDONLY) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\7`\0310\0\0\0"..., 
832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=9464, ...}) = 0
mmap(0x3019600000, 2102552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 
0) = 0x3019600000
mprotect(0x3019602000, 2093056, PROT_NONE) = 0
mmap(0x3019801000, 4096, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x1000) = 0x3019801000
close(4)                                = 0
mprotect(0x3fd3607000, 4096, PROT_READ) = 0
munmap(0x2ae2a2766000, 49028)           = 0
open("/dev/urandom", O_RDONLY)          = 4
fcntl(4, F_GETFD)                       = 0
fcntl(4, F_SETFD, FD_CLOEXEC)           = 0
getuid()                                = 0
getppid()                               = 8809
read(4, "\266\310we\347u\31\336\33\266 \262C\3267\32", 16) = 16
gettid()                                = 8810
clock_gettime(CLOCK_REALTIME, {1361753274, 392351000}) = 0
getuid()                                = 0
geteuid()                               = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, 
-1, 0) = 0x4023d000
mprotect(0x4023d000, 4096, PROT_NONE)   = 0
clone(child_stack=0x40c3d250, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
 parent_tidptr=0x40c3d9d0, tls=0x40c3d940, child_tidptr=0x40c3d9d0) = 8811
mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, 
-1, 0) = 0x41e43000
mprotect(0x41e43000, 4096, PROT_NONE)   = 0
clone(child_stack=0x42843250, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
 parent_tidptr=0x428439d0, tls=0x42843940, child_tidptr=0x428439d0) = 8812
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0x2ae2a29a5a80, FUTEX_WAIT_PRIVATE, 0, {2, 999506000}) = -1 ETIMEDOUT 
(Connection timed out)
gettid()                                = 8810
write(2, "libust[8810/8810]: Error: Timed "..., 107libust[8810/8810]: Error: 
Timed out waiting for ltt-sessiond (in lttng_ust_init() at lttng-ust-comm.c:952)
) = 107
futex(0x2ae2a29a5b00, FUTEX_WAIT_PRIVATE, 2, NULL
"""
(freeze at this point)

I realise that bug 142 is fixed in the version of the code I'm running.

I also found that, even though the Perl script does not fork() (unless I'm very 
much mistaken), if I 'export LD_PRELOAD=/usr/lib64/liblttng-ust-fork.so.0' and 
then run a shell script that invokes 'perl ...' to start the script, the 
problem goes away.  Is this because, when the shell fork()s to start Perl, 
liblttng-ust-fork.so.0 will cause LTTng to be initialised in the new process 
much earlier than is the case with just the dlopen()? 

I played around with another script and dynamic library combination and found a 
similar problem occurred on CentOS 5.4, but when I used the same LTTng-UST and 
related libraries on Fedora 16, I didn't get the freeze.

Finally, I noticed that commit ID b834deadbfa8a78ae1d00440fd91c41dfd351eba 
changed README to remove a note suggesting that dlopen() not be used.  It looks 
like the same change should be made to doc/man/lttng-ust.3, which still 
suggests that you not use dlopen().

Thanks in advance,
David

----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any 
disclosure, copying, or further distribution of confidential information is not 
permitted unless such privilege is explicitly granted in writing by Quantum. 
Quantum reserves the right to have electronic communications, including email 
and attachments, sent across its networks filtered through anti virus and spam 
software programs and retain such messages in order to comply with applicable 
data security and retention requirements. Quantum is not responsible for the 
proper and complete transmission of the substance of this communication or for 
any delay in its receipt.

_______________________________________________
lttng-dev mailing list
[email protected]
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Reply via email to