Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen
[+lldb-dev] Hello Steve, thanks for the report. The fact that you see the rendezvous breakpoint being hit many times is not surprising. We get those every time the library is loaded (we need that to load relevant debug info and set potential breakpoints). However, they should generally not be surfaced to the user (unless you have the stop-on-sharedlibrary-events setting set, which you don't). The part that is suspicious to me is that __restore_rt shows up on the top of the backtrace. This is a trampoline used to return from signal handlers, and it would seem to indicate that you got some sort of a signal while loading the libraries. I don't know why this would happen, but it could be that this is confusing lldb's auto-resume logic. The interesting part to see here is what lldb thinks are the stop reasons for individual threads in the process (is the process multi-threaded?) for the last couple of stops. The "lldb step" and "gdb-remote packets" log categories are the most interesting to observe here. If you are able to send me the log traces, I can help you interpret them. regards, pavel On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev < llvm-...@lists.llvm.org> wrote: > Hello lldb developers, I am running into a problem with lldb on Linux. I am currently running llvm 6.0.0. > I have an executable that dynamically loads a large number of shared libraries at runtime. These are explicitly loaded via dlopen (they are specified in a configuration file), and after loading a few (typically a dozen or so, but the number varies) lldb will halt during dlopen. If I continue, it will load a few more then halt again, which makes debugging from startup impractical since there are so many libraries to be loaded (more than a hundred of them). > When I build and debug this same C++ on macOS, the debugger works fine. I have verified that target.process.stop-on-sharedlibrary-events is false. I turned on dyld logging and I see lots of log messages about RendezvousBreakpoint being hit, but I don’t see anything that sheds light on why some libraries load without stopping but others don’t. > I have tried to recreate this in a trivial program that calls dlopen in a loop, but haven’t been able to reproduce. > Can your offer any suggestions for further debugging this? More supporting evidence follows. > Here is the message when the debugger stops: > Process 120004 stopped > * thread #1, name = ‘', stop reason = trace > frame #0: 0x2cfca6a0 libc.so.6`__restore_rt > libc.so.6`__restore_rt: > -> 0x2cfca6a0 <+0>: movq $0xf, %rax > 0x2cfca6a7 <+7>: syscall > 0x2cfca6a9 <+9>: nopl (%rax) > libc.so.6`__libc_sigaction: > 0x2cfca6b0 <+0>: subq $0xd0, %rsp > I do not have the stop on shared library events setting enabled: > (lldb) settings show target.process.stop-on-sharedlibrary-events > target.process.stop-on-sharedlibrary-events (boolean) = false > The backtrace goes back to dlopen: > (lldb) bt > * thread #1, name = ‘x', stop reason = trace >* frame #0: 0x2cfca6a0 libc.so.6`__restore_rt > frame #1: 0x2aab9eb0 ld-linux-x86-64.so.2 > frame #2: 0x2aabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499 > frame #3: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error + 102 > frame #4: 0x2aabd63a ld-linux-x86-64.so.2`_dl_open + 186 > frame #5: 0x2c39df66 libdl.so.2`dlopen_doit + 102 > frame #6: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error + 102 > frame #7: 0x2c39e29c libdl.so.2`_dlerror_run + 124 > frame #8: 0x2c39dee1 libdl.so.2`__dlopen_check + 49 > the dyld debug log has a lot of this: > 209 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid 153501 stop_when_images_change=false > 210 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit called for pid 153501 > 211 intern-state DYLDRendezvous::Resolve address size: 8, padding 4 > 212 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160 > 213 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid 153501 stop_when_images_change=false > 214 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit called for pid 153501 > 215 intern-state DYLDRendezvous::Resolve address size: 8, padding 4 > 216 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160 > thanks, > --steve > In the woods too, a man casts off his years, as the snake his slough, and at what period soever of life, is always a child. > ___ > LLVM Developers mailing list > llvm-...@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen
It is interesting that the stop reason on the thread that stopped is "trace". That's what you would expect returning from the single-step to step over the breakpoint. But it looks like we got a signal while single-stepping, but the stop reason was misreported by somebody. Jim > On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev > wrote: > > [+lldb-dev] > > Hello Steve, > > thanks for the report. > > The fact that you see the rendezvous breakpoint being hit many times is not > surprising. We get those every time the library is loaded (we need that to > load relevant debug info and set potential breakpoints). However, they > should generally not be surfaced to the user (unless you have the > stop-on-sharedlibrary-events setting set, which you don't). > > The part that is suspicious to me is that __restore_rt shows up on the top > of the backtrace. This is a trampoline used to return from signal handlers, > and it would seem to indicate that you got some sort of a signal while > loading the libraries. I don't know why this would happen, but it could be > that this is confusing lldb's auto-resume logic. > > The interesting part to see here is what lldb thinks are the stop reasons > for individual threads in the process (is the process multi-threaded?) for > the last couple of stops. The "lldb step" and "gdb-remote packets" log > categories are the most interesting to observe here. If you are able to > send me the log traces, I can help you interpret them. > > regards, > pavel > > > > > > On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev < > llvm-...@lists.llvm.org> wrote: > >> Hello lldb developers, I am running into a problem with lldb on Linux. I > am currently running llvm 6.0.0. > >> I have an executable that dynamically loads a large number of shared > libraries at runtime. These are explicitly loaded via dlopen (they are > specified in a configuration file), and after loading a few (typically a > dozen or so, but the number varies) lldb will halt during dlopen. If I > continue, it will load a few more then halt again, which makes debugging > from startup impractical since there are so many libraries to be loaded > (more than a hundred of them). > >> When I build and debug this same C++ on macOS, the debugger works fine. > I have verified that target.process.stop-on-sharedlibrary-events is false. > I turned on dyld logging and I see lots of log messages about > RendezvousBreakpoint being hit, but I don’t see anything that sheds light > on why some libraries load without stopping but others don’t. > >> I have tried to recreate this in a trivial program that calls dlopen in a > loop, but haven’t been able to reproduce. > >> Can your offer any suggestions for further debugging this? More > supporting evidence follows. > >> Here is the message when the debugger stops: > >> Process 120004 stopped >> * thread #1, name = ‘', stop reason = trace >> frame #0: 0x2cfca6a0 libc.so.6`__restore_rt >> libc.so.6`__restore_rt: >> -> 0x2cfca6a0 <+0>: movq $0xf, %rax >> 0x2cfca6a7 <+7>: syscall >> 0x2cfca6a9 <+9>: nopl (%rax) > >> libc.so.6`__libc_sigaction: >> 0x2cfca6b0 <+0>: subq $0xd0, %rsp > >> I do not have the stop on shared library events setting enabled: > >> (lldb) settings show target.process.stop-on-sharedlibrary-events >> target.process.stop-on-sharedlibrary-events (boolean) = false > > > >> The backtrace goes back to dlopen: > >> (lldb) bt >> * thread #1, name = ‘x', stop reason = trace >> * frame #0: 0x2cfca6a0 libc.so.6`__restore_rt >> frame #1: 0x2aab9eb0 ld-linux-x86-64.so.2 >> frame #2: 0x2aabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499 >> frame #3: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error + > 102 >> frame #4: 0x2aabd63a ld-linux-x86-64.so.2`_dl_open + 186 >> frame #5: 0x2c39df66 libdl.so.2`dlopen_doit + 102 >> frame #6: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error + > 102 >> frame #7: 0x2c39e29c libdl.so.2`_dlerror_run + 124 >> frame #8: 0x2c39dee1 libdl.so.2`__dlopen_check + 49 > >> the dyld debug log has a lot of this: >> 209 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid > 153501 stop_when_images_change=false >> 210 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit > called for pid 153501 >> 211 intern-state DYLDRendezvous::Resolve address size: 8, padding 4 >> 212 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160 >> 213 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid > 153501 stop_when_images_change=false >> 214 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit > called for pid 153501 >> 215 intern-state DYLDRendezvous::Resolve address size: 8, padding 4 >> 216 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160 > > > >> thanks, >> --steve > > >> In the woods too,
Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen
Pavel asked for a dump of gdb-remote commands. I got that and ran it through the gdbremote decoder, and trimmed to include what looks like the last successful continue after breakpoint and then the halt on dlopen. Both cases stop on signal 5.After the stop message the debugger issues two binary reads and then apparently makes the decision that it should stop rather than continue. The stopping case is missing the equivalent of "Element 1: Single stepping past breakpoint site 2 at 0x2aab9eb0” which is in the continuing case. I’ve attached the file here: out Description: Binary data thanks,--steve From the ashes of disaster grow the roses of success. On Apr 17, 2018, at 11:27 AM, Jim Inghamwrote:It is interesting that the stop reason on the thread that stopped is "trace". That's what you would expect returning from the single-step to step over the breakpoint. But it looks like we got a signal while single-stepping, but the stop reason was misreported by somebody.JimOn Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev wrote:[+lldb-dev]Hello Steve,thanks for the report.The fact that you see the rendezvous breakpoint being hit many times is notsurprising. We get those every time the library is loaded (we need that toload relevant debug info and set potential breakpoints). However, theyshould generally not be surfaced to the user (unless you have thestop-on-sharedlibrary-events setting set, which you don't).The part that is suspicious to me is that __restore_rt shows up on the topof the backtrace. This is a trampoline used to return from signal handlers,and it would seem to indicate that you got some sort of a signal whileloading the libraries. I don't know why this would happen, but it could bethat this is confusing lldb's auto-resume logic.The interesting part to see here is what lldb thinks are the stop reasonsfor individual threads in the process (is the process multi-threaded?) forthe last couple of stops. The "lldb step" and "gdb-remote packets" logcategories are the most interesting to observe here. If you are able tosend me the log traces, I can help you interpret them.regards,pavelOn Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev wrote:Hello lldb developers, I am running into a problem with lldb on Linux. Iam currently running llvm 6.0.0.I have an executable that dynamically loads a large number of sharedlibraries at runtime. These are explicitly loaded via dlopen (they arespecified in a configuration file), and after loading a few (typically adozen or so, but the number varies) lldb will halt during dlopen. If Icontinue, it will load a few more then halt again, which makes debuggingfrom startup impractical since there are so many libraries to be loaded(more than a hundred of them).When I build and debug this same C++ on macOS, the debugger works fine.I have verified that target.process.stop-on-sharedlibrary-events is false.I turned on dyld logging and I see lots of log messages aboutRendezvousBreakpoint being hit, but I don’t see anything that sheds lighton why some libraries load without stopping but others don’t.I have tried to recreate this in a trivial program that calls dlopen in aloop, but haven’t been able to reproduce.Can your offer any suggestions for further debugging this? Moresupporting evidence follows.Here is the message when the debugger stops:Process 120004 stopped* thread #1, name = ‘', stop reason = trace frame #0: 0x2cfca6a0 libc.so.6`__restore_rtlibc.so.6`__restore_rt:-> 0x2cfca6a0 <+0>: movq $0xf, %rax 0x2cfca6a7 <+7>: syscall 0x2cfca6a9 <+9>: nopl (%rax)libc.so.6`__libc_sigaction: 0x2cfca6b0 <+0>: subq $0xd0, %rspI do not have the stop on shared library events setting enabled:(lldb) settings show target.process.stop-on-sharedlibrary-eventstarget.process.stop-on-sharedlibrary-events (boolean) = falseThe backtrace goes back to dlopen:(lldb) bt* thread #1, name = ‘x', stop reason = trace * frame #0: 0x2cfca6a0 libc.so.6`__restore_rt frame #1: 0x2aab9eb0 ld-linux-x86-64.so.2 frame #2: 0x2aabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499 frame #3: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +102 frame #4: 0x2aabd63a ld-linux-x86-64.so.2`_dl_open + 186 frame #5: 0x2c39df66 libdl.so.2`dlopen_doit + 102 frame #6: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +102 frame #7: 0x2c39e29c libdl.so.2`_dlerror_run + 124 frame #8: 0x2c39dee1 libdl.so.2`__dlopen_check + 49the dyld debug log has a lot of this:209 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid153501 stop_when_images_change=false210 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHitcalled for pid 153501211 intern-state DYLDRendezvous::Resolve address size: 8, padding 4212 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160213 inter
Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen
It's a bit of a wild guess, but is it possible that you (or one of the libraries you use) are doing anything with signals (SIGALRM or such?). I think I remember looking at the code handling the server-side ignored signal handling and thinking that it could go wrong if you get a signal while doing a instruction-step. I am not sure it fully applies here as the last command that lldb client did was a "continue", but i think it has to have something to do with signals, as you end up stopped in a signal handler. Could you try the following sequence of commands? (lldb) process launch --stop-at-entry-point (lldb) process handle --notify true --stop true #Stop on all signals (lldb) continue and let us know if you see any extra stops due to signals. If that doesn't find anything then I think we'll have start pulling logs from the lldb-server side, as there doesn't seem to be anything wrong with the client. The easiest way to achieve that is to do a export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets" export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log before launching lldb. On Tue, 17 Apr 2018 at 18:28, Steve Ravet wrote: > Pavel asked for a dump of gdb-remote commands. I got that and ran it through the gdbremote decoder, and trimmed to include what looks like the last successful continue after breakpoint and then the halt on dlopen. Both cases stop on signal 5. > After the stop message the debugger issues two binary reads and then apparently makes the decision that it should stop rather than continue. The stopping case is missing the equivalent of "Element 1: Single stepping past breakpoint site 2 at 0x2aab9eb0” which is in the continuing case. I’ve attached the file here: > thanks, > --steve > From the ashes of disaster grow the roses of success. > On Apr 17, 2018, at 11:27 AM, Jim Ingham wrote: > It is interesting that the stop reason on the thread that stopped is "trace". That's what you would expect returning from the single-step to step over the breakpoint. But it looks like we got a signal while single-stepping, but the stop reason was misreported by somebody. > Jim > On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev < lldb-dev@lists.llvm.org> wrote: > [+lldb-dev] > Hello Steve, > thanks for the report. > The fact that you see the rendezvous breakpoint being hit many times is not > surprising. We get those every time the library is loaded (we need that to > load relevant debug info and set potential breakpoints). However, they > should generally not be surfaced to the user (unless you have the > stop-on-sharedlibrary-events setting set, which you don't). > The part that is suspicious to me is that __restore_rt shows up on the top > of the backtrace. This is a trampoline used to return from signal handlers, > and it would seem to indicate that you got some sort of a signal while > loading the libraries. I don't know why this would happen, but it could be > that this is confusing lldb's auto-resume logic. > The interesting part to see here is what lldb thinks are the stop reasons > for individual threads in the process (is the process multi-threaded?) for > the last couple of stops. The "lldb step" and "gdb-remote packets" log > categories are the most interesting to observe here. If you are able to > send me the log traces, I can help you interpret them. > regards, > pavel > On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev < > llvm-...@lists.llvm.org> wrote: > Hello lldb developers, I am running into a problem with lldb on Linux. I > am currently running llvm 6.0.0. > I have an executable that dynamically loads a large number of shared > libraries at runtime. These are explicitly loaded via dlopen (they are > specified in a configuration file), and after loading a few (typically a > dozen or so, but the number varies) lldb will halt during dlopen. If I > continue, it will load a few more then halt again, which makes debugging > from startup impractical since there are so many libraries to be loaded > (more than a hundred of them). > When I build and debug this same C++ on macOS, the debugger works fine. > I have verified that target.process.stop-on-sharedlibrary-events is false. > I turned on dyld logging and I see lots of log messages about > RendezvousBreakpoint being hit, but I don’t see anything that sheds light > on why some libraries load without stopping but others don’t. > I have tried to recreate this in a trivial program that calls dlopen in a > loop, but haven’t been able to reproduce. > Can your offer any suggestions for further debugging this? More > supporting evidence follows. > Here is the message when the debugger stops: > Process 120004 stopped > * thread #1, name = ‘', stop reason = trace > frame #0: 0x2cfca6a0 libc.so.6`__restore_rt > libc.so.6`__restore_rt: > -> 0x2cfca6a0 <+0>: movq $0xf, %rax > 0x2cfca6a7 <+7>: syscall > 0x2cfca6a9 <+9>: nopl (%rax) > libc.so.6`__libc_
Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen
Ding! That’s it. This program does use SIGALRM. I wouldn’t have thought it would be enabled at this point, but apparently it is because I got lots of sigalarm halts during the .so loading. Further, if I run in lldb with process handle -n false -p false -s false SIGALRM then the debugger seems to run fine without stopping during dlopen(). The SIGALRM itself isn’t important to the operation of the program. With this knowledge I can probably create a simple testcase. Would this be considered an lldb bug? If so should I submit a testcase in some way? thanks! --steve The lover of nature is he whose inward and outward senses are still truly adjusted to each other; who has retained the spirit of infancy even into the era of manhood. > On Apr 17, 2018, at 1:12 PM, Pavel Labath wrote: > > It's a bit of a wild guess, but is it possible that you (or one of the > libraries you use) are doing anything with signals (SIGALRM or such?). I > think I remember looking at the code handling the server-side ignored > signal handling and thinking that it could go wrong if you get a signal > while doing a instruction-step. I am not sure it fully applies here as the > last command that lldb client did was a "continue", but i think it has to > have something to do with signals, as you end up stopped in a signal > handler. > > Could you try the following sequence of commands? > (lldb) process launch --stop-at-entry-point > (lldb) process handle --notify true --stop true #Stop on all signals > (lldb) continue > > and let us know if you see any extra stops due to signals. If that doesn't > find anything then I think we'll have start pulling logs from the > lldb-server side, as there doesn't seem to be anything wrong with the > client. The easiest way to achieve that is to do a > export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets" > export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log > before launching lldb. > On Tue, 17 Apr 2018 at 18:28, Steve Ravet wrote: > >> Pavel asked for a dump of gdb-remote commands. I got that and ran it > through the gdbremote decoder, and trimmed to include what looks like the > last successful continue after breakpoint and then the halt on dlopen. > Both cases stop on signal 5. > >> After the stop message the debugger issues two binary reads and then > apparently makes the decision that it should stop rather than continue. > The stopping case is missing the equivalent of "Element 1: Single stepping > past breakpoint site 2 at 0x2aab9eb0” which is in the continuing case. > I’ve attached the file here: > > >> thanks, >> --steve > > >> From the ashes of disaster grow the roses of success. > > > >> On Apr 17, 2018, at 11:27 AM, Jim Ingham wrote: > >> It is interesting that the stop reason on the thread that stopped is > "trace". That's what you would expect returning from the single-step to > step over the breakpoint. But it looks like we got a signal while > single-stepping, but the stop reason was misreported by somebody. > >> Jim > > >> On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > >> [+lldb-dev] > >> Hello Steve, > >> thanks for the report. > >> The fact that you see the rendezvous breakpoint being hit many times is > not >> surprising. We get those every time the library is loaded (we need that to >> load relevant debug info and set potential breakpoints). However, they >> should generally not be surfaced to the user (unless you have the >> stop-on-sharedlibrary-events setting set, which you don't). > >> The part that is suspicious to me is that __restore_rt shows up on the top >> of the backtrace. This is a trampoline used to return from signal > handlers, >> and it would seem to indicate that you got some sort of a signal while >> loading the libraries. I don't know why this would happen, but it could be >> that this is confusing lldb's auto-resume logic. > >> The interesting part to see here is what lldb thinks are the stop reasons >> for individual threads in the process (is the process multi-threaded?) for >> the last couple of stops. The "lldb step" and "gdb-remote packets" log >> categories are the most interesting to observe here. If you are able to >> send me the log traces, I can help you interpret them. > >> regards, >> pavel > > > > > >> On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev < >> llvm-...@lists.llvm.org> wrote: > >> Hello lldb developers, I am running into a problem with lldb on Linux. I > >> am currently running llvm 6.0.0. > >> I have an executable that dynamically loads a large number of shared > >> libraries at runtime. These are explicitly loaded via dlopen (they are >> specified in a configuration file), and after loading a few (typically a >> dozen or so, but the number varies) lldb will halt during dlopen. If I >> continue, it will load a few more then halt again, which makes debugging >> from startup impractical since there are so many librarie