Hello all, we are running into limitations of the current module download/caching system. A simple android application can link to about 46 megabytes worth of modules, and downloading that with our current transfer rates takes about 25 seconds. Much of the data we download this way is never actually accessed, and yet we download everything immediately upon starting the debug session, which makes the first session extremely laggy.
We could speed up a lot by only downloading the portions of the module that we really need (in my case this turns out to be about 8 megabytes). Also, further speedups could be made by increasing the throughput of the gdb-remote protocol used for downloading these files by using pipelining. I made a proof-of-concept hack of these things, put it into lldb and I was able to get the time for the startup-attach-detach-exit cycle down to 5.4 seconds (for comparison, the current time for the cycle is about 3.6 seconds with a hot module cache, and 28(!) seconds with an empty cache). Now, I would like to properly implement these things in lldb properly, so this is a request for comments on my plan. What I would like to do is: - Replace ModuleCache with a SectionCache (actually, more like a cache of arbitrary file chunks). When a the cache gets a request for a file and the file is not in the cache already, it returns a special kind of a Module, whose fragments will be downloaded as we are trying to access them. These fragments will be cached on disk, so that subsequent requests for the file do not need to re-download them. We can also have the option to short-circuit this logic and download the whole file immediately (e.g., when the file is small, or we have a super-fast way of obtaining the whole file via rsync, etc...) - Add pipelining support to GDBRemoteCommunicationClient for communicating with the platform. This actually does not require any changes to the wire protocol. The only change is in adding the ability to send an additional request to the server while waiting for the response to the previous one. Since the protocol is request-response based and we are communication over a reliable transport stream, each response can be correctly matched to a request even though we have multiple packets in flight. Any packets which need to maintain more complex state (like downloading a single entity using continuation packets) can still lock the stream to get exclusive access, but I am not sure if we actually even have any such packets in the platform flavour of the protocol. - Paralelize downloading of multiple files in parallel, utilizing request pipelining. Currently we get the biggest delay when first attaching to a process (we download file headers and some basic informative sections) and when we try to set the first symbol-level breakpoint (we download symbol tables and string sections). Both of these actions operate on all modules in bulk, which makes them easy paralelization targets. This will provide a big speed boost, as we will be eliminating communication latency. Furthermore, in case of lots of files, we will be overlapping file download (io) with parsing (cpu), for an even bigger boost. What do you think? cheers, pl _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev