Re: [lldb-dev] LLDB Evolution - Final Form

Greg Clayton via lldb-dev Wed, 21 Sep 2016 13:44:48 -0700

the variable used to be a "const char *" in the last example...

> On Sep 21, 2016, at 1:43 PM, Greg Clayton <[email protected]> wrote:
> 
> I am laughing as I convert code over to using StringRef and I get crashes:
> 
> if (name == NULL)
> 
> StringRef is nice enough to implicitly construct a StringRef from NULL or 
> nullptr so that it can crash for me...
> 
>> On Sep 21, 2016, at 11:09 AM, Zachary Turner via lldb-dev 
>> <[email protected]> wrote:
>> 
>> Adding another thing to my list (thanks to Mehdi and Eric Christopher for 
>> the idea).
>> 
>> Apply libfuzzer to LLDB.  Details sparse on what parse of LLDB and how, but 
>> I think it would be easy to come up with candidates.
>> 
>> On Mon, Sep 19, 2016 at 1:18 PM Zachary Turner <[email protected]> wrote:
>> Following up with Kate's post from a few weeks ago, I think the dust has 
>> settled on the code reformat and it went over pretty smoothly for the most 
>> part.  So I thought it might be worth throwing out some ideas for where we 
>> go from here.  I have a large list of ideas (more ideas than time, sadly) 
>> that I've been collecting over the past few weeks, so I figured I would 
>> throw them out in the open for discussion.
>> 
>> I’ve grouped the areas for improvement into 3 high level categories.
>> 
>>      • De-inventing the wheel - We should use more code from LLVM, and 
>> delete code in LLDB where LLVM provides a solution.  In cases where there is 
>> an LLVM thing that is *similar* to what we need, we should extend the LLVM 
>> thing to support what we need, and then use it.  Following are some areas 
>> I've identified.  This list is by no means complete.  For each one, I've 
>> given a personal assessment of how likely it is to cause some (temporary) 
>> hiccups, how much it would help us in the long run, and how difficult it 
>> would be to do.  Without further ado:
>>              • Use llvm::Regex instead of lldb::Regex
>>                      • llvm::Regex doesn’t support enhanced mode.  Could we 
>> add support for this to llvm::Regex?
>>                      • Risk: 6
>>                      • Impact: 3
>>                      • Difficulty / Effort: 3  (5 if we have to add enhanced 
>> mode support)
>>              • Use llvm streams instead of lldb::StreamString
>>                      • Supports output re-targeting (stderr, stdout, 
>> std::string, etc), printf style formatting, and type-safe streaming 
>> operators.
>>                      • Interoperates nicely with many existing llvm utility 
>> classes
>>                      • Risk: 4
>>                      • Impact: 5
>>                      • Difficulty / Effort: 7
>>              • Use llvm::Error instead of lldb::Error
>>                      • llvm::Error is an error class that *requires* you to 
>> check whether it succeeded or it will assert.  In a way, it's similar to a 
>> C++ exception, except that it doesn't come with the performance hit 
>> associated with exceptions.  It's extensible, and can be easily extended to 
>> support the various ways LLDB needs to construct errors and error messages.
>>                      • Would need to first rename lldb::Error to LLDBError 
>> so that te conversion from LLDBError to llvm::Error could be done 
>> incrementally.
>>                      • Risk: 7
>>                      • Impact: 7
>>                      • Difficulty / Effort: 8
>>              • StringRef instead of const char *, len everywhere
>>                      • Can do most common string operations in a way that is 
>> guaranteed to be safe.
>>                      • Reduces string manipulation algorithm complexity by 
>> an order of magnitude.
>>                      • Can potentially eliminate tens of thousands of string 
>> copies across the codebase.
>>                      • Simplifies code.
>>                      • Risk: 3
>>                      • Impact: 8
>>                      • Difficulty / Effort: 7
>>              • ArrayRef instead of const void *, len everywhere
>>                      • Same analysis as StringRef
>>              • MutableArrayRef instead of void *, len everywhere
>>                      • Same analysis as StringRef
>>              • Delete ConstString, use a modified StringPool that is 
>> thread-safe.
>>                      • StringPool is a non thread-safe version of 
>> ConstString.
>>                      • Strings are internally refcounted so they can be 
>> cleaned up when they are no longer used.  ConstStrings are a large source of 
>> memory in LLDB, so ref-counting and removing stale strings has the potential 
>> to be a huge savings.
>>                      • Risk: 2
>>                      • Impact: 9
>>                      • Difficulty / Effort: 6
>>              • thread_local instead of lldb::ThreadLocal
>>                      • This fixes a number of bugs on Windows that cannot be 
>> fixed otherwise, as they require compiler support.
>>                      • Some other compilers may not support this yet?
>>                      • Risk: 2
>>                      • Impact: 3
>>                      • Difficulty: 3
>>              • Use llvm::cl for the command line arguments to the primary 
>> lldb executable.
>>                      • Risk: 2
>>                      • Impact: 3
>>                      • Difficulty / Effort: 4
>>      • Testing - Our testing infrastructure is unstable, and our test 
>> coverage is lacking.  We should take steps to improve this.
>>              • Port as much as possible to lit
>>                      • Simple tests should be trivial to port to lit today.  
>> If nothing else this serves as a proof of concept while increasing the speed 
>> and stability of the test suite, since lit is a more stable harness.
>>              • Separate testing tools
>>                      • One question that remains open is how to represent 
>> the complicated needs of a debugger in lit tests.  Part a) above covers the 
>> trivial cases, but what about the difficult cases?  In 
>> https://reviews.llvm.org/D24591 a number of ideas were discussed.  We 
>> started getting to this idea towards the end, about a separate tool which 
>> has an interface independent of the command line interface and which can be 
>> used to test.  lldb-mi was mentioned.  While I have serious concerns about 
>> lldb-mi due to its poorly written and tested codebase, I do agree in 
>> principle with the methodology.  In fact, this is the entire philosophy 
>> behind lit as used with LLVM, clang, lld, etc.  
>> 
>> I don’t take full credit for this idea.  I had been toying with a similar 
>> idea for some time, but it was further cemented in an offline discussion 
>> with a co-worker.  
>> 
>> There many small, targeted tools in LLVM (e.g. llc, lli, llvm-objdump, etc) 
>> whose purpose are to be chained together to do interesting things.  Instead 
>> of a command line api as we think of in LLDB where you type commands from an 
>> interactive prompt, they have a command line api as you would expect from 
>> any tool which is launched from a shell.
>> 
>> I can imagine many potential candidates for lldb tools of this nature.  Off 
>> the top of my head:
>>      • lldb-unwind - A tool for testing the unwinder.  Accepts byte code as 
>> input and passes it through to the unwinder, outputting a compressed summary 
>> of the steps taken while unwinding, which could be pattern matched in lit.  
>> The output format is entirely controlled by the tool, and not by the 
>> unwinder itself, so it would be stable in the face of changes to the 
>> underlying unwinder.  Could have various options to enable or disable 
>> features of the unwinder in order to force the unwinder into modes that can 
>> be tricky to encounter in the wild.
>>      • lldb-symbol - A tool for testing symbol resolution.  Could have 
>> options for testing things like:
>>              • Determining if a symbol matches an executable
>>              • looking up a symbol by name in the debug info, and mapping it 
>> to an address in the process.  
>>              • Displaying candidate symbols when doing name lookup in a 
>> particular scope (e.g. while stopped at a breakpoint).
>>      • lldb-breakpoint - A tool for testing breakpoints and stepping.  
>> Various options could include:
>>              • Set breakpoints and out addresses and/or symbol names where 
>> they were resolved to.
>>              • Trigger commands, so that when a breakpoint is hit the tool 
>> could automatically continue and try to run to another breakpoint, etc.
>>              • options to inspect certain useful pieces of state about an 
>> inferior, to be matched in lit. 
>>      • lldb-interpreter - tests the jitter etc.  I don’t know much about 
>> this, but I don’t see why this couldn’t be tested in a manner similar to how 
>> lli is tested.
>>      • lldb-platform - tests lldb local and remote platform interfaces.
>>      • lldb-cli -- lldb interactive command line.
>>      • lldb-format - lldb data formatters etc.
>>      • Tests NOW, not later.
>>              • I know we’ve been over this a million times and it’s not 
>> worth going over the arguments again.  And I know it’s hard to write tests, 
>> often requiring the invention of new SB APIs.  Hopefully those issues will 
>> be addressed by above a) and b) above and writing tests will be easier.  
>> Vedant Kumar ran some analytics on the various codebases and found that LLDB 
>> has the lowest test / commit ratio of any LLVM project (He didn’t post 
>> numbers for lld, so I’m not sure what it is there).
>>                      • lldb: 287 of the past 1000 commits
>>                      • llvm: 511 of the past 1000 commits
>>                      • clang: 622 of the past 1000 commits
>>                      • compiler-rt: 543 of the past 1000 commits
>> This is an alarming statistic, and I would love to see this number closer to 
>> 50%.
>>      • Code style / development conventions - Aside from just the column 
>> limitations and bracing styles, there are other areas where LLDB differs 
>> from LLVM on code style.  We should continue to adopt more of LLVM's style 
>> where it makes sense.  I've identified a couple of areas (incomplete list) 
>> which I outline below.  
>>              • Clean up the mess of cyclical dependencies and properly layer 
>> the libraries.  This is especially important for things like lldb-server 
>> that need to link in as little as possible, but regardless it leads to a 
>> more robust architecture, faster build and link times, better testability, 
>> and is required if we ever want to do a modules build of LLDB
>>              • Use CMake instead of Xcode project (CMake supports 
>> Frameworks).  CMake supports Apple Frameworks, so the main roadblock to 
>> getting this working is just someone doing it.  Segmenting the build process 
>> by platform doesn't make sense for the upstream, especially when there is a 
>> perfectly workable solution.  I have no doubt that the resulting Xcode 
>> workspace generated automatically by CMake will not be as "nice" as one that 
>> is maintained by hand.  We face this problem with Visual Studio on Windows 
>> as well.  The solution that most people have adopted is to continue using 
>> the IDE for code editing and debugging, but for actually running the build, 
>> use CMake with Ninja.  A similar workflow should still be possible with an 
>> OSX CMake build, but as I do not work every day on a Mac, all I can say is 
>> that it's possible, I have no idea how impactful it would be on peoples' 
>> workflows.  
>>              • Variable naming conventions
>>                      • I don’t expect anyone is too fond of LLDB’s naming 
>> conventions, but if we’re committed to joining the LLVM ecosystem, then 
>> let’s go all the way.
>>              • Use more modern C++ and less C
>>                      • Old habits die hard, but this isn’t just a matter of 
>> style.  It leads to safer, more robust, and less fragile code as well.
>>              • Shorter functions and classes with more narrowly targeted 
>> responsibilities
>>                      • It’s not uncommon to find functions that are hundreds 
>> (and in a few cases even 1,000+) of lines long.  We really need to be better 
>> about breaking functions and classes down into smaller responsibilities.  
>> This helps not just for someone coming in to read the function, but also for 
>> testing.  Smaller functions are easier to unit test.
>>              • Convert T foo(X, Y, Error &error) functions to Expected<T> 
>> foo(X, Y) style (Depends on 1.c)
>>                      • llvm::Expected is based on the llvm::Error class 
>> described earlier.  It’s used when a function is supposed to return a value, 
>> but it could fail.  By packaging the error with the return value, it’s 
>> impossible to have a situation where you use the return value even in case 
>> of an error, and because llvm::Error has mandatory checking, it’s also 
>> impossible to have a sitaution where you don’t check the error.  So it’s 
>> very safe.  
>> 
>> Whew.  That was a lot.  If you made it this far, thanks for reading!
>> 
>> Obviously if we were to embark on all of the above, it would take many 
>> months to complete everything.  So I'm not proposing anyone stop what 
>> they're doing to work on this.  This is just my own personal wishlist
>> _______________________________________________
>> lldb-dev mailing list
>> [email protected]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>


_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] LLDB Evolution - Final Form

Reply via email to