Re: [lldb-dev] LLDB Evolution - Final Form

Greg Clayton via lldb-dev Wed, 21 Sep 2016 13:45:02 -0700

I am laughing as I convert code over to using StringRef and I get crashes:

if (name == NULL)


StringRef is nice enough to implicitly construct a StringRef from NULL or 
nullptr so that it can crash for me...

> On Sep 21, 2016, at 11:09 AM, Zachary Turner via lldb-dev 
> <[email protected]> wrote:
> 
> Adding another thing to my list (thanks to Mehdi and Eric Christopher for the 
> idea).
> 
> Apply libfuzzer to LLDB.  Details sparse on what parse of LLDB and how, but I 
> think it would be easy to come up with candidates.
> 
> On Mon, Sep 19, 2016 at 1:18 PM Zachary Turner <[email protected]> wrote:
> Following up with Kate's post from a few weeks ago, I think the dust has 
> settled on the code reformat and it went over pretty smoothly for the most 
> part.  So I thought it might be worth throwing out some ideas for where we go 
> from here.  I have a large list of ideas (more ideas than time, sadly) that 
> I've been collecting over the past few weeks, so I figured I would throw them 
> out in the open for discussion.
> 
> I’ve grouped the areas for improvement into 3 high level categories.
> 
>       • De-inventing the wheel - We should use more code from LLVM, and 
> delete code in LLDB where LLVM provides a solution.  In cases where there is 
> an LLVM thing that is *similar* to what we need, we should extend the LLVM 
> thing to support what we need, and then use it.  Following are some areas 
> I've identified.  This list is by no means complete.  For each one, I've 
> given a personal assessment of how likely it is to cause some (temporary) 
> hiccups, how much it would help us in the long run, and how difficult it 
> would be to do.  Without further ado:
>               • Use llvm::Regex instead of lldb::Regex
>                       • llvm::Regex doesn’t support enhanced mode.  Could we 
> add support for this to llvm::Regex?
>                       • Risk: 6
>                       • Impact: 3
>                       • Difficulty / Effort: 3  (5 if we have to add enhanced 
> mode support)
>               • Use llvm streams instead of lldb::StreamString
>                       • Supports output re-targeting (stderr, stdout, 
> std::string, etc), printf style formatting, and type-safe streaming operators.
>                       • Interoperates nicely with many existing llvm utility 
> classes
>                       • Risk: 4
>                       • Impact: 5
>                       • Difficulty / Effort: 7
>               • Use llvm::Error instead of lldb::Error
>                       • llvm::Error is an error class that *requires* you to 
> check whether it succeeded or it will assert.  In a way, it's similar to a 
> C++ exception, except that it doesn't come with the performance hit 
> associated with exceptions.  It's extensible, and can be easily extended to 
> support the various ways LLDB needs to construct errors and error messages.
>                       • Would need to first rename lldb::Error to LLDBError 
> so that te conversion from LLDBError to llvm::Error could be done 
> incrementally.
>                       • Risk: 7
>                       • Impact: 7
>                       • Difficulty / Effort: 8
>               • StringRef instead of const char *, len everywhere
>                       • Can do most common string operations in a way that is 
> guaranteed to be safe.
>                       • Reduces string manipulation algorithm complexity by 
> an order of magnitude.
>                       • Can potentially eliminate tens of thousands of string 
> copies across the codebase.
>                       • Simplifies code.
>                       • Risk: 3
>                       • Impact: 8
>                       • Difficulty / Effort: 7
>               • ArrayRef instead of const void *, len everywhere
>                       • Same analysis as StringRef
>               • MutableArrayRef instead of void *, len everywhere
>                       • Same analysis as StringRef
>               • Delete ConstString, use a modified StringPool that is 
> thread-safe.
>                       • StringPool is a non thread-safe version of 
> ConstString.
>                       • Strings are internally refcounted so they can be 
> cleaned up when they are no longer used.  ConstStrings are a large source of 
> memory in LLDB, so ref-counting and removing stale strings has the potential 
> to be a huge savings.
>                       • Risk: 2
>                       • Impact: 9
>                       • Difficulty / Effort: 6
>               • thread_local instead of lldb::ThreadLocal
>                       • This fixes a number of bugs on Windows that cannot be 
> fixed otherwise, as they require compiler support.
>                       • Some other compilers may not support this yet?
>                       • Risk: 2
>                       • Impact: 3
>                       • Difficulty: 3
>               • Use llvm::cl for the command line arguments to the primary 
> lldb executable.
>                       • Risk: 2
>                       • Impact: 3
>                       • Difficulty / Effort: 4
>       • Testing - Our testing infrastructure is unstable, and our test 
> coverage is lacking.  We should take steps to improve this.
>               • Port as much as possible to lit
>                       • Simple tests should be trivial to port to lit today.  
> If nothing else this serves as a proof of concept while increasing the speed 
> and stability of the test suite, since lit is a more stable harness.
>               • Separate testing tools
>                       • One question that remains open is how to represent 
> the complicated needs of a debugger in lit tests.  Part a) above covers the 
> trivial cases, but what about the difficult cases?  In 
> https://reviews.llvm.org/D24591 a number of ideas were discussed.  We started 
> getting to this idea towards the end, about a separate tool which has an 
> interface independent of the command line interface and which can be used to 
> test.  lldb-mi was mentioned.  While I have serious concerns about lldb-mi 
> due to its poorly written and tested codebase, I do agree in principle with 
> the methodology.  In fact, this is the entire philosophy behind lit as used 
> with LLVM, clang, lld, etc.  
> 
> I don’t take full credit for this idea.  I had been toying with a similar 
> idea for some time, but it was further cemented in an offline discussion with 
> a co-worker.  
> 
> There many small, targeted tools in LLVM (e.g. llc, lli, llvm-objdump, etc) 
> whose purpose are to be chained together to do interesting things.  Instead 
> of a command line api as we think of in LLDB where you type commands from an 
> interactive prompt, they have a command line api as you would expect from any 
> tool which is launched from a shell.
> 
> I can imagine many potential candidates for lldb tools of this nature.  Off 
> the top of my head:
>       • lldb-unwind - A tool for testing the unwinder.  Accepts byte code as 
> input and passes it through to the unwinder, outputting a compressed summary 
> of the steps taken while unwinding, which could be pattern matched in lit.  
> The output format is entirely controlled by the tool, and not by the unwinder 
> itself, so it would be stable in the face of changes to the underlying 
> unwinder.  Could have various options to enable or disable features of the 
> unwinder in order to force the unwinder into modes that can be tricky to 
> encounter in the wild.
>       • lldb-symbol - A tool for testing symbol resolution.  Could have 
> options for testing things like:
>               • Determining if a symbol matches an executable
>               • looking up a symbol by name in the debug info, and mapping it 
> to an address in the process.  
>               • Displaying candidate symbols when doing name lookup in a 
> particular scope (e.g. while stopped at a breakpoint).
>       • lldb-breakpoint - A tool for testing breakpoints and stepping.  
> Various options could include:
>               • Set breakpoints and out addresses and/or symbol names where 
> they were resolved to.
>               • Trigger commands, so that when a breakpoint is hit the tool 
> could automatically continue and try to run to another breakpoint, etc.
>               • options to inspect certain useful pieces of state about an 
> inferior, to be matched in lit. 
>       • lldb-interpreter - tests the jitter etc.  I don’t know much about 
> this, but I don’t see why this couldn’t be tested in a manner similar to how 
> lli is tested.
>       • lldb-platform - tests lldb local and remote platform interfaces.
>       • lldb-cli -- lldb interactive command line.
>       • lldb-format - lldb data formatters etc.
>       • Tests NOW, not later.
>               • I know we’ve been over this a million times and it’s not 
> worth going over the arguments again.  And I know it’s hard to write tests, 
> often requiring the invention of new SB APIs.  Hopefully those issues will be 
> addressed by above a) and b) above and writing tests will be easier.  Vedant 
> Kumar ran some analytics on the various codebases and found that LLDB has the 
> lowest test / commit ratio of any LLVM project (He didn’t post numbers for 
> lld, so I’m not sure what it is there).
>                       • lldb: 287 of the past 1000 commits
>                       • llvm: 511 of the past 1000 commits
>                       • clang: 622 of the past 1000 commits
>                       • compiler-rt: 543 of the past 1000 commits
> This is an alarming statistic, and I would love to see this number closer to 
> 50%.
>       • Code style / development conventions - Aside from just the column 
> limitations and bracing styles, there are other areas where LLDB differs from 
> LLVM on code style.  We should continue to adopt more of LLVM's style where 
> it makes sense.  I've identified a couple of areas (incomplete list) which I 
> outline below.  
>               • Clean up the mess of cyclical dependencies and properly layer 
> the libraries.  This is especially important for things like lldb-server that 
> need to link in as little as possible, but regardless it leads to a more 
> robust architecture, faster build and link times, better testability, and is 
> required if we ever want to do a modules build of LLDB
>               • Use CMake instead of Xcode project (CMake supports 
> Frameworks).  CMake supports Apple Frameworks, so the main roadblock to 
> getting this working is just someone doing it.  Segmenting the build process 
> by platform doesn't make sense for the upstream, especially when there is a 
> perfectly workable solution.  I have no doubt that the resulting Xcode 
> workspace generated automatically by CMake will not be as "nice" as one that 
> is maintained by hand.  We face this problem with Visual Studio on Windows as 
> well.  The solution that most people have adopted is to continue using the 
> IDE for code editing and debugging, but for actually running the build, use 
> CMake with Ninja.  A similar workflow should still be possible with an OSX 
> CMake build, but as I do not work every day on a Mac, all I can say is that 
> it's possible, I have no idea how impactful it would be on peoples' 
> workflows.  
>               • Variable naming conventions
>                       • I don’t expect anyone is too fond of LLDB’s naming 
> conventions, but if we’re committed to joining the LLVM ecosystem, then let’s 
> go all the way.
>               • Use more modern C++ and less C
>                       • Old habits die hard, but this isn’t just a matter of 
> style.  It leads to safer, more robust, and less fragile code as well.
>               • Shorter functions and classes with more narrowly targeted 
> responsibilities
>                       • It’s not uncommon to find functions that are hundreds 
> (and in a few cases even 1,000+) of lines long.  We really need to be better 
> about breaking functions and classes down into smaller responsibilities.  
> This helps not just for someone coming in to read the function, but also for 
> testing.  Smaller functions are easier to unit test.
>               • Convert T foo(X, Y, Error &error) functions to Expected<T> 
> foo(X, Y) style (Depends on 1.c)
>                       • llvm::Expected is based on the llvm::Error class 
> described earlier.  It’s used when a function is supposed to return a value, 
> but it could fail.  By packaging the error with the return value, it’s 
> impossible to have a situation where you use the return value even in case of 
> an error, and because llvm::Error has mandatory checking, it’s also 
> impossible to have a sitaution where you don’t check the error.  So it’s very 
> safe.  
> 
> Whew.  That was a lot.  If you made it this far, thanks for reading!
> 
> Obviously if we were to embark on all of the above, it would take many months 
> to complete everything.  So I'm not proposing anyone stop what they're doing 
> to work on this.  This is just my own personal wishlist
> _______________________________________________
> lldb-dev mailing list
> [email protected]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] LLDB Evolution - Final Form

Reply via email to