I am laughing as I convert code over to using StringRef and I get crashes: if (name == NULL)
StringRef is nice enough to implicitly construct a StringRef from NULL or nullptr so that it can crash for me... > On Sep 21, 2016, at 11:09 AM, Zachary Turner via lldb-dev > <lldb-dev@lists.llvm.org> wrote: > > Adding another thing to my list (thanks to Mehdi and Eric Christopher for the > idea). > > Apply libfuzzer to LLDB. Details sparse on what parse of LLDB and how, but I > think it would be easy to come up with candidates. > > On Mon, Sep 19, 2016 at 1:18 PM Zachary Turner <ztur...@google.com> wrote: > Following up with Kate's post from a few weeks ago, I think the dust has > settled on the code reformat and it went over pretty smoothly for the most > part. So I thought it might be worth throwing out some ideas for where we go > from here. I have a large list of ideas (more ideas than time, sadly) that > I've been collecting over the past few weeks, so I figured I would throw them > out in the open for discussion. > > I’ve grouped the areas for improvement into 3 high level categories. > > • De-inventing the wheel - We should use more code from LLVM, and > delete code in LLDB where LLVM provides a solution. In cases where there is > an LLVM thing that is *similar* to what we need, we should extend the LLVM > thing to support what we need, and then use it. Following are some areas > I've identified. This list is by no means complete. For each one, I've > given a personal assessment of how likely it is to cause some (temporary) > hiccups, how much it would help us in the long run, and how difficult it > would be to do. Without further ado: > • Use llvm::Regex instead of lldb::Regex > • llvm::Regex doesn’t support enhanced mode. Could we > add support for this to llvm::Regex? > • Risk: 6 > • Impact: 3 > • Difficulty / Effort: 3 (5 if we have to add enhanced > mode support) > • Use llvm streams instead of lldb::StreamString > • Supports output re-targeting (stderr, stdout, > std::string, etc), printf style formatting, and type-safe streaming operators. > • Interoperates nicely with many existing llvm utility > classes > • Risk: 4 > • Impact: 5 > • Difficulty / Effort: 7 > • Use llvm::Error instead of lldb::Error > • llvm::Error is an error class that *requires* you to > check whether it succeeded or it will assert. In a way, it's similar to a > C++ exception, except that it doesn't come with the performance hit > associated with exceptions. It's extensible, and can be easily extended to > support the various ways LLDB needs to construct errors and error messages. > • Would need to first rename lldb::Error to LLDBError > so that te conversion from LLDBError to llvm::Error could be done > incrementally. > • Risk: 7 > • Impact: 7 > • Difficulty / Effort: 8 > • StringRef instead of const char *, len everywhere > • Can do most common string operations in a way that is > guaranteed to be safe. > • Reduces string manipulation algorithm complexity by > an order of magnitude. > • Can potentially eliminate tens of thousands of string > copies across the codebase. > • Simplifies code. > • Risk: 3 > • Impact: 8 > • Difficulty / Effort: 7 > • ArrayRef instead of const void *, len everywhere > • Same analysis as StringRef > • MutableArrayRef instead of void *, len everywhere > • Same analysis as StringRef > • Delete ConstString, use a modified StringPool that is > thread-safe. > • StringPool is a non thread-safe version of > ConstString. > • Strings are internally refcounted so they can be > cleaned up when they are no longer used. ConstStrings are a large source of > memory in LLDB, so ref-counting and removing stale strings has the potential > to be a huge savings. > • Risk: 2 > • Impact: 9 > • Difficulty / Effort: 6 > • thread_local instead of lldb::ThreadLocal > • This fixes a number of bugs on Windows that cannot be > fixed otherwise, as they require compiler support. > • Some other compilers may not support this yet? > • Risk: 2 > • Impact: 3 > • Difficulty: 3 > • Use llvm::cl for the command line arguments to the primary > lldb executable. > • Risk: 2 > • Impact: 3 > • Difficulty / Effort: 4 > • Testing - Our testing infrastructure is unstable, and our test > coverage is lacking. We should take steps to improve this. > • Port as much as possible to lit > • Simple tests should be trivial to port to lit today. > If nothing else this serves as a proof of concept while increasing the speed > and stability of the test suite, since lit is a more stable harness. > • Separate testing tools > • One question that remains open is how to represent > the complicated needs of a debugger in lit tests. Part a) above covers the > trivial cases, but what about the difficult cases? In > https://reviews.llvm.org/D24591 a number of ideas were discussed. We started > getting to this idea towards the end, about a separate tool which has an > interface independent of the command line interface and which can be used to > test. lldb-mi was mentioned. While I have serious concerns about lldb-mi > due to its poorly written and tested codebase, I do agree in principle with > the methodology. In fact, this is the entire philosophy behind lit as used > with LLVM, clang, lld, etc. > > I don’t take full credit for this idea. I had been toying with a similar > idea for some time, but it was further cemented in an offline discussion with > a co-worker. > > There many small, targeted tools in LLVM (e.g. llc, lli, llvm-objdump, etc) > whose purpose are to be chained together to do interesting things. Instead > of a command line api as we think of in LLDB where you type commands from an > interactive prompt, they have a command line api as you would expect from any > tool which is launched from a shell. > > I can imagine many potential candidates for lldb tools of this nature. Off > the top of my head: > • lldb-unwind - A tool for testing the unwinder. Accepts byte code as > input and passes it through to the unwinder, outputting a compressed summary > of the steps taken while unwinding, which could be pattern matched in lit. > The output format is entirely controlled by the tool, and not by the unwinder > itself, so it would be stable in the face of changes to the underlying > unwinder. Could have various options to enable or disable features of the > unwinder in order to force the unwinder into modes that can be tricky to > encounter in the wild. > • lldb-symbol - A tool for testing symbol resolution. Could have > options for testing things like: > • Determining if a symbol matches an executable > • looking up a symbol by name in the debug info, and mapping it > to an address in the process. > • Displaying candidate symbols when doing name lookup in a > particular scope (e.g. while stopped at a breakpoint). > • lldb-breakpoint - A tool for testing breakpoints and stepping. > Various options could include: > • Set breakpoints and out addresses and/or symbol names where > they were resolved to. > • Trigger commands, so that when a breakpoint is hit the tool > could automatically continue and try to run to another breakpoint, etc. > • options to inspect certain useful pieces of state about an > inferior, to be matched in lit. > • lldb-interpreter - tests the jitter etc. I don’t know much about > this, but I don’t see why this couldn’t be tested in a manner similar to how > lli is tested. > • lldb-platform - tests lldb local and remote platform interfaces. > • lldb-cli -- lldb interactive command line. > • lldb-format - lldb data formatters etc. > • Tests NOW, not later. > • I know we’ve been over this a million times and it’s not > worth going over the arguments again. And I know it’s hard to write tests, > often requiring the invention of new SB APIs. Hopefully those issues will be > addressed by above a) and b) above and writing tests will be easier. Vedant > Kumar ran some analytics on the various codebases and found that LLDB has the > lowest test / commit ratio of any LLVM project (He didn’t post numbers for > lld, so I’m not sure what it is there). > • lldb: 287 of the past 1000 commits > • llvm: 511 of the past 1000 commits > • clang: 622 of the past 1000 commits > • compiler-rt: 543 of the past 1000 commits > This is an alarming statistic, and I would love to see this number closer to > 50%. > • Code style / development conventions - Aside from just the column > limitations and bracing styles, there are other areas where LLDB differs from > LLVM on code style. We should continue to adopt more of LLVM's style where > it makes sense. I've identified a couple of areas (incomplete list) which I > outline below. > • Clean up the mess of cyclical dependencies and properly layer > the libraries. This is especially important for things like lldb-server that > need to link in as little as possible, but regardless it leads to a more > robust architecture, faster build and link times, better testability, and is > required if we ever want to do a modules build of LLDB > • Use CMake instead of Xcode project (CMake supports > Frameworks). CMake supports Apple Frameworks, so the main roadblock to > getting this working is just someone doing it. Segmenting the build process > by platform doesn't make sense for the upstream, especially when there is a > perfectly workable solution. I have no doubt that the resulting Xcode > workspace generated automatically by CMake will not be as "nice" as one that > is maintained by hand. We face this problem with Visual Studio on Windows as > well. The solution that most people have adopted is to continue using the > IDE for code editing and debugging, but for actually running the build, use > CMake with Ninja. A similar workflow should still be possible with an OSX > CMake build, but as I do not work every day on a Mac, all I can say is that > it's possible, I have no idea how impactful it would be on peoples' > workflows. > • Variable naming conventions > • I don’t expect anyone is too fond of LLDB’s naming > conventions, but if we’re committed to joining the LLVM ecosystem, then let’s > go all the way. > • Use more modern C++ and less C > • Old habits die hard, but this isn’t just a matter of > style. It leads to safer, more robust, and less fragile code as well. > • Shorter functions and classes with more narrowly targeted > responsibilities > • It’s not uncommon to find functions that are hundreds > (and in a few cases even 1,000+) of lines long. We really need to be better > about breaking functions and classes down into smaller responsibilities. > This helps not just for someone coming in to read the function, but also for > testing. Smaller functions are easier to unit test. > • Convert T foo(X, Y, Error &error) functions to Expected<T> > foo(X, Y) style (Depends on 1.c) > • llvm::Expected is based on the llvm::Error class > described earlier. It’s used when a function is supposed to return a value, > but it could fail. By packaging the error with the return value, it’s > impossible to have a situation where you use the return value even in case of > an error, and because llvm::Error has mandatory checking, it’s also > impossible to have a sitaution where you don’t check the error. So it’s very > safe. > > Whew. That was a lot. If you made it this far, thanks for reading! > > Obviously if we were to embark on all of the above, it would take many months > to complete everything. So I'm not proposing anyone stop what they're doing > to work on this. This is just my own personal wishlist > _______________________________________________ > lldb-dev mailing list > lldb-dev@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev