On Mon, Jul 04, 2022 at 08:30:08PM +0100, Alberto Faria wrote: > On Mon, Jul 4, 2022 at 5:28 PM Daniel P. Berrangé <berra...@redhat.com> wrote: > > Have you done any measurement see how much of the overhead is from > > the checks you implemented, vs how much is inherantly forced on us > > by libclang ? ie what does it look like if you just load the libclang > > framework and run it actross all source files, without doing any > > checks in python. > > Running the script with all checks disabled, i.e., doing nothing after > TranslationUnit.from_source(): > > $ time ./static-analyzer.py build > [...] > Analyzed 8775 translation units in 274.0 seconds. > > real 4m34.537s > user 49m32.555s > sys 1m18.731s > > $ time ./static-analyzer.py build block util > Analyzed 162 translation units in 4.2 seconds. > > real 0m4.804s > user 0m40.389s > sys 0m1.690s > > This is still with 12 threads on a 12-hardware thread laptop, but > scalability is near perfect. (The time reported by the script doesn't > include loading and inspection of the compilation database.) > > So, not great. What's more, TranslationUnit.from_source() delegates > all work to clang_parseTranslationUnit(), so I suspect C libclang > wouldn't do much better. > > And with all checks enabled: > > $ time ./static-analyzer.py build block util > [...] > Analyzed 162 translation units in 86.4 seconds. > > real 1m26.999s > user 14m51.163s > sys 0m2.205s > > Yikes. Also not great at all, although the current implementation does > many inefficient things, like redundant AST traversals. Cutting > through some of clang.cindex's abstractions should also help, e.g., > using libclang's visitor API properly instead of calling > clang_visitChildren() for every get_children(). > > Perhaps we should set a target for how slow we can tolerate this thing > to be, as a percentage of total build time, and determine if the > libclang approach is viable. I'm thinking maybe 10%? > > > If i run 'clang-tidy' across the entire source tree, it takes 3 minutes > > on my machine, but there's overhead of repeatedly starting the process > > in there. > > Is that parallelized in some way? It seems strange that clang-tidy > would be so much faster than libclang.
No, that was me doing a dumb for i in `git ls-tree --name-only -r HEAD:` do clang-tidy $i 1>/dev/null 2>&1 done so in fact it was parsing all source files, not just .c files (and likely whining about non-C files. This was on my laptop with 6 cores / 2 SMT With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|