mzolotukhin added a comment. > What kinds of <some part> would be useful to you? (How do you want the > runtime of Clang broken down?) Are vertical slices (through all Clang's > various layers and potentially parts of LLVM) -- as this patch will produce > -- useful, or would you really want horizontal slices (as in, what part of > Clang is actually spending the time)? Or just anything that's basically > expected to be consistent from run to run, so you can identify that something > has slowed down, even if you don't have enough information to really know > what?
For me "something has slowed down" would be enough. I.e. even if "parse templates" would be erroneously attributed to 90% time spent in front-end, I would be able to see a jump from 90% to 95%. While these numbers are not reflecting the actual ratio, they still will indicate changes. I would find horizontal slicing more interesting, as we can get vertical slices from profilers. I.e. if in LLVM profile I see that time spent in APInt::add has grown a lot, I'd like to know which pass started to use it more extensively - it's rarely the case that APInt::add slowed down itself. > I think we need to fix the overlap issue as a prerequisite to adding timers > with large amounts of overlap, especially self-overlap. Otherwise the numbers > will likely do more harm than good, as they will significantly misattribute > runtime. Fortunately, I think that should only require some relatively small > changes to the LLVM timer infrastructure. I definitely would support that :) > Well, that depends on the profiler. With some profilers, you can just attach > a profiler to your entire compilation and then ask it what the hottest > functions were. But sure, if you have existing infrastructure built around > -ftime-report, I can see that it makes sense to reuse that infrastructure. Yeah, that was exactly my point. > While LLVM may have some overlap between regions, the vertical timing slices > are still pretty well aligned with the horizontal functionality slices. I > expect the problems in Clang to be a lot worse. Suppose you enter N levels of > parsing templates, and then trigger a whole bunch of template instantiation, > AST deserialization, and code generation. Let's say that takes 1s in total. > With this patch, I think we'd end up attributing Ns of compile time to > "parsing templates", which is clearly very far from the truth, but will > likely be listed as (by far) the slowest thing in the compilation. This does > not even rise to the level of "not-perfect", it's going to be actively > misleading in a lot of cases, and won't even necessarily point anywhere near > the problematic spot. > I think with this patch and no fix to the overlap problem, we will find > ourselves frequently fielding bugs where people say "X is slow" when it > actually isn't. Plus I think we have the opportunity to deliver something > that's hugely useful and not much harder to achieve (providing timing > information that relates back to the source code), and I'd prefer we spend > our complexity budget for this feature there. I see your point, and I agree it would be really misleading if e.g the sum of timers won't match the total time due to self-overlaps. I think adding these timers still might be worthwhile as my guess is that no one usually uses them anyway unless they know what to expect from the reported timers, but I might be mistaken here. And anyway, if you think it's possible to fix the issue first, it's totally fine with me. Thanks, Michael https://reviews.llvm.org/D43578 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits