Em Wed, Nov 21, 2018 at 09:51:19AM -0800, Eric Saint-Etienne escreveu: > Perf can take minutes to parse an image when -ffunction-section is used. > This is especially true with the kernel image when it is compiled this way, > which is the arm64 default since the patcheset "Enable deadcode elimination > at link time". > > Perf organize maps using a rbtree. Whenever perf finds a new symbols, it > first searches this rbtree for the map it belongs to, by strcmp()'aring > section names. When it finds the map with the right name, it uses it to > add the symbol. With a usual image there aren't so many maps but when using > -ffunction-section there's basically one map per function. > With the kernel image that's north of 40,000 maps. For most symbols perf > has to parses the entire rbtree to eventually create a new map and add it. > Consequently perf spends most of the time browsing a rbtree that keeps > getting larger. > > This performance fix introduces a secondary rbtree that indexes maps based > on the section name. > > Signed-off-by: Eric Saint-Etienne <eric.saint.etie...@oracle.com> > Reviewed-by: Dave Kleikamp <dave.kleik...@oracle.com> > Reviewed-by: David Aldridge <david.aldri...@oracle.com> > Reviewed-by: Rob Gardner <rob.gard...@oracle.com>
Looks sane, thanks to the multiple reviewers, really appreciated, Applied. - Arnaldo