Also, there are a number of TreeMaps that ought to be HashMaps: https://issues.apache.org/jira/browse/LUCENE-8041
On Wed, Nov 15, 2017 at 9:20 PM Erick Erickson <[email protected]> wrote: > The first thing I'd do is profile a running Solr instance and see if it > spends enough time building TreeSets to bother with before doing wholesale > surgery. My guess is that the efficiencies are hard to measure if > measurable at all. > > The proof is in the profiling of course. > > Best, > Erick > > On Wed, Nov 15, 2017 at 5:46 PM, David McManamon <[email protected]> > wrote: > >> There were interesting papers regarding balanced binary trees in 2015 & >> 2016 >> >> http://sidsen.azurewebsites.net// >> >> And after reading about "Rank-balanced Trees" I started comparing AVL & >> Red-black tree performance and was surprised to see that red-black trees >> are the default in Java given the performance differences noted in Ben >> Pfaff's 2004 work. >> So I wrote several new TreeMap implementations and put them on GitHub >> with two short blog posts- >> >> https://refactoringlightly.wordpress.com/ >> >> Seems that TreeSet is used a lot in Lucene and if the input has sequences >> then the constants in the Olog(n) operations make those places a very >> strong candidate to switch to an AVLTreeSet. >> >> With so many uses of TreeSet I wasn't sure where/how to start >> benchmarking? >> --David >> > > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
