Thanks Adrien, I'll run the tests with 5.3 snapshot and post the results here. In case this helps, this is the hprof samples output (-Xrunhprof:cpu=samples,depth=3,file=/home/ec2-user/hprof_output.txt) for 4.10.4 and 5.2.1 in my test:
Solr 4.10.4: CPU SAMPLES BEGIN (total = 243525) Fri Jul 31 22:29:06 2015 rank self accum count trace method 1 75.07% 75.07% 182812 300523 java.net.PlainSocketImpl.socketAccept 2 4.27% 79.34% 10408 301576 org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.decodeMetaData 3 4.15% 83.49% 10108 301585 org.apache.lucene.index.FilteredTermsEnum.docs 4 3.46% 86.95% 8419 301582 org.apache.lucene.index.FilteredTermsEnum.next 5 2.49% 89.44% 6070 301573 java.net.SocketInputStream.socketRead0 6 1.99% 91.43% 4848 301599 org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet 7 1.98% 93.42% 4824 301583 org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet 8 1.57% 94.99% 3824 301589 org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll 9 1.44% 96.42% 3504 301594 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.refillDocs 10 1.09% 97.51% 2655 301581 java.nio.Bits.copyToArray 11 0.98% 98.50% 2388 301598 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc 12 0.62% 99.12% 1522 301600 org.apache.lucene.store.DataInput.readVInt 13 0.21% 99.33% 500 301612 org.apache.lucene.codecs.blocktree.SegmentTermsEnum.docs 14 0.07% 99.39% 167 301601 org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next 15 0.06% 99.45% 139 301619 java.lang.System.identityHashCode 16 0.05% 99.50% 114 301632 org.apache.lucene.codecs.lucene41.ForUtil.readBlock 17 0.04% 99.54% 92 300708 java.util.zip.Inflater.inflateBytes 18 0.03% 99.57% 76 301624 org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFloorBlock 19 0.03% 99.59% 68 300613 java.lang.ClassLoader.defineClass1 20 0.03% 99.62% 68 301615 org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next 21 0.03% 99.65% 62 301635 org.apache.solr.search.SolrIndexSearcher.getDocSetNC 22 0.02% 99.66% 41 301664 org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next 23 0.01% 99.68% 31 301629 org.apache.lucene.util.FixedBitSet.<init> CPU SAMPLES END Solr 5.2.1: CPU SAMPLES BEGIN (total = 235415) Fri Jul 31 22:42:06 2015 rank self accum count trace method 1 51.38% 51.38% 120954 301291 sun.nio.ch.EPollArrayWrapper.epollWait 2 25.69% 77.07% 60477 301292 sun.nio.ch.ServerSocketChannelImpl.accept0 3 10.59% 87.66% 24923 301369 org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.next 4 2.20% 89.86% 5182 301414 org.apache.lucene.codecs.blocktree.SegmentTermsEnum.postings 5 2.01% 91.87% 4742 301384 org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.postings 6 1.25% 93.12% 2944 301434 java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss 7 1.11% 94.23% 2612 301367 org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite 8 0.94% 95.17% 2204 301390 org.apache.lucene.util.BitSet.or 9 0.93% 96.10% 2190 301383 java.nio.Bits.copyToArray 10 0.78% 96.87% 1825 301449 org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs 11 0.73% 97.60% 1717 301378 org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll 12 0.73% 98.33% 1715 301374 org.apache.lucene.util.BitSet.or 13 0.33% 98.66% 787 301387 org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc 14 0.16% 98.82% 374 301426 org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc 15 0.10% 98.93% 245 301382 org.apache.lucene.util.BitSet.or 16 0.09% 99.02% 219 301381 org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next 17 0.09% 99.11% 207 301370 org.apache.lucene.util.BitSet.or 18 0.06% 99.17% 153 301416 org.apache.lucene.util.BitSet.or 19 0.06% 99.24% 151 301427 org.apache.lucene.util.BitSet.or 20 0.06% 99.30% 151 301441 org.apache.lucene.store.DataInput.readVInt 21 0.06% 99.36% 147 301389 java.lang.System.identityHashCode 22 0.06% 99.42% 140 301375 org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next 23 0.04% 99.47% 104 301425 org.apache.lucene.util.BitSet.or 24 0.03% 99.50% 76 301423 org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc 25 0.03% 99.53% 74 301454 org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite 26 0.03% 99.56% 65 301432 org.apache.lucene.util.BitDocIdSet$Builder.or 27 0.02% 99.58% 53 301456 org.apache.lucene.util.FixedBitSet.or 28 0.02% 99.60% 52 300077 java.lang.ClassLoader.defineClass1 29 0.02% 99.63% 50 301464 org.apache.lucene.codecs.lucene50.ForUtil.readBlock 30 0.02% 99.64% 39 301438 org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next 31 0.02% 99.66% 37 301465 org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFloorBlock 32 0.02% 99.67% 36 301419 org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc CPU SAMPLES END On Fri, Jul 31, 2015 at 4:23 PM, Adrien Grand <[email protected]> wrote: > Hi Tomás, > > I suspect this might be related to LUCENE-5938. We changed the default > rewrite method for multi-term queries to load documents into a sparse > bit set first first, and only upgrade to a dense bit set when we know > many documents match. When there are lots of terms to intersect, then > we end up spending significant cpu time to build the sparse bit set to > eventually upgrade to a dense bit set like before. This might be what > you are seeing. > > You might see the issue less with the population field because it has > fewer unique values, so postings lists are longer and the DocIdSet > building logic can upgrade quicker to a dense bit set. > > Mike noticed this slowness when working on BDK trees and we changed > this first phase to use a plain int[] array that we sort and > deduplicate instead of a more fancy sparse bit set (LUCENE-6645), > which seemed to make things faster. Would it be possible for you to > also check a 5.3 snapshot? > > > > > On Fri, Jul 31, 2015 at 10:51 PM, Tomás Fernández Löbbe > <[email protected]> wrote: > > Hi, I'm seeing some query performance degradation between 4.10.4 and > 5.2.1. > > It doesn't happen with all the queries, but for queries like range > queries > > on fields with many different values the average time in 5.2.1 is worse > than > > in 4.10.4. Is anyone seeing something similar? > > > > Test Details: > > * Single thread running queries continuously. I run the test twice for > each > > Solr version. > > * JMeter running on my laptop, Solr running on EC2, on an m3.xlarge > instance > > with all the defaults but with 5G heap. Index in local disk (SSD) > > * Plain Solr releases, nothing custom. Single Solr core, not in SolrCloud > > mode, no distributed search. > > * "allCountries" geonames dataset (~8M small docs). No updates during the > > test. Index Size is around 1.1GB for Solr 5.2.1 and 1.3GB for Solr 4.10.4 > > (fits entirely in RAM) > > * jdk1.8.0_45 > > > > Queries: 3k boolean queries (generated with terms from the dataset) with > > range queries as filters on "tlongitude" and "tlatitude" fields with > > randomly generated bounds, e.g. > > q=name:foo OR name:bar&fq=tlongitude:[W TO X]&fq=tlatitude:[Y TO Z] > > > > Fields are: > > <field name="tlatitude" type="tdouble"/> > > <field name="tlongitude" type="tdouble"/> > > Field Type: > > <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" > > positionIncrementGap="0"/> > > > > In this case, Solr 4.10.4 was between 20% to 30% faster than 5.2.1 in > > average. > > > > http://snag.gy/2yPPM.jpg > > > > Doing only the boolean queries show no performance difference between > 4.10 > > and 5.2, same thing if I do filters on a string field instead of the > range > > queries. > > > > When using "double" field type (precisionStep="0"), the difference was > > bigger: > > > > longitude/latitude fields: > > <field name="longitude" type="double" docValues="true"/> > > <field name="latitude" type="double" docValues="true"/> > > <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" > > positionIncrementGap="0"/> > > > > http://snag.gy/Vi5uk.jpg > > I understand this is not the best field type definition for range > queries, > > I'm just trying to understand the difference between the two versions and > > why. > > > > Performance was OK when doing range queries on the "population" field > > (long), but that field doesn't have many different values, only 300k out > of > > the 8.3M docs have the population field with a value different to 0. On > the > > other hand, doing range queries on the _version_ field did show a graph > > similar to the previous one: > > > > <field name="_version_" type="long" indexed="true" stored="true"/> > > <fieldType name="long" class="solr.TrieLongField" precisionStep="0" > > positionIncrementGap="0"/> > > > > http://snag.gy/4tc7e.jpg > > > > Any idea what could be causing this? Is this expected after some known > > change? > > > > With Solr 4.10, a single CPU core remains high during the test (close to > > 100%), but with Solr 5.2, different cores go up and down in utilization > > continuously. That's probably because of the different Jetty version I > > suppose. > > GC pattern looks similar in both. For both Solr versions I'm using the > > settings that ship with Solr (in solr.in.sh) except for Xmx and Xms > > > > > > -- > Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
