Re: Understanding performance degradation in range queries between Solr 5.2.1 and 4.10.4

Tomás Fernández Löbbe Mon, 03 Aug 2015 10:21:43 -0700

Thanks Adrien,
I'll run the tests with 5.3 snapshot and post the results here. In case
this helps, this is the hprof samples output
(-Xrunhprof:cpu=samples,depth=3,file=/home/ec2-user/hprof_output.txt) for
4.10.4 and 5.2.1 in my test:


Solr 4.10.4:
CPU SAMPLES BEGIN (total = 243525) Fri Jul 31 22:29:06 2015
rank   self  accum   count trace method
   1 75.07% 75.07%  182812 300523 java.net.PlainSocketImpl.socketAccept
   2  4.27% 79.34%   10408 301576
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.decodeMetaData
   3  4.15% 83.49%   10108 301585
org.apache.lucene.index.FilteredTermsEnum.docs
   4  3.46% 86.95%    8419 301582
org.apache.lucene.index.FilteredTermsEnum.next
   5  2.49% 89.44%    6070 301573 java.net.SocketInputStream.socketRead0
   6  1.99% 91.43%    4848 301599
org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet
   7  1.98% 93.42%    4824 301583
org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet
   8  1.57% 94.99%    3824 301589
org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll
   9  1.44% 96.42%    3504 301594
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.refillDocs
  10  1.09% 97.51%    2655 301581 java.nio.Bits.copyToArray
  11  0.98% 98.50%    2388 301598
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc
  12  0.62% 99.12%    1522 301600 org.apache.lucene.store.DataInput.readVInt
  13  0.21% 99.33%     500 301612
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.docs
  14  0.07% 99.39%     167 301601
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next
  15  0.06% 99.45%     139 301619 java.lang.System.identityHashCode
  16  0.05% 99.50%     114 301632
org.apache.lucene.codecs.lucene41.ForUtil.readBlock
  17  0.04% 99.54%      92 300708 java.util.zip.Inflater.inflateBytes
  18  0.03% 99.57%      76 301624
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFloorBlock
  19  0.03% 99.59%      68 300613 java.lang.ClassLoader.defineClass1
  20  0.03% 99.62%      68 301615
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next
  21  0.03% 99.65%      62 301635
org.apache.solr.search.SolrIndexSearcher.getDocSetNC
  22  0.02% 99.66%      41 301664
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next
  23  0.01% 99.68%      31 301629 org.apache.lucene.util.FixedBitSet.<init>
CPU SAMPLES END

Solr 5.2.1:
CPU SAMPLES BEGIN (total = 235415) Fri Jul 31 22:42:06 2015
rank   self  accum   count trace method
   1 51.38% 51.38%  120954 301291 sun.nio.ch.EPollArrayWrapper.epollWait
   2 25.69% 77.07%   60477 301292 sun.nio.ch.ServerSocketChannelImpl.accept0
   3 10.59% 87.66%   24923 301369
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.next
   4  2.20% 89.86%    5182 301414
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.postings
   5  2.01% 91.87%    4742 301384
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.postings
   6  1.25% 93.12%    2944 301434
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss
   7  1.11% 94.23%    2612 301367
org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite
   8  0.94% 95.17%    2204 301390 org.apache.lucene.util.BitSet.or
   9  0.93% 96.10%    2190 301383 java.nio.Bits.copyToArray
  10  0.78% 96.87%    1825 301449
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs
  11  0.73% 97.60%    1717 301378
org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll
  12  0.73% 98.33%    1715 301374 org.apache.lucene.util.BitSet.or
  13  0.33% 98.66%     787 301387
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc
  14  0.16% 98.82%     374 301426
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc
  15  0.10% 98.93%     245 301382 org.apache.lucene.util.BitSet.or
  16  0.09% 99.02%     219 301381
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.next
  17  0.09% 99.11%     207 301370 org.apache.lucene.util.BitSet.or
  18  0.06% 99.17%     153 301416 org.apache.lucene.util.BitSet.or
  19  0.06% 99.24%     151 301427 org.apache.lucene.util.BitSet.or
  20  0.06% 99.30%     151 301441 org.apache.lucene.store.DataInput.readVInt
  21  0.06% 99.36%     147 301389 java.lang.System.identityHashCode
  22  0.06% 99.42%     140 301375
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next
  23  0.04% 99.47%     104 301425 org.apache.lucene.util.BitSet.or
  24  0.03% 99.50%      76 301423
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc
  25  0.03% 99.53%      74 301454
org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite
  26  0.03% 99.56%      65 301432
org.apache.lucene.util.BitDocIdSet$Builder.or
  27  0.02% 99.58%      53 301456 org.apache.lucene.util.FixedBitSet.or
  28  0.02% 99.60%      52 300077 java.lang.ClassLoader.defineClass1
  29  0.02% 99.63%      50 301464
org.apache.lucene.codecs.lucene50.ForUtil.readBlock
  30  0.02% 99.64%      39 301438
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.next
  31  0.02% 99.66%      37 301465
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadNextFloorBlock
  32  0.02% 99.67%      36 301419
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc
CPU SAMPLES END

On Fri, Jul 31, 2015 at 4:23 PM, Adrien Grand <[email protected]> wrote:

> Hi Tomás,
>
> I suspect this might be related to LUCENE-5938. We changed the default
> rewrite method for multi-term queries to load documents into a sparse
> bit set first first, and only upgrade to a dense bit set when we know
> many documents match. When there are lots of terms to intersect, then
> we end up spending significant cpu time to build the sparse bit set to
> eventually upgrade to a dense bit set like before. This might be what
> you are seeing.
>
> You might see the issue less with the population field because it has
> fewer unique values, so postings lists are longer and the DocIdSet
> building logic can upgrade quicker to a dense bit set.
>
> Mike noticed this slowness when working on BDK trees and we changed
> this first phase to use a plain int[] array that we sort and
> deduplicate instead of a more fancy sparse bit set (LUCENE-6645),
> which seemed to make things faster. Would it be possible for you to
> also check a 5.3 snapshot?
>
>
>
>
> On Fri, Jul 31, 2015 at 10:51 PM, Tomás Fernández Löbbe
> <[email protected]> wrote:
> > Hi, I'm seeing some query performance degradation between 4.10.4 and
> 5.2.1.
> > It doesn't happen with all the queries, but for queries like range
> queries
> > on fields with many different values the average time in 5.2.1 is worse
> than
> > in 4.10.4. Is anyone seeing something similar?
> >
> > Test Details:
> > * Single thread running queries continuously. I run the test twice for
> each
> > Solr version.
> > * JMeter running on my laptop, Solr running on EC2, on an m3.xlarge
> instance
> > with all the defaults but with 5G heap. Index in local disk (SSD)
> > * Plain Solr releases, nothing custom. Single Solr core, not in SolrCloud
> > mode, no distributed search.
> > * "allCountries" geonames dataset (~8M small docs). No updates during the
> > test. Index Size is around 1.1GB for Solr 5.2.1 and 1.3GB for Solr 4.10.4
> > (fits entirely in RAM)
> > * jdk1.8.0_45
> >
> > Queries: 3k boolean queries (generated with terms from the dataset) with
> > range queries as filters on "tlongitude" and "tlatitude" fields with
> > randomly generated bounds, e.g.
> > q=name:foo OR name:bar&fq=tlongitude:[W TO X]&fq=tlatitude:[Y TO Z]
> >
> > Fields are:
> > <field name="tlatitude" type="tdouble"/>
> > <field name="tlongitude" type="tdouble"/>
> > Field Type:
> > <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8"
> > positionIncrementGap="0"/>
> >
> > In this case, Solr 4.10.4 was between 20% to 30% faster than 5.2.1 in
> > average.
> >
> > http://snag.gy/2yPPM.jpg
> >
> > Doing only the boolean queries show no performance difference between
> 4.10
> > and 5.2, same thing if I do filters on a string field instead of the
> range
> > queries.
> >
> > When using "double" field type (precisionStep="0"), the difference was
> > bigger:
> >
> > longitude/latitude fields:
> > <field name="longitude" type="double" docValues="true"/>
> > <field name="latitude" type="double" docValues="true"/>
> > <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0"
> > positionIncrementGap="0"/>
> >
> > http://snag.gy/Vi5uk.jpg
> > I understand this is not the best field type definition for range
> queries,
> > I'm just trying to understand the difference between the two versions and
> > why.
> >
> > Performance was OK when doing range queries on the "population" field
> > (long), but that field doesn't have many different values, only 300k out
> of
> > the 8.3M docs have the population field with a value different to 0. On
> the
> > other hand, doing range queries on the _version_ field did show a graph
> > similar to the previous one:
> >
> > <field name="_version_" type="long" indexed="true" stored="true"/>
> > <fieldType name="long" class="solr.TrieLongField" precisionStep="0"
> > positionIncrementGap="0"/>
> >
> > http://snag.gy/4tc7e.jpg
> >
> > Any idea what could be causing this? Is this expected after some known
> > change?
> >
> > With Solr 4.10, a single CPU core remains high during the test (close to
> > 100%), but with Solr 5.2, different cores go up and down in utilization
> > continuously. That's probably because of the different Jetty version I
> > suppose.
> > GC pattern looks similar in both. For both Solr versions I'm using the
> > settings that ship with Solr (in solr.in.sh) except for Xmx and Xms
> >
>
>
>
> --
> Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Understanding performance degradation in range queries between Solr 5.2.1 and 4.10.4

Reply via email to