Hi all,
I'm trying to build a (elastic) suggester that uses context in
completionqueries to implement authorization for these suggestions.
Basically, I only want suggestions from the contexts where the user has
rights.
(not sure if this is the best way, suggestions (no pun intended) welcome)
What
t; >>>>> can
> >>>>>> try this to see if that helps. But I doubt that in this case.
> >>>>>>
> >>>>>> On opening the issue, I am working through some reproducible
> >>>>>> benchmarks
> >>>>>>
W graph searching is crazy slow...
Mike McCandless
http://blog.mikemccandless.com
On Sun, Sep 29, 2024 at 4:06 AM Navneet Verma <
vermanavneet...@gmail.com
wrote:
Hi Lucene Experts,
I wanted to understand the performance difference between
opening and
reading the whole file using an IndexIn
ges
> >>>>> to ensure that checksumming is always done with IOContext.READ_ONCE
> >>>>> (which uses READ behind scenes).
> >>>>>
> >>>>> Uwe
> >>>>>
> >>>>> Am 29.09.2024 um 17:09 schrieb Michael McCandless
> > use
> >>> MADV_RANDOM (which is stupid), that is indeed expected to perform worse
> >>> since there is no readahead pre-caching. 50% worse (what you are
> > seeing)
> >>> is indeed quite an impact ...
> >>>
> >>> May
hing is crazy slow...
Mike McCandless
http://blog.mikemccandless.com
On Sun, Sep 29, 2024 at 4:06 AM Navneet Verma <
vermanavneet...@gmail.com
wrote:
Hi Lucene Experts,
I wanted to understand the performance difference between
opening and
reading the whole file using an IndexInput with IoContext
m
On Sun, Sep 29, 2024 at 4:06 AM Navneet Verma <
vermanavneet...@gmail.com
wrote:
Hi Lucene Experts,
I wanted to understand the performance difference between opening and
reading the whole file using an IndexInput with IoContext as RANDOM vs
READ.
I can see .vec files(storing the flat vectors) ar
is to use RANDOM for normal readingof
index and use the other IOContexts only for merging. If tis requiresfiles
to be opened multiple times its a better compromise.*
Yeah, I was thinking of doing something similar. But I am not 100% sure
what would be the performance degradation of opening files
azon (product search) for our production searching processes.
Otherwise paging in all .vec/.veq pages via random access provoked
through
HNSW graph searching is crazy slow...
Mike McCandless
http://blog.mikemccandless.com
On Sun, Sep 29, 2024 at 4:06 AM Navneet Verma
Hi Lucene Experts,
I wante
all bytes/pages in .vec/.veq files -- this asks the OS to
cache
> > all of those bytes into page cache (if there is enough free RAM). We do
> > this at Amazon (product search) for our production searching processes.
> > Otherwise paging in all .vec/.veq pages via random access pr
candless.com
On Sun, Sep 29, 2024 at 4:06 AM Navneet Verma
wrote:
Hi Lucene Experts,
I wanted to understand the performance difference between opening and
reading the whole file using an IndexInput with IoContext as RANDOM vs
READ.
I can see .vec files(storing the flat vectors) are opened with RANDOM
via random access provoked through
HNSW graph searching is crazy slow...
Mike McCandless
http://blog.mikemccandless.com
On Sun, Sep 29, 2024 at 4:06 AM Navneet Verma
wrote:
> Hi Lucene Experts,
> I wanted to understand the performance difference between opening and
> reading the whol
Hi Lucene Experts,
I wanted to understand the performance difference between opening and
reading the whole file using an IndexInput with IoContext as RANDOM vs READ.
I can see .vec files(storing the flat vectors) are opened with RANDOM and
whereas dvd files are opened as READ. As per my testing
gt; (Lucene GITHUB#12696 <https://github.com/apache/lucene/pull/12696>:
>> Change
>> > Postings back to using FOR in Lucene99PostingsFormat. Freqs, positions
>> and
>> > offset keep using PFOR)
>> >
>> > However, in our (Mongodb Atlas Search) in
erts,
> >
> > I have a question about the following change:
> > The Lucene9.11 changed the Posting list format
> > (Lucene GITHUB#12696 <https://github.com/apache/lucene/pull/12696>:
> Change
> > Postings back to using FOR in Lucene99PostingsFormat. Freqs,
thub.com/apache/lucene/pull/12696>: Change
> Postings back to using FOR in Lucene99PostingsFormat. Freqs, positions and
> offset keep using PFOR)
>
> However, in our (Mongodb Atlas Search) internal performance testing, we saw
> an increase of query latency up to 32% on match-all
in our (Mongodb Atlas Search) internal performance testing, we saw
an increase of query latency up to 32% on match-all and match-many inverted
index based queries, e.g. query.phrase-slop-0 and query.date-facet-match-all.
I wonder if the community sees similar performance regressions on some
queri
s.
Mike McCandless
http://blog.mikemccandless.com
On Tue, Dec 12, 2023 at 4:36 PM Marc Davenport
wrote:
> Hello,
>
> We have a search application built around Lucene 8. Motivated by the list
> of performance enhancements and optimizations in the change notes we
> upgraded from 8.1 to 8.11.2
Hello,
We have a search application built around Lucene 8. Motivated by the list
of performance enhancements and optimizations in the change notes we
upgraded from 8.1 to 8.11.2. We track the performance of different
activities within our application and can clearly see an improvement in our
k access for every single byte during readByte().
> >
> > Does this warrant a JIRA for regression?
> >
> > As mentioned, I am noticing a 10x slowdown in
> SegmentTermsEnum.seekExact()
> > affecting atomic update performance . For setups like mine that can't use
> &
y causing disk access for every single byte during readByte().
>
> Does this warrant a JIRA for regression?
>
> As mentioned, I am noticing a 10x slowdown in SegmentTermsEnum.seekExact()
> affecting atomic update performance . For setups like mine that can't use
> mmap due
b.com/apache/lucene/issues/10297 ) I understand that this is
essentially causing disk access for every single byte during readByte().
Does this warrant a JIRA for regression?
As mentioned, I am noticing a 10x slowdown in SegmentTermsEnum.seekExact()
affecting atomic update performance . For setups
Yes, this changed in 8.x:
- 8.0 moved the terms index off-heap for non-PK fields with
MMapDirectory. https://github.com/apache/lucene/issues/9681
- Then in 8.6 the FST was moved off-heap all the time.
https://github.com/apache/lucene/issues/10297
More generally, there's a few files that are no l
Thanks Adrien. Is this behavior of FST something that has changed in Lucene
8.x (from 7.x)?
Also, is the terms index not loaded into memory anymore in 8.x?
To your point on MMapDirectoryFactory, it is much faster as you
anticipated, but the indexes commonly being >1 TB makes the Windows machine
fr
+Alan Woodward helped me better understand what is going on here.
BufferedIndexInput (used by NIOFSDirectory and SimpleFSDirectory)
doesn't play well with the fact that the FST reads bytes backwards:
every call to readByte() triggers a refill of 1kB because it wants to
read the byte that is just be
My best guess based on your description of the issue is that
SimpleFSDirectory doesn't like the fact that the terms index now reads
data directly from the directory instead of loading the terms index in
heap. Would you be able to run the same benchmark with MMapDirectory
to check if it addresses th
Hello,
We started experiencing slowness with atomic updates in Solr after
upgrading from 7.7.2 to 8.11.1. Running several tests revealed the
slowness to be in RealTimeGet's SolrIndexSearcher.getFirstMatch() call
which eventually calls Lucene's SegmentTermsEnum.seekExact()..
In the benchmarks I ran
: Performance Comparison of Benchmarks by using Lucene 9.1.0 vs 8.5.1
https://urldefense.com/v3/__https://home.apache.org/*mikemccand/lucenebench/__;fg!!ACWV5N9M2RV99hQ!MxMLYjBYzRbF_h4Vx__pd6DDXhkE7Tu2WF3eudKJ-YxXBxzvpfhcAMO4Lt1zcBC9lfRrvzZ1Xg8tiSc8Xw$
shows how various
benchmarks have evolved over time *on
https://home.apache.org/~mikemccand/lucenebench/ shows how various
benchmarks have evolved over time *on the main branch*. There is no
direct comparison of every version against every other version that I
have seen though.
On Tue, Jul 26, 2022 at 2:12 PM Baris Kazar wrote:
>
> Dear Folks,-
> Sim
Dear Folks,-
Similar question to my previous post: this time I wonder if there is a Lucene
web site where benchmarks are run against these two versions of Lucene.
I see many (44+16) api changes and (48+9) improvements and (16+15) Bug fixes,
which sounds great.
Best regards
May 2021 13:55
To: Michael McCandless ; Lucene Users
Subject: RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)
Hi,
thanks for reaching me that fast!
Your hint that there were changes to NRTCachingDirectory were the right point:
I copied the 8.3 NRTCachingDirectory impl
w-down.
> I’ll report here.
>
> Bye,
>
> Markus
>
>
> From: Michael McCandless
> Sent: Wednesday, 19 May 2021 13:39
> To: Lucene Users ; Gietzen, Markus <
> markus.giet...@softwareag.com>
> Subject: Re: Performance decrease with NRT use-case in 8.8.x (coming from
&
: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)
> The update showed no issues (e.g. compiled without changes) but I noticed
> that our test-suites take a lot longer to finish.
Hmm, that sounds bad. We need our tests to stay fast but also do a good job
testing things ;)
Doe
without changes) but I noticed
> that our test-suites take a lot longer to finish.
>
> So I took a closer look at one test-case which showed a severe slowdown
> (it’s doing small update, flush, search cycles in order to stress NRT;
> the purpose is to see performance-changes in an
slowdown
(it’s doing small update, flush, search cycles in order to stress NRT; the
purpose is to see performance-changes in an early stage 😉 ):
Lucene 8.3: ~2,3s
Lucene 8.8.x: 25s
This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8 and
do a comparison.
The gap is
Hi!
After upgrading ES cluster from 6.2 to 7.9 version, we find that force merge
operation will take long time, about double of previous latency.
Based on our investigation, we found the follows is main cause of the
force-merge performance decrease:
* From Lucene 8.0, NormsProducer is
Hi!
After upgrading ES cluster from 6.2 to 7.9 version, we find that force merge
operation will take long time, about double of previous latency.
Based on our investigation, we found the follows is main cause of the
force-merge performance decrease:
* From Lucene 8.0, NormsProducer is added as
Hi!
After upgrading ES cluster from 6.2 to 7.9 version, we find that force merge
operation will take long time, about double of previous latency.
Based on our investigation, we found the follows is main cause of the
force-merge performance decrease:
* From Lucene 8.0, NormsProducer is
sts query like this, which is
> fully in line with your investigation: if a field has docvalues it uses
> DocValuesFieldExistsQuery, if it is a tokenized field it uses the
> NormsFieldExistsQuery. The negative one is a must-not clause, which is
> perfectly fine performance wise.
>
fine performance wise.
An alternative way to search is indexing all field names that have a value into
a separate stringfield. But this needs preprocessing.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html
https://issues.apache.org/jira/browse/SOLR
That's great Rob! Thanks for bringing closure.
Mike McCandless
http://blog.mikemccandless.com
On Fri, Nov 13, 2020 at 9:13 AM Rob Audenaerde
wrote:
> To follow up, based on a quick JMH-test with 2M docs with some random data
> I see a speedup of 70% :)
> That is a nice friday-afternoon gift,
To follow up, based on a quick JMH-test with 2M docs with some random data
I see a speedup of 70% :)
That is a nice friday-afternoon gift, thanks!
For ppl that are interested:
I added a BinaryDocValues field like this:
doc.add(BinaryDocValuesField("GROUPS_ALLOWED_EMPTY", new BytesRef(0x01;
Maybe NormsFieldExistsQuery as a MUST_NOT clause? Though, you must enable
norms on your field to use that.
TermRangeQuery is indeed a horribly costly way to execute this, but if you
cache the result on each refresh, perhaps it is OK?
You could also index a dedicated doc values field indicating t
Hi all,
We have implemented some security on our index by adding a field
'groups_allowed' to documents, and wrap a boolean must query around the
original query, that checks if one of the given user-groups matches at
least one groups_allowed.
We chose to leave the groups_allowed field empty when t
gt; >
> > > On Tue, Oct 13, 2020 at 11:48 AM Adrien Grand
> wrote:
> > >
> > >> Can you give us a few more details:
> > >> - What version of Lucene are you testing?
> > >> - Are you benchmarking "restrictionQuery" on its o
Adrien Grand wrote:
> >
> >> Can you give us a few more details:
> >> - What version of Lucene are you testing?
> >> - Are you benchmarking "restrictionQuery" on its own, or its
> conjunction
> >> with another query?
> >>
> >&
"restrictionQuery"
>> since it should not contribute to scoring.
>>
>> TermsInSetQuery automatically executes like a BooleanQuery when the number
>> of clauses is less than 16, so I would not expect major performance
>> differences between a TermInSetQuery ove
> You mentioned that you combine your "restrictionQuery" and the user query
> with Occur.MUST, Occur.FILTER feels more appropriate for "restrictionQuery"
> since it should not contribute to scoring.
>
> TermsInSetQuery automatically executes like a BooleanQuery when
ur.FILTER feels more appropriate for "restrictionQuery"
since it should not contribute to scoring.
TermsInSetQuery automatically executes like a BooleanQuery when the number
of clauses is less than 16, so I would not expect major performance
differences between a TermInSetQuery over less than 16 te
I'm having some performance issues when counting the index (>60M docs), so
I thought about tweaking this restriction-implementation.
I set-up a benchmark like this:
I generate 2M documents, Each document has a multi-value "roles" field. The
"roles" field in each docu
not seeing the same slowdown on the other field.
> > How hard would it be for you to test what the performance is if you
> > lowercase the name of the digest algorithms, ie. "md5;[md5 value in
> hex]",
> > etc. The reason I'm asking is because the compression logic i
On Mon, 27 Jul 2020 at 19:24, Adrien Grand wrote:
>
> It's interesting you're not seeing the same slowdown on the other field.
> How hard would it be for you to test what the performance is if you
> lowercase the name of the digest algorithms, ie. "md5;[md5 value in h
It's interesting you're not seeing the same slowdown on the other field.
How hard would it be for you to test what the performance is if you
lowercase the name of the digest algorithms, ie. "md5;[md5 value in hex]",
etc. The reason I'm asking is because the compressi
iple runs?
>
> On Mon, Jul 27, 2020 at 5:57 AM Alex K wrote:
>
> > Hi,
> >
> > Also have a look here:
> > https://issues.apache.org/jira/plugins/servlet/mobile#issue/LUCENE-9378
> >
> > Seems it might be related.
> > - Alex
> >
> > On Sun, Jul 26, 202
s it might be related.
> - Alex
>
> On Sun, Jul 26, 2020, 23:31 Trejkaz wrote:
>
> > Hi all.
> >
> > I've been tracking down slow seeking performance in TermsEnum after
> > updating to Lucene 8.5.1.
> >
> > On 8.5.1
Hi,
Also have a look here:
https://issues.apache.org/jira/plugins/servlet/mobile#issue/LUCENE-9378
Seems it might be related.
- Alex
On Sun, Jul 26, 2020, 23:31 Trejkaz wrote:
> Hi all.
>
> I've been tracking down slow seeking performance in TermsEnum after
> updating to Luc
Hi all.
I've been tracking down slow seeking performance in TermsEnum after
updating to Lucene 8.5.1.
On 8.5.1:
SegmentTermsEnum.seekExact: 33,829 ms (70.2%) (remaining time in our code)
SegmentTermsEnumFrame.loadBlock: 29,104 ms (60.4%)
CompressionAlgorithm$2
gt; (A B C)~2 is equivalent to ((+A +B) (+A +C) (+B +C)).
> >
> > In other words a single BooleaQuery with a min should match parameter
> could
> > be rewritten as pure disjunctive BooleanQuery comprised from 3
> sub-queries.
> >
> > In terms of performance it seems
that at least 2 should match.
>
> In terms of semantics what I understand so far is that
>
> (A B C)~2 is equivalent to ((+A +B) (+A +C) (+B +C)).
>
> In other words a single BooleaQuery with a min should match parameter could
> be rewritten as pure disjunctive BooleanQuery comprised
is equivalent to ((+A +B) (+A +C) (+B +C)).
In other words a single BooleaQuery with a min should match parameter could
be rewritten as pure disjunctive BooleanQuery comprised from 3 sub-queries.
In terms of performance it seems that the two queries present different
behavior so the minMatch
g great.
I saw this issue with this class such that if you search for "term1*"
it is good, (i.e., 4 millisecs when it has >= 5 chars and it is ~250
millisecs when it is 2 chars)
but when you search for "term1 term2*" where when term2 is a single
char, the performance degra
l 1 char.
> >>>> Best regards
> >>>>
> >>>>
> >>>>> On 2/3/20 4:13 PM, baris.ka...@oracle.com wrote:
> >>>>> Hi,-
> >>>>>
> >>>>> i hope everyone is doing great.
> >>>>>
ue.
>>> > Thanks
>>> >
>>> >> On Feb 4, 2020, at 4:14 AM, Mikhail Khludnev wrote:
>>> >>
>>> >> It's slow per se, since it loads terms positions. Usual advices are
>>> >> shingling or edge ngrams. Note, i
n be smarter and faster in certain cases, although they
>> >> are backed on the same slow positions.
>> >>
>> >>> On Tue, Feb 4, 2020 at 7:25 AM wrote:
>> >>>
>> >>> How can this slowdown be resolved?
>> >>> is t
t; >>> is this another limitation of this class?
> >>> Thanks
> >>>
> >>>>> On Feb 3, 2020, at 4:14 PM, baris.ka...@oracle.com wrote:
> >>>> Please ignore the first comparison there. i was comparing there
> {term1
> >>>
har.
Best regards
On 2/3/20 4:13 PM, baris.ka...@oracle.com wrote:
Hi,-
i hope everyone is doing great.
I saw this issue with this class such that if you search for "term1*"
it is good, (i.e., 4 millisecs when it has >= 5 chars and it is ~250
millisecs when it is 2 chars)
but wh
gt;
>>> Best regards
>>>
>>>
>>>> On 2/3/20 4:13 PM, baris.ka...@oracle.com wrote:
>>>> Hi,-
>>>>
>>>> i hope everyone is doing great.
>>>>
>>>> I saw this issue with this class such that if you sea
that if you search for "term1*"
> it is good, (i.e., 4 millisecs when it has >= 5 chars and it is ~250
> millisecs when it is 2 chars)
> >>
> >> but when you search for "term1 term2*" where when term2 is a single
> char, the performance degrades too much.
m wrote:
>> Hi,-
>>
>> i hope everyone is doing great.
>>
>> I saw this issue with this class such that if you search for "term1*" it is
>> good, (i.e., 4 millisecs when it has >= 5 chars and it is ~250 millisecs
>> when it is 2 chars)
&
t is 2 chars)
but when you search for "term1 term2*" where when term2 is a single
char, the performance degrades too much.
The query "term1 term2*" slows down 50 times (~200 millisecs) compared
to "term1*" case when term 1 has >5 chars and term2 is still
is a single
char, the performance degrades too much.
The query "term1 term2*" slows down 50 times (~200 millisecs) compared
to "term1*" case when term 1 has >5 chars and term2 is still 1 char.
The query "term1 term2*" slows down 400 times (~1500 millisecs) compared
https://www.elastic.co/blog/faster-retrieval-of-top-hits-in-elasticsearch-with-block-max-wand
Uwe
Am April 14, 2019 2:22:59 PM UTC schrieb Khurram Shehzad :
>Hi All,
>
>I have recently updated from lucene-7.5.0 to lucene-8.0.0. But I
>noticed considerable performance degrade. Queries that use
Hi All,
I have recently updated from lucene-7.5.0 to lucene-8.0.0. But I noticed
considerable performance degrade. Queries that used to be executed in 18 to 24
milliseconds now taking 74 to 110 milliseconds.
Any suggestion please?
Regards,
Khurram
circumstances
where fetching from docValues actually has poorer overall performance
than using stored=true.
That said, the ability to use docValues fields in place of stored
(subject to certain restrictions that you should take the time to
understand) does indeed blur the distinction.
It's rea
scenario (I think
it is more common nowadays), search phrase should return as many results as
possible so that rank phrase can resort the results by machine learning
algorithm(on other clusters). Fetching performance is also important. On
Tue, 28 Aug 2018 00:11:40 +0800 Erick Erickson wrote
gt; Aug 2018 22:12:07 +0800 wrote Alex,- how big
> > are those docs? Best regards On 8/27/18 10:09 AM, alex stark wrote: > Hello
> > experts, I am wondering is there any way to improve document fetching
> > performance, it appears to me that visiting from store fi
:07 +0800
wrote Alex,- how big are those docs? Best regards On 8/27/18
10:09 AM, alex stark wrote: > Hello experts, I am wondering is there any way to improve
document fetching performance, it appears to me that visiting from store field is quite slow. I
simply tested to use indexsearch.do
:09 AM, alex stark wrote: > Hello experts, I am wondering is there
any way to improve document fetching performance, it appears to me that
visiting from store field is quite slow. I simply tested to use
indexsearch.doc() to get 2000 document which takes 50ms. Is there any idea to
improv
Mon, 27 Aug 2018 22:12:07 +0800
wrote Alex,- how big are those docs? Best regards On
8/27/18 10:09 AM, alex stark wrote: > Hello experts, I am wondering is there any way
to improve document fetching performance, it appears to me that visiting from store
field is quite slow. I simply tes
to improve document fetching performance, it appears to me that
visiting from store field is quite slow. I simply tested to use
indexsearch.doc() to get 2000 document which takes 50ms. Is there any idea to
improve that?
-
Alex,-
how big are those docs?
Best regards
On 8/27/18 10:09 AM, alex stark wrote:
Hello experts, I am wondering is there any way to improve document fetching
performance, it appears to me that visiting from store field is quite slow. I
simply tested to use indexsearch.doc() to get 2000
Hello experts, I am wondering is there any way to improve document fetching
performance, it appears to me that visiting from store field is quite slow. I
simply tested to use indexsearch.doc() to get 2000 document which takes 50ms.
Is there any idea to improve that?
I am seeing serious performance differences with three slightly varied
queries:
https://gist.github.com/darkfrog26/de19959db854aaf30957d64d1730d07f
Can anyone explain why this might be happening and any tips to optimize
it? Most queries are lightning fast, but ones like "Smith Mark D
LUCENE-8396 looks pretty good for LBS use cases, do we have performance result
for this approach? It appears to me it would greatly reduce terms to index a
polygon, and how about search performance? does it also perform well for
complex polygon which has hundreds or more coordinates?
are indexed and stored fields treated by Lucene w.r.t space and
> performance?
>
> Is there any performance hit with stored fields which are indexed?
>
>
>
> Lucene Version: 5.3.1
>
>
>
> Assumption:
>
> Stored fields are just simple strings (not huge documents
Hi
How are indexed and stored fields treated by Lucene w.r.t space and
performance?
Is there any performance hit with stored fields which are indexed?
Lucene Version: 5.3.1
Assumption:
Stored fields are just simple strings (not huge documents)
Example:
Data: [101, Gold]; [102
Thanks for the feedback!
-Original Message-
From: Adrien Grand [mailto:jpou...@gmail.com]
Sent: Friday, February 02, 2018 1:42 PM
To: java-user@lucene.apache.org
Subject: Re: Increase search performance
If needsScores returns false on the collector, then scores won't be computed.
.docBase = context.docBase;
> }
>
> public ScoreDoc[] getHits()
> {
> return matches;
> }
> }
>
> Best Regards,
> Atul Bisaria
>
> -Original Message-
> From: Adrien Grand [mailto:jpou...@gmail.com]
> Se
iginal Message-
From: Adrien Grand [mailto:jpou...@gmail.com]
Sent: Thursday, February 01, 2018 6:11 PM
To: java-user@lucene.apache.org
Subject: Re: Increase search performance
Yes, this collector won't perform well if you have many matches since memory
usage is linear with the number of
ffle(matches);
> maxHitsRequired = Math.min(matches.size(),
> maxHitsRequired);
>
> return matches.subList(0, maxHitsRequired);
> }
> }
>
> Best Regards,
> Atul Bisaria
>
> -Original Message-
> From: Adrien Grand [ma
);
}
}
Best Regards,
Atul Bisaria
-Original Message-
From: Adrien Grand [mailto:jpou...@gmail.com]
Sent: Wednesday, January 31, 2018 6:33 PM
To: java-user@lucene.apache.org
Subject: Re: Increase search performance
Hi Atul,
Le mar. 30 janv. 2018 à 16:24, Atul Bisaria a écrit
:
>
e a transaction log in parallel to
> > indexing,
> > >> so they commit very seldom. If the system crashes, the changes are
> > replayed
> > >> from tranlog since last commit.
> > >>
> > >> Uwe
> > >>
> > >>
on't sort by score, then wrapping with a ConstantScoreQuery won't
help as Lucene will figure out scores are not needed anyway.
> 2. Using query cache
>
>
>
> My understanding is that query cache would cache query results and hence
> lead to significant increase in pe
gt; >>
> >> -
> >> Uwe Schindler
> >> Achterdiek 19, D-28357 Bremen
> >> http://www.thetaphi.de
> >> eMail: u...@thetaphi.de
> >>
> >> > -Original Message-
> >> > From: Rob Audenaerde [mailto:rob.audenae...@gmail.c
>> > -Original Message-
>> > From: Rob Audenaerde [mailto:rob.audenae...@gmail.com]
>> > Sent: Monday, January 29, 2018 11:29 AM
>> > To: java-user@lucene.apache.org
>> > Subject: Re: indexing performance 6.6 vs 7.1
>> >
>> >
In the search use case in my application, I don't need to score query results
since all results are equal. Also query patterns are also more or less fixed.
Given these conditions, I am trying to increase search performance by
1. Using ConstantScoreQuery so that scoring overhe
we
>
> -
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Rob Audenaerde [mailto:rob.audenae...@gmail.com]
> > Sent: Monday, January 29, 2018 11:29 AM
> > To
28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Rob Audenaerde [mailto:rob.audenae...@gmail.com]
> Sent: Monday, January 29, 2018 11:29 AM
> To: java-user@lucene.apache.org
> Subject: Re: indexing performance 6.6 vs 7.1
>
> H
t; create pivot tables on search results really fast.
> >>
> >> These tables have some overlapping columns, but also disjoint ones.
> >>
> >> We anticipated a decrease in index size because of the sparse
> docvalues. We
> >> see this happening, w
search results really fast.
>>
>> These tables have some overlapping columns, but also disjoint ones.
>>
>> We anticipated a decrease in index size because of the sparse docvalues. We
>> see this happening, with decreases to ~50%-80% of the original index size.
>>
1 - 100 of 1582 matches
Mail list logo