IIRC, it's the number of documents marked with a "deleted" bit. They are
obliterated during merges as segments written during the merge operation no
longer include deleted contents. So eg. if you call forceMerge(1), no
previous segment is preserved and the deleted count will drop to 0 as a
result.
It looks like Sascha runs IndexUpgrader for all major versions, ie. 6.6.6,
7.7.3 and 8.11.1. File "segments_91" is written by the 7.7.3 run
immediately before the error.
On Wed, Jan 12, 2022 at 3:44 PM Adrien Grand wrote:
> The log says what the problem is: version 8.11.1 cannot read indices
> c
Hi Baris,
Explanation's output is hierarchical, and the leading "0.0" values you
are seeing are the individual contributions of each boolean clause or
any other nested query.
Going from bottom to top:
Term query on countryDFLT = 'states', but no term matched this value
--> score is 0.0 for the t
> > But it can be workable, if I manage to apply context condition
> separately.
> >
> >
> > More probably using custom filtering through Collector interface
> https://lucene.apache.org/core/7_3_1/core/org/apache/lucene/
> search/Collector.html.
> >
> >
> > Any idea please.
> >
> >
> > Regards,
> > Khurram
>
>
>
> --
> Tomoko Uchida
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
--
András Péteri
An n-gram tokenizer/filter might also work for you:
http://lucene.apache.org/core/7_3_1/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenizer.html
Regards,
András
On Wed, Jun 20, 2018 at 11:53 AM, Markus Jelsma
wrote:
> Hi Egorlex,
>
> Set the tokenSeparator to "" and ShingleFilter w
Hi Avarinth,
There is an open issue to encrypt index files using AES, don't know if
that would fit your requirements:
https://issues.apache.org/jira/browse/LUCENE-2228
Regards,
András
On Tue, Feb 6, 2018 at 8:32 AM, Michael Wilkowski wrote:
> Hi,
> sorry to say that, but your encryption is not
Hi,
Note that If you are using Lucene directly, 5.x introduced LUCENE-6064 [1]
[2], which adds checks to ensure that the sort field has a corresponding
DocValue of the expected type. Indexed fields can only be used for sorting
via an UninvertingReader, at a cost of increased heap usage [3]. Solr
h
ess the solution
> should be explicitly use getCommitData for each sub-index, then set it into
> new consolidated search database, right?
>
> Best,
>
> --Xiaolong
>
>
> On Tue, Nov 22, 2016 at 12:10 PM, András Péteri > wrote:
>
>> Hi Xiaolong,
>>
>> A Map o
> I am wondering does indexwriter can also merge this non-index file while
>> it
>> > merging multiple search index?
>> >
>> > And if I am stepping back a little bit, what's is the best way t
ter all?
> >
> > TX
> >
> > -----
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>
--
András Péteri
rch. It’s caused all sorts of head-scratching
> till we discovered what’s going on.
>
> Craziness.
>
> ~ David
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
--
András Péteri
l.com]
> > >>> Sent: Wednesday, October 21, 2015 7:03 PM
> > >>> To: java-user@lucene.apache.org
> > >>> Subject: ConjunctionScorer access
> > >>>
> > >>> It's a bummer Lucene makes the constructor of ConjunctionScorer non-
> > >>> public. I wanted to extend from this class in order to tweak its
> > >> behavior for
> > >>> my use case. Is it possible to change it to protected in future
> > releases
> > >> ?
> > >>
> > >>
> > >> -
> > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >>
> > >>
> >
> >
>
--
András Péteri
Hi Napoli,
You could also create an instance of SearcherManager [1], and let it take
care of tracking IndexSearchers; it can also be use to reopen the
underlying readers, and close them when they are no longer in use. Calling
maybeRefresh() or maybeRefreshBlocking() on the manager ensures that a
r
If I understand it correctly, the Zoie library [1][2] implements the
"sledgehammer" approach by collecting docValues for all documents when a
segment reader is opened. If you have some RAM to throw at the problem,
this could indeed bring you an acceptable level of performance.
[1] http://senseidb.
Collector's javadoc in Lucene 4.x includes a bare minimum example which
only registers matching documents in a bitset:
https://github.com/apache/lucene-solr/blob/lucene_solr_4_10_4/lucene/core/src/java/org/apache/lucene/search/Collector.java#L85
You'll have to adapt this if you want to use it in L
Hi,
IndexSearcher.search(Query, Collector) will iterate through all segments of
the index, call getLeafCollector, and use the returned LeafCollector to
collect result documents from that segment [1].
As LeafCollector's javadoc describes [2], there are cases when you want to
take into account prec
As Olivier wrote, multiple BytesRef instances can share the underlying byte
array when representing slices of existing data, for performance reasons.
BytesRef#clone()'s javadoc comment says that the result will be a shallow
clone, sharing the backing array with the original instance, and points
to
Sorry, I also got it wrong in the previous message. :) It goes 0.89f
-> 123 -> 0.875f.
On Thu, Mar 5, 2015 at 10:08 AM, András Péteri
wrote:
> Hi Andrew,
>
> If you are using Lucene 3.6.1, you can take a look at the method which
> creates a single byte value out of the receiv
Hi Andrew,
If you are using Lucene 3.6.1, you can take a look at the method which
creates a single byte value out of the received float using bit
manipulation at [1]. There is also a 256-element decoder table in
Similarity, where each byte corresponds to a decoded float value
computed by [2].
The
Hi,
According to IndexSearcher's code [1], if a Collector implementation is not
interested in collecting document hits from a particular leaf reader, it
can also throw CollectionTerminatedException from
Collector.getLeafCollector(LeafReaderContext). This option is however not
described in Collecto
Hi Clemens,
I think this part of the release notes [1] applies to your case:
* FieldCache is gone (moved to a dedicated UninvertingReader in the misc
module). This means when you intend to sort on a field, you should index
that field using doc values, which is much faster and less heap consuming
Hello Aurélien,
I believe the approach you described is what Elasticsearch is taking with
nested documents, in addition to indexing parent and child documents in a
single block. See the "sidebar" at the bottom of [1] and the sections
labeled "nested" of [2] for more details.
Michael's blog post o
Hello,
Our application uses Lucene to index documents received from a
back-end that supports storage of temporal data with branches, similar
to revision control systems like SVN: when looking at a single object,
one can choose to either retrieve the current state, go back to a
previous point in ti
23 matches
Mail list logo