Hi!
I cannot open by lucene master my indexes created by lucene 8.5. I get an
error
Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
codec mismatch: actual codec=Lucene84PostingsWriterDoc vs expected
codec=Lucene90PostingsWriterDoc
(resource=MMapIndexInput(path="C:\data\luc
ck up and ask what the use-case
> > > > is. Returning 6.5M docs to a user is useless, so are you’re doing
> > > > some kind of analytics maybe? In which case, and again
> > > > assuming you’re using Solr, Streaming Aggregation might
> > > > be a better
e you’re doing
> > > some kind of analytics maybe? In which case, and again
> > > assuming you’re using Solr, Streaming Aggregation might
> > > be a better option.
> > >
> > > This really sounds like an XY problem. You’re trying to solve problem X
> &
problem X
> and asking how to accomplish it with Y. What I’m questioning
> is whether Y (grouping) is a good approach or not. Perhaps if
> you explained X there’d be a better suggestion.
>
> Best,
> Erick
>
> > On Oct 9, 2020, at 8:19 AM, Dmitry Emets wrote:
> >
>
I have 12_000_000 documents, 6_500_000 groups
With sort: It takes around 1 sec without grouping, 2 sec with grouping and
12 sec with setAllGroups(true)
Without sort: It takes around 0.2 sec without grouping, 0.6 sec with
grouping and 10 sec with setAllGroups(true)
Thank you, Erick, I will look in
Yes, it is
пт, 9 окт. 2020 г. в 14:25, Diego Ceccarelli (BLOOMBERG/ LONDON) <
dceccarel...@bloomberg.net>:
> Is the field that you are using to dedupe stored as a docvalue?
>
> From: java-user@lucene.apache.org At: 10/09/20 12:18:04To:
> java-user@lucene.apache.org
> Subject: Deduplication of sea
Hi,
I need to deduplicate search results by specific field and I have no idea
how to implement this properly.
I have tried grouping with setGroupDocsLimit(1) and it gives me expected
results, but has not very good performance.
I think that I need something like DiversifiedTopDocsCollector, but
suit