Probably not very helpful for the original question, but for the sake of
completeness: you can use the Lucene documentID with the Luke Request
Handler.
https://solr.apache.org/guide/solr/latest/indexing-guide/luke-request-handler.html
You can not use it as a reliable identifier for your Solr docu
You can find id terms repeating in an index via
https://solr.apache.org/guide/solr/latest/query-guide/terms-component.html
and terms.mincount=2
or do the same via facets
q=*:*&facet=true&facet.field=id&facet.limit=-1&facet.mincount=2 (just on
top of my head)
Then you can query duplicated ids one by
On 10/22/23 12:25, Gus Heck wrote:
Echoing what Thomas says, this problem indicates your indexing system
probably has a significant design flaw. For most systems, you should have a
notion of document identity that is external to Solr, and that should be
used as (or to deterministically generate)
Echoing what Thomas says, this problem indicates your indexing system
probably has a significant design flaw. For most systems, you should have a
notion of document identity that is external to Solr, and that should be
used as (or to deterministically generate) the id in Solr. If you don't do
this
e, or all fields etc?
> >
> > Sent from Mail for Windows
> >
> > From: Vince McMahon
> > Sent: Sunday, October 22, 2023 3:22 PM
> > To: users@solr.apache.org
> > Subject: what is SOLR syntax to remove duplicated documents
> >
> > I have a SO
ds etc?
>
> Sent from Mail for Windows
>
> From: Vince McMahon
> Sent: Sunday, October 22, 2023 3:22 PM
> To: users@solr.apache.org
> Subject: what is SOLR syntax to remove duplicated documents
>
> I have a SOLR 8.X. I suspect one of the core has duplicates and wants to
When do you consider two documents are duplicates? When 1 field has the same
value, when multiple fields have the same value, or all fields etc?
Sent from Mail for Windows
From: Vince McMahon
Sent: Sunday, October 22, 2023 3:22 PM
To: users@solr.apache.org
Subject: what is SOLR syntax to remove
I have a SOLR 8.X. I suspect one of the core has duplicates and wants to
remove the duplicated documents. Signature, as in the SOLR guide, is not
implemented. https://solr.apache.org/guide/6_6/de-duplication.html
in sql, a query without the use of a hash column will be liked:
;WITH CTE AS
(