Thnaks for you quick response. Let me elaborate a bit more.
There is an existing index I have in production where I reindex documents
all the time using the DocID unique field.
When a new request comes in using /update/extract that documents gets
reindexed and replace with new data. No problem with that.

Once I added the _root_  and type_level fields to the schema old existing
document in Solr stays the same and a new document gets created.
If I reindex again the same document. The new one the one gets reindexed
and rewrite but the old one still there.

I have the feeeling there is an issue with update/extrcat firs time you add
_root_ and type_level to the schema. It doesn't understand the old document
and the new one are the same.

This forces me to delte index and do a reindexation for scatch. Or reindex
all documents and later at then end delte the old one don't have
type_level:parent.

Any ideas on this?





On Mon 19 May 2025 at 13:40, Ehrenleitner Robert Harald <
robert.ehrenleit...@plus.ac.at> wrote:

> Hi,
>
> what exactly do you mean by "my document appears twice"? A document can
> appear a hundred times if all the entries differ only by the ID. Make sure
> your indexer takes care of this. Also, your unique ID field is "DocID", and
> according to your sample, its value seems to match ID. Make sure, it always
> matches, otherwise it is handled like a compound primary key in a SQL
> database (actually, Solr's DB is a no-SQL database, but this only concerns
> the way the data is queried).
>
> Also, make sure your query does not confuse parent ID and ID in some way.
> This could happen due to a bug in the querying application.
>
> Mag.phil. Robert Ehrenleitner, BEng.
> --
>
> Mag.phil. Robert Ehrenleitner, BEng.
>
> Web-Developer
>
> IT-Services | Application & Digitalization Services
>
> Hellbrunner Straße 34 | 5020 Salzburg | Austria
>
> Tel.: +43/(0)662/8044 - 6778
>
> *www.plus.ac.at <http://www.plus.ac.at>*
>
>
> ------------------------------
> *Von:* Sergio García Maroto <marot...@gmail.com>
> *Gesendet:* Montag, 19. Mai 2025 13:00
> *An:* solr-user <solr-u...@lucene.apache.org>
> *Betreff:* Using ExtractRequest handler to index documents using
> type_leve=parent
>
> Hi,
>
> I have been indexing documents for a long time usign /update/extract.
> Everyhting has been working well until I got a new requirement to add
> nested documents
>
> I added to schema.xml
> <field name="type_level" type="string" indexed="true" stored="true"
> docValues="true" /> <field name="_root_" type="string" indexed="true"
> stored
> ="true" multiValued="false" required="false" />
>
> My unique field
> <field name="DocID" type="string" indexed="true" stored="true" />
> <uniqueKey>DocID</uniqueKey>
>
> Ater doing this my reques to /update/extract to reindex the same document
> duplicates the document in SOlr.
> Here my request. I only changed the new parametes type_level:parent
>
> http://server:8983/solr/document/update/extract?
> literal.id=6584239&
> resource.name=&
> wt=xml&
> literal.DocID=6584239&
> literal.CoreID=6584239&
> literal.DocIsAttachToPNB=False&
> literal.DocAuthorID=1455&
> literal.DocIsAttachToPerson=True&
> literal.DocIsAttachToAssign=False&
> literal.DocIsAttachToCompany=False&
> literal.DocVersionID=4504527&
> literal.InsertDateSD=2011-01-03T07%3a51%3a00.0Z&
> literal.DocNameS=Squires+David+RES.doc&
> literal.DocCateNameS=Resume%2fCV&
> literal.DocAreaCateNameS=Person+Module&
> literal.type_level=parent&
>
> stream.url=http%3a%2f%2flocalhost%3a8081%2f4%2f50%2f45%2fSquires%2520David%2520RES15EAC416-AF05-4D38-A4F9-7B489962C167.docx&
> overwrite=true&
> commit=true
>
> After this request the document appear duplicated. the only difference
> between the old and new one is type_level:parent.
>
> Anyone has any idea why this is happening.
>
> Regads,
> Sergio Maroto
>

Reply via email to