Hi,

what exactly do you mean by "my document appears twice"? A document can appear 
a hundred times if all the entries differ only by the ID. Make sure your 
indexer takes care of this. Also, your unique ID field is "DocID", and 
according to your sample, its value seems to match ID. Make sure, it always 
matches, otherwise it is handled like a compound primary key in a SQL database 
(actually, Solr's DB is a no-SQL database, but this only concerns the way the 
data is queried).

Also, make sure your query does not confuse parent ID and ID in some way. This 
could happen due to a bug in the querying application.

Mag.phil. Robert Ehrenleitner, BEng.
--

[cid:6de1a8ce-0088-4e39-af91-4e8d253b2e01]

Mag.phil. Robert Ehrenleitner, BEng.

Web-Developer

IT-Services | Application & Digitalization Services

Hellbrunner Straße 34 | 5020 Salzburg | Austria

Tel.: +43/(0)662/8044 - 6778

www.plus.ac.at<http://www.plus.ac.at>


________________________________
Von: Sergio García Maroto <marot...@gmail.com>
Gesendet: Montag, 19. Mai 2025 13:00
An: solr-user <solr-u...@lucene.apache.org>
Betreff: Using ExtractRequest handler to index documents using type_leve=parent

Hi,

I have been indexing documents for a long time usign /update/extract.
Everyhting has been working well until I got a new requirement to add
nested documents

I added to schema.xml
<field name="type_level" type="string" indexed="true" stored="true"
docValues="true" /> <field name="_root_" type="string" indexed="true" stored
="true" multiValued="false" required="false" />

My unique field
<field name="DocID" type="string" indexed="true" stored="true" />
<uniqueKey>DocID</uniqueKey>

Ater doing this my reques to /update/extract to reindex the same document
duplicates the document in SOlr.
Here my request. I only changed the new parametes type_level:parent

http://server:8983/solr/document/update/extract?
literal.id=6584239&
resource.name=&
wt=xml&
literal.DocID=6584239&
literal.CoreID=6584239&
literal.DocIsAttachToPNB=False&
literal.DocAuthorID=1455&
literal.DocIsAttachToPerson=True&
literal.DocIsAttachToAssign=False&
literal.DocIsAttachToCompany=False&
literal.DocVersionID=4504527&
literal.InsertDateSD=2011-01-03T07%3a51%3a00.0Z&
literal.DocNameS=Squires+David+RES.doc&
literal.DocCateNameS=Resume%2fCV&
literal.DocAreaCateNameS=Person+Module&
literal.type_level=parent&
stream.url=http%3a%2f%2flocalhost%3a8081%2f4%2f50%2f45%2fSquires%2520David%2520RES15EAC416-AF05-4D38-A4F9-7B489962C167.docx&
overwrite=true&
commit=true

After this request the document appear duplicated. the only difference
between the old and new one is type_level:parent.

Anyone has any idea why this is happening.

Regads,
Sergio Maroto

Reply via email to