[ 
https://issues.apache.org/jira/browse/SOLR-16288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566000#comment-17566000
 ] 

Gus Heck commented on SOLR-16288:
---------------------------------

This is not the appropriate forum for obtaining help. This is a bug tracker, 
and is meant to be used when you are certain of something (of a bug, a feature 
request, of a patch that you want to contribute). When you are {*}uncertain{*}, 
and need help, please use the mailing list 
[us...@solr.apache.org|mailto:us...@solr.apache.org] (instructions for joining 
are available at [https://solr.apache.org/community.html]

Briefly I will comment that SolrCell is really only appropriate for small scale 
and for testing. Large indexes will want to do their text extraction via Tika 
(which is what SolrCell uses) outside of solr to avoid excessive load on the 
search machine while it is serving queries. (You may be aware of that given 
that it is stated in the docs at 
[https://solr.apache.org/guide/8_11/uploading-data-with-solr-cell-using-apache-tika.html#solr-cell-performance-implications)]
 As to your specific problem I'm not sure, but I suspect your non-standard 
_uniqueid field (evidently defined in your schema based on the error message) 
needs to be specified as a literal in the request

> Error indexing files(html, pdf) using SOLR Cell Tika
> ----------------------------------------------------
>
>                 Key: SOLR-16288
>                 URL: https://issues.apache.org/jira/browse/SOLR-16288
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Nicolas
>            Priority: Major
>
> Hi - I am trying to index files such as html and pdf. I got the following 
> error related to unique id which is defined in the curl command. The unique 
> id is set with the literal.id parameter.
> Can you please help? I read all the documentation of SOLR Cell and tika, and 
> I am doing the steps as its described.
> Here is what I enter in the cmd.
> C:\>{*}curl 
> "https://localhost:8984/solr/XP0_Slavik_web_index/update/extract?literal.id=doc1?commit=true";
>  -F "myfile=@example.pdf"{*}
> {
>   "responseHeader":{
>     "status":400,
>     "QTime":55},
>   "error":{
>     "metadata":[
>       "error-class","org.apache.solr.common.SolrException",
>       "root-error-class","org.apache.solr.common.SolrException"],
>     "msg":"{*}Document is missing mandatory uniqueKey field: _uniqueid{*}",
>     "code":400}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to