[
https://issues.apache.org/jira/browse/SOLR-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yury Kartsev updated SOLR-9493:
-------------------------------
Attachment: Screen Shot 2016-09-11 at 16.29.50 .png
[~arafalov] I have spent some time and tried version 6.2.
Version 6.2 gives the same error, although in both cases now when using SolrJ.
What I mean by that is that both CloudSolrClient and HttpSolrClient end up
sending payload as "application/javabin" now (still ending up at the same place
of HttpSolrClient, i.e. {code}final HttpResponse response =
httpClient.execute(method);{code} In version 5.1 HttpSolrClient (when not in
cloud mode) was sending payload as "application/xml; charset=UTF-8" and that
worked (generated uniqueKey) - see above.
Case with payload sent as JSON (or XML) works fine and generated uniqueKey
without any issues. I ran it from SOLR web interface (Collection -> Documents
-> /update).
Please see screenshot from local proxy. First request sent by SolrJ when in
Cloud Mode (Solr started with ZK and -c switch, plus CloudColrClient is used).
Second request sent when in Standalone Mode (Solr started without -c switch,
collection created locallt, HttpSolrClient is used). Third request was made by
SOLR web UI while posting a document without ID as JSON (ID was auto-generated
successfully).
So there is definitely some issue there uniqueKey not generating when content
is posted as "application/javabin".
> uniqueKey generation fails if content POSTed as "application/javabin".
> ----------------------------------------------------------------------
>
> Key: SOLR-9493
> URL: https://issues.apache.org/jira/browse/SOLR-9493
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Yury Kartsev
> Attachments: 200.png, 400.png, Screen Shot 2016-09-11 at 16.29.50 .png
>
>
> I have faced a weird issue when the same application code (using SolrJ) fails
> indexing a document without a unique key (should be auto-generated by SOLR)
> in SolrCloud and succeeds indexing it in standalone SOLR instance (or even in
> cloud mode, but from web interface of one of the replicas). Difference is
> obviously only between clients (CloudSolrClient vs HttpSolrClient) and SOLR
> URLs (Zokeeper hostname+port vs standalone SOLR instance hostname and port).
> Failure is seen as "org.apache.solr.client.solrj.SolrServerException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Document is missing mandatory uniqueKey field: id".
> I am using SOLR 5.1. In cloud mode I have 1 shard and 3 replicas.
> After lot of debugging and investigation (see below as well as my
> [StackOverflow
> post|http://stackoverflow.com/questions/39401792/uniquekey-generation-does-not-work-in-solrcloud-but-works-if-standalone])
> I came to a conclusion that the difference in failing and succeeding calls
> is simply content type of the POSTing requests. Local proxy clearly shows
> that the request fails if content is sent as "application/javabin" (see
> attached screenshot with sensitive data removed) and succeeds if content sent
> as "application/xml; charset=UTF-8" (see attached screenshot with sensitive
> data removed).
> Would you be able to please assist?
> Thank you very much in advance!
> ------------------------
> Copying whole description and investigation here as well:
> ------------------------
> [Documentation|https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements]
> states:{quote}Schema defaults and copyFields cannot be used to populate the
> uniqueKey field. You can use UUIDUpdateProcessorFactory to have uniqueKey
> values generated automatically.{quote}
> Therefore I have added my uniqueKey field to the schema:{code}<fieldType
> name="uuid" class="solr.UUIDField" indexed="true" />
> ...
> <field name="id" type="uuid" indexed="true" stored="true" required="true" />
> ...
> <uniqueKey>id</uniqueKey>{code}Then I have added updateRequestProcessorChain
> to my solrconfig:{code}<updateRequestProcessorChain name="uuid">
> <processor class="solr.UUIDUpdateProcessorFactory">
> <str name="fieldName">id</str>
> </processor>
> <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>{code}And made it the default for the
> UpdateRequestHandler:{code}<initParams path="/update/**">
> <lst name="defaults">
> <str name="update.chain">uuid</str>
> </lst>
> </initParams>{code}
> Adding new documents with null/absent id works fine as from web-interface of
> one of the replicas, as when using SOLR in standalone mode (non-cloud) from
> my application. Although when only I'm using SolrCloud and add document from
> my application (using CloudSolrClient from SolrJ) it fails with
> "org.apache.solr.client.solrj.SolrServerException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Document is missing mandatory uniqueKey field: id"
> All other operations like ping or search for documents work fine in either
> mode (standalone or cloud).
> INVESTIGATION (i.e. more details):
> In standalone mode obviously update request is:{code}POST
> standalone_host:port/solr/collection_name/update?wt=json{code}
> In SOLR cloud mode, when adding document from one replica's web interface,
> update request is (found through inspecting the call made by web interface):
> {code}POST
> replica_host:port/solr/collection_name_shard1_replica_1/update?wt=json{code}
> In both these cases payload is something like:{code}{
> "add": {
> "doc": {
> .....
> },
> "boost": 1.0,
> "overwrite": true,
> "commitWithin": 1000
> }
> }{code}
> In case when CloudSolrClient is used, the following happens (found through
> debugging):
> Using ZK and some logic, URL list of replicas is constructed that looks like
> this:{code}[http://replica_1_host:port/solr/collection_name/,
> http://replica_2_host:port/solr/collection_name/,
> http://replica_3_host:port/solr/collection_name/]{code}
> This code is called:{code}LBHttpSolrClient.Req req = new
> LBHttpSolrClient.Req(request, theUrlList);
> LBHttpSolrClient.Rsp rsp = lbClient.request(req);
> return rsp.getResponse();{code}
> Where the second line fails with the exception.
> If to debug the second line further, it ends up calling HttpClient.execute
> (from HttpSolrClient.executeMethod) for:{code}POST
> http://replica_1_host:port/solr/collection_name/update?wt=javabin&version=2
> HTTP/1.1
> POST
> http://replica_2_host:port/solr/collection_name/update?wt=javabin&version=2
> HTTP/1.1
> POST
> http://replica_3_host:port/solr/collection_name/update?wt=javabin&version=2
> HTTP/1.1{code}
> And the very first request returns 400 Bad Request with replica 1 logging
> "Document is missing mandatory uniqueKey field: id" in the logs.
> The funny thing is that when I execute the same request using POSTMAN (but
> with JSON instead of binary payload), it works! Am I doing something wrong
> here? I assume it's definitely something in the way of how the request is
> made...
> UPDATE:
> I have used local proxy in order to see the difference in these 2 requests
> sent by my application in order to understand what is different there. Looks
> like the only difference is content type. In case of cloud mode the payload
> for POSTing document is sent as "application/javabin" while in standalone
> mode it's sent as "application/xml; charset=UTF-8". Everything else is the
> same. First request results in 400 while second is 200.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]