Hello,

I'm trying to follow the Solr Tutorial and need some pointers where I'm going wrong.

In exercise 3 I cannot import any documents in my localDocs collection, neither using the example\exampledocs\post.jar helper, nor the File Upload in the Admin UI. Neither PDF, nor XML, nor plain text.

The /update endpoint doesn't seem to be reachable. If so, why?

This is on Windows 10. All command line input and output is a transcript, so no copy&paste errors.

First, the Java version:

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> java -version
openjdk version "16.0.1" 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-16.0.1+9 (build 16.0.1+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-16.0.1+9 (build 16.0.1+9, mixed mode, sharing)
----------------------------------------------------------------------

I'm not comfortable taking shortcuts, so I'm starting Solr, creating collections etc. from exercises 1 and 2, as well.

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> .\bin\solr start -e cloud
"java version info is 16.0.1"
"Extracted major version is 16"

Welcome to the SolrCloud example!

This interactive session will help you launch a SolrCloud cluster on your local workstation. To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2]:

Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.
Please enter the port for node1 [8983]:

Please enter the port for node2 [7574]:

Creating Solr home directory D:\Solr\solr-8.8.2\example\cloud\node1\solr
Cloning D:\Solr\solr-8.8.2\example\cloud\node1 into
   D:\Solr\solr-8.8.2\example\cloud\node2

Starting up Solr on port 8983 using command:
"D:\Solr\solr-8.8.2\bin\solr.cmd" start -cloud -p 8983 -s "D:\Solr\solr-8.8.2\example\cloud\node1\solr"

"java version info is 16.0.1"
"Extracted major version is 16"
OpenJDK 64-Bit Server VM warning: JVM cannot use large page memory because it does not have enough privilege to lock pag
es in memory.
Waiting up to 30 to see Solr running on port 8983
Started Solr server on port 8983. Happy searching!

Starting up Solr on port 7574 using command:
"D:\Solr\solr-8.8.2\bin\solr.cmd" start -cloud -p 7574 -s "D:\Solr\solr-8.8.2\example\cloud\node2\solr" -z localhost:998
3

"java version info is 16.0.1"
"Extracted major version is 16"
OpenJDK 64-Bit Server VM warning: JVM cannot use large page memory because it does not have enough privilege to lock pag
es in memory.
Waiting up to 30 to see Solr running on port 7574
Started Solr server on port 7574. Happy searching!
INFO - 2021-04-29 15:22:53.249; org.apache.solr.common.cloud.ConnectionManager; Waiting for client to connect to ZooKee
per
INFO - 2021-04-29 15:22:53.275; org.apache.solr.common.cloud.ConnectionManager; zkClient has connected INFO - 2021-04-29 15:22:53.275; org.apache.solr.common.cloud.ConnectionManager; Client is connected to ZooKeeper INFO - 2021-04-29 15:22:53.290; org.apache.solr.common.cloud.ZkStateReader; Updated live nodes from ZooKeeper... (0) ->
 (2)
INFO - 2021-04-29 15:22:53.290; org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider; Cluster at localhost:99
83 ready

Now let's create a new collection for indexing documents in your 2-node cluster.
Please provide a name for your new collection: [gettingstarted]
techproducts
How many shards would you like to split techproducts into? [2]

How many replicas per shard would you like to create? [2]

Please choose a configuration for the techproducts collection, available options are:
_default or sample_techproducts_configs [_default]
sample_techproducts_configs
Created collection 'techproducts' with 2 shard(s), 2 replica(s) with config-set 'techproducts'

Enabling auto soft-commits with maxTime 3 secs using the Config API

POSTing request to Config API: http://localhost:8983/solr/techproducts/config
{"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}
Successfully set-property updateHandler.autoSoftCommit.maxTime to 3000


SolrCloud example running, please visit: http://localhost:8983/solr
----------------------------------------------------------------------

Now I'm deleting it again, as documented in the wrap-up:

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> .\bin\solr delete -c techproducts
"java version info is 16.0.1"
"Extracted major version is 16"
{
  "responseHeader":{
    "status":0,
    "QTime":237},
  "success":{
    "192.168.178.59:8983_solr":{"responseHeader":{
        "status":0,
        "QTime":45}},
    "192.168.178.59:7574_solr":{"responseHeader":{
        "status":0,
        "QTime":60}}}}


Deleted collection 'techproducts' using command:
http://192.168.178.59:7574/solr/admin/collections?action=DELETE&name=techproducts
----------------------------------------------------------------------

Enriching the schema:

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> c:\bin\curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema
{
  "responseHeader":{
    "status":0,
    "QTime":655}}
----------------------------------------------------------------------

BTW, PowerShell has 'curl' as an alias to an internal command. That took me a minute to figure out… maybe add a note in the Tutorial?

For some reason I cannot use curl to add the copy field, so I did that in the Admin UI, as documented.

Maybe a problem with escaping? I don't know.

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> c:\bin\curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field" : {"source":"*","dest":"_text_"}}' http://localhost:8983/solr/films/schema
{
  "responseHeader":{
    "status":500,
    "QTime":0},
  "error":{
"msg":"JSON Parse Error: char=*,position=26 AFTER='{add-copy-field : {source:*' BEFORE=',dest:_text_}}'", "trace":"org.noggit.JSONParser$ParseException: JSON Parse Error: char=*,position=26 AFTER='{add-copy-field : {source :*' BEFORE=',dest:_text_}}'\r\n\tat org.noggit.JSONParser.err(JSONParser.java:452)\r\n\tat org.noggit.JSONParser.handleN onDoubleQuoteString(JSONParser.java:819)\r\n\tat org.noggit.JSONParser.next(JSONParser.java:1026)\r\n\tat org.noggit.JSO NParser.nextEvent(JSONParser.java:1108)\r\n\tat org.noggit.ObjectBuilder.getObject(ObjectBuilder.java:180)\r\n\tat org.n oggit.ObjectBuilder.getVal(ObjectBuilder.java:104)\r\n\tat org.apache.solr.common.util.CommandOperation.parse(CommandOpe ration.java:293)\r\n\tat org.apache.solr.common.util.CommandOperation.readCommands(CommandOperation.java:362)\r\n\tat or g.apache.solr.common.util.CommandOperation.readCommands(CommandOperation.java:336)\r\n\tat org.apache.solr.api.ApiBag.ge tCommandOperations(ApiBag.java:318)\r\n\tat org.apache.solr.servlet.HttpSolrCall.getCommands(HttpSolrCall.java:1190)\r\n \tat org.apache.solr.servlet.SolrRequestParsers$1.getCommands(SolrRequestParsers.java:252)\r\n\tat org.apache.solr.schem a.SchemaManager.performOperations(SchemaManager.java:87)\r\n\tat org.apache.solr.handler.SchemaHandler.handleRequestBody (SchemaHandler.java:95)\r\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)\r\ n\tat org.apache.solr.core.SolrCore.execute(SolrCore.java:2646)\r\n\tat org.apache.solr.servlet.HttpSolrCall.execute(Htt pSolrCall.java:794)\r\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)\r\n\tat org.apache.solr.ser vlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilte r(SolrDispatchFilter.java:357)\r\n\tat org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)\r\n\tat or g.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)\r\n\tat org.eclipse.jetty.servlet.Servle tHandler.doHandle(ServletHandler.java:548)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j ava:143)\r\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)\r\n\tat org.eclipse.jetty.s erver.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.next Handle(ScopedHandler.java:235)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1612 )\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\r\n\tat org.eclipse.jetty.se rver.handler.ContextHandler.doHandle(ContextHandler.java:1434)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.ne xtScope(ScopedHandler.java:188)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)\r\n\ta t org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1582)\r\n\tat org.eclipse.jetty.server.han dler.ScopedHandler.nextScope(ScopedHandler.java:186)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(Con textHandler.java:1349)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\r\n\tat org .eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)\r\n\tat org.eclipse.jet ty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)\r\n\tat org.eclipse.jetty.server.handler.HandlerC ollection.handle(HandlerCollection.java:146)\r\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapp er.java:127)\r\n\tat org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)\r\n\tat org.eclips e.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\r\n\tat org.eclipse.jetty.server.Server.handle(Ser ver.java:516)\r\n\tat org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)\r\n\tat org.eclipse.jet ty.server.HttpChannel.dispatch(HttpChannel.java:556)\r\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.jav a:375)\r\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)\r\n\tat org.eclipse.jetty.io. AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\r\n\tat org.eclipse.jetty.io.FillInterest.fillabl e(FillInterest.java:105)\r\n\tat org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)\r\n\tat org.eclips e.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)\r\n\tat org.eclipse.jetty.util.thread.strat egy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)\r\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryP roduce(EatWhatYouKill.java:171)\r\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:12 9)\r\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)\r\n\ tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)\r\n\tat org.eclipse.jetty.util.thre ad.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)\r\n\tat java.base/java.lang.Thread.run(Thread.java:831)\r\n",
    "code":500}}
----------------------------------------------------------------------

I can successfully POST here in exercise 2:

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> java -jar -Dc=films -Dauto example\exampledocs\post.jar example\films\*.xml
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/films/update...
Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ot
s,rtf,htm,html,txt,log
POSTing file films.xml (application/xml) to [base]
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/films/update...
Time spent: 0:00:02.395
----------------------------------------------------------------------

So far, so good, let's delete the collection:

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> .\bin\solr delete -c films
"java version info is 16.0.1"
"Extracted major version is 16"
{
  "responseHeader":{
    "status":0,
    "QTime":226},
  "success":{
    "192.168.178.59:8983_solr":{"responseHeader":{
        "status":0,
        "QTime":43}},
    "192.168.178.59:7574_solr":{"responseHeader":{
        "status":0,
        "QTime":55}}}}


Deleted collection 'films' using command:
http://192.168.178.59:7574/solr/admin/collections?action=DELETE&name=films
----------------------------------------------------------------------

Finally, exercise 3. Create a new collection:

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> .\bin\solr create -c localDocs -s 2 -rf 2
"java version info is 16.0.1"
"Extracted major version is 16"
WARNING: Using _default configset with data driven schema functionality. NOT RECOMMENDED for production use. To turn off: bin\solr config -c localDocs -p 8983 -action set-user-property -property update.autoCreateFields -
value false
Created collection 'localDocs' with 2 shard(s), 2 replica(s) with config-set 'localDocs'
----------------------------------------------------------------------

And POST:

----------------------------------------------------------------------
PS D:\Solr\solr-8.8.2> java -jar -Dc=localDocs -Dauto example\exampledocs\post.jar .\example\exampledocs\solr-word.pdf
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/localDocs/update...
Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ot
s,rtf,htm,html,txt,log
POSTing file solr-word.pdf (application/pdf) to [base]/extract
SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url: http://localhost:8983/solr/localDocs/update/ex
tract?resource.name=D%3A%5CSolr%5Csolr-8.8.2%5C.%5Cexample%5Cexampledocs%5Csolr-word.pdf&literal.id=D%3A%5CSolr%5Csolr-8
.8.2%5C.%5Cexample%5Cexampledocs%5Csolr-word.pdf
SimplePostTool: WARNING: Response: <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404 Not Found</h2>
<table>
<tr><th>URI:</th><td>/solr/localDocs/update/extract</td></tr>
<tr><th>STATUS:</th><td>404</td></tr>
<tr><th>MESSAGE:</th><td>Not Found</td></tr>
<tr><th>SERVLET:</th><td>default</td></tr>
</table>

</body>
</html>
SimplePostTool: WARNING: IOException while reading response: java.io.FileNotFoundException: http://localhost:8983/solr/l
ocalDocs/update/extract?resource.name=D%3A%5CSolr%5Csolr-8.8.2%5C.%5Cexample%5Cexampledocs%5Csolr-word.pdf&literal.id=D%
3A%5CSolr%5Csolr-8.8.2%5C.%5Cexample%5Cexampledocs%5Csolr-word.pdf
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/localDocs/update...
Time spent: 0:00:00.059
----------------------------------------------------------------------

No luck.

I then tried to upload example\exampledocs\solr-word.pdf in the Admin UI (Document, File Upload), but got the error message:

"Unsupported ContentType: application/pdf Not in: [application/xml, […cut here…]].

Then I tried uploading films.xml from the example earlier, but got the error message:

"Async exception during distributed update: Error from server at http://192.168.178".59:8983/solr/localDocs_shard2_replica_n4/: null request: http://192.168.178.59:8983/solr/localDocs_shard2_replica_n4/ Remote error message: ERROR [doc=/en/quien_es_el_senor_lopez] Error adding field 'name'='¿Quién es el señor López?' msg=For input string: "¿Quién es el señor López?"

Can you help me out?
--
Wunderschön illustrierte Kinderbücher:
https://www.schoene-kinderbuecher.de    
Weblog:
https://www.thomas-huehn.de

Reply via email to