[
https://issues.apache.org/jira/browse/SOLR-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060934#comment-14060934
]
Hoss Man commented on SOLR-6016:
--------------------------------
bq. We could add a SchemaGeneratorHandler which would generate the "best"
schema.
You wouldn't need/want a handler for this -- you'd just need an
UpdateProcessorFactory to use in place of RunUpdateProcessorFactory that would
look at the datatpes of the fields in each document w/o doing any indexing and
pick the least common denominator.
So then you'd have a chain with all of your normal update processors including
the TypeMapping processors configured with the preccedence orders and locales
and format strings you want -- and at the end you'd have your
BestFitScheamGeneratorUpdateProcessorFactory that would look at all those docs,
study their values, and throw them away -- until a {{commit}} comes along, at
which point it does all the under the hood schema field addition calls.
So do learn, you'd send docs using whatever handler/format you wnat (json, xml,
extraction, etc...) with an
{{update.chain=my.datatype.learning.processor.chain}} request param ... and
once you've sent a bunch and giving it a lot of variety to see, then you send a
commit so it creates the schema and then you re-index your docs for real w/o
that special chain.
Varun: want to open a new issue for this idea? ... it's realted but independent
to the current issue which might have other tweaks/improvements on it's own.
> Failure indexing exampledocs with example-schemaless mode
> ---------------------------------------------------------
>
> Key: SOLR-6016
> URL: https://issues.apache.org/jira/browse/SOLR-6016
> Project: Solr
> Issue Type: Bug
> Components: documentation, Schema and Analysis
> Affects Versions: 4.7.2, 4.8
> Reporter: Shalin Shekhar Mangar
> Attachments: SOLR-6016.patch, solr.log
>
>
> Steps to reproduce:
> # cd example; java -Dsolr.solr.home=example-schemaless/solr -jar start.jar
> # cd exampledocs; java -jar post.jar *.xml
> Output from post.jar
> {code}
> Posting files to base url http://localhost:8983/solr/update using
> content-type application/xml..
> POSTing file gb18030-example.xml
> POSTing file hd.xml
> POSTing file ipod_other.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> POSTing file ipod_video.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> POSTing file manufacturers.xml
> POSTing file mem.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> POSTing file money.xml
> POSTing file monitor2.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> POSTing file monitor.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> POSTing file mp500.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> POSTing file sd500.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> POSTing file solr.xml
> POSTing file utf8-example.xml
> POSTing file vidcard.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update
> 14 files indexed.
> COMMITting Solr index changes to http://localhost:8983/solr/update..
> Time spent: 0:00:00.401
> {code}
> Exceptions in Solr (I am pasting just one of them):
> {code}
> 5105 [qtp697879466-14] ERROR org.apache.solr.core.SolrCore –
> org.apache.solr.common.SolrException: ERROR: [doc=EN7800GTX/2DHTV/256M] Error
> adding field 'price'='479.95' msg=For input string: "479.95"
> at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:167)
> at
> org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77)
> at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:234)
> at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
> at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
> at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> ......
> Caused by: java.lang.NumberFormatException: For input string: "479.95"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:441)
> at java.lang.Long.parseLong(Long.java:483)
> at org.apache.solr.schema.TrieField.createField(TrieField.java:609)
> at org.apache.solr.schema.TrieField.createFields(TrieField.java:660)
> {code}
> The full solr.log is attached.
> I understand why these errors occur but since we ship example data with Solr
> to demonstrate our core features, I expect that indexing exampledocs should
> work without errors.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]