[
https://issues.apache.org/jira/browse/SOLR-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892429#comment-13892429
]
Hoss Man commented on SOLR-5698:
--------------------------------
Easy steps to reproduce using the example configs...
{noformat}
hossman@frisbee:~$ perl -le 'print "a,aaa"; print "z," . ("Z" x 32767);' | curl
'http://localhost:8983/solr/update?header=false&fieldnames=name,long_s&rowid=id&commit=true'
-H 'Content-Type: application/csv' --data-binary @-
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">572</int></lst>
</response>
hossman@frisbee:~$ curl
'http://localhost:8983/solr/select?q=*:*&fl=id,name&wt=json&indent=true'{
"responseHeader":{
"status":0,
"QTime":12,
"params":{
"fl":"id,name",
"indent":"true",
"q":"*:*",
"wt":"json"}},
"response":{"numFound":2,"start":0,"docs":[
{
"name":"a",
"id":"0"},
{
"name":"z",
"id":"1"}]
}}
hossman@frisbee:~$ curl
'http://localhost:8983/solr/select?q=long_s:*&wt=json&indent=true'
{
"responseHeader":{
"status":0,
"QTime":1,
"params":{
"indent":"true",
"q":"long_s:*",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
"name":"a",
"long_s":"aaa",
"id":"0",
"_version_":1459225819107819520}]
}}
{noformat}
> exceptionally long terms are silently ignored during indexing
> -------------------------------------------------------------
>
> Key: SOLR-5698
> URL: https://issues.apache.org/jira/browse/SOLR-5698
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
>
> As reported on the user list, when a term is greater then 2^15 bytes it is
> silently ignored at indexing time -- no error is given at all.
> we should investigate:
> * if there is a way to get the lower level lucene code to propogate up an
> error we can return to the user instead of silently ignoring these terms
> * if there is no way to generate a low level error:
> ** is there at least way to make this limit configurable so it's more obvious
> to users that this limit exists?
> ** should we make things like StrField do explicit size checking on the terms
> they produce and explicitly throw their own error?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]