[jira] [Updated] (SOLR-445) Update Handlers abort with bad documents

JIRA Mon, 31 Mar 2014 17:28:27 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tomás Fernández Löbbe updated SOLR-445:
---------------------------------------

    Attachment: SOLR-445-alternative.patch

This is a different approach for this issue. The errors are managed by an 
UpdateRequestProcessor that must be added before other processors in the chain. 
It accepts maxErrors in the configuration as default or as a request parameter. 
If used, the default maxErrors value is Integer.MAX_VALUE, to get the current 
behavior one should set it to 0 (however, wouldn’t make sense to add the 
processor to the chain in this case, unless it depends on the request 
parameter).
This would handle only bad documents, but not others mentioned in previous 
comments (like Tika parsing exceptions, etc).
The response will look something like: 

{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
  <int name="numErrors">10</int>
  <lst name="errors">
    <lst name="1">
      <str name="message">ERROR: [doc=1] Error adding field 'weight'='b' 
msg=For input string: "b"</str>
    </lst>
    <lst name="3">
      <str name="message">ERROR: [doc=3] Error adding field 'weight'='b' 
msg=For input string: "b"</str>
    </lst>
...
  <int name="status">0</int>
  <int name="QTime">17</int>
</lst>
</response>
{code}

> Update Handlers abort with bad documents
> ----------------------------------------
>
>                 Key: SOLR-445
>                 URL: https://issues.apache.org/jira/browse/SOLR-445
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Will Johnson
>             Fix For: 4.8
>
>         Attachments: SOLR-445-3_x.patch, SOLR-445-alternative.patch, 
> SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, SOLR-445.patch, 
> SOLR-445_3x.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid 
> batch.  Ie:
> <add>
>   <doc>
>     <field name="id">1</field>
>   </doc>
>   <doc>
>     <field name="id">2</field>
>     <field name="myDateField">I_AM_A_BAD_DATE</field>
>   </doc>
>   <doc>
>     <field name="id">3</field>
>   </doc>
> </add>
> Right now solr adds the first doc and then aborts.  It would seem like it 
> should either fail the entire batch or log a message/return a code and then 
> continue on to add doc 3.  Option 1 would seem to be much harder to 
> accomplish and possibly require more memory while Option 2 would require more 
> information to come back from the API.  I'm about to dig into this but I 
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-445) Update Handlers abort with bad documents

Reply via email to