: We recently updated our Solr and Solr indexing from DIH using Solr 1.4 to our
: own Hadoop import using SolrJ and Solr 3.4.
...
: Any document that has a string field value with a carriage return "\r" is
: having that carriage return stripped before being added to the index. All
: line breaks "\n" are not being stripped.
...
: This did not occur with the DIH.
:
: Thoughts? Is there a way to not have solrJ strip all carriage returns?
What makes you think this is SolrJ? If it is, you should be able to
create a ~10 line test of SOlrJ demonstrating this with hard coded date.
I suspect your data is getting cleaned somewhere else in your data flow
that didn't exist when DIH was fetching it directly.
-Hoss