[ 
https://issues.apache.org/jira/browse/SOLR-7255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365111#comment-14365111
 ] 

Mark Miller commented on SOLR-7255:
-----------------------------------

bq. 4.10.3

If you indeed are on 4.10.3, I'd inspect the config and make sure the write 
side of the block cache is not on (it doesn't really give much or any benefit 
anyway IMO). The above exception pattern is a very common result if it is.

> Index Corruption on HDFS whenever online bulk indexing (from Hive)
> ------------------------------------------------------------------
>
>                 Key: SOLR-7255
>                 URL: https://issues.apache.org/jira/browse/SOLR-7255
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.10.3
>         Environment: HDP 2.2 / HDP Search + LucidWorks hadoop-lws-job.jar
>            Reporter: Hari Sekhon
>            Priority: Blocker
>
> When running SolrCloud on HDFS and using the LucidWorks hadoop-lws-job.jar to 
> index a Hive table (620M rows) to Solr it runs for about 1500 secs and then 
> gets this exception:
> {code}Exception in thread "Lucene Merge Thread #2191" 
> org.apache.lucene.index.MergePolicy$MergeException: 
> org.apache.lucene.index.CorruptIndexException: codec header mismatch: actual 
> header=1494817490 vs expected header=1071082519 (resource: 
> BufferedChecksumIndexInput(_r3.nvm))
>         at 
> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:549)
>         at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:522)
> Caused by: org.apache.lucene.index.CorruptIndexException: codec header 
> mismatch: actual header=1494817490 vs expected header=1071082519 (resource: 
> BufferedChecksumIndexInput(_r3.nvm))
>         at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:136)
>         at 
> org.apache.lucene.codecs.lucene49.Lucene49NormsProducer.<init>(Lucene49NormsProducer.java:75)
>         at 
> org.apache.lucene.codecs.lucene49.Lucene49NormsFormat.normsProducer(Lucene49NormsFormat.java:112)
>         at 
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:127)
>         at 
> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:108)
>         at 
> org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
>         at 
> org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:282)
>         at 
> org.apache.lucene.index.IndexWriter._mergeInit(IndexWriter.java:3951)
>         at 
> org.apache.lucene.index.IndexWriter.mergeInit(IndexWriter.java:3913)
>         at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3766)
>         at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
>         at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
> {code}
> So I deleted the whole index, re-create it and re-ran the job to send Hive 
> table contents to Solr again and it returned exactly the same exception the 
> first time after trying to send a lot of updates to Solr.
> I moved off HDFS to a normal dataDir backend and then re-indexed the full 
> table in 2 hours successfully without index corruptions.
> This implies that this is some sort of stability issue on the HDFS 
> DirectoryFactory implementation.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to