Hi Bruno, As an aside, in general you'd want your staging (pre-prod) solr instance to exactly match your production solr instance in every way (like solr version) possible.
Another thought is to have several indexing machines, each pointing at a portion of those 200M textfiles, to speed up indexing the entire corpus. Cheers Robi On Sat, Apr 5, 2025 at 4:08 AM Bruno Mannina <bmann...@matheo-software.com> wrote: > Hi Colvin, > > Thank for your answer and your link, I will see if I can solve my problem. > > I use a old solr, I know :'(. > This old version is used since several years and I have a huge set of data > (around 200M of textfile to index). > Re-indexing my set of data will take too much time for me (several week). > > It's a pre-production solr (I used a Solr 8.11.3 on my production). > This pre-production is used to check data before dumping in Production. > > > Cordialement, Best Regards > Bruno Mannina > www.matheo-software.com > www.patent-pulse.com > Mob. +33 0 634 421 817 > > > -----Message d'origine----- > De : Colvin Cowie [mailto:colvin.cowie....@gmail.com] > Envoyé : vendredi 4 avril 2025 11:57 > À : users@solr.apache.org > Objet : Re: Solr error... > > Hello, > > I think we might need some more context here, that is to say, why are you > using Solr 5.5.1? That was released in 2016 and is very much out of date > and unsupported (and will contain a number of critical CVEs). > So rather than trying to make it work, can you instead move to the latest > release (9.8.1)? A lot of things have changed in the last 9 years, so maybe > consider it as a fresh start? > > By the sounds of the error, the *file* is corrupt now, that doesn't mean > the disk is corrupt. The reason for why that happened is probably not going > to be apparent, though if you go back through your logs you might identify > the cause. > A little googling of org.apache.lucene.index.CorruptIndexException > suggests that you may be able to "fix" the corrupt index (and lose the > corrupted documents in the process) https://stackoverflow.com/a/14934177 > > But I would seriously recommend that you move to a supported version and > reindex your data from source instead either way. > > > > On Thu, 3 Apr 2025 at 23:58, Bruno Mannina <bmann...@matheo-software.com> > wrote: > > > Hi All, > > > > > > > > I have on my new computer with a solr (5.5.1) a collection with an error. > > > > My new computer is 1.5 year old (4*4to Nvme) > > > > > > > > I check my disk and I have no error ?! > > > > > > > > Do you know if I can do something to solve it ? > > > > > > > > Many thanks for your help ! > > > > > > > > The error message is: > > > > > > > > java.lang.IllegalStateException: this writer hit an unrecoverable > > error; cannot complete commit > > at > > org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:2985) > > at > > org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2970) > > at > > org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2930) > > at > > > > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler > > 2.java > > :619) > > at > > > org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1464) > > at > > org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1264) > > at > > java.util.concurrent.Executors$RunnableAdapter.call(Unknown > > Source) > > at java.util.concurrent.FutureTask.run(Unknown Source) > > at > > java.util.concurrent.Executors$RunnableAdapter.call(Unknown > > Source) > > at java.util.concurrent.FutureTask.run(Unknown Source) > > at > > > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1. > > run(Ex > > ecutorUtil.java:231) > > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > > Source) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > > Source) > > at java.lang.Thread.run(Unknown Source) Caused by: > > org.apache.lucene.index.CorruptIndexException: checksum failed > > (hardware problem?) : expected=d0a2833f actual=64e63211 > > > > (resource=BufferedChecksumIndexInput(MMapIndexInput(path="C:\Users\Uti > > lisate > > ur\INDEX\FTCLAIMS\index\_8znd.cfs") [slice=_8znd.fdt])) > > at > > org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334) > > at > > org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:451) > > at > > > > org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.che > > ckInte > > grity(CompressingStoredFieldsReader.java:669) > > at > > > > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.mer > > ge(Com > > pressingStoredFieldsWriter.java:595) > > at > > org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:177) > > at > > org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:83) > > at > > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4075) > > at > > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655) > > at > > > > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMer > > geSche > > duler.java:588) > > at > > > > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Concu > > rrentM > > ergeScheduler.java:626) > > > > > > > > > > > > Cordialement, Best Regards > > > > Bruno Mannina > > > > <http://www.matheo-software.com/> www.matheo-software.com > > > > <http://www.patent-pulse.com/> www.patent-pulse.com > > > > Mob. +33 0 634 421 817 > > > > > > > > > > > > -- > > Cet e-mail a été vérifié par le logiciel antivirus d'Avast. > > www.avast.com > > > -- > Cet e-mail a été vérifié par le logiciel antivirus d'Avast. > www.avast.com >