Non-Solr related advice: Keep staging system and production system in the same 
version. Having the staging system and the production system in different - 
esp. massively different - versions does not make sense.

The Solr API has changed within the past years (there is a V2), and 
configuration directives were introduced and deprecated, and also defaults have 
changed. Testing against an ancient version is basically of no value.

Mag.phil. Robert Ehrenleitner, BEng.
IT-Services
Paris Lodron University of Salzburg, Austria




________________________________
Von: Thomas Corthals
Gesendet: Montag, 07. April 2025 10:31
Bis: users@solr.apache.org
Betreff: Re: Solr error...

[Sie erhalten nicht häufig E-Mails von tho...@klascement.net. Weitere 
Informationen, warum dies wichtig ist, finden Sie unter 
https://aka.ms/LearnAboutSenderIdentification ]

Another speed-up is sending the updates in batches if you don't do this
already. Instead of making 200M requests with 1 document, try doing it in
1M requests that send 200 documents each.

If you're sending the updates as XML, look into switching over to JSON.

Thomas


Op ma 7 apr 2025 om 02:28 schreef Walter Underwood <wun...@wunderwood.org>:

> Multi-threaded indexing can speed things up. Use two threads per CPU
> to get maximum throughput. I wrote a simple Python program to do that.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobserver.wunderwood.org%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230650018%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=f5y3pXhB6FNdJawXP245MWk7%2B58fA84bXKD3x8TDnfo%3D&reserved=0<http://observer.wunderwood.org/>
>   (my blog)
>
> > On Apr 6, 2025, at 5:11 PM, Robi Petersen <robip...@gmail.com> wrote:
> >
> > Hi Bruno,
> >
> > As an aside, in general you'd want your staging (pre-prod) solr instance
> to
> > exactly match your production solr instance in every way (like solr
> > version) possible.
> >
> > Another thought is to have several indexing machines, each pointing at a
> > portion of those 200M textfiles, to speed up indexing the entire corpus.
> >
> > Cheers
> > Robi
> >
> > On Sat, Apr 5, 2025 at 4:08 AM Bruno Mannina <
> bmann...@matheo-software.com>
> > wrote:
> >
> >> Hi Colvin,
> >>
> >> Thank for your answer and your link, I will see if I can solve my
> problem.
> >>
> >> I use a old solr, I know :'(.
> >> This old version is used since several years and I have a huge set of
> data
> >> (around 200M of textfile to index).
> >> Re-indexing my set of data will take too much time for me (several
> week).
> >>
> >> It's a pre-production solr (I used a Solr 8.11.3 on my production).
> >> This pre-production is used to check data before dumping in Production.
> >>
> >>
> >> Cordialement, Best Regards
> >> Bruno Mannina
> >> https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.matheo-software.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230670569%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=PF0xI2vkcNxYuD1QAWqkhhg6RiF3Vx8wOlvEFhE7spo%3D&reserved=0<http://www.matheo-software.com/>
> >> https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.patent-pulse.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230683221%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=hKesU0AUezykjUaYrJVI2Jj%2FKkN7ho6tb7qB8UDmxCA%3D&reserved=0<http://www.patent-pulse.com/>
> >> Mob. +33 0 634 421 817
> >>
> >>
> >> -----Message d'origine-----
> >> De : Colvin Cowie [mailto:colvin.cowie....@gmail.com]
> >> Envoyé : vendredi 4 avril 2025 11:57
> >> À : users@solr.apache.org
> >> Objet : Re: Solr error...
> >>
> >> Hello,
> >>
> >> I think we might need some more context here, that is to say, why are
> you
> >> using Solr 5.5.1? That was released in 2016 and is very much out of date
> >> and unsupported (and will contain a number of critical CVEs).
> >> So rather than trying to make it work, can you instead move to the
> latest
> >> release (9.8.1)? A lot of things have changed in the last 9 years, so
> maybe
> >> consider it as a fresh start?
> >>
> >> By the sounds of the error, the *file* is corrupt now, that doesn't mean
> >> the disk is corrupt. The reason for why that happened is probably not
> going
> >> to be apparent, though if you go back through your logs you might
> identify
> >> the cause.
> >> A little googling of  org.apache.lucene.index.CorruptIndexException
> >> suggests that you may be able to "fix" the corrupt index (and lose the
> >> corrupted documents in the process)
> https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fa%2F14934177&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230695561%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=%2F26KQAai6QgJnm7v5f6cmsbgCJGnrnylGytCRNlc7Ns%3D&reserved=0<https://stackoverflow.com/a/14934177>
> >>
> >> But I would seriously recommend that you move to a supported version and
> >> reindex your data from source instead either way.
> >>
> >>
> >>
> >> On Thu, 3 Apr 2025 at 23:58, Bruno Mannina <
> bmann...@matheo-software.com>
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>>
> >>>
> >>> I have on my new computer with a solr (5.5.1) a collection with an
> error.
> >>>
> >>> My new computer is 1.5 year old (4*4to Nvme)
> >>>
> >>>
> >>>
> >>> I check my disk and I have no error ?!
> >>>
> >>>
> >>>
> >>> Do you know if I can do something to solve it ?
> >>>
> >>>
> >>>
> >>> Many thanks for your help !
> >>>
> >>>
> >>>
> >>> The error message is:
> >>>
> >>>
> >>>
> >>> java.lang.IllegalStateException: this writer hit an unrecoverable
> >>> error; cannot complete commit
> >>>         at
> >>> org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:2985)
> >>>         at
> >>>
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2970)
> >>>         at
> >>> org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2930)
> >>>         at
> >>>
> >>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler
> >>> 2.java
> >>> :619)
> >>>         at
> >>>
> >>
> org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1464)
> >>>         at
> >>> org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1264)
> >>>         at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> >>> Source)
> >>>         at java.util.concurrent.FutureTask.run(Unknown Source)
> >>>         at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> >>> Source)
> >>>         at java.util.concurrent.FutureTask.run(Unknown Source)
> >>>         at
> >>>
> >>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.
> >>> run(Ex
> >>> ecutorUtil.java:231)
> >>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> >>> Source)
> >>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> >>> Source)
> >>>         at java.lang.Thread.run(Unknown Source) Caused by:
> >>> org.apache.lucene.index.CorruptIndexException: checksum failed
> >>> (hardware problem?) : expected=d0a2833f actual=64e63211
> >>>
> >>> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="C:\Users\Uti
> >>> lisate
> >>> ur\INDEX\FTCLAIMS\index\_8znd.cfs") [slice=_8znd.fdt]))
> >>>         at
> >>> org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334)
> >>>         at
> >>>
> org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:451)
> >>>         at
> >>>
> >>> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.che
> >>> ckInte
> >>> grity(CompressingStoredFieldsReader.java:669)
> >>>         at
> >>>
> >>> org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.mer
> >>> ge(Com
> >>> pressingStoredFieldsWriter.java:595)
> >>>         at
> >>>
> org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:177)
> >>>         at
> >>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:83)
> >>>         at
> >>> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4075)
> >>>         at
> >>> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3655)
> >>>         at
> >>>
> >>> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMer
> >>> geSche
> >>> duler.java:588)
> >>>         at
> >>>
> >>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Concu
> >>> rrentM
> >>> ergeScheduler.java:626)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Cordialement, Best Regards
> >>>
> >>> Bruno Mannina
> >>>
> >>> <https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.matheo-software.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230707749%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=iqbMlr1moIIA630JuhrOKIBL6sh7PcW5eLimLIpkW3Y%3D&reserved=0<http://www.matheo-software.com/>>
> >>>  
> >>> https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.matheo-software.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230720177%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ykuT%2FfgS2gAI2WiwkRunrQ6qz2VaVr%2By0ZpB%2BVp3nFU%3D&reserved=0<http://www.matheo-software.com/>
> >>>
> >>> <https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.patent-pulse.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230731931%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6hRQTEQjgQoye4ibgXuRMycCnDm17ajUzW8CSx5R9Hk%3D&reserved=0<http://www.patent-pulse.com/>>
> >>>  
> >>> https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.patent-pulse.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230743747%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=IMbKiLX0wmT6LDEL2w4%2FS1BW4pykbml1OJupgOoYaGM%3D&reserved=0<http://www.patent-pulse.com/>
> >>>
> >>> Mob. +33 0 634 421 817
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
> >>> https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avast.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230755295%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=jV%2Fb2Bod449weqTCgHXdookJHPMBhkAmoy%2FrMpq9AOc%3D&reserved=0<http://www.avast.com/>
> >>
> >>
> >> --
> >> Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
> >> https://eur05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avast.com%2F&data=05%7C02%7Crobert.ehrenleitner%40plus.ac.at%7C81735dc06ff44d13e6f808dd75aee640%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638796116230766662%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=Ih3tIgk38L%2BmttbWo5xOFruWM6lFNleJfKMei7LprvU%3D&reserved=0<http://www.avast.com/>
> >>
>
>

Reply via email to