No problem, I've been trying to get my head around how it all works myself!
As per https://solr.apache.org/guide/8_9/working-with-external-files-and-processes.html our schema defines a field type: <fieldType name="fileboost" keyField="id" defVal="1" stored="false" indexed="false" class="solr.ExternalFileField"/> which is then used to define a field: <field name="boostvalue" type="fileboost"/> which pulls data from a file, external_boostvalue, living in $SOLR_HOME/data This is used to set a boost value that increases the visibility of some search results. Setting this file to be empty completely removes the performance hit we see taking several minutes to resolve after each replication. But we do need the functionality still, and I'm unclear on why this is an issue for 8.9 when it wasn't for 8.3 Hope this clarifies the problem! Dominic On Mon, 25 Oct 2021 at 19:03, Charlie Hull <ch...@opensourceconnections.com> wrote: > Hi Dominic, > > Could you clarify what you mean by boost files in this context? Just > curious.... > > Charlie > > On 25/10/2021 17:11, Dominic Humphries wrote: > > Performance with the replica pulling from 8.3.1 was actually worse. And > > looking at the data in the databases and the boost file contents, I'm > > dubious it's a problem of incompatible boost files. I think the > performance > > of importing/applying the boosts really is what's responsible for the > issue > > we see. Not sure what else to test to verify or disprove this.. > > > > On Mon, 25 Oct 2021 at 14:56, Dominic Humphries <domi...@adzuna.com> > wrote: > > > >> I think I found it! > >> > >> I didn't realise, but we have boost files for the core I'm testing and > the > >> boost is applied after replication! Setting the contents of the files to > >> empty completely removes the post-replication performance problem we > were > >> seeing. > >> > >> So now my question becomes "Why is boosting taking so much longer for > the > >> upgrade?" > >> > >> Since the upgrade has its own independent set of data, I'm wondering if > >> it's as simple as the IDs it's trying to boost don't exist and it takes > >> longer to find out an item is missing than it does to find one that > does? I > >> believe I can point an 8.9.0 follower at an 8.3.1 leader, that seems > like > >> the next logical step - if there's no performance hit when it has the > same > >> data as the 8.3.1 replica, then that's almost certainly the problem. > >> > >> Fingers crossed! > >> > >> On Sun, 24 Oct 2021 at 10:26, Deepak Goel <deic...@gmail.com> wrote: > >> > >>> There could be some testing and cooling happening post-replication. > will > >>> have to dig a bit more into the code. > >>> > >>> Deepak > >>> "The greatness of a nation can be judged by the way its animals are > >>> treated > >>> - Mahatma Gandhi" > >>> > >>> +91 73500 12833 > >>> deic...@gmail.com > >>> > >>> Facebook: https://www.facebook.com/deicool > >>> LinkedIn: www.linkedin.com/in/deicool > >>> > >>> "Plant a Tree, Go Green" > >>> > >>> Make In India : http://www.makeinindia.com/home > >>> > >>> > >>> On Thu, Oct 21, 2021 at 9:57 PM Dominic Humphries > >>> <domi...@adzuna.com.invalid> wrote: > >>> > >>>> One more tidbit: I just tried leaving replication off for a few hours > >>> and > >>>> then triggering a "big" replication run so I could see the distinct > >>> stages. > >>>> > >>>> - Beginning replication didn't cause any performance degradation. > >>>> - Several minutes of downloading the replication files saw no > >>>> degradation > >>>> - Only after downloading had completed did we start to see > >>> performance > >>>> issues in our tests > >>>> - But we saw the "number of docs/timestamp of latest file" both > jump > >>>> almost immediately after downloading completed and never move > again > >>>> - But the performance degradation continued for about seven more > >>> minutes > >>>> even though replication was clearly finished at this point > >>>> > >>>> > >>>> Is there some kind of re-indexing optimization thing that solr can run > >>>> post-replication? At this point it's about my only remaining suspect.. > >>>> > > -- > Charlie Hull - Managing Consultant at OpenSource Connections Limited > <www.o19s.com> > Founding member of The Search Network <https://thesearchnetwork.com/> > and co-author of Searching the Enterprise > <https://opensourceconnections.com/about-us/books-resources/> > tel/fax: +44 (0)8700 118334 > mobile: +44 (0)7767 825828 > > OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin > Amtsgericht Charlottenburg | HRB 230712 B > Geschäftsführer: John M. Woodell | David E. Pugh > Finanzamt: Berlin Finanzamt für Körperschaften II >