About a year ago, we were in a similar position, running Solr v7 and using 
built-in DIH.

We upgraded to Solr v9.4 and updated DIH using the contrib package from the 
SearchScale GitHub repo.  It was a very successful upgrade.

(Solr v9.5 was available at the time, but in early 2024, the latest SeachrScale 
DIH release was for v9.4.)

We did need to update our DIH definitions to remove the JavaScript transformers 
we had been using, and rework them into the SQL and built-in Solr transformers, 
since the contrib package does not (or did not) support JavaScript transformers.

I'm not familiar with the custom code issue you describe, but you can 
successfully use DIH with Solr v9.  I see that SearchScale now has releases for 
v9.6 and v9.7.  Based on our v9.4 upgrade, I would assume that DIH can be used 
just fine with Solr v.9.6 or v9.7.

HOWEVER, we've just recently taken on a usecase where we need near-real-time 
updates to the Solr index.  The way we use DIH (essentially, a nightly 
re-indexing of all of our Solr cores -- which is okay for our situation because 
each core takes only a few minutes worst-case) is not easily amenable to 
on-the-fly index updates.  We found that replacing DIH with an importer (a 
basic ETL, as others have alluded to in other replies in this thread) was much 
simpler than re-working our use of DIH, and has the benefit of moving us away 
from the DIH add-on.

(I want to emphasize that we don't have any problems at all with the 
SearchScale DIH add-on.  The only motivations to move away from it are (a) it's 
not always at the latest Solr version, and (b) the general motivation to reduce 
the number of moving parts.)

In the end, even though the use of the SearchScale DIH has been quite 
successful, if a year ago we could have seen a year into the future, we 
probably would have leapfrogged the DIH update, and gone to the ETL / importer. 
 Thus, based on our experience, I recommend (as others have) dropping DIH and 
implementing your own importer.  I am certain that it will be less effort 
overall.

Andrew Witt
Senior Software Engineer II
Learning A-Z, a Cambium Learning® Group Company
andrew.w...@learninga-z.com

-----Original Message-----
From: Sarah Weissman <sweiss...@stsci.edu> 
Sent: Thursday, May 29, 2025 12:43 PM
To: users@solr.apache.org
Subject: Advice on ways forward with or without Data Import Handler

Hi all,

We’ve been using Solr with DIH for about 8 years or so but now we’re hitting an 
impasse with DIH being deprecated in Solr 9. Additionally, I’m looking to move 
our Solr deploy to Kubernetes and I’ve been struggling to figure out what to do 
with the DIH component in a cloud setting. I was hoping to get something that 
replicates our current setup up ad running pretty quickly, but our DIH 
implementation has some custom code and I’m unable to get the jar dependency to 
load as a runtime library from the blob store with Solr 8. Maybe this isn’t 
possible with DIH? I’ve never used the runtimelib feature before and I have 
been unable to get the examples from the docs to work because the jars are too 
old. The next thing I would try is building my own custom image of Solr that 
includes the jar I need, but I’m also hesitant to spend a bunch more time on 
making deprecated features in Solr 8 work.

Unfortunately, I’ve also been unable to get the new DIH 3rd party plugin to 
work with Solr 9 and I’ve found the plugin commands with the solr script to be 
pretty finicky and the syntax changes between 8 and 9 frustrating as I switch 
between versions trying to get something to work as documented. I’m really not 
in a position where writing my own plugin is feasible at this point.

I’ve been banging my head against this all week and I’m trying to figure out 
the best way forward at this point. Is DIH still a viable option or should I be 
moving off of that something else? Any advice or perspectives on this would be 
appreciated.

Thanks
Sarah

Reply via email to