Just wanted to put in a +1 for this idea….!  Years ago Jan did a spike for this 
in Solr, https://issues.apache.org/jira/browse/SOLR-10665 and I was really 
excited about it.  We ended up deciding to go in a home grown manner, and the 
results today are, well, in my opinion, kind of what you expect from a home 
grown solution.  We have a cool, but slightly orphaned package manager, and 
we’ve interacted on how to store plugins at least twice.  

If I had it all over again, I’d go back to using PF4J.   I love how few 
dependencies it brings, and how it has a strong focus!

I will watch this effort with interest and if it succeeds, it might reignite my 
interest in pushing this for Solr.

Eric


> On Aug 24, 2024, at 1:09 PM, Nicholas DiPiazza <nicholas.dipia...@gmail.com> 
> wrote:
> 
> Dear Tika Devs:
> 
> Tika pipes in production had a blocker problem for my peoples in that the
> extensible Fetcher objects we have loaded into the Tika Server and Tika
> Grpc Server would have classpath loading issues with other Fetchers. They
> need to be purely classpath independent of each other.
> 
> In order to fix this, I am attempting to introduce pf4j in this pull:
> 
> https://github.com/apache/tika/pull/1906
> 
> In this pull, the shade plugin goes completely bye-bye in favor of Maven
> dependency plugin and assembly plugin.
> 
> All Fetchers are now loaded via the plugin manager and classpath pulled in
> dynamically with a separate classloader than those of other Fetchers.
> 
> Great.
> 
> Some changes come as a result:
> 
> So now instead of having <fetcher> in the tika configuration. It's actually
> <fetcherConfig> because we don't need a full copy of the Fetcher anymore.
> 
> So now the fetcherConfig is the only thing stored in the Tika Config and
> the pf4j plugin manager handles loading the correct Fetcher, and then you
> send it the configuration that it requires.
> 
> So now I'm going into the Tika xml serialization stuff I need to place the
> FetcherConfig to replace the Fetcher objects previously stored there.
> 
> I figured this is a good time to take a step back and share with everyone.
> I would like to do a quick zoom with Tim and others to review the PR and
> discuss how to gracefully make that change to the Tika serialization stuff
> so that I don't step on toes of other intensions.
> 
> After this is merged, I'd like to build another RC so I can see if the
> issues reported by users are fixed.
> 
> -Nicholas

_______________________
Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to