Tilman found a good example of why we should try to move to pf4j for
the tika-pipes modules (at least):
https://github.com/apache/tika/blob/main/tika-parent/pom.xml#L345
https://github.com/apache/tika/blob/main/tika-parent/pom.xml#L383

We can't upgrade to cxf 4.1.0/jetty 12 because of the Solr
dependencies in the solr-pipes modules.

On Mon, Aug 26, 2024 at 9:14 AM Eric Pugh
<ep...@opensourceconnections.com> wrote:
>
> Just wanted to put in a +1 for this idea….!  Years ago Jan did a spike for 
> this in Solr, https://issues.apache.org/jira/browse/SOLR-10665 and I was 
> really excited about it.  We ended up deciding to go in a home grown manner, 
> and the results today are, well, in my opinion, kind of what you expect from 
> a home grown solution.  We have a cool, but slightly orphaned package 
> manager, and we’ve interacted on how to store plugins at least twice.
>
> If I had it all over again, I’d go back to using PF4J.   I love how few 
> dependencies it brings, and how it has a strong focus!
>
> I will watch this effort with interest and if it succeeds, it might reignite 
> my interest in pushing this for Solr.
>
> Eric
>
>
> > On Aug 24, 2024, at 1:09 PM, Nicholas DiPiazza 
> > <nicholas.dipia...@gmail.com> wrote:
> >
> > Dear Tika Devs:
> >
> > Tika pipes in production had a blocker problem for my peoples in that the
> > extensible Fetcher objects we have loaded into the Tika Server and Tika
> > Grpc Server would have classpath loading issues with other Fetchers. They
> > need to be purely classpath independent of each other.
> >
> > In order to fix this, I am attempting to introduce pf4j in this pull:
> >
> > https://github.com/apache/tika/pull/1906
> >
> > In this pull, the shade plugin goes completely bye-bye in favor of Maven
> > dependency plugin and assembly plugin.
> >
> > All Fetchers are now loaded via the plugin manager and classpath pulled in
> > dynamically with a separate classloader than those of other Fetchers.
> >
> > Great.
> >
> > Some changes come as a result:
> >
> > So now instead of having <fetcher> in the tika configuration. It's actually
> > <fetcherConfig> because we don't need a full copy of the Fetcher anymore.
> >
> > So now the fetcherConfig is the only thing stored in the Tika Config and
> > the pf4j plugin manager handles loading the correct Fetcher, and then you
> > send it the configuration that it requires.
> >
> > So now I'm going into the Tika xml serialization stuff I need to place the
> > FetcherConfig to replace the Fetcher objects previously stored there.
> >
> > I figured this is a good time to take a step back and share with everyone.
> > I would like to do a quick zoom with Tim and others to review the PR and
> > discuss how to gracefully make that change to the Tika serialization stuff
> > so that I don't step on toes of other intensions.
> >
> > After this is merged, I'd like to build another RC so I can see if the
> > issues reported by users are fixed.
> >
> > -Nicholas
>
> _______________________
> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 | 
> http://www.opensourceconnections.com <http://www.opensourceconnections.com/> 
> | My Free/Busy <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be 
> Company Confidential unless explicitly stated otherwise, regardless of 
> whether attachments are marked as such.
>

Reply via email to