What about introducing isolated class loaders, similar to the approach used by web servers? Perhaps OSGi bundles or something similar?
El sáb, 18 ene 2025, 22:43, Holden Karau <holden.ka...@gmail.com> escribió: > I would say the short answer is "mostly not" and the longer answer is that > the connect APIs are explicitly not covering many, what we would call, > "paved paths." Because we're more likely to have JAR conflicts with > advanced users who are more likely to use some of the non-supported APIs. > For example, some of our biggest JAR conflicts come from other platform > teams which build platforms on top of Spark (thinking custom machine > learning tools or special streaming stuff). > > It's sort of that classic problem of building something for the 90% but > the 10% are the ones with the actual issue your trying to avoid. > > On Sat, Jan 18, 2025 at 1:26 PM Denny Lee <denny.g....@gmail.com> wrote: > >> BTW, one of many reasons Spark Connect was developed was to potentially >> simplify this process around shading (i.e. not need to do it). I’m >> wondering if utilizing Spark Connect could be a potential solution here? >> >> >> On Fri, Jan 17, 2025 at 12:27 Holden Karau <holden.ka...@gmail.com> >> wrote: >> >>> +1 I think this is great. If you’ve got any shading you’d be open to >>> upstreaming I’d be happy to review it. >>> >>> Twitter: https://twitter.com/holdenkarau >>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>> <https://www.fighthealthinsurance.com/?q=hk_email> >>> Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> Pronouns: she/her >>> >>> >>> On Fri, Jan 17, 2025 at 12:25 PM John Zhuge <jzh...@apache.org> wrote: >>> >>>> Thanks for sharing the insightful context! >>>> >>>> On Fri, Jan 17, 2025 at 11:47 AM Regina Lee <re...@linkedin.com.invalid> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I’d like to share insights from our Spark team at LinkedIn. We >>>>> recently moved to a mostly shaded Spark 3 client internally. Our goal was >>>>> to minimize dependency conflicts that could hinder Spark upgrades, >>>>> especially given our previous efforts to migrate our users from Spark 2 to >>>>> Spark 3, and LinkedIn’s heavy Scala / Java use cases with complicated >>>>> dependency trees. We shaded rather aggressively (100+ relocations) given >>>>> our specific ecosystem needs – Hadoop 2.10 with no current/planned support >>>>> for Spark streaming / connect modules. >>>>> >>>>> At a high level, some notable shaded prefixes included org.json, >>>>> com.google.common / protobuf, org.apache.commons, and org.antlr. Key >>>>> dependencies *not* shaded were avro, jackson, datanucleus, logging / >>>>> JRE / scala dependencies (in general, any dependencies exposed in Spark’s >>>>> / >>>>> other dependencies’ public APIs). >>>>> >>>>> There is an expected one-time cost in onboarding our Spark users to >>>>> the shaded client. Most issues require importing missing dependencies >>>>> originally provided by Spark/Hadoop. We are generally in favor of shading >>>>> more of Spark’s dependencies because it has helped reduce developer toil >>>>> and troubleshooting efforts. >>>>> >>>>> Thanks, >>>>> >>>>> Regina >>>>> >>>>> On 2024/12/07 15:30:20 Mich Talebzadeh wrote: >>>>> > General comment without specifics. I think shading should be used* >>>>> on a >>>>> > case by case basis* when the benefits outweigh the drawbacks. How >>>>> about >>>>> > exploring alternatives such as modularization, dependency >>>>> management, or >>>>> > careful dependency selection, before resorting to shading? My point >>>>> is that >>>>> > shading will introduce more debugging and testing as packages will be >>>>> > renamed impacting flexibility. Case in point, things like unit and >>>>> > integration tests may need adjustments to account for the renamed >>>>> packages. >>>>> > >>>>> > HTH >>>>> > >>>>> > Mich Talebzadeh, >>>>> > >>>>> > Architect | Data Science | Financial Crime | GDPR & Compliance >>>>> Specialist >>>>> > PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >>>>> College >>>>> > London <https://en.wikipedia.org/wiki/Imperial_College_London> >>>>> > London, United Kingdom >>>>> > >>>>> > >>>>> > view my Linkedin profile >>>>> > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>> > >>>>> > >>>>> > https://en.everybodywiki.com/Mich_Talebzadeh >>>>> > >>>>> > >>>>> > >>>>> > *Disclaimer:* The information provided is correct to the best of my >>>>> > knowledge but of course cannot be guaranteed . It is essential to >>>>> note >>>>> > that, as with any advice, quote "one test result is worth >>>>> one-thousand >>>>> > expert opinions (Werner < >>>>> https://en.wikipedia.org/wiki/Wernher_von_Braun>Von >>>>> > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>>>> > >>>>> > >>>>> > On Sat, 7 Dec 2024 at 06:21, Holden Karau <ho...@gmail.com> wrote: >>>>> > >>>>> > > Hi Y'all, >>>>> > > >>>>> > > As we're getting closer to 4.0 I was thinking now is a good time >>>>> for us to >>>>> > > try and reduce the class path we expose for JVM users. Are there >>>>> any common >>>>> > > classes/packages folks would like to see shaded? >>>>> > > >>>>> > > Cheers, >>>>> > > >>>>> > > Holden :) >>>>> > > >>>>> > > -- >>>>> > > Twitter: https://twitter.com/holdenkarau >>>>> > > Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>> > > <https://www.fighthealthinsurance.com/?q=hk_email> >>>>> > > Books (Learning Spark, High Performance Spark, etc.): >>>>> > > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>> > > YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>> > > Pronouns: she/her >>>>> > > >>>>> > >>>>> >>>> >>>> >>>> -- >>>> John Zhuge >>>> >>> > > -- > Twitter: https://twitter.com/holdenkarau > Fight Health Insurance: https://www.fighthealthinsurance.com/ > <https://www.fighthealthinsurance.com/?q=hk_email> > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > Pronouns: she/her >