What about introducing isolated class loaders, similar to the approach used
by web servers? Perhaps OSGi bundles or something similar?

El sáb, 18 ene 2025, 22:43, Holden Karau <holden.ka...@gmail.com> escribió:

> I would say the short answer is "mostly not" and the longer answer is that
> the connect APIs are explicitly not covering many, what we would call,
> "paved paths." Because we're more likely to have JAR conflicts with
> advanced users who are more likely to use some of the non-supported APIs.
> For example, some of our biggest JAR conflicts come from other platform
> teams which build platforms on top of Spark (thinking custom machine
> learning tools or special streaming stuff).
>
> It's sort of that classic problem of building something for the 90% but
> the 10% are the ones with the actual issue your trying to avoid.
>
> On Sat, Jan 18, 2025 at 1:26 PM Denny Lee <denny.g....@gmail.com> wrote:
>
>> BTW, one of many reasons Spark Connect was developed was to potentially
>> simplify this process around shading (i.e. not need to do it).   I’m
>> wondering if utilizing Spark Connect could be a potential solution here?
>>
>>
>> On Fri, Jan 17, 2025 at 12:27 Holden Karau <holden.ka...@gmail.com>
>> wrote:
>>
>>> +1 I think this is great. If you’ve got any shading you’d be open to
>>> upstreaming I’d be happy to review it.
>>>
>>> Twitter: https://twitter.com/holdenkarau
>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>> Pronouns: she/her
>>>
>>>
>>> On Fri, Jan 17, 2025 at 12:25 PM John Zhuge <jzh...@apache.org> wrote:
>>>
>>>> Thanks for sharing the insightful context!
>>>>
>>>> On Fri, Jan 17, 2025 at 11:47 AM Regina Lee <re...@linkedin.com.invalid>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I’d like to share insights from our Spark team at LinkedIn. We
>>>>> recently moved to a mostly shaded Spark 3 client internally. Our goal was
>>>>> to minimize dependency conflicts that could hinder Spark upgrades,
>>>>> especially given our previous efforts to migrate our users from Spark 2 to
>>>>> Spark 3, and LinkedIn’s heavy Scala / Java use cases with complicated
>>>>> dependency trees. We shaded rather aggressively (100+ relocations) given
>>>>> our specific ecosystem needs – Hadoop 2.10 with no current/planned support
>>>>> for Spark streaming / connect modules.
>>>>>
>>>>> At a high level, some notable shaded prefixes included org.json,
>>>>> com.google.common / protobuf, org.apache.commons, and org.antlr. Key
>>>>> dependencies *not* shaded were avro, jackson, datanucleus, logging /
>>>>> JRE / scala dependencies (in general, any dependencies exposed in Spark’s 
>>>>> /
>>>>> other dependencies’ public APIs).
>>>>>
>>>>> There is an expected one-time cost in onboarding our Spark users to
>>>>> the shaded client. Most issues require importing missing dependencies
>>>>> originally provided by Spark/Hadoop. We are generally in favor of shading
>>>>> more of Spark’s dependencies because it has helped reduce developer toil
>>>>> and troubleshooting efforts.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Regina
>>>>>
>>>>> On 2024/12/07 15:30:20 Mich Talebzadeh wrote:
>>>>> > General comment without specifics. I think shading should be used*
>>>>> on a
>>>>> > case by case basis* when the benefits outweigh the drawbacks. How
>>>>> about
>>>>> > exploring alternatives such as modularization, dependency
>>>>> management, or
>>>>> > careful dependency selection, before resorting to shading? My point
>>>>> is that
>>>>> > shading will introduce more debugging and testing as packages will be
>>>>> > renamed impacting flexibility. Case in point, things like unit and
>>>>> > integration tests may need adjustments to account for the renamed
>>>>> packages.
>>>>> >
>>>>> > HTH
>>>>> >
>>>>> > Mich Talebzadeh,
>>>>> >
>>>>> > Architect | Data Science | Financial Crime | GDPR & Compliance
>>>>> Specialist
>>>>> > PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>>> College
>>>>> > London <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>>> > London, United Kingdom
>>>>> >
>>>>> >
>>>>> >    view my Linkedin profile
>>>>> > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>> >
>>>>> >
>>>>> >  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>> >
>>>>> >
>>>>> >
>>>>> > *Disclaimer:* The information provided is correct to the best of my
>>>>> > knowledge but of course cannot be guaranteed . It is essential to
>>>>> note
>>>>> > that, as with any advice, quote "one test result is worth
>>>>> one-thousand
>>>>> > expert opinions (Werner  <
>>>>> https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
>>>>> > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>> >
>>>>> >
>>>>> > On Sat, 7 Dec 2024 at 06:21, Holden Karau <ho...@gmail.com> wrote:
>>>>> >
>>>>> > > Hi Y'all,
>>>>> > >
>>>>> > > As we're getting closer to 4.0 I was thinking now is a good time
>>>>> for us to
>>>>> > > try and reduce the class path we expose for JVM users. Are there
>>>>> any common
>>>>> > > classes/packages folks would like to see shaded?
>>>>> > >
>>>>> > > Cheers,
>>>>> > >
>>>>> > > Holden :)
>>>>> > >
>>>>> > > --
>>>>> > > Twitter: https://twitter.com/holdenkarau
>>>>> > > Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>> > > <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>> > > Books (Learning Spark, High Performance Spark, etc.):
>>>>> > > https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> > > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>> > > Pronouns: she/her
>>>>> > >
>>>>> >
>>>>>
>>>>
>>>>
>>>> --
>>>> John Zhuge
>>>>
>>>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Fight Health Insurance: https://www.fighthealthinsurance.com/
> <https://www.fighthealthinsurance.com/?q=hk_email>
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> Pronouns: she/her
>

Reply via email to