+1 I think this is great. If you’ve got any shading you’d be open to
upstreaming I’d be happy to review it.

Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her


On Fri, Jan 17, 2025 at 12:25 PM John Zhuge <jzh...@apache.org> wrote:

> Thanks for sharing the insightful context!
>
> On Fri, Jan 17, 2025 at 11:47 AM Regina Lee <re...@linkedin.com.invalid>
> wrote:
>
>> Hi,
>>
>> I’d like to share insights from our Spark team at LinkedIn. We recently
>> moved to a mostly shaded Spark 3 client internally. Our goal was to
>> minimize dependency conflicts that could hinder Spark upgrades, especially
>> given our previous efforts to migrate our users from Spark 2 to Spark 3,
>> and LinkedIn’s heavy Scala / Java use cases with complicated dependency
>> trees. We shaded rather aggressively (100+ relocations) given our specific
>> ecosystem needs – Hadoop 2.10 with no current/planned support for Spark
>> streaming / connect modules.
>>
>> At a high level, some notable shaded prefixes included org.json,
>> com.google.common / protobuf, org.apache.commons, and org.antlr. Key
>> dependencies *not* shaded were avro, jackson, datanucleus, logging / JRE
>> / scala dependencies (in general, any dependencies exposed in Spark’s /
>> other dependencies’ public APIs).
>>
>> There is an expected one-time cost in onboarding our Spark users to the
>> shaded client. Most issues require importing missing dependencies
>> originally provided by Spark/Hadoop. We are generally in favor of shading
>> more of Spark’s dependencies because it has helped reduce developer toil
>> and troubleshooting efforts.
>>
>> Thanks,
>>
>> Regina
>>
>> On 2024/12/07 15:30:20 Mich Talebzadeh wrote:
>> > General comment without specifics. I think shading should be used* on a
>> > case by case basis* when the benefits outweigh the drawbacks. How about
>> > exploring alternatives such as modularization, dependency management, or
>> > careful dependency selection, before resorting to shading? My point is
>> that
>> > shading will introduce more debugging and testing as packages will be
>> > renamed impacting flexibility. Case in point, things like unit and
>> > integration tests may need adjustments to account for the renamed
>> packages.
>> >
>> > HTH
>> >
>> > Mich Talebzadeh,
>> >
>> > Architect | Data Science | Financial Crime | GDPR & Compliance
>> Specialist
>> > PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>> College
>> > London <https://en.wikipedia.org/wiki/Imperial_College_London>
>> > London, United Kingdom
>> >
>> >
>> >    view my Linkedin profile
>> > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>> >
>> >
>> >  https://en.everybodywiki.com/Mich_Talebzadeh
>> >
>> >
>> >
>> > *Disclaimer:* The information provided is correct to the best of my
>> > knowledge but of course cannot be guaranteed . It is essential to note
>> > that, as with any advice, quote "one test result is worth one-thousand
>> > expert opinions (Werner  <
>> https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
>> > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>> >
>> >
>> > On Sat, 7 Dec 2024 at 06:21, Holden Karau <ho...@gmail.com> wrote:
>> >
>> > > Hi Y'all,
>> > >
>> > > As we're getting closer to 4.0 I was thinking now is a good time for
>> us to
>> > > try and reduce the class path we expose for JVM users. Are there any
>> common
>> > > classes/packages folks would like to see shaded?
>> > >
>> > > Cheers,
>> > >
>> > > Holden :)
>> > >
>> > > --
>> > > Twitter: https://twitter.com/holdenkarau
>> > > Fight Health Insurance: https://www.fighthealthinsurance.com/
>> > > <https://www.fighthealthinsurance.com/?q=hk_email>
>> > > Books (Learning Spark, High Performance Spark, etc.):
>> > > https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> > > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>> > > Pronouns: she/her
>> > >
>> >
>>
>
>
> --
> John Zhuge
>

Reply via email to