+1 I think this is great. If you’ve got any shading you’d be open to upstreaming I’d be happy to review it.
Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ <https://www.fighthealthinsurance.com/?q=hk_email> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau Pronouns: she/her On Fri, Jan 17, 2025 at 12:25 PM John Zhuge <jzh...@apache.org> wrote: > Thanks for sharing the insightful context! > > On Fri, Jan 17, 2025 at 11:47 AM Regina Lee <re...@linkedin.com.invalid> > wrote: > >> Hi, >> >> I’d like to share insights from our Spark team at LinkedIn. We recently >> moved to a mostly shaded Spark 3 client internally. Our goal was to >> minimize dependency conflicts that could hinder Spark upgrades, especially >> given our previous efforts to migrate our users from Spark 2 to Spark 3, >> and LinkedIn’s heavy Scala / Java use cases with complicated dependency >> trees. We shaded rather aggressively (100+ relocations) given our specific >> ecosystem needs – Hadoop 2.10 with no current/planned support for Spark >> streaming / connect modules. >> >> At a high level, some notable shaded prefixes included org.json, >> com.google.common / protobuf, org.apache.commons, and org.antlr. Key >> dependencies *not* shaded were avro, jackson, datanucleus, logging / JRE >> / scala dependencies (in general, any dependencies exposed in Spark’s / >> other dependencies’ public APIs). >> >> There is an expected one-time cost in onboarding our Spark users to the >> shaded client. Most issues require importing missing dependencies >> originally provided by Spark/Hadoop. We are generally in favor of shading >> more of Spark’s dependencies because it has helped reduce developer toil >> and troubleshooting efforts. >> >> Thanks, >> >> Regina >> >> On 2024/12/07 15:30:20 Mich Talebzadeh wrote: >> > General comment without specifics. I think shading should be used* on a >> > case by case basis* when the benefits outweigh the drawbacks. How about >> > exploring alternatives such as modularization, dependency management, or >> > careful dependency selection, before resorting to shading? My point is >> that >> > shading will introduce more debugging and testing as packages will be >> > renamed impacting flexibility. Case in point, things like unit and >> > integration tests may need adjustments to account for the renamed >> packages. >> > >> > HTH >> > >> > Mich Talebzadeh, >> > >> > Architect | Data Science | Financial Crime | GDPR & Compliance >> Specialist >> > PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >> College >> > London <https://en.wikipedia.org/wiki/Imperial_College_London> >> > London, United Kingdom >> > >> > >> > view my Linkedin profile >> > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> > >> > >> > https://en.everybodywiki.com/Mich_Talebzadeh >> > >> > >> > >> > *Disclaimer:* The information provided is correct to the best of my >> > knowledge but of course cannot be guaranteed . It is essential to note >> > that, as with any advice, quote "one test result is worth one-thousand >> > expert opinions (Werner < >> https://en.wikipedia.org/wiki/Wernher_von_Braun>Von >> > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >> > >> > >> > On Sat, 7 Dec 2024 at 06:21, Holden Karau <ho...@gmail.com> wrote: >> > >> > > Hi Y'all, >> > > >> > > As we're getting closer to 4.0 I was thinking now is a good time for >> us to >> > > try and reduce the class path we expose for JVM users. Are there any >> common >> > > classes/packages folks would like to see shaded? >> > > >> > > Cheers, >> > > >> > > Holden :) >> > > >> > > -- >> > > Twitter: https://twitter.com/holdenkarau >> > > Fight Health Insurance: https://www.fighthealthinsurance.com/ >> > > <https://www.fighthealthinsurance.com/?q=hk_email> >> > > Books (Learning Spark, High Performance Spark, etc.): >> > > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >> > > YouTube Live Streams: https://www.youtube.com/user/holdenkarau >> > > Pronouns: she/her >> > > >> > >> > > > -- > John Zhuge >