That's an interesting idea. But it requires us to maintain the consistency
of the Jackson version in two places instead of one. The original Jackson
version has to match with the one shaded in Iceberg spark runtime. Every
time we update one, we have to remember to update another. I'm not sure if
it improves the situation.

Yufei


On Fri, Jun 20, 2025 at 11:43 AM Dmitri Bourlatchkov <di...@apache.org>
wrote:

> Hi Yun and Yufei,
>
> > Specifically, why does CreateGenericTableRESTRequest use the shaded
> Jackson?
>
> As discussed off list, request / response payload classes have to work with
> the version of Jackson included with the Iceberg Spark jars (because they
> own the RESTClient).
>
> That in itself is fine.
>
> I'd like to propose a different approach to implementing that in Polaris,
> though.
>
> Instead of compiling against relocated classes, why don't we compile
> against the original Jackson jar, and later relocate the Spark Client to
> "org.apache.iceberg.shaded.com.fasterxml.jackson.*" ?
>
> I believe Jackson is the only relocation concern.
>
> After relocation we can publish both the "thin" client for use with
> --package in Spark, and the "fat" jar for use with --jar. Both artifacts
> will depend on the relocated Iceberg artifacts.
>
> WDYT?
>
> Cheers,
> Dmitri.
>
> On Fri, Jun 20, 2025 at 1:05 PM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
> > Thanks for the quick response, Yun!
> >
> > > org.apache.polaris#polaris-core
> > > org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
> >
> > IIRC, polaris-core uses Jackson. iceberg-spark-runtime also uses Jackson,
> > but it shades it.
> >
> > I believe I saw issues with using both shaded and non-shaded Jackson in
> > the same Spark env. with Iceberg.
> >
> > This may or may not be a concern for our Spark Client. What I mean is
> that
> > it may need some more consideration to be sure.
> >
> > Specifically, why does CreateGenericTableRESTRequest use the shaded
> > Jackson?
> >
> > WDYT?
> >
> > Thanks,
> > Dmitri.
> >
> > On Fri, Jun 20, 2025 at 12:47 PM yun zou <yunzou.colost...@gmail.com>
> > wrote:
> >
> >> *-- What is the maven artifact that Spark can automatically pull
> >> (via--packages)*
> >>
> >> Our spark client pulls the following:
> >>
> >> org.apache.polaris#polaris-spark-3.5_2.12
> >>
> >> org.apache.polaris#polaris-core
> >>
> >> org.apache.polaris#polaris-api-management-model
> >>
> >> org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
> >>
> >>
> >> Prior to the change, it also pulled iceberg-core and avro 1.20.0.
> >>
> >>
> >> *-- Does that artifact use shaded dependencies*
> >>
> >> Any usage of classes from iceberg-spark-runtime uses the shaded
> libraries
> >> shipped along with the artifacts.
> >>
> >>
> >>
> >> *-- Does that artifact depend on the Iceberg Spark bundle?*
> >>
> >> If you are referring to our spark client, it depends on
> >> iceberg-spark-runtime,
> >> not other bundles.
> >>
> >>
> >>
> >> *-- Is the _code_ running in Spark the same when the Polaris Spark
> Client
> >> ispulled via --packages and via --jars?*
> >>
> >>
> >> yes, the jar and package will use the same code, where the jar simply
> >> packs
> >> everything
> >>
> >> for the user and there is no need to download any other dependency.
> >>
> >>
> >> Best Regards,
> >>
> >> Yun
> >>
> >>
> >>
> >> On Fri, Jun 20, 2025 at 9:18 AM Dmitri Bourlatchkov <di...@apache.org>
> >> wrote:
> >>
> >> > Some questions for clarification:
> >> >
> >> > * What is the maven artifact that Spark can automatically pull (via
> >> > --packages)?
> >> > * Does that artifact use shaded dependencies?
> >> > * Does that artifact depend on the Iceberg Spark bundle?
> >> > * Is the _code_ running in Spark the same when the Polaris Spark
> Client
> >> is
> >> > pulled via --packages and via --jars?
> >> >
> >> > I know I could have figured that out from code, but I'm asking here
> >> because
> >> > I think we may need to review our approach to publishing these
> >> artifacts.
> >> >
> >> > I believe that regardless of the method of including the Client into
> >> Spark
> >> > runtime, the code has to be exactly the same.... and I doubt it is the
> >> same
> >> > now. WDYT?
> >> >
> >> > Thanks,
> >> > Dmitri.
> >> >
> >> >
> >> > On Fri, Jun 20, 2025 at 10:15 AM Dmitri Bourlatchkov <
> di...@apache.org>
> >> > wrote:
> >> >
> >> > > Hi All,
> >> > >
> >> > > Re: PR [1908] let's use this thread to clarify the problems we're
> >> trying
> >> > > to solve and options for solutions.
> >> > >
> >> > > As for me, it looks like some refactoring in the way the Spark
> Client
> >> is
> >> > > built and published may be needed.
> >> > >
> >> > > I think it makes sense to clarify this before 1.0 to avoid changes
> to
> >> > > Maven coordinates right after 1.0
> >> > >
> >> > > [1908] https://github.com/apache/polaris/pull/1908
> >> > >
> >> > > Thanks,
> >> > > Dmitri.
> >> > >
> >> > >
> >> >
> >>
> >
>

Reply via email to