Re: Understanding Iceberg's dependency configuration

Ryan Blue Thu, 27 Jun 2019 14:01:24 -0700

I think that guava and slf4j-api could be compile dependencies. Guava
should probably be relocated to isolate the Iceberg version from cluster
dependencies.


I wasn't aware of the shadow configuration used by the plugin. We've been
using that name for the dependencies to include for a long time. We can
update to use the conventions that are more standard for that plugin.

> Should thin jars with transitive dependencies be used or an Iceberg
runtime with shaded dependencies [most common dependencies which could
conflict e.g guava, avro] be used?

Iceberg should provided shaded Jars to make it easy to get started with
Spark. We also want to shade Parquet, Avro, and others to ensure that
Iceberg's dependencies can be updated without conflicting with what Spark
uses. Libraries like slf4j-api should be fine to exclude because they
change rarely, though.

On Sat, Jun 22, 2019 at 10:23 PM RD <rdsr...@gmail.com> wrote:

> Hi Iceberg devs,
>
> I see that guava and slf4j-api are compileOnly dependencies. This implies
> that they are not required at runtime and will not be resolved when
> resolving Iceberg artifacts. So it might very well be the case that, say
> for example, for iceberg-spark, the guava dependency that could be used
> would be coming from Spark itself which could very well be different from
> what we intended.
>
> I think these should be changed to compile as these are required
> dependencies, thoughts?
>
> Today, iceberg-runtime and iceberg-presto-runtime artifacts will not
> include these dependencies as they are declared as compileonly and we have
> configured shadow tasks to pick dependencies from "shadow" configuration.
>
> I think these slf4j and guava should be part of these iceberg runtime
> artifacts, no?
>
> Also, iceberg-[presto]-runtime reconfigure/recreates "shadow"
> configuration
> https://imperceptiblethoughts.com/shadow/configuration/#configuring-the-runtime-classpath.
> This configuration is reserved by Shadow task to add transitive
> dependencies which are not to be bundled in the fat jar.
>
> I think that we should not recreate "shadow" configuration and use
> standard runtime/compile configuration for shadow task to use.
>
> My last comment is what is the expected/recommended way to use Iceberg
> artifacts in a runtime say Spark. Should thin jars with transitive
> dependencies be used or an Iceberg runtime with shaded dependencies [most
> common dependencies which could conflict e.g guava, avro] be used?
>
> -R
>
>

-- 
Ryan Blue
Software Engineer
Netflix

Re: Understanding Iceberg's dependency configuration

Reply via email to