I’m not a huge fan of special cases for configuration values like this. Is
there something that we can do to pass a set of values to all sources (and
catalogs for #21306)?

I would prefer adding a special prefix for options that are passed to all
sources, like this:

spark.sql.catalog.shared.shared-property = value0
spark.sql.catalog.jdbc-prod.prop = value1
spark.datasource.source-name.prop = value2

All of the properties in the shared namespace would be passed to all
catalogs and sources. What do you think?

On Sun, Sep 16, 2018 at 6:51 PM Wenchen Fan <cloud0...@gmail.com> wrote:

> I'm +1 for this proposal: "Extend SessionConfigSupport to support passing
> specific white-listed configuration values"
>
> One goal of data source v2 API is to not depend on any high-level APIs
> like SparkSession, SQLConf, etc. If users do want to access these
> high-level APIs, there is a workaround: calling `SparkSession.getActive` or
> `SQLConf.get`.
>
> In the meanwhile, I think you use case makes sense. `SessionConfigSupport`
> is created for this use case but it's not powerful enough yet. I think it
> should support multiple key-prefixes and white-list.
>
> Feel free to submit a patch, and thanks for looking into it!
>
> On Sun, Sep 16, 2018 at 2:40 PM tigerquoll <tigerqu...@outlook.com> wrote:
>
>> The current V2 Datasource API provides support for querying a portion of
>> the
>> SparkConfig namespace (spark.datasource.*) via the SessionConfigSupport
>> API.
>> This was designed with the assumption that all configuration information
>> for
>> v2 data sources should be separate from each other.
>>
>> Unfortunately, there are some cross-cutting concerns such as
>> authentication
>> that touch multiple data sources - this means that common configuration
>> items need to be shared amongst multiple data sources.
>> In particular, Kerberos setup can use the following configuration items:
>>
>> * userPrincipal,
>> * userKeytabPath
>> * krb5ConfPath
>> * kerberos debugging flags
>> * spark.security.credentials.${service}.enabled
>> * JAAS config
>> * ZKServerPrincipal ??
>>
>> So potential solutions I can think of to pass this information to various
>> data sources are:
>>
>> * Pass the entire SparkContext object to data sources (not likely)
>> * Pass the entire SparkConfig Map object to data sources
>> * Pass all required configuration via environment variables
>> * Extend SessionConfigSupport to support passing specific white-listed
>> configuration values
>> * Add a specific data source v2 API "SupportsKerberos" so that a data
>> source
>> can indicate that it supports Kerberos and also provide the means to pass
>> needed configuration info.
>> * Expand out all Kerberos configuration items to be in each data source
>> config namespace that needs it.
>>
>> If the data source requires TLS support then we also need to support
>> passing
>> all the  configuration values under  "spark.ssl.*"
>>
>> What do people think?  Placeholder Issue has been added at SPARK-25329.
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to