Re: Iceberg/Hive properties handling

Jacques Nadeau Tue, 01 Dec 2020 08:06:43 -0800

Would someone be willing to create a document that states the current
proposal?


It is becoming somewhat difficult to follow this thread. I also worry that
without a complete statement of the current shape that people may be
incorrectly thinking they are in alignment.



--
Jacques Nadeau
CTO and Co-Founder, Dremio


On Tue, Dec 1, 2020 at 5:32 AM Zoltán Borók-Nagy <borokna...@cloudera.com>
wrote:

> Thanks, Ryan. I answered inline.
>
> On Mon, Nov 30, 2020 at 8:26 PM Ryan Blue <rb...@netflix.com> wrote:
>
>> This sounds like a good plan overall, but I have a couple of notes:
>>
>>    1. We need to keep in mind that users plug in their own catalogs, so
>>    iceberg.catalog could be a Glue or Nessie catalog, not just Hive or
>>    Hadoop. I don’t think it makes much sense to use separate hadoop.catalog
>>    and hive.catalog values. Those should just be names for catalogs 
>> configured
>>    in Configuration, i.e., via hive-site.xml. We then only need a
>>    special value for loading Hadoop tables from paths.
>>
>> About extensibility, I think the usual Hive way is to use Java class
> names. So this way the value for 'iceberg.catalog' could be e.g.
> 'org.apache.iceberg.hive.HiveCatalog'. Then each catalog implementation
> would need to have a factory method that constructs the catalog object from
> a properties object (Map<String, String>). E.g.
> 'org.apache.iceberg.hadoop.HadoopCatalog' would require
> 'iceberg.catalog_location' to be present in properties.
>
>>
>>    1. I don’t think that catalog configuration should be kept in table
>>    properties. A catalog should not be loaded for each table. So I don’t 
>> think
>>    we need iceberg.catalog_location. Instead, we should have a way to
>>    define catalogs in the Configuration for tables in the metastore to
>>    reference.
>>
>>  I think it makes sense, on the other hand it would make adding new
> catalogs more heavy-weight, i.e. now you'd need to edit configuration files
> and restart/reinit services. Maybe it can be cumbersome in some
> environments.
>
>>
>>    1. I’d rather use a prefix to exclude properties from being passed to
>>    Iceberg than to include them. Otherwise, users don’t know what to do to
>>    pass table properties from Hive or Impala. If we exclude a prefix or
>>    specific properties, then everything but the properties reserved for
>>    locating the table are passed as the user would expect.
>>
>> I don't have a strong opinion about this, but yeah, maybe this behavior
> would cause the least surprises.
>
>>
>>
>>
>> On Mon, Nov 30, 2020 at 7:51 AM Zoltán Borók-Nagy <borokna...@apache.org>
>> wrote:
>>
>>> Thanks, Peter. I answered inline.
>>>
>>> On Mon, Nov 30, 2020 at 3:13 PM Peter Vary <pv...@cloudera.com.invalid>
>>> wrote:
>>>
>>>> Hi Zoltan,
>>>>
>>>> Answers below:
>>>>
>>>> On Nov 30, 2020, at 14:19, Zoltán Borók-Nagy <
>>>> borokna...@cloudera.com.INVALID> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Thanks for the replies. My take for the above questions are as follows
>>>>
>>>>    - Should 'iceberg.catalog' be a required property?
>>>>    - Yeah, I think it would be nice if this would be required to avoid
>>>>       any implicit behavior
>>>>
>>>> Currently we have a Catalogs class to get/initialize/use the different
>>>> Catalogs. At that time the decision was to use HadoopTables as a default
>>>> catalog.
>>>> It might be worthwhile to use the same class in Impala as well, so the
>>>> behavior is consistent.
>>>>
>>>
>>> Yeah, I think it'd be beneficial for us to use the Iceberg classes
>>> whenever possible. The Catalogs class is very similar to what we have
>>> currently in Impala.
>>>
>>>>
>>>>    - 'hadoop.catalog' LOCATION and catalog_location
>>>>       - In Impala we don't allow setting LOCATION for tables stored in
>>>>       'hadoop.catalog'. But Impala internally sets LOCATION to the Iceberg
>>>>       table's actual location. We were also thinking about using only the 
>>>> table
>>>>       LOCATION, and set it to the catalog location, but we also found it
>>>>       confusing.
>>>>
>>>> It could definitely work, but it is somewhat strange that we have an
>>>> external table location set to an arbitrary path, and we have a different
>>>> location generated by other configs. It would be nice to have the real
>>>> location set in the external table location as well.
>>>>
>>>
>>> Impala sets the real Iceberg table location for external tables. E.g. if
>>> the user issues
>>>
>>> CREATE EXTERNAL TABLE my_hive_db.iceberg_table_hadoop_catalog
>>> STORED AS ICEBERG
>>> TBLPROPERTIES('iceberg.catalog'='hadoop.catalog',
>>>               'iceberg.catalog_location'='/path/to/hadoop/catalog',
>>>               'iceberg.table_identifier'='namespace1.namespace2.ice_t');
>>>
>>> If the end user had specified LOCATION, then Impala would have raised an
>>> error. But the above DDL statement is correct, so Impala loads the iceberg
>>> table via Iceberg API, then creates the HMS table and sets LOCATION to the
>>> Iceberg table location (something like
>>> /path/to/hadoop/catalog/namespace1/namespace2/ice_t).
>>>
>>>
>>>> I like the flexibility of setting the table_identifier on table level,
>>>> which could help removing naming conflicts. We might want to have this in
>>>> the Iceberg Catalog implementation.
>>>>
>>>>
>>>>    - 'iceberg.table_identifier' for HiveCatalog
>>>>       - Yeah, it doesn't add much if we only allow using the current
>>>>       HMS. I think it can be only useful if we are allowing external HMSes.
>>>>    - Moving properties to SERDEPROPERTIES
>>>>       - I see that these properties are used by the SerDe classes in
>>>>       Hive, but I feel that these properties are just not about 
>>>> serialization and
>>>>       deserialization. And as I see the current SERDEPROPERTIES are things 
>>>> like
>>>>       'field.delim', 'separatorChar', 'quoteChar', etc. So properties 
>>>> about table
>>>>       loading more naturally belong to TBLPROPERTIES in my opinion.
>>>>
>>>> I have seen it used both ways for HBaseSerDe. (even the wiki page uses
>>>> both :) ). Since Impala prefers TBLPROPERTIES and if we start using prefix
>>>> for separating real Iceberg table properties from other properties, then we
>>>> can keep it at TBLPROPERTIES.
>>>>
>>>
>>> In the google doc I also had a comment about prefixing iceberg table
>>> properties. We could use a prefix like 'iceberg.tblproperties.', and pass
>>> every property with this prefix to the Iceberg table. Currently Impala
>>> passes every table property to the Iceberg table.
>>>
>>>
>>>>
>>>> Thanks,
>>>>     Zoltan
>>>>
>>>>
>>>> On Mon, Nov 30, 2020 at 1:33 PM Peter Vary <pv...@cloudera.com.invalid>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Based on the discussion below I understand we have the following kinds
>>>>> of properties:
>>>>>
>>>>>    1. Iceberg table properties - Engine independent, storage related
>>>>>    parameters
>>>>>    2. "how to get to" - I think these are mostly Hive table specific
>>>>>    properties, since for Spark, the Spark catalog configuration serves 
>>>>> for the
>>>>>    same purpose. I think the best place for storing these would be the
>>>>>    Hive SERDEPROPERTIES, as this describes the access information for the
>>>>>    SerDe. Sidenote: I think we should decide if we allow HiveCatalogs
>>>>>    pointing to a different HMS and the 'iceberg.table_identifier' would 
>>>>> make
>>>>>    sense only if we allow having multiple catalogs.
>>>>>    3. Query specific properties - These are engine specific and might
>>>>>    be mapped to / even override the Iceberg table properties on the engine
>>>>>    specific code paths, but currently these properties have independent 
>>>>> names
>>>>>    and mapped on a case-by-case basis.
>>>>>
>>>>>
>>>>> Based on this:
>>>>>
>>>>>    - Shall we move the "how to get to" properties to SERDEPROPERTIES?
>>>>>    - Shall we define a prefix for setting Iceberg table properties
>>>>>    from Hive queries and omitting other engine specific properties?
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Peter
>>>>>
>>>>>
>>>>> On Nov 27, 2020, at 17:45, Mass Dosage <massdos...@gmail.com> wrote:
>>>>>
>>>>> I like these suggestions, comments inline below on the last round...
>>>>>
>>>>> On Thu, 26 Nov 2020 at 09:45, Zoltán Borók-Nagy <borokna...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> The above aligns with what we did in Impala, i.e. we store
>>>>>> information about table loading in HMS table properties. We are just a 
>>>>>> bit
>>>>>> more explicit about which catalog to use.
>>>>>> We have table property 'iceberg.catalog' to determine the catalog
>>>>>> type, right now the supported values are 'hadoop.tables', 
>>>>>> 'hadoop.catalog',
>>>>>> and 'hive.catalog'. Additional table properties can be set based on the
>>>>>> catalog type.
>>>>>>
>>>>>> So, if the value of 'iceberg.catalog' is
>>>>>>
>>>>>
>>>>> I'm all for renaming this, having "mr" in the property name is
>>>>> confusing.
>>>>>
>>>>>
>>>>>>
>>>>>>    - hadoop.tables
>>>>>>       - the table location is used to load the table
>>>>>>
>>>>>> The only question I have is should we have this as the default? i.e.
>>>>> if you don't set a catalog it will assume its HadoopTables and use the
>>>>> location? Or should we require this property to be here to be consistent
>>>>> and avoid any "magic"?
>>>>>
>>>>>
>>>>>>
>>>>>>    - hadoop.catalog
>>>>>>       - Required table property 'iceberg.catalog_location' specifies
>>>>>>       the location of the hadoop catalog in the file system
>>>>>>       - Optional table property 'iceberg.table_identifier' specifies
>>>>>>       the table id. If it's not set, then <database_name>.<table_name> 
>>>>>> is used as
>>>>>>       table identifier
>>>>>>
>>>>>> I like this as it would allow you to use a different database and
>>>>> table name in Hive as opposed to the Hadoop Catalog - at the moment they
>>>>> have to match. The only thing here is that I think Hive requires a table
>>>>> LOCATION to be set and it's then confusing as there are now two locations
>>>>> on the table. I'm not sure whether in the Hive storage handler or SerDe
>>>>> etc. we can get Hive to not require that and maybe even disallow it from
>>>>> being set. That would probably be best in conjunction with this. Another
>>>>> solution would be to not have the 'iceberg.catalog_location' property but
>>>>> instead use the table LOCATION for this but that's a bit confusing from a
>>>>> Hive point of view.
>>>>>
>>>>>
>>>>>>    - hive.catalog
>>>>>>       - Optional table property 'iceberg.table_identifier' specifies
>>>>>>       the table id. If it's not set, then <database_name>.<table_name> 
>>>>>> is used as
>>>>>>       table identifier
>>>>>>       - We have the assumption that the current Hive metastore
>>>>>>       stores the table, i.e. we don't support external Hive metastores 
>>>>>> currently
>>>>>>
>>>>>> These sound fine for Hive catalog tables that are created outside of
>>>>> the automatic Hive table creation (see
>>>>> https://iceberg.apache.org/hive/ -> Using Hive Catalog) we'd just
>>>>> need to document how you can create these yourself and that one could use 
>>>>> a
>>>>> different Hive database and table etc.
>>>>>
>>>>>
>>>>>> Independent of catalog implementations, but we also have table
>>>>>> property 'iceberg.file_format' to specify the file format for the data
>>>>>> files.
>>>>>>
>>>>>
>>>>> OK, I don't think we need that for Hive?
>>>>>
>>>>>
>>>>>> We haven't released it yet, so we are open to changes, but I think
>>>>>> these properties are reasonable and it would be great if we could
>>>>>> standardize the properties across engines that use HMS as the primary
>>>>>> metastore of tables.
>>>>>>
>>>>>>
>>>>> If others agree I think we should create an issue where we document
>>>>> the above changes so it's very clear what we're doing and can then go an
>>>>> implement them and update the docs etc.
>>>>>
>>>>>
>>>>>> Cheers,
>>>>>>     Zoltan
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 26, 2020 at 2:20 AM Ryan Blue <rb...@netflix.com.invalid>
>>>>>> wrote:
>>>>>>
>>>>>>> Yes, I think that is a good summary of the principles.
>>>>>>>
>>>>>>> #4 is correct because we provide some information that is
>>>>>>> informational (Hive schema) or tracked only by the metastore 
>>>>>>> (best-effort
>>>>>>> current user). I also agree that it would be good to have a table
>>>>>>> identifier in HMS table metadata when loading from an external table. 
>>>>>>> That
>>>>>>> gives us a way to handle name conflicts.
>>>>>>>
>>>>>>> On Wed, Nov 25, 2020 at 5:14 PM Jacques Nadeau <jacq...@dremio.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Minor error, my last example should have been:
>>>>>>>>
>>>>>>>> db1.table1_etl_branch =>
>>>>>>>> nessie.folder1.folder2.folder3.table1@etl_branch
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jacques Nadeau
>>>>>>>> CTO and Co-Founder, Dremio
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Nov 25, 2020 at 4:56 PM Jacques Nadeau <jacq...@dremio.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I agree with Ryan on the core principles here. As I understand
>>>>>>>>> them:
>>>>>>>>>
>>>>>>>>>    1. Iceberg metadata describes all properties of a table
>>>>>>>>>    2. Hive table properties describe "how to get to" Iceberg
>>>>>>>>>    metadata (which catalog + possibly ptr, path, token, etc)
>>>>>>>>>    3. There could be default "how to get to" information set at a
>>>>>>>>>    global level
>>>>>>>>>    4. Best-effort schema should stored be in the table properties
>>>>>>>>>    in HMS. This should be done for information schema retrieval 
>>>>>>>>> purposes
>>>>>>>>>    within Hive but should be ignored during Hive/other tool execution.
>>>>>>>>>
>>>>>>>>> Is that a fair summary of your statements Ryan (except 4, which I
>>>>>>>>> just added)?
>>>>>>>>>
>>>>>>>>> One comment I have on #2 is that for different catalogs and use
>>>>>>>>> cases, I think it can be somewhat more complex where it would be
>>>>>>>>> desirable for a table that initially existed without Hive that was 
>>>>>>>>> later
>>>>>>>>> exposed in Hive to support a ptr/path/token for how the table is named
>>>>>>>>> externally. For example, in a Nessie context we support arbitrary 
>>>>>>>>> paths for
>>>>>>>>> an Iceberg table (such as folder1.folder2.folder3.table1). If you 
>>>>>>>>> then want
>>>>>>>>> to expose that table to Hive, you might have this mapping for #2
>>>>>>>>>
>>>>>>>>> db1.table1 => nessie:folder1.folder2.folder3.table1
>>>>>>>>>
>>>>>>>>> Similarly, you might want to expose a particular branch version of
>>>>>>>>> a table. So it might say:
>>>>>>>>>
>>>>>>>>> db1.table1_etl_branch => nessie.folder1@etl_branch
>>>>>>>>>
>>>>>>>>> Just saying that the address to the table in the catalog could
>>>>>>>>> itself have several properties. The key being that no matter what 
>>>>>>>>> those
>>>>>>>>> are, we should follow #1 and only store properties that are about the 
>>>>>>>>> ptr,
>>>>>>>>> not the content/metadata.
>>>>>>>>>
>>>>>>>>> Lastly, I believe #4 is the case but haven't tested it. Can
>>>>>>>>> someone confirm that it is true? And that it is possible/not 
>>>>>>>>> problematic?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jacques Nadeau
>>>>>>>>> CTO and Co-Founder, Dremio
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Nov 25, 2020 at 4:28 PM Ryan Blue <
>>>>>>>>> rb...@netflix.com.invalid> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for working on this, Laszlo. I’ve been thinking about
>>>>>>>>>> these problems as well, so this is a good time to have a discussion 
>>>>>>>>>> about
>>>>>>>>>> Hive config.
>>>>>>>>>>
>>>>>>>>>> I think that Hive configuration should work mostly like other
>>>>>>>>>> engines, where different configurations are used for different 
>>>>>>>>>> purposes.
>>>>>>>>>> Different purposes means that there is not a global configuration 
>>>>>>>>>> priority.
>>>>>>>>>> Hopefully, I can explain how we use the different config sources 
>>>>>>>>>> elsewhere
>>>>>>>>>> to clarify.
>>>>>>>>>>
>>>>>>>>>> Let’s take Spark as an example. Spark uses Hadoop, so it has a
>>>>>>>>>> Hadoop Configuration, but it also has its own global configuration. 
>>>>>>>>>> There
>>>>>>>>>> are also Iceberg table properties, and all of the various Hive 
>>>>>>>>>> properties
>>>>>>>>>> if you’re tracking tables with a Hive MetaStore.
>>>>>>>>>>
>>>>>>>>>> The first step is to simplify where we can, so we effectively
>>>>>>>>>> eliminate 2 sources of config:
>>>>>>>>>>
>>>>>>>>>>    - The Hadoop Configuration is only used to instantiate Hadoop
>>>>>>>>>>    classes, like FileSystem. Iceberg should not use it for any other 
>>>>>>>>>> config.
>>>>>>>>>>    - Config in the Hive MetaStore is only used to identify that
>>>>>>>>>>    a table is Iceberg and point to its metadata location. All other 
>>>>>>>>>> config in
>>>>>>>>>>    HMS is informational. For example, the input format is 
>>>>>>>>>> FileInputFormat so
>>>>>>>>>>    that non-Iceberg readers cannot actually instantiate the format 
>>>>>>>>>> (it’s
>>>>>>>>>>    abstract) but it is available so they also don’t fail trying to 
>>>>>>>>>> load the
>>>>>>>>>>    class. Table-specific config should not be stored in table or 
>>>>>>>>>> serde
>>>>>>>>>>    properties.
>>>>>>>>>>
>>>>>>>>>> That leaves Spark configuration and Iceberg table configuration.
>>>>>>>>>>
>>>>>>>>>> Iceberg differs from other tables because it is opinionated: data
>>>>>>>>>> configuration should be maintained at the table level. This is 
>>>>>>>>>> cleaner for
>>>>>>>>>> users because config is standardized across engines and in one 
>>>>>>>>>> place. And
>>>>>>>>>> it also enables services that analyze a table and update its 
>>>>>>>>>> configuration
>>>>>>>>>> to tune options that users almost never do, like row group or stripe 
>>>>>>>>>> size
>>>>>>>>>> in the columnar formats. Iceberg table configuration is used to 
>>>>>>>>>> configure
>>>>>>>>>> table-specific concerns and behavior.
>>>>>>>>>>
>>>>>>>>>> Spark configuration is used for engine-specific concerns, and
>>>>>>>>>> runtime overrides. A good example of an engine-specific concern is 
>>>>>>>>>> the
>>>>>>>>>> catalogs that are available to load Iceberg tables. Spark has a way 
>>>>>>>>>> to load
>>>>>>>>>> and configure catalog implementations and Iceberg uses that for all
>>>>>>>>>> catalog-level config. Runtime overrides are things like target split 
>>>>>>>>>> size.
>>>>>>>>>> Iceberg has a table-level default split size in table properties, 
>>>>>>>>>> but this
>>>>>>>>>> can be overridden by a Spark option for each table, as well as an 
>>>>>>>>>> option
>>>>>>>>>> passed to the individual read. Note that these necessarily have 
>>>>>>>>>> different
>>>>>>>>>> config names for how they are used: Iceberg uses
>>>>>>>>>> read.split.target-size and the read-specific option is
>>>>>>>>>> target-size.
>>>>>>>>>>
>>>>>>>>>> Applying this to Hive is a little strange for a couple reasons.
>>>>>>>>>> First, Hive’s engine configuration *is* a Hadoop Configuration.
>>>>>>>>>> As a result, I think the right place to store engine-specific config 
>>>>>>>>>> is
>>>>>>>>>> there, including Iceberg catalogs using a strategy similar to what 
>>>>>>>>>> Spark
>>>>>>>>>> does: what external Iceberg catalogs are available and their 
>>>>>>>>>> configuration
>>>>>>>>>> should come from the HiveConf.
>>>>>>>>>>
>>>>>>>>>> The second way Hive is strange is that Hive needs to use its own
>>>>>>>>>> MetaStore to track Hive table concerns. The MetaStore may have tables
>>>>>>>>>> created by an Iceberg HiveCatalog, and Hive also needs to be able to 
>>>>>>>>>> load
>>>>>>>>>> tables from other Iceberg catalogs by creating table entries for 
>>>>>>>>>> them.
>>>>>>>>>>
>>>>>>>>>> Here’s how I think Hive should work:
>>>>>>>>>>
>>>>>>>>>>    - There should be a default HiveCatalog that uses the current
>>>>>>>>>>    MetaStore URI to be used for HiveCatalog tables tracked in the 
>>>>>>>>>> MetaStore
>>>>>>>>>>    - Other catalogs should be defined in HiveConf
>>>>>>>>>>    - HMS table properties should be used to determine how to
>>>>>>>>>>    load a table: using a Hadoop location, using the default 
>>>>>>>>>> metastore catalog,
>>>>>>>>>>    or using an external Iceberg catalog
>>>>>>>>>>       - If there is a metadata_location, then use the
>>>>>>>>>>       HiveCatalog for this metastore (where it is tracked)
>>>>>>>>>>       - If there is a catalog property, then load that catalog
>>>>>>>>>>       and use it to load the table identifier, or maybe an 
>>>>>>>>>> identifier from HMS
>>>>>>>>>>       table properties
>>>>>>>>>>       - If there is no catalog or metadata_location, then use
>>>>>>>>>>       HadoopTables to load the table location as an Iceberg table
>>>>>>>>>>
>>>>>>>>>> This would make it possible to access all types of Iceberg tables
>>>>>>>>>> in the same query, and would match how Spark and Flink configure 
>>>>>>>>>> catalogs.
>>>>>>>>>> Other than the configuration above, I don’t think that config in HMS 
>>>>>>>>>> should
>>>>>>>>>> be used at all, like how the other engines work. Iceberg is the 
>>>>>>>>>> source of
>>>>>>>>>> truth for table metadata, HMS stores how to load the Iceberg table, 
>>>>>>>>>> and
>>>>>>>>>> HiveConf defines the catalogs (or runtime overrides).
>>>>>>>>>>
>>>>>>>>>> This isn’t quite how configuration works right now. Currently,
>>>>>>>>>> the catalog is controlled by a HiveConf property,
>>>>>>>>>> iceberg.mr.catalog. If that isn’t set, HadoopTables will be used
>>>>>>>>>> to load table locations. If it is set, then that catalog will be 
>>>>>>>>>> used to
>>>>>>>>>> load all tables by name. This makes it impossible to load tables from
>>>>>>>>>> different catalogs at the same time. That’s why I think the Iceberg 
>>>>>>>>>> catalog
>>>>>>>>>> for a table should be stored in HMS table properties.
>>>>>>>>>>
>>>>>>>>>> I should also explain iceberg.hive.engine.enabled flag, but I
>>>>>>>>>> think this is long enough for now.
>>>>>>>>>>
>>>>>>>>>> rb
>>>>>>>>>>
>>>>>>>>>> On Wed, Nov 25, 2020 at 1:41 AM Laszlo Pinter <
>>>>>>>>>> lpin...@cloudera.com.invalid> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I would like to start a discussion, how should we handle
>>>>>>>>>>> properties from various sources like Iceberg, Hive or global 
>>>>>>>>>>> configuration.
>>>>>>>>>>> I've put together a short document
>>>>>>>>>>> <https://docs.google.com/document/d/1tyD7mGp_hh0dx9N_Ax9kj5INkg7Wzpj9XQ5t2-7AwNs/edit?usp=sharing>,
>>>>>>>>>>> please have a look and let me know what you think.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Laszlo
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Ryan Blue
>>>>>>>>>> Software Engineer
>>>>>>>>>> Netflix
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>> Software Engineer
>>>>>>> Netflix
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

Re: Iceberg/Hive properties handling

Reply via email to