Re: Welcome Szehon Ho as a committer!

2022-03-11 Thread Huadong Liu
Congrats Szehon! On Fri, Mar 11, 2022 at 3:32 PM Anton Okolnychyi wrote: > Hey everyone, > > I would like to welcome Szehon Ho as a new committer to the project! > > Thanks for all your work, Szehon! > > - Anton >

Re: shaded/unshaded packages in runtime/non-runtime jars

2021-08-05 Thread Huadong Liu
nm, I missed *iceberg-data*. On Thu, Aug 5, 2021 at 12:07 PM Huadong Liu wrote: > Thank you Russell. I wasn't able to make positive progress in either > direction. Is it possible to use the unshaded iceberg-spark3 jar and do > *spark.read().format("iceberg").load(tab

Re: shaded/unshaded packages in runtime/non-runtime jars

2021-08-05 Thread Huadong Liu
and shaded parquet > libs > > On Mon, Aug 2, 2021 at 2:28 PM Huadong Liu wrote: > >> Hi, >> >> I have a Java app that writes Iceberg files with the core API. As a >> result, it uses the unshaded parquet package. I am now extending the app to >> read the table wi

shaded/unshaded packages in runtime/non-runtime jars

2021-08-02 Thread Huadong Liu
Hi, I have a Java app that writes Iceberg files with the core API. As a result, it uses the unshaded parquet package. I am now extending the app to read the table with Spark. Unfortunately the iceberg-spark3-runtime uses the shaded parquet package and I am getting: *java.lang.ClassCastException:

Re: migrating Hadoop tables to tables with hive catalog

2021-07-01 Thread Huadong Liu
ussell Spitzer > wrote: > >> I think you could probably also do this by just creating a Hive table and >> then changing the location to point to the most recent hadoop metadata.json >> file. >> >> On Jul 1, 2021, at 1:42 AM, Huadong Liu wrote: >> >&

Re: migrating Hadoop tables to tables with hive catalog

2021-06-30 Thread Huadong Liu
FYI, I was able to do the migration by casting ManifestFile to GenericManifestFile, resetting sequence number and snapshot id and adding them to AppendFiles. On Mon, Jun 28, 2021 at 3:49 PM Huadong Liu wrote: > Hi, > > I am trying to migrate an Iceberg Hadoop table to a table using

Re: iceberg-hive-metastore dependencies

2021-06-30 Thread Huadong Liu
n Wed, Jun 30, 2021 at 11:19 AM Jack Ye wrote: > Hadoop dependencies are also compileOnly in Iceberg, so you also need to > add Hadoop packages such as hadoop-common and hadoop-client in your project > dependency. > -Jack Ye > > On Wed, Jun 30, 2021 at 10:35 AM Huadong Liu w

iceberg-hive-metastore dependencies

2021-06-30 Thread Huadong Liu
Hi, I made a shadowJar with *org.apache.hive:hive-metastore:3.1.2 and got errors:* *Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5141) at org.apache.hadoop.hive.conf.HiveConf.(HiveCon

Re: Welcoming OpenInx as a new PMC member!

2021-06-29 Thread Huadong Liu
Congrats Zheng! On Tue, Jun 29, 2021 at 1:52 PM Ryan Blue wrote: > Hi everyone, > > I'd like to welcome OpenInx (Zheng Hu) as a new Iceberg PMC member. > > Thanks for all your contributions and commitment to the project, OpenInx! > > > Ryan > > -- > Ryan Blue >

migrating Hadoop tables to tables with hive catalog

2021-06-28 Thread Huadong Liu
Hi, I am trying to migrate an Iceberg Hadoop table to a table using the hive catalog. Luckily the table is appended only, so there are no delete files. It is not clear which APIs were used in a previous post

Updating column type from timestamp to timestamptz

2021-06-08 Thread Huadong Liu
Hi, I have a hadoop table created from the Iceberg Java API that uses the timestamp type for a column. Spark cannot work with the table because of that. *java.lang.UnsupportedOperationException: Spark does not support timestamp without time zone fields* I tried *sql("ALTER TABLE table ALTER COLU

Re: Spark DELETE FROM parallelism

2021-05-25 Thread Huadong Liu
your data is properly clustered. > > - Anton > > On 25 May 2021, at 09:52, Huadong Liu wrote: > > Hi iceberg-dev, > > I have a table that is partitioned by id (custom partitioning at the > moment, not iceberg hidden partitioning) and event time. Individual DELETE > fini

Spark DELETE FROM parallelism

2021-05-25 Thread Huadong Liu
Hi iceberg-dev, I have a table that is partitioned by id (custom partitioning at the moment, not iceberg hidden partitioning) and event time. Individual DELETE finishes reasonably fast, for example: *sql("DELETE FROM table where id_shard=111 and id=111456")* *sql("DELETE FROM table where id_shard

Re: Stableness of V2 Spec/API

2021-05-17 Thread Huadong Liu
roduction after we resolve this > issue at least. > > On Sat, May 15, 2021 at 8:01 AM Huadong Liu wrote: > >> Hi iceberg-dev, >> >> I tried v2 row-level deletion by committing equality delete files after >> *upgradeToFormatVersion(2)*. It worked well. I know that S

Stableness of V2 Spec/API

2021-05-14 Thread Huadong Liu
Hi iceberg-dev, I tried v2 row-level deletion by committing equality delete files after *upgradeToFormatVersion(2)*. It worked well. I know that Spark actions to compact delete files and data files etc. are in progress. I currently use the JAVA API t

Re: When is the next release of Iceberg ?

2021-05-05 Thread Huadong Liu
Hi openinx, With https://github.com/apache/iceberg/pull/2303 and a potential sequence number based fix for https://github.com/apache/iceberg/issues/2308, I don't see a harder blocker to test out row-level deletions. Please correct if anything else in https://github.com/apache/iceberg/milestone/

Re: Iceberg tables not using hive catalog's hive.metastore.warehouse.dir

2021-04-28 Thread Huadong Liu
es, feel free to open a ticket. > > Thanks, > Peter > > On Apr 28, 2021, at 02:52, Huadong Liu wrote: > > Hi Iceberg Dev, > > Iceberg tables with hive catalog are created under > hive.metastore.warehouse.dir/ by default. Different table locations >

Iceberg tables not using hive catalog's hive.metastore.warehouse.dir

2021-04-27 Thread Huadong Liu
Hi Iceberg Dev, Iceberg tables with hive catalog are created under hive.metastore.warehouse.dir/ by default. Different table locations

Re: Spark configuration on hive catalog

2021-04-22 Thread Huadong Liu
to see you again :). The syntax is spark-sql is ‘insert > into .. …”, here you defined your db as a catalog? > > You just need to define one catalog and use it when referring to your > table. > > > > On 22 Apr 2021, at 07:34, Huadong Liu wrote: > > Hello Iceberg De

Spark configuration on hive catalog

2021-04-22 Thread Huadong Liu
Hello Iceberg Dev, I am not sure I follow the discussion on Spark configurations on hive catalogs . I created an iceberg table with the hive catalog. Configuration conf = new Configuration(); conf.set("hive.metastore.uris", args[0]); conf.