Yevgeniy, The project recently moved to Apache. I’m adding the new mailing list and will update the old README with some pointers.
The ASF site has newer Javadoc: https://iceberg.apache.org/javadoc/0.6.0/index.html?com/netflix/iceberg/package-summary.html Right now, the easiest way to test is with a path-based table. That’s what Spark supports, since we haven’t updated it to use the Hive metastore. You can add Iceberg by downloading the iceberg-runtime Jar and dropping it into your Spark classpath using --jars. Spark 2.3.x only supports interacting with Iceberg or other v2 sources through the DataFrame API and doesn’t support creating tables with DDL. We’re working on getting those features into Spark, but for now you have to create a table and then write to it from Spark. Here’s an example: val schema = new Schema(...) val spec = PartitionSpec.builderFor(schema).build() // add configuration for your partitioning val tables = new HadoopTables(spark.sparkContext.hadoopConfiguration) val table = tables.create(schema, spec, "hdfs://nn:8020/path/to/table") Once your table is created, you can write to it using the dataframe API. Be sure you sort the data frame to group data in each partition. df.write.format("iceberg").save("hdfs://nn:8020/path/to/table) Once data is written, you can read from the table like this: val df = spark.read.format("iceberg").load("hdfs://nn:8020/path/to/table") We will be adding Hive support so you can refer to your table by name and use the Hive metastore to track its metadata, but hadoop tables should get you started with your evaluation. Thanks for reaching out! rb On Tue, Dec 4, 2018 at 7:51 AM Yevgeniy Viller <zhenya.vi...@gmail.com> wrote: > > Hey Ryan, > > We are doing POC of Iceberg against our internal datawarehouse platform. > I read Spec docs and example in Git. However, it is still little hard to > get started without proper examples. Also current version in Git is 0.5.1, > but API docs > https://docs.google.com/document/d/1Q-zL5lSCle6NEEdyfiYsXYzX_Q8Qf0ctMyGBKslOswA/edit#heading=h.vga9bjlv1x2e > is for 0.3.0. Do you have link current version of APIs specs? > > Thanks, > Yevgeniy > > On Thursday, January 4, 2018 at 2:19:20 PM UTC-5, Ryan Blue wrote: >> >> The Iceberg repository is now public on github, here: >> https://github.com/Netflix/iceberg >> >> The project is built with gradle and requires a Spark 2.3.0-SNAPSHOT (for >> Datasource V2) and Parquet 1.9.1-SNAPSHOT (for API additions and bug fixes). >> >> An early version of the spec is available for comments here: >> https://docs.google.com/document/d/1Q-zL5lSCle6NEEdyfiYsXYzX_Q8Qf0ctMyGBKslOswA/edit?usp=sharing >> >> Feedback is welcome! >> >> rb >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> > -- > You received this message because you are subscribed to the Google Groups > "Iceberg Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to iceberg-devel+unsubscr...@googlegroups.com. > To post to this group, send email to iceberg-de...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/iceberg-devel/68ae23cd-709d-40ba-a4d2-0f0c4413f27e%40googlegroups.com > <https://groups.google.com/d/msgid/iceberg-devel/68ae23cd-709d-40ba-a4d2-0f0c4413f27e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- Ryan Blue Software Engineer Netflix