Re: Update existing partition spec

2021-06-09 Thread 126
Hi Devs, I had the same problem. and i commit a PR, try to fix it. https://github.com/apache/iceberg/pull/2691

Re: AWS DynamoDB Catalog Support

2021-06-09 Thread Forward Xu
Thanks, Jack and Ryan! I think it is necessary to extend metastore and provide REST API service, so that we can not rely on metastore such as hive metastore in production. big +1 Forward Ryan Blue 于2021年6月10日周四 上午7:52写道: > Thanks, Jack! I agree that it would be great to get an idea of what >

Re: AWS DynamoDB Catalog Support

2021-06-09 Thread Ryan Blue
Thanks, Jack! I agree that it would be great to get an idea of what catalogs are out there so we can add them to Iceberg. Always good to share those implementations. And not just for DynamoDB, too, if there are any others out there that aren't too specific to a company's internal metastore. One th

Re: AssertJ for Assertions

2021-06-09 Thread Ryan Blue
Thanks for the additional examples, Eduard. I do like the assertions here and I'm fine with the idea of gradually moving over to them. Looks like some of the CharSequence and collection comparators would definitely be easier to use than `assertEquals(ImmutableSet.of(...), actualSet)`. As long as w

Re: Consistency problems with Iceberg + EMRFS

2021-06-09 Thread Ryan Blue
Thanks for the additional detail. If you're not writing concurrently, then that eliminates the explanations that I had. I also don't think that Iceberg retries would be a problem because Iceberg will only retry if the commit fails. But there is no reason for a commit to fail and retry because nothi

Re: Update existing partition spec

2021-06-09 Thread Jun H.
Hi Laszlo, As Jack mentioned, this behavior happened to v1 format. If you want to stay with v1 format, a short term workaround is to add back this field with a different name, e.g. table.updateSpec().addField("ts_month_new", month("ts")).commit(); or rename the field before deleting it table.u

Re: Consistency problems with Iceberg + EMRFS

2021-06-09 Thread Scott Kruger
Here’s a little more detail on our use case that might be helpful. We’re running a batch process to apply CDC to several hundred tables every few hours; we use iceberg (via HadoopTables) on top of a traditional Hive external table model (EMRFS + parquet + glue metastore) to track the commits (t

Re: Update existing partition spec

2021-06-09 Thread Jack Ye
Hi Laszlo, I think this is expected behavior of format v1, where a dropped partition column is actually converted to a null transform instead of being actually dropped. This is because the partition spec is not versioned in v1. See https://github.com/apache/iceberg/blob/efaad975fc9bbeaa27392b3332

Update existing partition spec

2021-06-09 Thread Laszlo Pinter
Hi Iceberg Devs, I'm using the Iceberg API to drop/add partition transforms, but recently I've run into an issue. When I try to add a new partition that previously existed, but it was already dropped I get *Cannot use partition name more than once *error message. Here is what I'm doing: table.up

Re: AssertJ for Assertions

2021-06-09 Thread Eduard Tudenhoefner
Thanks guys for your feedback. The idea was to not replace existing code when introducing AssertJ as that would probably rather cause merge conflicts for a lot of people. The idea was rather to give people a (better) alternative when testing certain things, such as collections, exceptions, paths,