Re: Iceberg build and Nessie

2021-06-14 Thread Ryan Murray
Thanks Jack/Ryan for the feedback! I am happy to go with option 3 if that is what people agree on. Its the simplest change. To be clear: We will just skip tests in the Nessie module on Java8. This will still create the jars for releases and still run tests for Java11. I will wait a day or two to

Re: Iceberg build and Nessie

2021-06-14 Thread Ryan Blue
I would also vote for option 3. That seems like the easiest way to go if the Nessie server can't run in JDK8. Assuming that people have multiple JDKs available seems brittle to me, too. And using a container seems like a bit more development work in the Iceberg build. How long can we continue runni

Re: Keeping infinite snapshots

2021-06-14 Thread Ryan Blue
Keeping snapshots will add some metadata, but it isn't a ton and you can probably drop some summary metadata to make it smaller (the Spark app ID, for example). Since compaction creates new snapshots, it wouldn't really help. What would help is keeping track of "versions" as branches. Then you can

Re: Iceberg build and Nessie

2021-06-14 Thread Jack Ye
My thinking is that currently we already have CI tasks for JDK 8 and 11 separately, so it would be sufficient to only run build on JDK8 and run build and test on JDK11, which leads to option 3. For option 1, I think people do mostly have JDK 11, but might not want to have 8 and 11 at the same time

Re: Keeping infinite snapshots

2021-06-14 Thread Suraj Chandran
Thanks. So our use-case is to keep all the snapshots till the beginning of time. How is that going to impact performance, since the metadata files will be quite a bit? Also would it reduce opportunities of data compaction? One idea I had around this was to create a solution in Iceberg to be able t

Re: Keeping infinite snapshots

2021-06-14 Thread Ryan Blue
Hi Suraj, I just answered on slack, but I'll copy the replies here for everyone that's subscribed to the dev list: 1) Yes, there are use cases around this. To assist, we're planning on adding named snapshots so you don't keep complete history. Instead, you should keep a selection of snapshots. 2)

Iceberg build and Nessie

2021-06-14 Thread Ryan Murray
Hi All, Currently the iceberg CI/development build relies on a custom gradle plugin to start a Nessie server for the nessie module tests. The underlying tech that powers a nessie server is Quarkus. Unfortunately Quarkus is dropping support for building and running from jdk8. This means that the Ja