Re: Approaching Vectorized Reading in Iceberg ..

2019-07-26 Thread Gautam
Got it. Commented out that module and it works. Was just curious why it doesn't work on master branch either. On Fri, Jul 26, 2019 at 3:49 PM Daniel Weeks wrote: > Actually, it looks like the issue is right there in the error . . . the > ErrorProne module is being excluded from the compile stage

Re: Approaching Vectorized Reading in Iceberg ..

2019-07-26 Thread Daniel Weeks
Actually, it looks like the issue is right there in the error . . . the ErrorProne module is being excluded from the compile stages of the sub-projects here: https://github.com/apache/incubator-iceberg/blob/master/build.gradle#L152 However, it is still being applied to the jmh tasks. I'm not fami

Re: Approaching Vectorized Reading in Iceberg ..

2019-07-26 Thread Daniel Weeks
Gautam, you need to have the jmh-core libraries available to run. I validated that PR, so I'm guessing I had it configured in my environment. I assume there's a way to make that available within gradle, so I'll take a look. On Fri, Jul 26, 2019 at 2:52 PM Gautam wrote: > This fails on master t

Re: Approaching Vectorized Reading in Iceberg ..

2019-07-26 Thread Gautam
This fails on master too btw. Just wondering if i'm doing something wrong trying to run this. On Fri, Jul 26, 2019 at 2:24 PM Gautam wrote: > I'v been trying to run the jmh benchmarks bundled within the project. I'v > been running into issues with that .. have other hit this? Am I running > thes

Re: Approaching Vectorized Reading in Iceberg ..

2019-07-26 Thread Gautam
I'v been trying to run the jmh benchmarks bundled within the project. I'v been running into issues with that .. have other hit this? Am I running these incorrectly? bash-3.2$ ./gradlew :iceberg-spark:jmh -PjmhIncludeRegex=IcebergSourceFlatParquetDataFilterBenchmark -PjmhOutputPath=benchmark/icebe

Re: Getting delta of data changes between 2 Snapshots

2019-07-26 Thread Ryan Blue
Thanks for working on this! I think the overall idea of being able to plan an incremental scan is a good idea. But, we should avoid calling the incremental data a “snapshot”. A snapshot is the table state at some point in time, and I think it would be confusing if we started adding new meanings.