[GitHub] rdblue opened a new pull request #50: Use manifest lists by default and fix tests.

2018-12-12 Thread GitBox
rdblue opened a new pull request #50: Use manifest lists by default and fix tests. URL: https://github.com/apache/incubator-iceberg/pull/50 This also fixes tests that were picking up manifest list files because they validated manifests in the metadata directory by looking for all Avro file

[GitHub] rdblue commented on issue #46: Do not scan manifests with no deletes when expiring snapshots.

2018-12-12 Thread GitBox
rdblue commented on issue #46: Do not scan manifests with no deletes when expiring snapshots. URL: https://github.com/apache/incubator-iceberg/pull/46#issuecomment-446781096 Looks like we had not requested it yet, we had asked a question about it. I've submitted the request: https://issues

[GitHub] rdblue commented on issue #46: Do not scan manifests with no deletes when expiring snapshots.

2018-12-12 Thread GitBox
rdblue commented on issue #46: Do not scan manifests with no deletes when expiring snapshots. URL: https://github.com/apache/incubator-iceberg/pull/46#issuecomment-446775717 @groodt, I think we've requested for gitbox notifications to stop going to the dev list, but I'll check on that. Uns

[GitHub] groodt commented on issue #46: Do not scan manifests with no deletes when expiring snapshots.

2018-12-12 Thread GitBox
groodt commented on issue #46: Do not scan manifests with no deletes when expiring snapshots. URL: https://github.com/apache/incubator-iceberg/pull/46#issuecomment-446770708 I'm sorry. I really don't wish to be annoying, but I'm getting spammed by something called GitBox for all activity o

[GitHub] rdblue opened a new pull request #49: Fix type handling in Spark and Pig.

2018-12-12 Thread GitBox
rdblue opened a new pull request #49: Fix type handling in Spark and Pig. URL: https://github.com/apache/incubator-iceberg/pull/49 This copies Pig type handling from Spark and fixes a minor bug in Spark with integer logical types that have been promoted to longs. --

[GitHub] rdblue commented on issue #23: DataFile External Identifier Field

2018-12-12 Thread GitBox
rdblue commented on issue #23: DataFile External Identifier Field URL: https://github.com/apache/incubator-iceberg/issues/23#issuecomment-446766094 @vinooganesh, I don't really understand the use case. How would you use the identifier? -

[GitHub] vinooganesh commented on issue #23: DataFile External Identifier Field

2018-12-12 Thread GitBox
vinooganesh commented on issue #23: DataFile External Identifier Field URL: https://github.com/apache/incubator-iceberg/issues/23#issuecomment-446758457 Hey @rdblue - quickly jumping in here. I think the mentality is that a file path as the sole identifier of a file may not suffice for ev

[GitHub] rdblue closed pull request #46: Do not scan manifests with no deletes when expiring snapshots.

2018-12-12 Thread GitBox
rdblue closed pull request #46: Do not scan manifests with no deletes when expiring snapshots. URL: https://github.com/apache/incubator-iceberg/pull/46 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenanc

Reattempted message: Iceberg Encryption Proposal

2018-12-12 Thread Matt Cheah
Hi everyone, Firstly, if this is a duplicate e-mail, I apologize for that – it’s unclear if my previous e-mail went through so I’m trying again. Encrypting data written to Iceberg tables is crucial for using this technology securely in industry settings. Towards that end, I’ve proposed an

Iceberg Encryption Proposal

2018-12-12 Thread Matt Cheah
Hi everyone, Encrypting data written to Iceberg tables is crucial for using this technology securely in industry settings. Towards that end, I’ve proposed an API for supporting encryption, including how users can implement their own custom encryption key providers and the metadata we’ll need

[GitHub] rdblue opened a new pull request #48: Fix commit retry with manfiest lists.

2018-12-12 Thread GitBox
rdblue opened a new pull request #48: Fix commit retry with manfiest lists. URL: https://github.com/apache/incubator-iceberg/pull/48 A manifest list is created for every commit attempt. Before this update, the same file was used, which caused retries to fail trying to create the same l

[GitHub] rdblue commented on issue #20: Encryption in Data Files

2018-12-12 Thread GitBox
rdblue commented on issue #20: Encryption in Data Files URL: https://github.com/apache/incubator-iceberg/issues/20#issuecomment-446690252 @mccheah, can you also start a thread on the dev list to point out this spec? I think other people will probably be interested that aren't necessarily

[GitHub] rdsr commented on a change in pull request #45: Lazily submit tasks in ParallelIterable and add cancellation.

2018-12-12 Thread GitBox
rdsr commented on a change in pull request #45: Lazily submit tasks in ParallelIterable and add cancellation. URL: https://github.com/apache/incubator-iceberg/pull/45#discussion_r241119180 ## File path: core/src/main/java/com/netflix/iceberg/util/ParallelIterable.java ## @

[GitHub] mccheah commented on a change in pull request #45: Lazily submit tasks in ParallelIterable and add cancellation.

2018-12-12 Thread GitBox
mccheah commented on a change in pull request #45: Lazily submit tasks in ParallelIterable and add cancellation. URL: https://github.com/apache/incubator-iceberg/pull/45#discussion_r241103604 ## File path: core/src/main/java/com/netflix/iceberg/util/ParallelIterable.java ##