[GitHub] [iceberg-docs] kbendick commented on pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
kbendick commented on pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#issuecomment-987616428 > Thinking about this now, we could probably get rid of the main branch here and set next as the default branch for the repo, or alternatively we can have some "if main then nex

[GitHub] [iceberg-docs] samredai commented on pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
samredai commented on pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#issuecomment-987600519 > It looks like the Apache 2.0 license header is missing for some README and script files, please check and add, thanks! > > For Python script, it seems like indentation i

[GitHub] [iceberg-docs] samredai commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
samredai commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763658651 ## File path: content/about/about.md ## @@ -0,0 +1,9 @@ +--- +Title: What is Iceberg? +Draft: false +--- + +Iceberg adds tables to compute engines includi

[GitHub] [iceberg-docs] samredai commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
samredai commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763658510 ## File path: config.toml ## @@ -0,0 +1,24 @@ +baseURL = "" # This is populated by the github deploy workflow and is equal to "/" +languageCode = "en-us

[GitHub] [iceberg-docs] samredai commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
samredai commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763656608 ## File path: config.toml ## @@ -0,0 +1,24 @@ +baseURL = "" # This is populated by the github deploy workflow and is equal to "/" +languageCode = "en-us

[GitHub] [iceberg-docs] samredai commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
samredai commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763652888 ## File path: README.md ## @@ -1,3 +1,19 @@ -## Iceberg Docs +# Apache Iceberg Documentation Site -This repository contains the markdown documentation h

[GitHub] [iceberg-docs] samredai commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
samredai commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763652648 ## File path: .github/workflows/deploy.yml ## @@ -0,0 +1,52 @@ +name: github pages + +on: [push, pull_request] + +jobs: + deploy: +runs-on: ubuntu-20

Re: Some questions related to compaction support.

2021-12-06 Thread Russell Spitzer
> On Dec 6, 2021, at 9:02 PM, Puneet Zaroo wrote: > > Hi, > I had a few questions related to compaction support, in particular compaction > for CDC destination iceberg tables. Perhaps this information is available > somewhere else, but I could not find it readily, so responses appreciated. >

Re: Some questions related to compaction support.

2021-12-06 Thread Jack Ye
For clarification, Ajantha is correct about 4, I just mean we can remove delete files more eagerly using an additional procedure, but normal snapshot expiration still works. -Jack On Mon, Dec 6, 2021 at 9:22 PM Jack Ye wrote: > 1. Yes, you are correct. > 2. We just added the SQL procedure call,

Re: Some questions related to compaction support.

2021-12-06 Thread Jack Ye
1. Yes, you are correct. 2. We just added the SQL procedure call, if you don't want to directly invoke the action via Spark: https://github.com/apache/iceberg/blob/master/site/docs/spark-procedures.md?plain=1#L243 3. The filter is a data filter, it does not need to be at partition boundary, you can

Re: Some questions related to compaction support.

2021-12-06 Thread Ajantha Bhat
> > >1. I believe compaction for the CDC use case will require iceberg >version >= 0.13 (to pick up the change that maintains the same sequence >numbers after compaction) and Spark version >= 3.0 (for the actual >compaction action support). But please correct me if I'm wrong. > > *y

Some questions related to compaction support.

2021-12-06 Thread Puneet Zaroo
Hi, I had a few questions related to compaction support, in particular compaction for CDC destination iceberg tables. Perhaps this information is available somewhere else, but I could not find it readily, so responses appreciated. 1. I believe compaction for the CDC use case will require iceber

[GitHub] [iceberg-docs] rdblue commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
rdblue commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763196285 ## File path: content/about/about.md ## @@ -0,0 +1,9 @@ +--- +Title: What is Iceberg? +Draft: false +--- + +Iceberg adds tables to compute engines including

[GitHub] [iceberg-docs] rdblue commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
rdblue commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763195862 ## File path: config.toml ## @@ -0,0 +1,24 @@ +baseURL = "" # This is populated by the github deploy workflow and is equal to "/" +languageCode = "en-us"

[GitHub] [iceberg-docs] rdblue commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
rdblue commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763195072 ## File path: asciinema/schema_evolution.py ## @@ -0,0 +1,53 @@ +from generate_asciinema_cast import Cast + +sequence = [ +( +"ALTER TABLE taxis

[GitHub] [iceberg-docs] rdblue commented on a change in pull request #1: First version of hugo doc site

2021-12-06 Thread GitBox
rdblue commented on a change in pull request #1: URL: https://github.com/apache/iceberg-docs/pull/1#discussion_r763192839 ## File path: README.md ## @@ -1,3 +1,19 @@ -## Iceberg Docs +# Apache Iceberg Documentation Site -This repository contains the markdown documentation hos

Re: High memory usage with highly concurrent committers

2021-12-06 Thread Piotr Findeisen
Hi Igor, does fs.gs.outputstream.upload.chunk.size affect the file size I can upload? Can i upload e.g. 1GB Parquet file, while also setting fs.gs.outputstream. upload.chunk.size=8388608 (8MB / MiB)? Best PF On Fri, Dec 3, 2021 at 5:33 PM Igor Dvorzhak wrote: > No, right now this is a global