Re: Sorting requirements for partition keys

2021-02-12 Thread kkishore iiith
Apologies, https://iceberg.apache.org/spark-writes/#writing-to-partitioned-tables has answered my question On Fri, Feb 12, 2021 at 2:09 PM kkishore iiith wrote: > Hello Community, > > > https://developer.ibm.com/technologies/artificial-intelligence/articles/the-why-and-how-of-par

Sorting requirements for partition keys

2021-02-12 Thread kkishore iiith
Hello Community, https://developer.ibm.com/technologies/artificial-intelligence/articles/the-why-and-how-of-partitioning-in-apache-iceberg/ talks about sorting partition data, is that a requirement or only needed for performance improvement? Thanks, Kishor.

Re: Followup from iceberg newbie questions

2021-02-09 Thread kkishore iiith
.load("db.table") > > I hope that helps! > > rb > > On Tue, Feb 9, 2021 at 5:57 PM Ryan Blue wrote: > >> Replies inline. >> >> On Tue, Feb 9, 2021 at 5:36 PM kkishore iiith >> wrote: >> >>> *If a file system does not support at

Followup from iceberg newbie questions

2021-02-09 Thread kkishore iiith
Hello, This is followup from https://lists.apache.org/thread.html/rd15bf1db711b1a31f39d4b98776f29753b544fa3a496111d3460e11e%40%3Cdev.iceberg.apache.org%3E *If a file system does not support atomic renames, then you should use a metastore to track tables. You can use Hive, Nessie, or Glue. We also

Newbie iceberg questions

2021-01-28 Thread kkishore iiith
Hello Community, I am solving the problem of handling late arrived data in one of our systems. Currently, we wait for 8 hours for the late data to arrive before starting processing the current hour data. We have three stages in our pipeline A -> B -> C where B waits for 8 hours for A's hourly dat