Avoiding sort when writing 1TB+ of data

Xabriel Collazo Mojica Fri, 06 Sep 2019 09:19:31 -0700

Hi folks,

We are now consistently hitting this problem:


1) We want to write a data set in the order of 1-5TB of data into a partitioned 
Iceberg table.
2) Iceberg, on a single write, will fail a partitioned writer if it detects the 
data is not sorted. This has been discussed before in this list, and as per 
Ryan, this behavior is there to avoid the small files problem.
3) Thus, you are forced to do a sort. Ryan recommends to ‘sort by all 
partitioning columns, plus a high cardinality column like an ID’ (See Iceberg 
Mail Dev thread “Timeout waiting for connection from pool (S3)“).
4) For small writes, this is no problem, but I have had many job failures by 
trying to sort and running out of temp space while shuffling. The worker nodes 
are well configured, with 800GB of temp space available.
5) When it does succeed, then I have the issue that the sort consistently makes 
writes take at least double the time.

To summarize:

Given the cost of sorting TBs of data, any tips on what to do in Iceberg to 
avoid it?
Should we allow unsorted writes to go thru with a flag?

Thanks,
Xabriel J Collazo Mojica  |  Senior Software Engineer  |  Adobe  |  
xcoll...@adobe.com

Avoiding sort when writing 1TB+ of data

Reply via email to