*Date: *Tuesday, November 24, 2020 at 11:47 AM
> *To: *"Kruger, Scott"
> *Cc: *"dev@iceberg.apache.org"
> *Subject: *Re: Bucket partitioning in addition to regular partitioning
>
>
>
> This message contains hyperlinks, take precaution before opening the
misunderstanding what you’re saying? FWIW this is spark 2.4.x with
Iceberg 0.10.0 using the dataframe API.
From: Ryan Blue
Reply-To: "rb...@netflix.com"
Date: Tuesday, November 24, 2020 at 11:47 AM
To: "Kruger, Scott"
Cc: "dev@iceberg.apache.org"
Subject: Re: Bucket parti
only partition by the bucketed ID).
>
>
>
> *From: *Ryan Blue
> *Reply-To: *"dev@iceberg.apache.org" , "
> rb...@netflix.com"
> *Date: *Friday, November 20, 2020 at 8:11 PM
> *To: *Iceberg Dev List
> *Subject: *Re: Bucket partitioning in addition
trouble with (I can get
things to work just fine if I follow the docs and only partition by the
bucketed ID).
From: Ryan Blue
Reply-To: "dev@iceberg.apache.org" ,
"rb...@netflix.com"
Date: Friday, November 20, 2020 at 8:11 PM
To: Iceberg Dev List
Subject: Re: Bucket partiti
Hi Scott,
There are some docs to help with this situation:
https://iceberg.apache.org/spark/#writing-against-partitioned-table
We added a helper function, IcebergSpark.registerBucketUDF, to register the
UDF that you need for the bucket column. That's probably the source of the
problem.
I always
I want to have a table that’s partitioned by the following, in order:
* Low-cardinality identity
* Day
* Bucketed long ID, 16 buckets
Is this possible? If so, how should I do the dataframe write? This is what I’ve
tried so far:
1. df.orderBy(“identity”,
“day”).sortWithinPartit