Hello.

Maybe somebody has faced the same issue. Trying to write data to the table
while using DataFrame API v2. Table is partitioned by buckets using
df.writeTo("some_table").partitionedBy(col("date"), col("x"), bucket(10,
col("y"))).using("iceberg").createOrReplace()
 Can I somehow prepare df in terms of partitions before writing to
destination to not to write too many files? Raw data is not grouped by
keys. Expectations are like
df.repartition(col("x"), bucket(10,
col("y")).writeTo("some_table").partitionedBy(col("date"), col("x"),
bucket(10, col("y"))).using("iceberg").createOrReplace() .
bucket function can't be used in that way, because getting [INTERNAL_ERROR]
Cannot generate code for expression: bucket(10, input[0, bigint, true])

Thanks

Reply via email to