Kavin88 opened a new issue #4675:
URL: https://github.com/apache/hudi/issues/4675
Hi Team,
We are trying to create partition folders in below format using hudi 0.5.0.
s3://bucketname/folder1/**partition_col_nm1=**2015-01-01/
s3://bucketname/folder1/**partition_col_nm1=**2016-01-01/
But it supports only this one, without the partition column names.
s3://bucketname/folder1/2015-01-01/
s3://bucketname/folder1/2016-01-01/
Any workaround to achieve this format
s3://bucketname/folder1/partition_col_nm1=2015-01-01/ in hudi 0.5.0 ? This
will be of great help even at the time we upgrade to the hudi latest version ,
there should not be need to rebuild/reload these partitioned tables.
**Environment Description**
* Hudi version : 0.5.0
* Spark version : 2.4.4
* Hive version : 3.1.2
* Hadoop version : 3.2.1
* Storage (HDFS/S3/GCS..) :s3
* Running on Docker? (yes/no) : no
inputDF = spark.createDataFrame(
[
("100", "2015-01-01", "2015-01-01T13:51:39.340396Z"),
("101", "2015-01-01", "2015-01-01T12:14:58.597216Z"),
("102", "2015-01-01", "2015-01-01T13:51:40.417052Z"),
("103", "2015-01-01", "2015-01-01T13:51:40.519832Z"),
("104", "2015-01-02", "2015-01-01T12:15:00.512679Z"),
("105", "2015-01-02", "2015-01-01T13:51:42.248818Z"),
],
["id", "creation_date", "last_update_time"],
)
# Specify common DataSourceWriteOptions in the single hudiOptions variable
hudiOptions = {
"hoodie.table.name": "my_hudi_table",
"hoodie.datasource.write.recordkey.field": "id",
"hoodie.datasource.write.partitionpath.field": "creation_date",
"hoodie.datasource.write.precombine.field": "last_update_time",
"hoodie.datasource.write.hive_style_partitioning": "true"
}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]