Let’s say that I have a spark dataframe as 3 columns: id, name, age. When I save it into HDFS/S3, it saves as: (where I have used “partitionBy(id, name)”)
<root-dir>/id=1/name=Alex/<filename-1>.parquet <root-dir>/id=2/name=Bob/<filename-2>.parquet If I want not to include “id=” and “name=” in directory structures, what should I do Therefore I want my final output to be: <root-dir>/1/Alex/<filename-1>.parquet <root-dir>/2/Bob/<filename-2>.parquet Thanks, M. Parsian