There will be in 1.4. df.write.partitionBy("year", "month", "day").parquet("/path/to/output")
On Mon, Jun 1, 2015 at 10:21 PM, Matt Cheah <mch...@palantir.com> wrote: > Hi there, > > I noticed in the latest Spark SQL programming guide > <https://spark.apache.org/docs/latest/sql-programming-guide.html>, there > is support for optimized reading of partitioned Parquet files that have a > particular directory structure (year=1/month=10/day=3, for example). > However, I see no analogous way to write DataFrames as Parquet files with > similar directory structures based on user-provided partitioning. > > Generally, is it possible to write DataFrames as partitioned Parquet files > that downstream partition discovery can take advantage of later? I > considered extending the Parquet output format, but it looks like > ParquetTableOperations.scala has fixed the output format to > AppendingParquetOutputFormat. > > Also, I was wondering if it would be valuable to contribute writing > Parquet in partition directories as a PR. > > Thanks, > > -Matt Cheah >