Thanks On Wed, May 8, 2019 at 10:36 AM Felix Cheung <felixcheun...@hotmail.com> wrote:
> You could > > df.filter(col(“c”) = “c1”).write().partitionBy(“c”).save > > It could get some data skew problem but might work for you > > > > ------------------------------ > *From:* Burak Yavuz <brk...@gmail.com> > *Sent:* Tuesday, May 7, 2019 9:35:10 AM > *To:* Shubham Chaurasia > *Cc:* dev; u...@spark.apache.org > *Subject:* Re: Static partitioning in partitionBy() > > It depends on the data source. Delta Lake (https://delta.io) allows you > to do it with the .option("replaceWhere", "c = c1"). With other file > formats, you can write directly into the partition directory > (tablePath/c=c1), but you lose atomicity. > > On Tue, May 7, 2019, 6:36 AM Shubham Chaurasia <shubh.chaura...@gmail.com> > wrote: > >> Hi All, >> >> Is there a way I can provide static partitions in partitionBy()? >> >> Like: >> df.write.mode("overwrite").format("MyDataSource").partitionBy("c=c1").save >> >> Above code gives following error as it tries to find column `c=c1` in df. >> >> org.apache.spark.sql.AnalysisException: Partition column `c=c1` not found >> in schema struct<a:string,b:string,c:string>; >> >> Thanks, >> Shubham >> >