Re: Append more files to existing partitioned data

2018-03-18 Thread Serega Sheypak
Thanks a lot! 2018-03-18 9:30 GMT+01:00 Denis Bolshakov : > Please checkout. > > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand > > > and > > org.apache.spark.sql.execution.datasources.WriteRelation > > > I guess it's managed by > > job.getConfiguration.set(DATASOURC

Re: Append more files to existing partitioned data

2018-03-18 Thread Denis Bolshakov
Please checkout. org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand and org.apache.spark.sql.execution.datasources.WriteRelation I guess it's managed by job.getConfiguration.set(DATASOURCE_WRITEJOBUUID, uniqueWriteJobId.toString) On 17 March 2018 at 20:46, Serega

Re: Append more files to existing partitioned data

2018-03-17 Thread Serega Sheypak
Hi Denis, great to see you here :) It works, thanks! Do you know how spark generates datafile names? names look like part- with uuid appended after part-0-124a8c43-83b9-44e1-a9c4-dcc8676cdb99.c000.snappy.parquet 2018-03-17 14:15 GMT+01:00 Denis Bolshakov : > Hello Serega, > > https:

Re: Append more files to existing partitioned data

2018-03-17 Thread Denis Bolshakov
Hello Serega, https://spark.apache.org/docs/latest/sql-programming-guide.html Please try SaveMode.Append option. Does it work for you? сб, 17 мар. 2018 г., 15:19 Serega Sheypak : > Hi, I', using spark-sql to process my data and store result as parquet > partitioned by several columns > > ds.wr