Yes, files do not support complete mode output yet. We are working on that,
and should be available in Spark 2.1.
In the meantime, you can use aggregation with memory sink (i.e.
format("memory")) to store in a in-memory table, which then can be
periodically written to a parquet table explicitly. N
Hi Arun,
Regarding parquet and complete output mode:
A relevant piece of the code to think about:
if (outputMode != OutputMode.Append) {
throw new IllegalArgumentException(
s"Data source $className does not support $outputMode output mode")
}
https://github
Thanks for the response. However, I am not able to use any output mode. In
case of Parquet sink, there should not be any aggregations?
scala> val query =
streamingCountsDF.writeStream.format("parquet").option("path","parq").option("checkpointLocation","chkpnt").outputMode("complete").start()
ja
Correction, the two options are.
- writeStream.format("parquet").option("path", "...").start()
- writestream.parquet("...").start()
There no start with param.
On Jul 30, 2016 11:22 AM, "Jacek Laskowski" wrote:
> Hi Arun,
>
> > As per documentation, parquet is the only available file sink.
>
>
Hi Arun,
> As per documentation, parquet is the only available file sink.
The following sinks are currently available in Spark:
* ConsoleSink for console format.
* FileStreamSink for parquet format.
* ForeachSink used in foreach operator.
* MemorySink for memory format.
You can create your own