Also check SPARK-19478 <https://issues.apache.org/jira/browse/SPARK-19478> - JDBC sink (seems to be waiting for a review)
Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Mon, Apr 10, 2017 at 10:10 AM, Hemanth Gudela <hemanth.gud...@qvantel.com > wrote: > Many thanks Silvio for the link. That’s exactly what I’m looking for. ☺ > > However there is no mentioning of checkpoint support for custom > “ForeachWriter” in structured streaming. I’m going to test that now. > > > > Good question Gary, this is the mentioning in the link > <https://databricks.com/blog/2017/04/04/real-time-end-to-end-integration-with-apache-kafka-in-apache-sparks-structured-streaming.html> > . > > Often times we want to be able to write output of streams to external > databases such as MySQL. At the time of writing, the Structured Streaming > API does not support external databases as sinks; however, when it does, > the API option will be as simple as .format("jdbc").start("jdbc:mysql/.."). > > > In the meantime, we can use the foreach sink to accomplish this. Let’s > create a custom JDBC Sink that extends *ForeachWriter* and implements its > methods. > > > > I’m not sure though if jdbc sink feature will be available in upcoming > spark (2.2.0?) version or not. > > It would good to know if someone has information about it. > > > > Thanks, > > Hemanth > > > > *From: *"lucas.g...@gmail.com" <lucas.g...@gmail.com> > *Date: *Monday, 10 April 2017 at 8.24 > *To: *"user@spark.apache.org" <user@spark.apache.org> > *Subject: *Re: Does spark 2.1.0 structured streaming support jdbc sink? > > > > Interesting, does anyone know if we'll be seeing the JDBC sinks in > upcoming releases? > > > > Thanks! > > > > Gary Lucas > > > > On 9 April 2017 at 13:52, Silvio Fiorito <silvio.fior...@granturing.com> > wrote: > > JDBC sink is not in 2.1. You can see here for an example implementation > using the ForEachWriter sink instead: https://databricks.com/blog/ > 2017/04/04/real-time-end-to-end-integration-with-apache- > kafka-in-apache-sparks-structured-streaming.html > > > > > > *From: *Hemanth Gudela <hemanth.gud...@qvantel.com> > *Date: *Sunday, April 9, 2017 at 4:30 PM > *To: *"user@spark.apache.org" <user@spark.apache.org> > *Subject: *Does spark 2.1.0 structured streaming support jdbc sink? > > > > Hello Everyone, > > I am new to Spark, especially spark streaming. > > > > I am trying to read an input stream from Kafka, perform windowed > aggregations in spark using structured streaming, and finally write > aggregates to a sink. > > - MySQL as an output sink doesn’t seem to be an option, because > this block of code throws an error > > streamingDF.writeStream.format("jdbc").start("jdbc:mysql…”) > > *ava.lang.UnsupportedOperationException*: Data source jdbc does not > support streamed writing > > This is strange because, this > <http://rxin.github.io/talks/2016-02-18_spark_summit_streaming.pdf> > document shows that jdbc is supported as an output sink! > > > > - Parquet doesn’t seem to be an option, because it doesn’t > support “complete” output mode, but “append” only. As I’m preforming > windows aggregations in spark streaming, the output mode has to be > complete, and cannot be “append” > > > > - Memory and console sinks are good for debugging, but are not > suitable for production jobs. > > > > So, please correct me if I’m missing something in my code to enable jdbc > output sink. > > If jdbc output sink is not option, please suggest me an alternative output > sink that suits my needs better. > > > > Or since structured streaming is still ‘alpha’, should I resort to spark > dstreams to achieve my use case described above. > > Please suggest. > > > > Thanks in advance, > > Hemanth > > >