Using plain JDBC on Redshift will be slow for any reasonable volume but if you need to do that, you can open a connection to it from a RichFunction open() method-
I wrote a blog article a while back on Spark-Redshift package works - https://databricks.com/blog/2015/10/19/introducing-redshift-data-source-for-spark.html . This image captures the internal processes in Spark-Redshift for read - https://databricks.com/wp-content/uploads/2015/10/image01.gif. and this one captures the write - https://databricks.com/wp-content/uploads/2015/10/image00.gif In your case you can read the Kafka sources, partition the data appropriately (based on Redshift) and write the partitions to an S3 bucket and then invoke the COPY command in Redshift to load the data from the S3 bucket. This is the exact same process written explicitly as the above mentioned blog article. Sameer On Sat, Sep 24, 2016 at 12:32 PM, ram kumar <ramkumar09...@gmail.com> wrote: > Many Thanks Felix. > > * Flink Use case :* > > > > Extract data from source *(Kafka*) and loading data into target (*AWS S3 > and Redshift)*. > > we use SCD2 in the Redshift…since data changes need to be captured in the > redshift target. > > > > To connect redshift ( for staging and production database ) I need to > setup JDBC connection in Flink Scala. > > > > *Kafka (Source) ------------> Flink (JDBC) ----------->** AWS ( S3 and > Redshift) Target.* > > > > Could you please suggest me the best approach for this use case. > > > Regards > > Ram. > > > > On 24 September 2016 at 16:14, Felix Dreissig <f...@f30.me> wrote: > >> Hi Ram, >> >> On 24 Sep 2016, at 16:08, ram kumar <ramkumar09...@gmail.com> wrote: >> > I am wondering is that possible to add JDBC connection or url as a >> source or target in Flink using Scala. >> > Could you kindly some one help me on this? if you have any sample code >> please share it here. >> >> What’s your intended use case? Getting changes from a database or REST >> API into a data stream for processing in Flink? >> >> If so, you could use a data capture tool to write your changes to Kafka >> and then let Flink receive them from there. There are e.g. Bottled Water >> [1] for Postgres and Maxwell [2] and Debezium [3] for MySQL. >> For REST, I suppose you’d have to periodically query the API and >> determine changes yourself. I don’t know if there are any tools to help you >> with that. >> >> Regards, >> Felix >> >> [1] https://github.com/confluentinc/bottledwater-pg >> [2] http://maxwells-daemon.io/ >> [3] http://debezium.io/ > > >