Arkaprova, For something like this, you most likely need to use Sqoop for the bulk transfer.
Take a look at this link below to get some ideas. https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_importing_data_into_hive Kafka is for suited in my opinion for continuously streaming data. If you are looking to monitor and stream CDC changes from the RDBMS to Kafka and to downstream storage like HDFS, data lake store or blob store then you can take you can take a look at the JDBC Source Connector which will monitor the inserts and updates happening there and create streams that you can eventually push to HDFS, Blob Store or Datalake Store. http://docs.confluent.io/3.2.0/connect/connect-jdbc/docs/index.html I hope this helps. On Thu, Apr 20, 2017 at 9:46 AM, <arkaprova.s...@cognizant.com> wrote: > Hi, > > I would like to ingest data from RDBMS to CLOUD platform like Azure > HDInsight BLOB using Kafka . What will be the best practice in terms of > architectural perspective . Please suggest. > > Thanks, > Arkaprova > > This e-mail and any files transmitted with it are for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. Where permitted by > applicable law, this e-mail and other e-mail communications sent to and > from Cognizant e-mail addresses may be monitored. >