Geovani, You can use HiveContext to do inserts into a Hive table in a Streaming app just as you would a batch app. A DStream is really a collection of RDDs so you can run the insert from within the foreachRDD. You just have to be careful that you’re not creating large amounts of small files. So you may want to either increase the duration of your Streaming batches or repartition right before you insert. You’ll just need to do some testing based on your ingest volume. You may also want to consider streaming into another data store though.
Thanks, Silvio From: Luiz Geovani Vier <lgv...@gmail.com<mailto:lgv...@gmail.com>> Date: Thursday, November 6, 2014 at 7:46 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Store DStreams into Hive using Hive Streaming Hello, Is there a built-in way or connector to store DStream results into an existing Hive ORC table using the Hive/HCatalog Streaming API? Otherwise, do you have any suggestions regarding the implementation of such component? Thank you, -Geovani