Re: Storm HiveBolt missing records due to batching of Hive transactions

Harshit Raikar Wed, 21 Oct 2015 05:03:01 -0700

Hi Aaron,

Thanks for the information.
Do I need to update my Storm version? Currently I am using 0.10.0 version.
Can you please guide me what parameters need to be set to use tick tuple


Regards,
Harshit Raikar

On 9 October 2015 at 14:49, Aaron.Dossett <aaron.doss...@target.com> wrote:

> STORM-938 adds a periodic flush to the HiveBolt using tick tuples that
> would address this situation.
>
> From: Harshit Raikar <harshit.rai...@gmail.com>
> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
> Date: Friday, October 9, 2015 at 4:05 AM
> To: "user@hive.apache.org" <user@hive.apache.org>
> Subject: Storm HiveBolt missing records due to batching of Hive
> transactions
>
>
> To store the processed records I am using HiveBolt in Storm topology with
> following arguments.
>
> - id: "MyHiveOptions"
>     className: "org.apache.storm.hive.common.HiveOptions"
>       - "${metastore.uri}"                       # metaStoreURI
>       - "${hive.database}"                       # databaseName
>       - "${hive.table}"                          # tableName
>     configMethods:
>           - name: "withTxnsPerBatch"
>             args:
>               - 2
>           - name: "withBatchSize"
>             args:
>               - 100
>           - name: "withIdleTimeout"
>             args:
>               - 2      #default value 0
>           - name: "withMaxOpenConnections"
>             args:
>               - 200     #default value 500
>           - name: "withCallTimeout"
>             args:
>               - 30000     #default value 10000
>           - name: "withHeartBeatInterval"
>             args:
>               - 240     #default value 240
>
> There are missing transaction in Hive due to batch no being completed and
> records are flushed. (For example: 1330 records are processed but only 1200
> records are in hive. 130 records missing.)
>
> How can I overcome this situation? How can I fill the batch so that the
> transaction is triggered and the records are stored in hive.
>
> Topology : Kafka-Spout --> DataProcessingBolt
>            DataProcessingBolt -->HiveBolt (Sink)
>            DataProcessingBolt -->JdbcBolt (Sink)
>
>
> --
> Thanks and Regards,
> Harshit Raikar
>
>
>
> --
> Thanks and Regards,
> Harshit Raikar
> Phone No. +4917655471932
>



-- 
Thanks and Regards,
Harshit Raikar
Phone No. +4917655471932

Re: Storm HiveBolt missing records due to batching of Hive transactions

Reply via email to