Re: Hive Pulsar Integration

Jörn Franke Sat, 13 Apr 2019 06:43:40 -0700

I think you need to develop a custom hiveserde + custom Hadoopinputformat + 
custom Hiveoutputformat


> Am 12.04.2019 um 17:35 schrieb 李鹏辉gmail <codelipeng...@gmail.com>:
> 
> Hi guys,
> 
> I’m working on integration of hive and pulsar recently. But now i have 
> encountered some problems and hope to get help here.
> 
> First of all, i simply describe the motivation.
> 
> Pulsar can be used as infinite streams for keeping both historic data and 
> streaming data, So we want to use pulsar as a storage extension for hive.
> In this way, hive can read the data in pulsar naturally, and can also write 
> data into pulsar.
> We will benefit from the same data that provides both interactive query and 
> streaming capabilities.
> 
> As an improvement, support data partitioning can make the query more 
> efficient(e.g. partition by date or any other field). 
> 
> But
> 
> - how to get hive table partition definition? 
> - While user inert data to hive table, how to get partition the data should 
> be store? 
> - While use select data from hive table, how to determine data is in that 
> partition?
> 
> If hive already expose some mechanism to support, please show me how to use 
> it.
> 
> Best regards
> 
> Penghui
> Beijing, China
> 
> 
>

Re: Hive Pulsar Integration

Reply via email to