On 4/4/19 10:22 AM, PengHui Li wrote:
Hi guys,
I am integrating hive and pulsar(http://pulsar.apache.org <http://pulsar.apache.org/>) by HiveStorageHandler and HiveMetaHook, I want to add a feature can divide the data
into several parts(pulsar topics) when use hive `PARTITIONED BY`. But don't know how to implement it based on HiveStorageHandler and HiveMetaHook.
I think you should be able to access the table's properties from the
StorageHandler (and get access to the pulsar server address/etc from there).
About supporting topics: I think instead of adding some features to support
"partitioned by"
the storage handler could get into predicate push down...by making the topic a
column.
To get some ideas how to do that I would first take a look at the jdbc storage
handler(or hbase).
note: I think this topic might better fit the developer list.
cheers,
Zoltan