Hello Team,

This is Anshu here, nice to meet you!
I am working at Crealytics, Berlin as a Big Data Consultant/ Solution
Architect.

We are building/ designing a data flow pipeline using mainly *Apache NiFi
--> Consume data from Kafka --> Transform, Convert and Merge --> Write to
HDFS*
We have several *HIVE External table*s defined to point to HDFS location
where we are dumping the data.
All these tables are partitioned (mostly by Date); through NiFi we are able
to create and write to a new partition.

However we need to execute a meta refresh (MSCK REPAIR ..) in order for
HIVE to pick the new partition.
Is there any way to automate this process?
We can have a small script to run and do the same but then we wouldn't know
the run frequency of it (in case of a partition apart from Date).


Thanking you in advance!

​
______________________

*Kind Regards,*
*Anshuman Ghosh*
*Contact - +49 179 9090964*

Reply via email to