Hi,

Is there a way to get the hdfs sink to signal that a file was just closed
and then use that signal to add a partition to hive if one does not exist
already.

Right now, what I do is:

- move files to s3
- run recover partitions <--- step takes forever.

But given that I have so much historical data, it's not feasible to run
recover partitions every single day since it takes forever.

I had much rather add an extra partition whenever I see a file in that
partition for the first time.

I looked around the code base and it seems the Flume-OG had something like
this but I don't see the capability in Flume-NG.

I can see a way to adding this by adding another Callback parameter to the
HdfsEventSink and create a customer wrapper around it.

Any other suggestions ?

Thanks,
Viral

Reply via email to