Any suggestions on this ? Still trying to figure out how do I get a notification that a new partition is being created by the HDFS sink and I can add that via a ALTER TABLE statement on a separate thread.
Is adding a callback the right way to handle this ? Thanks, Viral On Mon, Jul 28, 2014 at 2:40 PM, Viral Bajaria <viral.baja...@gmail.com> wrote: > Hi, > > Is there a way to get the hdfs sink to signal that a file was just closed > and then use that signal to add a partition to hive if one does not exist > already. > > Right now, what I do is: > > - move files to s3 > - run recover partitions <--- step takes forever. > > But given that I have so much historical data, it's not feasible to run > recover partitions every single day since it takes forever. > > I had much rather add an extra partition whenever I see a file in that > partition for the first time. > > I looked around the code base and it seems the Flume-OG had something like > this but I don't see the capability in Flume-NG. > > I can see a way to adding this by adding another Callback parameter to the > HdfsEventSink and create a customer wrapper around it. > > Any other suggestions ? > > Thanks, > Viral > >