Hey Cam, You should use Oozie's Coordinator: https://github.com/yahoo/oozie/wiki/Oozie-Coord-Use-Cases.
Regards, Jeff On Tue, Feb 8, 2011 at 4:29 PM, Cam Bazz <camb...@gmail.com> wrote: > Hello, > > What kind of strategy must i follow, in order to periodically run > certain things. > > For example, each hour, i want to look up log files from certain dir, > and for new files, i need to run: > > load data local inpath '/home/cam/logs/log.2011310120' into table > item_view_raw partition (date_hour=2011310120); > > FROM item_view_raw ivr INSERT OVERWRITE TABLE item_view partition > (date_hour=2011310120) SELECT ivr.view_time, ivr.ip_number, > ivr.session_id, ivr.session_cookie, ivr.eser_sid, ivr.sale_status, > ivr.maker_name, ivr.title WHERE ivr.log_tag = 'PROD' and > ivr.date_hour='2011310120'; > > obviously, i need to deduce which files are new, iterate over them, > and extract the time key, which will be used as a partition name, in > this case is: 2011310120 > > It seems like i can write a java program to deal with the > syncronization of all these tasks, but i was wondering, what would you > guys suggest? > > Any ideas/recomendations/help greatly appreciated > > Best Regards, > C.B. >