Hello,

What kind of strategy must i follow, in order to periodically run
certain things.

For example, each hour, i want to look up log files from certain dir,
and for new files, i need to run:

load data local inpath '/home/cam/logs/log.2011310120' into table
item_view_raw partition (date_hour=2011310120);

FROM item_view_raw ivr INSERT OVERWRITE TABLE item_view partition
(date_hour=2011310120) SELECT ivr.view_time, ivr.ip_number,
ivr.session_id, ivr.session_cookie, ivr.eser_sid, ivr.sale_status,
ivr.maker_name, ivr.title WHERE ivr.log_tag = 'PROD' and
ivr.date_hour='2011310120';

obviously, i need to deduce which files are new, iterate over them,
and extract the time key, which will be used as a partition name, in
this case is: 2011310120

It seems like i can write a java program to deal with the
syncronization of all these tasks, but i was wondering, what would you
guys suggest?

Any ideas/recomendations/help greatly appreciated

Best Regards,
C.B.

Reply via email to