On Wed, Feb 16, 2011 at 8:11 PM, Alejandro Abdelnur <t...@cloudera.com> wrote: > An update on this. > I've finished doing changes in Oozie Hive-action to work with Hive 0.7. > As mentioned before the problem is that not all needed Hive & dependent JARs > are available in public Maven repos. > Early next week the Cloudera Maven repositories should have beta versions of > these JARs (currently I'm building against SNAPSHOTs). > As soon as the beta JARs are available I'll post a patch using those JAR > versions. > Thanks. > Alejandro > On Thu, Feb 10, 2011 at 4:51 PM, Alejandro Abdelnur <t...@cloudera.com> > wrote: >> >> Hi Balaji, >> The latest patch of the Hive action does not bundle hive-default.xml (got >> same feedback from Carl), you'll be responsible for bundling it in the WF >> directory until Hive JARs bundles it. >> I'll upload the new patch early next week and then ask Oozie it integrate >> it. >> Still the problem I have is that, AFAIK, not all Hadoop and Hive JARs are >> available in public Maven repositories currently used by Oozie build. I'll >> submit as part o the PR a separate commit that configures Oozie build to >> pull for Cloudera's Maven repositories where all JARs are available. >> Thanks. >> Alejandro >> On Thu, Feb 10, 2011 at 4:34 PM, Balaji Rajagopalan >> <balaj...@yahoo-inc.com> wrote: >>> >>> Alejandro, >>> >>> I have used your hive action patch from tucu’s forked branch in yahoo >>> github and it works fine, when will your patch be available in the master >>> branch of yahoo github. Also I have a small suggestion if I may, >>> hive-default.xml is bundled with the oozie-core.jar, instead can we have the >>> hive-default.xml is the same folder of workflow.xml in the hdfs, so when I >>> change the hive-default.xml I don’t have to bundle the jar again. >>> >>> >>> >>> Regards, >>> >>> Balaji >>> >>> >>> >>> From: Alejandro Abdelnur [mailto:t...@cloudera.com] >>> Sent: Thursday, February 10, 2011 3:12 AM >>> To: user@hive.apache.org >>> Subject: Re: periodic execution >>> >>> >>> >>> Hi Cam, >>> >>> >>> >>> A bit of information that may be useful for you, Cloudera's Oozie has a >>> Hive action that you can use from workflow jobs. >>> >>> >>> >>> Cheers >>> >>> >>> >>> Alejandro >>> >>> >>> >>> On Wed, Feb 9, 2011 at 11:44 AM, Cam Bazz <camb...@gmail.com> wrote: >>> >>> Hello, >>> >>> I am looking over oozie's coordinator. But meanwhile, I managed to >>> write a simple java program to connect to hive using jdbc. >>> >>> I can import data and execute queries. >>> >>> I was wondering, somewhat for doing workflows, one needs to keep >>> metadata, i.e. which was the last file, partition processed etc. >>> >>> I could do this usually using a database like db4o, and keeping a static >>> file. >>> >>> Is the derby database that comes with hive is for this purpose? how do >>> people usually store state when using a hive application? >>> >>> best regards, >>> -C.B. >>> >>> On Wed, Feb 9, 2011 at 5:23 AM, Jeff Hammerbacher <ham...@cloudera.com> >>> wrote: >>> > Hey Cam, >>> > You should use Oozie's >>> > Coordinator: https://github.com/yahoo/oozie/wiki/Oozie-Coord-Use-Cases. >>> > Regards, >>> > Jeff >>> > >>> > On Tue, Feb 8, 2011 at 4:29 PM, Cam Bazz <camb...@gmail.com> wrote: >>> >> >>> >> Hello, >>> >> >>> >> What kind of strategy must i follow, in order to periodically run >>> >> certain things. >>> >> >>> >> For example, each hour, i want to look up log files from certain dir, >>> >> and for new files, i need to run: >>> >> >>> >> load data local inpath '/home/cam/logs/log.2011310120' into table >>> >> item_view_raw partition (date_hour=2011310120); >>> >> >>> >> FROM item_view_raw ivr INSERT OVERWRITE TABLE item_view partition >>> >> (date_hour=2011310120) SELECT ivr.view_time, ivr.ip_number, >>> >> ivr.session_id, ivr.session_cookie, ivr.eser_sid, ivr.sale_status, >>> >> ivr.maker_name, ivr.title WHERE ivr.log_tag = 'PROD' and >>> >> ivr.date_hour='2011310120'; >>> >> >>> >> obviously, i need to deduce which files are new, iterate over them, >>> >> and extract the time key, which will be used as a partition name, in >>> >> this case is: 2011310120 >>> >> >>> >> It seems like i can write a java program to deal with the >>> >> syncronization of all these tasks, but i was wondering, what would you >>> >> guys suggest? >>> >> >>> >> Any ideas/recomendations/help greatly appreciated >>> >> >>> >> Best Regards, >>> >> C.B. >>> > >>> > >>> >>> > >
Did support for hive variables (0.7.0) make it into this version of the oozie-action?