If you want to operate over all partitions in a table you don't need to specify
the partitions at all. Run your query and enjoy!
If you want to specify the partition mapping of the output dataset from a
query, I think you can derive that value on a per row basis like so:
Partition=substr(dat
Hello,
How can I do some process for each partition in some other table.
for example lets say table A has partitions 1,2,3
I want to be able to say
for each partition in A do {
select * from A where partition is ? into some othertable where partition is ?
}
Best Regards,
C.B.
Hello,
Is there a practical way to filter the logs left by crawlers like google?
They usually have user-agent strings like
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
is there a database for the
Hello,
I am looking over oozie's coordinator. But meanwhile, I managed to
write a simple java program to connect to hive using jdbc.
I can import data and execute queries.
I was wondering, somewhat for doing workflows, one needs to keep
metadata, i.e. which was the last file, partition processed
Hey Cam,
You should use Oozie's Coordinator:
https://github.com/yahoo/oozie/wiki/Oozie-Coord-Use-Cases.
Regards,
Jeff
On Tue, Feb 8, 2011 at 4:29 PM, Cam Bazz wrote:
> Hello,
>
> What kind of strategy must i follow, in order to periodically run
> certain things.
>
> For example, each hour, i w
Hello,
What kind of strategy must i follow, in order to periodically run
certain things.
For example, each hour, i want to look up log files from certain dir,
and for new files, i need to run:
load data local inpath '/home/cam/logs/log.2011310120' into table
item_view_raw partition (date_hour=20