You get that error because "location" is a keyword in Hive. Try to
encapsulate it in ` char and try.
On Mon, Apr 2, 2012 at 7:07 AM, Anurag Gulati wrote:
> I’ve been trying to figure this out for a couple days now and I haven’t
> gotten very far.
>
> Looking for your guidance on the matter.*
I've been trying to figure this out for a couple days now and I haven't gotten
very far.
Looking for your guidance on the matter.
As a test, I'm trying to import Facebook Open Graph API data into Hive but am
having a problem with the syntax.
Here is a line of sample data I'm trying to import (m
Anand
You can optimize pretty much all hive queries. Based on your queries you
need to do the optimizations. For example Group By has some specific way to be
optimized. Some times Distribute By comes in handy for optimizing some queries.
Skew joins are good to balace the reducer loads. etc
Hi
On a first look, it seems like map join is happening in your case other
than bucketed map join. The following conditions need to hold for bucketed map
join to work
1) Both the tables are bucketed on the join columns
2) The number of buckets in each table should be multiples of each other
3
Anand,
best place to understand the join queries on hive is from the presentation
by Namit Jain from Facebook.
Here is the pdf
https://cwiki.apache.org/Hive/presentations.data/Hive%20Summit%202011-join.pdf
you can search the video on youtube. Its very well described
On Sun, Apr 1, 2012 at 11:59
I am trying to understand what are some of the options/settings available to
tune the performance of Hive Queries. I have seen the benefits of Map side
joins and Partitioning/Clustering. However I have yet to realize the impact map
side aggregation has on query performance. I tried running this