Re: map side join

2015-04-30 Thread Abe Weinograd
Great info, thanks. Makes sense on the partition since those files can be shipped by themselves. These are "reference" tables, but one happens to be pretty long. Thanks, Abe On Thu, Apr 30, 2015 at 12:54 PM, Gopal Vijayaraghavan wrote: > Hi, > > > Using CDH 5.3 - Hive 0.13. Does a view help

Re: map side join

2015-04-30 Thread Gopal Vijayaraghavan
Hi, > Using CDH 5.3 - Hive 0.13. Does a view help here? Does how i format >the table help in reducing size? No, a view does not help - they are not materialized and you need hive-1.0 to have temporary table support. The only way out is if you only have 1 filter column in the system. I assume

Re: map side join

2015-04-30 Thread Abe Weinograd
Using CDH 5.3 - Hive 0.13. Does a view help here? Does how i format the table help in reducing size? Abe On Thu, Apr 30, 2015 at 11:07 AM, Gopal Vijayaraghavan wrote: > Hi, > > > its submitting the whole table to the job. if I use a view with the > >filter > > baked in, will that help? I do

Re: map side join

2015-04-30 Thread Gopal Vijayaraghavan
Hi, > its submitting the whole table to the job. if I use a view with the >filter > baked in, will that help? I don't want to have to jack up the JVM for >the > client/HiveServer2 to accommodate the full table. Which hive version are you using? If you¹re on a recent version like hive-1.0, this

map side join

2015-04-30 Thread Abe Weinograd
Hi, I am doing a few map side joins in one query to load an user facing ORC table in order to denormalize. Two of the tables I am joining too are pretty large. I am setting hive.auto.convert.join.noconditionaltask.size pretty high. However, the join it self filters on those two tables, but it se

map-side join fails when a serialized table contains arrays

2015-03-02 Thread Makoto Yui
Hi, I got the attached error on a map-side join where a serialized table contains an array column. When setting map-side join off via setting hive.mapjoin.optimized.hashtable=false, exceptions do not occur. It seems that a wrong ObjectInspector was set at CommonJoinOperator#initializeOp. I am

Map side join failed when setting hive.optimize.cp to false

2014-08-05 Thread Shangzhong zhu
Hive version 0.12.0 To enable Map side join, we enable: set hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask = true; set hive.auto.convert.join.noconditionaltask.size = 12800; However, when we also set: hive.optimize.cp=false, Map side join will fail with

Re: Issue while inserting data in the hive table using map side join

2014-04-23 Thread Db-Blog
Hi Anirudh, Below are some links depicting the problem MIGHT BE related to data nodes. Please go thru the same and let us know if it was useful. 1. http://hansmire.tumblr.com 2. http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo Hive Experts- Kindly share your suggestions/findings on the sa

Issue while inserting data in the hive table using map side join

2014-04-23 Thread anirudh kala
up in the map side join. This is a 20 node cluster with 8 GB ram on each machine. -- thanks Anirudh Visit me at : www.anirudhkala.in

Re: Map-side join memory limit is too low

2014-02-03 Thread Lefty Leverenz
Searching the JIRA for HADOOP_HEAPSIZE turned up this new ticket (and related ones mentioned in the comments): HADOOP-10245 : The Hadoop command line scripts (hadoop.sh or hadoop.cmd) will call java > with "-Xmx" options twice. The impact is that

Re: Map-side join memory limit is too low

2014-02-02 Thread Navis류승우
try "set hive.mapred.local.mem=7000" or add it to hive-site.xml instead of modifying hive-env.sh HADOOP_HEAPSIZE is not in use. Should fix documentation of it. Thanks, Navis 2014-01-31 Avrilia Floratou : > Hi, > I'm running hive 0.12 on yarn and I'm trying to convert a common join into > a map

Map-side join memory limit is too low

2014-01-31 Thread Avrilia Floratou
Hi, I'm running hive 0.12 on yarn and I'm trying to convert a common join into a map join. My map join fails and from the logs I can see that the memory limit is very low: Starting to launch local task to process map join; maximum memory = 514523136 How can I increase the maximum memory? I'

ClassCastException during reduce-side join, but not map-side join

2013-01-17 Thread Anthony Urso
1 STRING, ... c5 STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3n://t1/'; CREATE EXTERNAL TABLE t2 ( c1 STRING, ... c18 STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3n://t2/'; Finally, changing Q1 above to use a map-sid

Re: Map side join

2012-12-27 Thread Souvik Banerjee
Hi, To conclude this thread I am summarizing my experiences. Correct me if think // observed otherwise. 1) For Map side join you need to set the flag hive.auto.convert.join=true; Map side join works well with multiple table and multiple Join condition. 2) You can change the size of the small

Re: map side join with group by

2012-12-13 Thread Chen Song
n >> >> Yeah. My original question is that is there a way to force Hive (or >> rather to say, is it possible) to execute map side join at mapper phase and >> group by in reduce phase. So instead of launching a map only job (join) and >> map reduce job (group by), doing it

Re: map side join with group by

2012-12-13 Thread Nitin Pawar
provide more ideas On Fri, Dec 14, 2012 at 12:42 AM, Chen Song wrote: > Nitin > > Yeah. My original question is that is there a way to force Hive (or rather > to say, is it possible) to execute map side join at mapper phase and group > by in reduce phase. So instead of launching a map

Re: map side join with group by

2012-12-13 Thread Chen Song
Nitin Yeah. My original question is that is there a way to force Hive (or rather to say, is it possible) to execute map side join at mapper phase and group by in reduce phase. So instead of launching a map only job (join) and map reduce job (group by), doing it altogether in a single MR job. This

Re: map side join with group by

2012-12-13 Thread Nitin Pawar
keys are different). But for map side join, the joins would be > complete by the end of the map phase, and outputs should be ready to be > distributed to reducers based on group by keys. > > Chen > > > On Thu, Dec 13, 2012 at 11:04 AM, Nitin Pawar wrote: > >> Thats beca

Re: Map side join

2012-12-13 Thread Souvik Banerjee
KS > > Sent from remote device, Please excuse typos > -- > *From: * Souvik Banerjee > *Date: *Thu, 13 Dec 2012 12:00:16 -0600 > *To: *; > *Subject: *Re: Map side join > > Hi Bejoy, > > The input files are non-compressed text file. > There

Re: map side join with group by

2012-12-13 Thread Chen Song
Understood that fact that it is impossible in the same MR job if both join and group by are gonna happen in the reduce phase (because the join keys and group by keys are different). But for map side join, the joins would be complete by the end of the map phase, and outputs should be ready to be

Re: Map side join

2012-12-13 Thread bejoy_ks
from remote device, Please excuse typos -Original Message- From: Souvik Banerjee Date: Thu, 13 Dec 2012 12:00:16 To: ; Subject: Re: Map side join Hi Bejoy, The input files are non-compressed text file. There are enough free slots in the cluster. Can you please let me know can I increase

Re: Map side join

2012-12-13 Thread Souvik Banerjee
e > *Date: *Wed, 12 Dec 2012 14:27:27 -0600 > *To: *; > *ReplyTo: * user@hive.apache.org > *Subject: *Re: Map side join > > Hi Bejoy, > > Yes I ran the pi example. It was fine. > Regarding the HIVE Job what I found is that it took 4 hrs for the first > map job to get

Re: Map side join

2012-12-13 Thread bejoy_ks
Banerjee Date: Wed, 12 Dec 2012 14:27:27 To: ; Reply-To: user@hive.apache.org Subject: Re: Map side join Hi Bejoy, Yes I ran the pi example. It was fine. Regarding the HIVE Job what I found is that it took 4 hrs for the first map job to get completed. Those map tasks were doing their job and

Re: map side join with group by

2012-12-13 Thread Nitin Pawar
n plans and the Semantic Analyzer code. >>> >>> And for completeness, there is a conditional task (starting Hive 0.7) >>> that will convert your joins automatically to map joins where >>> applicable. This can be enabled by enabling hive.auto.convert.join >>> pro

Re: map side join with group by

2012-12-13 Thread Chen Song
ert your joins automatically to map joins where >> applicable. This can be enabled by enabling hive.auto.convert.join >> property. >> >> Mark >> >> On Wed, Dec 12, 2012 at 3:32 PM, Chen Song >> wrote: >> > I have a silly question on how Hive inter

Re: map side join with group by

2012-12-12 Thread Nitin Pawar
> > And for completeness, there is a conditional task (starting Hive 0.7) > that will convert your joins automatically to map joins where > applicable. This can be enabled by enabling hive.auto.convert.join > property. > > Mark > > On Wed, Dec 12, 2012 at 3:32 PM, Chen Son

Re: map side join with group by

2012-12-12 Thread Mark Grover
ling hive.auto.convert.join property. Mark On Wed, Dec 12, 2012 at 3:32 PM, Chen Song wrote: > I have a silly question on how Hive interpretes a simple query with both map > side join and group by. > > Below query will translate into two jobs, with the 1st one as a map only job >

map side join with group by

2012-12-12 Thread Chen Song
I have a silly question on how Hive interpretes a simple query with both map side join and group by. Below query will translate into two jobs, with the 1st one as a map only job doing the join and storing the output in a intermediary location, and the 2nd one as a map-reduce job taking the output

Re: Map side join

2012-12-12 Thread Souvik Banerjee
g skeptical in > task, Tasktracker or jobtracker logs? > > > Regards > Bejoy KS > > Sent from remote device, Please excuse typos > -- > *From: * Souvik Banerjee > *Date: *Tue, 11 Dec 2012 17:12:20 -0600 > *To: *; > *ReplyTo: * user

Re: Map side join

2012-12-12 Thread bejoy_ks
-Original Message- From: Souvik Banerjee Date: Tue, 11 Dec 2012 17:12:20 To: ; Reply-To: user@hive.apache.org Subject: Re: Map side join Hello Everybody, Need help in for on HIVE join. As we were talking about the Map side join I tried that. I set the flag set hive.auto.convert.join=true; I

Re: Map side join

2012-12-11 Thread Souvik Banerjee
Hello Everybody, Need help in for on HIVE join. As we were talking about the Map side join I tried that. I set the flag set hive.auto.convert.join=true; I saw Hive converts the same to map join while launching the job. But the problem is that none of the map job progresses in my case. I made the

Re: Map side join

2012-12-07 Thread Souvik Banerjee
Hi Bejoy, That's wonderful. Thanks for your reply. What I was wondering if HIVE can do map side join with more than one condition on JOIN clause. I'll simply try it out and post the result. Thanks once again. Regards, Souvik. On Fri, Dec 7, 2012 at 2:10 PM, wrote: > ** >

Re: Map side join

2012-12-07 Thread bejoy_ks
Subject: Map side join Hello everybody, I have got a question. I didn't came across any post which says somethign about this. I have got two tables. Lets say A and B. I want to join A & B in HIVE. I am currently using HIVE 0.9 version. The join would be on few columns. like on (A.id1 = B

Map side join

2012-12-07 Thread Souvik Banerjee
.id2) AND (A.id3 = B.id3) Can I ask HIVE to use map side join in this scenario? Should I give a hint to HIVE by saying /*+mapjoin(B)*/ Get back to me if you want any more information in this regard. Thanks and regards, Souvik.

Re: Map side join and Serde jar in distributed cache missing

2012-09-24 Thread Aniket Mokashi
s >> table >> > needs a custom serde, which is added every time using add Jars(size of >> jar >> > is 25KB) in hive. >> > The problem is when hive performs map side join, classes in that serde >> jar >> > is not loaded and class not found excep

Re: Map side join and Serde jar in distributed cache missing

2012-09-24 Thread Abhishek Pratap Singh
ek Pratap Singh > wrote: > > Hi all, > > > > I have enabled automatic Map join for any table less than 50MB. This > table > > needs a custom serde, which is added every time using add Jars(size of > jar > > is 25KB) in hive. > > The problem is when hive pe

Re: Map side join and Serde jar in distributed cache missing

2012-09-24 Thread Edward Capriolo
0MB. This table > needs a custom serde, which is added every time using add Jars(size of jar > is 25KB) in hive. > The problem is when hive performs map side join, classes in that serde jar > is not loaded and class not found exception is thrown. But if I disable map > side join, it wo

Map side join and Serde jar in distributed cache missing

2012-09-24 Thread Abhishek Pratap Singh
Hi all, I have enabled automatic Map join for any table less than 50MB. This table needs a custom serde, which is added every time using add Jars(size of jar is 25KB) in hive. The problem is when hive performs map side join, classes in that serde jar is not loaded and class not found exception is

Re: Map side join

2012-06-18 Thread Aniket Mokashi
task, > as will the order of the join condition. In this example, large.key was > always on the left side of the join conditions. > > > Matt Tucker > > -Original Message- > From: Abhishek [mailto:abhishek.dod...@gmail.com] > Sent: Wednesday, June 13, 2012 11:13

RE: Map side join

2012-06-13 Thread Tucker, Matt
join conditions. Matt Tucker -Original Message- From: Abhishek [mailto:abhishek.dod...@gmail.com] Sent: Wednesday, June 13, 2012 11:13 AM To: user@hive.apache.org Subject: Map side join Hi all, How map side join in hive, can be used to join multiple tables(suppose 5 tables). Regards

Map side join

2012-06-13 Thread Abhishek
Hi all, How map side join in hive, can be used to join multiple tables(suppose 5 tables). Regards Abhishek Sent from my iPhone

Re: Hive map side join with Distributed cache

2012-06-11 Thread Harsh J
x27;d common-user@ and CC'd you. On Mon, Jun 11, 2012 at 9:46 PM, abhishek dodda wrote: > hi all, > > Map side join with distributed cache how to do this? can any one help > me on this. > > Regards > Abhishek. -- Harsh J