Re: how to let hive support lzo

2013-07-22 Thread bejoy_ks
Hi, Along with the mapred.compress* properties try to set hive.exec.compress.output to true. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: ch huang Date: Mon, 22 Jul 2013 13:41:01 To: Reply-To: user@hive.apache.org Subject: Re: how to let hi

Re: Hive CLI

2013-07-08 Thread bejoy_ks
Hi Rahul, The same shortcuts ctrl+A and ctrl+E works in hive shell for me( hive 0.9) Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: rahul kavale Date: Tue, 9 Jul 2013 11:00:49 To: Reply-To: user@hive.apache.org Subject: Hive CLI Hey there, I

Re: Need help in Hive

2013-07-08 Thread bejoy_ks
Hi Maheedhar As I understand, you are having a column with data of type MM:SS in your input data set. AFAIK this format is not in the standard java.sql.Timestamp format also it doesn't even have any date part . Hence you may not be able to use Timestamp data type here. You can define it as a

Re: integration issure about hive and hbase

2013-07-08 Thread bejoy_ks
Hi Can you try including the zookeeper quorum and port in your hive configuration as shown below hive --auxpath .../hbase-handler.jar, .../hbase.jar, ...zookeeper.jar, .../guava.jar -hiveconf hbase.zookeeper.quorum= -hiveconf hbase.zookeeper.property.clientPort= Substitute the above command wi

Re: Strange error in hive

2013-07-08 Thread bejoy_ks
Hii Jerome Can you send the error log of the MapReduce task that failed? That should have some pointers which can help you troubleshoot the issue. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Jérôme Verdier Date: Mon, 8 Jul 2013 11:25:34 To

Re: When to use bucketed tables with/instead of partitioned tables

2013-06-17 Thread bejoy_ks
Hi Stephen In addition to join optimization, bucketing helps much in sampling as well. It helps you to choose the sample space, (ie n buckets of m). Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Stephen Boesch Date: Sun, 16 Jun 2013 11:20:49

Re: How to delete Specific date data using hive QL?

2013-06-04 Thread bejoy_ks
Adding my two cents If you are having an unpartitioned data/table and would like to partition it on some specific columns in source table, Use dynamic partition insert. That would get the source data in separate partitions on a partitioned target table. http://kickstarthadoop.blogspot.com/2011/

Re: how does hive find where is MR job tracker

2013-05-28 Thread bejoy_ks
Hive gets the JobTracker from the mapred-site.xml specified within your $HADOOP_HOME/conf. Is your $HADOOP_HOME/conf/mapred-site.xml on the node that runs hive have the correct value for jobtracker? If not changing that to the right one might resolve your issue. Regards Bejoy KS Sent from re

Re: Sqoop Oracle Import to Hive Table - Error in metadata: InvalidObjectException

2013-05-25 Thread bejoy_ks
Hi Can you try doing the import again after assigning 'DS12' the default schema for the user doing the import. Your DB admin should be able to do this in oracle . Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Raj Hadoop Date: Sat, 25 May 2013

Re: io.compression.codecs not found

2013-05-23 Thread bejoy_ks
These are the default, add snappy as well along io.compression.codecs org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- Fr

Re: Snappy with HIve

2013-05-23 Thread bejoy_ks
Hi Please find responses below. Do I have to give some INPUTFORMAT directive to make the Hive Table read Snappy Codec files ? For example for LZO its STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat" Bejoy : No custom input format required. Add the snappy codec in io.compr

Re: io.compression.codecs not found

2013-05-23 Thread bejoy_ks
Go to $HADOOP_HOME/config open and edit core-site.xml Add a new property 'io.compression.codecs' and assign the required compression codecs as its value. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Sachin Sudarshana Date: Thu, 23 May 2013 19

Re: Hive on Oracle

2013-05-17 Thread bejoy_ks
Hi Raj Which jar depends on what version of oracle you are using? The jar version corresponding to each oracle release would be there in oracle documentations online. JDBC Jars should be available from the oracle website for free download. Regards Bejoy KS Sent from remote device, Please ex

Re: Hive on Oracle

2013-05-17 Thread bejoy_ks
Hi The procedure is same as setting up mysql metastore. You need to use the jdbc driver/jar corresponding to the oracle version/release you are intending to use. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Raj Hadoop Date: Fri, 17 May 2013 1

Re: Getting Slow Query Performance!

2013-03-12 Thread bejoy_ks
Hi Since you are on a pseudo distributed/ single node environment the hadoop mapreduce parallelism is limited. You might be having just a few map slots and map tasks might be in queue waiting for others to complete. In a larger cluster your job should be faster. As a side note, Certain SQL que

Re: Getting Slow Query Performance!

2013-03-12 Thread bejoy_ks
Hi Since you are on a pseudo distributed/ single node environment the hadoop mapreduce parallelism is limited. You might be having just a few map slots and map tasks might be in queue waiting for others to complete. In a larger cluster your job should be faster. Certain SQL queries that uliliz

Re: hive issue with sub-directories

2013-03-10 Thread bejoy_ks
Hi Suresh AFAIK as of now a partition cannot contain sub directories, it can contain only files. You may have to move the sub dirs out of the parent dir 'a' and create separate partitions for those. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- Fro

Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil

2013-03-10 Thread bejoy_ks
Hi Sai Local mode is just for trials, for any pre prod/production environment you need MR jobs. Hive under the hood stores data in HDFS (mostly) and definitely we use hadoop/hive for larger data volumes. So MR should be in there to process them. Regards Bejoy KS Sent from remote device, Ple

Re: Accessing sub column in hive

2013-03-08 Thread bejoy_ks
Hi Sai You can do it as Select address.country from employees;   Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Bennie Schut Date: Fri, 8 Mar 2013 09:09:49 To: user@hive.apache.org; 'Sai Sai' Reply-To: user@hive.apache.org Subject: RE: Accessi

Re: Finding maximum across a row

2013-03-01 Thread bejoy_ks
Hi Sachin You could get the detailed ateps from hive wiki itself https://cwiki.apache.org/Hive/hiveplugins.html Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Sachin Sudarshana Date: Fri, 1 Mar 2013 22:37:54 To: ; Reply-To: user@hive.apache.o

Re: Finding maximum across a row

2013-03-01 Thread bejoy_ks
Hi Sachin AFAIK There isn't one at the moment. But you can easily achieve this using a custom UDF. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Sachin Sudarshana Date: Fri, 1 Mar 2013 22:16:37 To: Reply-To: user@hive.apache.org Subject: Fin

Re: Hive queries

2013-02-25 Thread bejoy_ks
Hi Cyril I believe you are using the derby meta store and then it should be an issue with the hive configs. Derby is trying to create a metastore at your current dir from where you are starting hive. The tables exported by sqoop would be inside HIVE_HOME and hence you are not able to see the t

Re: Security for Hive

2013-02-23 Thread bejoy_ks
Hi Austin AFAIK at the moment you can control permissions gracefully only on a data level not on the metadata level. ie you can play with the hdfs permissions . Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Austin Chungath Date: Fri, 22 Feb 20

Re: Security for Hive

2013-02-22 Thread bejoy_ks
Hi Sachin Currently there is no such admin user concept in hive. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Sachin Sudarshana Date: Fri, 22 Feb 2013 16:40:49 To: Reply-To: user@hive.apache.org Subject: Re: Security for Hive Hi, I have rea

Re: Adding comment to a table for columns

2013-02-21 Thread bejoy_ks
Hi Gupta Try out DESCRIBE EXTENDED FORMATTED I vaguely recall a operation like this. Please check hive wiki for the exact syntax. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Chunky Gupta Date: Thu, 21 Feb 2013 17:15:37 To: ; ; Reply-To:

Re: Adding comment to a table for columns

2013-02-21 Thread bejoy_ks
Hi Gupta You can the describe output in a formatted way using DESCRIBE FORMATTED ; Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Chunky Gupta Date: Thu, 21 Feb 2013 16:46:30 To: Reply-To: user@hive.apache.org Subject: Adding comment to a tab

Re: Running Hive on multi node

2013-02-21 Thread bejoy_ks
Hi Hamad Fully distributed is a proper cluster where all demons are not on the same machine. You can have hadoop installed in three modes - Stand Alone - Pseudo Distributed (all daemons in same machine) and - Fully Distributed Regards Bejoy KS Sent from remote device, Please excuse typos --

Re: Running Hive on multi node

2013-02-21 Thread bejoy_ks
Hi Hive uses the hadoop installation specified in HADOOP_HOME. If your hadoop home is configured for fully distributed operation it'll utilize the cluster itself. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Hamza Asad Date: Thu, 21 Feb 2013

Re: bucketing on a column with millions of unique IDs

2013-02-20 Thread bejoy_ks
Hi Li The major consideration you should give is regarding the size of bucket. One bucket corresponds to a file in hdfs and you should ensure that every bucket is atleast a block size or in the worst case atleast majority of the buckets should be. So based on the data size you should derive on

Re: CREATE EXTERNAL TABLE Fails on Some Directories

2013-02-15 Thread bejoy_ks
Hi Joseph There are differences in the following ls commands cloudera@localhost data]$ hdfs dfs -ls /715 This would list out all the contents in /715 in hdfs, if it is a dir Found 1 items -rw-r--r--   1 cloudera supergroup    7853975 2013-02-14 17:03 /715 The output clearly defines it is file

Re: Map join optimization issue

2013-02-15 Thread bejoy_ks
Hi In later versions of hive you actually don't need a map joint hint in your query. Just the following would suffice the purpose Set hive.auto.convert.join=true Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Mayuresh Kunjir Date: Fri, 15 Fe

Re: LOAD HDFS into Hive

2013-01-25 Thread bejoy_ks
Hi Venkataraman You can just create an external table and give it location as the hdfs dir where the data resides. No need to perform an explicit LOAD operation here. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: venkatramanan Date: Fri, 25

Re: An explanation of LEFT OUTER JOIN and NULL values

2013-01-24 Thread bejoy_ks
Hi David, The default partitioner used in map reduce is the hash partitioner. So based on your keys they are send to a particular reducer. May be in your current data set, the keys that have no values in table are all falling in the same hash bucket and hence being processed by the same reducer

Re: An explanation of LEFT OUTER JOIN and NULL values

2013-01-24 Thread bejoy_ks
Hi David An explain extended would give you the exact pointer. From my understanding, this is how it could work. You have two tables then two different map reduce job would be processing those. Based on the join keys, combination of corresponding columns would be chosen as key from mapper1 a

Re: Mapping HBase table in Hive

2013-01-13 Thread bejoy_ks
Hi Ibrahim. SQOOP is used to import data from rdbms to hbase in your case. Please get the schema from hbase for your corresponding table and post it here. We can point out how your mapping could be. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- Fr

Re: View with map join fails

2013-01-08 Thread bejoy_ks
Looks like there is a bug with mapjoin + view. Please check hive jira to see if there an issue open against this else file a new jira. From my understanding, When you enable map join, hive parser would create back up jobs. These back up jobs are executed only if map join fails. In normal cases

Re: Mapping HBase table in Hive

2013-01-08 Thread bejoy_ks
Hi Ibrahim The hive hbase integration totally depends on the hbase table schema and not the schema of the source table in mysql. You need to provide the column family qualifier mapping in there. Get the hbase table's schema from hbase shell. suppose you have the schema as Id CF1.qualifier1 CF1

Re: Map Reduce Local Task

2013-01-08 Thread bejoy_ks
Hi Santhosh As long as the smaller table size is in the range of a few MBs. It is a good candidate for map join. If the smaller table size is still more then you can take a look at bucketed map joins. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- F

Re: External table with partitions

2013-01-06 Thread bejoy_ks
Sorry, I din understand your query on first look through. Like Jagat said, you may need to go with a temp table for this. Do a hadoop fs -cp ../../a.* Create a external table with location as 'destn dir'. CREATE EXERNAL TABLE LIKE LOCATION '' ; NB: I just gave the syntax from memory. please

Re: External table with partitions

2013-01-06 Thread bejoy_ks
Hi Oded If you have created the directories manually that would come visible to the hive table only if the partitions/ sub dirs are added to the meta data using 'ALTER TABLE ... ADD PARTITION' . Partitions are not retrieved implicitly into hive tabe even if you have a proper sub dir structure.

Re: Map side join

2012-12-13 Thread bejoy_ks
Hi Souvik To have the new hdfs block size in effect on the already existing files, you need to re copy them into hdfs. To play with the number of mappers you can set lesser value like 64mb for min and max split size. Mapred.min.split.size and mapred.max.split.size Regards Bejoy KS Sent from

Re: Map side join

2012-12-13 Thread bejoy_ks
Hi Souvik Is your input files compressed using some non splittable compression codec? Do you have enough free slots while this job is running? Make sure that the job is not running locally. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From: Souvik

Re: Map side join

2012-12-12 Thread bejoy_ks
Hi Souvik Apart from hive jobs is the normal mapreduce jobs like the wordcount running fine on your cluster? If it is working, for the hive jobs are you seeing anything skeptical in task, Tasktracker or jobtracker logs? Regards Bejoy KS Sent from remote device, Please excuse typos -Ori

Re: Map side join

2012-12-07 Thread bejoy_ks
Hi Souvik In earlier versions of hive you had to give the map join hint. But in later versions just set hive.auto.convert.join = true; Hive automatically selects the smaller table. It is better to give the smaller table as the first one in join. You can use a map join if you are joining a smal

Re: Hive | HBase Integration

2012-02-28 Thread bejoy_ks
Hi Rinku Were you able to create a normal table within your hive without any issues? By Normal table I mean the one that has data dir in hdfs not in HBase. Regards Bejoy K S From handheld, Please excuse typos. -Original Message- From: "Garg, Rinku" Date: Wed, 29 Feb 2012 05:29:

Re: parallel inserts ?

2012-02-15 Thread bejoy_ks
Hi John Yes Insert is parallel in default for hive. Hive QL gets transformed to mapreduce jobs and hence definitely it is parallel. The only case it is not parallel is when you have just 1 reducer . It is just reading and processing the input files and in parallel using map reduce jobs fr

Re: Doubt in INSERT query in Hive?

2012-02-15 Thread bejoy_ks
Bhavesh In this case if you are not using INSERT INTO, you may need some tmp table write the query output to that. Load that data from there to your target table's data dir. You are not writing that to any file while doing the LOAD DATA operation. Rather you are just moving the files(in

Re: Doubt in INSERT query in Hive?

2012-02-15 Thread bejoy_ks
Hi Bhavesh INSERT INTO is supported in hive 0.8 . An upgrade would get you things rolling. LOAD DATA inefficient? What was the performance overhead you were facing here? Regards Bejoy K S From handheld, Please excuse typos. -Original Message- From: Bhavesh Shah Date: Wed, 15 Fe

Re: external partitioned table

2012-02-08 Thread bejoy_ks
Hi Koert As you are creating dir/sub dirs using mapreduce jobs out of hive, hive is unaware of these sub dirs. There is no other way in such cases other than an add partition DDL to register the dir with a hive partition. If you are using oozie or shell to trigger your jobs,you can accom

Re: Error when Creating an UDF

2012-02-06 Thread bejoy_ks
Hi One of your jar is not available and may be that has the required UDF or any related methods. Hive was not able to locate your first jar '/scripts/hiveMd5.jar does not exist' Just fix this with the correct location. Everything should work fine. Regards Bejoy K S From handheld, Please

Re: Important Question

2012-01-25 Thread bejoy_ks
Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements. For data mining and analytics you can mount Hive table over corresponding Hbase table and play on with SQL li

Re: Question on bucketed map join

2012-01-19 Thread bejoy_ks
Corrected a few typos in previous mail Hi Avrila Hi Avrila    AFAIK the bucketed map join is not default in hive and it happens only when the configuration parameter hive.optimize.bucketmapjoin is set to true. You may be getting the same execution plan because hive.optimize.bucketmapjoin

Re: Insert based on whether string contains

2012-01-04 Thread bejoy_ks
I agree with Matt on that aspect. The solution proposed by me was purely based on the sample data provided where there were 3 digit comma separated values. If there are chances of 4 digit values as well in event_list you may need to revisit the solution. Regards Bejoy K S -Original Messag

Re: Schemas/Databases in Hive

2011-12-22 Thread bejoy_ks
Also multiple databases have proved helpful for me in organizing tables into corresponding databases when you have quite a large number of tables to manage. Also I believe it'd be helpful in providing access restrictions. Regards Bejoy K S -Original Message- From: bejoy...@yahoo.com Da

Re: Schemas/Databases in Hive

2011-12-22 Thread bejoy_ks
Ranjith Hive do support multiple data bases if you are on some of the latest versions of hive try Create database testdb; Use testdb; It should give you what you are looking for. Regards Bejoy K S -Original Message- From: "Raghunath, Ranjith" Date: Thu, 22 Dec 2011 17:02:09 To:

Re: Loading data into hive tables

2011-12-08 Thread bejoy_ks
Adithya The answer is yes. SQOOP is the tool you are looking for. It has an import option to load data from from any jdbc compliant database into hive. It even creates the hive table for you by refering to the source db table. Hope It helps!.. Regards Bejoy K S -Original Message-

Re: Hive query failing on group by

2011-10-19 Thread bejoy_ks
Looks like some data problem. Were you using the GROUP BY query on same data set? But if count(*) also throws an error then it comes to square 1, installation/configuration problem with hive or map reduce. Regards Bejoy K S -Original Message- From: Mark Kerzner Date: Wed, 19 Oct 2011

Re: Hive query failing on group by

2011-10-19 Thread bejoy_ks
Mark To ensure your hive installation is fine run two queries SELECT * FROM trans LIMIT 10; SELECT * FROM trans WHERE ***; You can try this for couple of different tables. If these queries return results and work fine as desired then your hive could be working good. If it works good as the s

Re: Hive query failing on group by

2011-10-19 Thread bejoy_ks
Hi Mark What does your Map reduce job logs say? Try figuring out the error form there. From hive CLI you could hardly find out the root cause of your errors. From job tracker web UI < http://hostname:50030/jobtracker.jsp> you can easily browse to failed tasks and get the actual exception fr

Re: upgrading hadoop package

2011-09-01 Thread bejoy_ks
Hi Li AFAIK 0.21 is not really a stable version of hadoop . So if this upgrade is on a production cluster it'd be better to go in with 0.20.203. Regards Bejoy K S -Original Message- From: Shouguo Li Date: Thu, 1 Sep 2011 11:41:46 To: Reply-To: user@hive.apache.org Subject: upgrad

Re: Re:Re: Re: RE: Why a sql only use one map task?

2011-08-25 Thread bejoy_ks
Hi Daniel In the hadoop eco system the number of map tasks is actually decided by the job basically based no of input splits . Setting mapred.map.tasks wouldn't assure that only that many number of map tasks are triggered. What worked out here for you is that you were specifying that a

Re: Hive crashing after an upgrade - issue with existing larger tables

2011-08-18 Thread bejoy_ks
A small correction to my previous post. The CDH version is CDH u1 not u0 Sorry for the confusion Regards Bejoy K S -Original Message- From: Bejoy Ks Date: Thu, 18 Aug 2011 05:51:58 To: hive user group Reply-To: user@hive.apache.org Subject: Hive crashing after an upgrade - issue with ex

Re: how to load data to partitioned table

2011-08-14 Thread bejoy_ks
Ya I very much agree with you on those lines. Using the basic stuff would literally run into memory issues with large datasets. I had some of those resolved by using the DISTRIBUTE BY clause and so. In short a little work around over your hive queries could help you out in some cases. Regards B

Re: how to load data to partitioned table

2011-08-12 Thread bejoy_ks
Hi Daniel Just having a look at your requirement , to load data into a partition based hive table from any input file the most hassle free approach would be. 1. Load the data into a non partitioned table that shares similar structure as the target table. 2. Populate the target table with t

Re: why need to copy when run a sql with a single map

2011-08-10 Thread bejoy_ks
Hi Hive queries are parsed into hadoop map reduce jobs. In map reduce jobs, between map and reduce tasks there are two phases, copy-phase and sort-phase together known as sort and shuffle phase. So the copy task indicated in hive job here should be the copy phase of map reduce. It does the co

Re: Hive or pig for sequential iterations like those using foreach

2011-08-08 Thread bejoy_ks
Thanks Amareshwari, the article gave me much valuable hints to decide my choice. But on curiosity, does hive support stage by stage iterative processing? If so how? Thank You Regards Bejoy K S -Original Message- From: Amareshwari Sri Ramadasu Date: Mon, 8 Aug 2011 17:14:21 To: user@h

Hive or pig for sequential iterations like those using foreach

2011-08-08 Thread bejoy_ks
Hi I've been successful using hive for a past few projects. Now for a particular use case I'm bit confused what to choose, Hive or Pig. My project involves a step by step sequential work flow. In every step I retrieve some values based on some query, use these values as input to new queries

Re: NPE with hive.cli.print.header=true;

2011-08-01 Thread bejoy_ks
Hi Ayon AFAIK hive is supposed to behave so. If you set the hive.cli.print.header=true for enabling column headers then some commands like 'desc' is not expected to work. Not sure whether there is some patch recently out for this. Regards Bejoy K S -Original Message- From: Ayon Sin

Re: Partition by existing field?

2011-07-08 Thread bejoy_ks
Hi Travis From my understanding of your requirement, Dynamic Partitions in hive is the most suitable solution. I have written a blogpost on such requirements please refer http://kickstarthadoop.blogspot.com/2011/06/how-to-speed-up-your-hive-queries-in.html for an understanding on the

Re: Hive create table

2011-05-25 Thread bejoy_ks
Hi Jinhang I don't think hive supports multi character delimiters. The hassle free option here would be to preprocess the data using mapreduce to replace the multi character delimiter with another permissible one that suits your data. Regards Bejoy K S -Original Message- From: jin

Re: Hive map join - process a little larger tables withmoderatenumber of rows

2011-04-01 Thread bejoy_ks
Thanks for your reply Viral. However in later versions of hive you don't have to tell hive anything (which is the smaller table) . During runtime hive itself identifies the smaller table and do the local map task on the same irrespective of whether it comes on left or right side of the join. Th

Re: Hive map join - process a little larger tables withmoderatenumber of rows

2011-04-01 Thread bejoy_ks
Thanks Yongqiang . I worked for me and I was able to evaluate the performance. It proved to be expensive :) Regards Bejoy K S -Original Message- From: yongqiang he Date: Thu, 31 Mar 2011 22:27:26 To: ; Reply-To: user@hive.apache.org Subject: Re: Hive map join - process a little larger

Re: Hive map join - process a little larger tables with moderatenumber of rows

2011-03-31 Thread bejoy_ks
Thanks Yongqiang for your reply. I'm running a hive script which has nearly 10 joins within. From those joins all map joins(9 of them involves one small table) involving smaller tables are running fine. Just 1 join is on two larger tables and this map join fails, however since the back up task(c

Re: Hadoop error 2 while joining two large tables

2011-03-17 Thread bejoy_ks
Try out CDH3b4 it has hive 0.7 and the latest of other hadoop tools. When you work with open source it is definitely a good practice to upgrade those with latest versions. With newer versions bugs would be minimal , performance would be better and you get more functionalities. Your query looks f