Re: need help in Mapreduce(urgent)

2012-01-09 Thread hadoop hive
Hey thanks for your reply i wanna ask that *setOutputFormatClass *works in Java New Api, please tell me if u have any idea. regards hadoophive On Mon, Jan 9, 2012 at 8:47 PM, Martin Kuhn wrote: > > > one more i wanna ask like how i can write output in different > directories according to key va

Re: hive and hdfs location

2012-01-09 Thread Aniket Mokashi
Created https://issues.apache.org/jira/browse/HIVE-2702 for the same. On Mon, Jan 9, 2012 at 10:33 PM, Aniket Mokashi wrote: > Programmatically, > listPartitionsByFilter on > HiveMetaStoreClient returns list of partitions for a given filter criteria > (which supports only string based partitions

Re: An Issue in org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe (All versions affacted)

2012-01-09 Thread Aniket Mokashi
It means if value associated with key FIELD_DELIM is absent, then use value associated with SERIALIZATION_FORMAT. Thanks, Aniket On Mon, Jan 9, 2012 at 10:52 PM, Lu, Wei wrote: > Hi there, > > ** ** > > Codes highlighted below may have some problem. SERIALIZATION_FORMAT should > be *FIELD

Re: How to calculate count of quarter wise record in Hive?

2012-01-09 Thread Bhavesh Shah
Thanks Aniket for the reply. But when I try this I got the error: hive> select quarter, count(*) from subset group by quarter; FAILED: Error in semantic analysis: Line 1:46 Invalid table alias or column reference quarter Is there any mistake in query. On Tue, Jan 10, 2012 at 12:04 PM, Aniket M

An Issue in org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe (All versions affacted)

2012-01-09 Thread Lu, Wei
Hi there, Codes highlighted below may have some problem. SERIALIZATION_FORMAT should be FIELD_DELIM. In file: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe public static SerDeParameters initSerdeParams(Configuration job, Properties tbl, String serdeName) throws SerDeException {

Re: RCFile in java MapReduce

2012-01-09 Thread Aniket Mokashi
A better way would be to mount a table on top of RCFiles and use http://incubator.apache.org/hcatalog/docs/r0.2.0/inputoutput.html#HCatInputFormat But, you will have to install and run hcatalog server for it. (Note: By default, hcatalog assumes underlying storage is RCFile, so you do not need to p

Re: How to calculate count of quarter wise record in Hive?

2012-01-09 Thread Aniket Mokashi
select quarter, COUNT(*) from table group by quarter? On Mon, Jan 9, 2012 at 10:06 PM, Bhavesh Shah wrote: > Hello, > I want to calculate count of quarter wise record in Hive. > (e.g.: In 1st Quarter - 72 (counts) likewise for other quarter) > > How can we calculate it through query or UDF in Hiv

Re: hive and hdfs location

2012-01-09 Thread Aniket Mokashi
Programmatically, listPartitionsByFilter on HiveMetaStoreClient returns list of partitions for a given filter criteria (which supports only string based partitions, we need to open a jira for that). For each of these partitions, you can ptn.getSd().getLocation() to get its location. Or you can use

How to calculate count of quarter wise record in Hive?

2012-01-09 Thread Bhavesh Shah
Hello, I want to calculate count of quarter wise record in Hive. (e.g.: In 1st Quarter - 72 (counts) likewise for other quarter) How can we calculate it through query or UDF in Hive? -- Thanks and Regards, Bhavesh Shah

hive and hdfs location

2012-01-09 Thread Chris Kudelka
Is there an easy/elegant way to query hive for a table's (and a partition's) location in HDFS? I'm aware that you can get the location using "describe extended table_name" but it like to be able to query on just the location key. If not, is there a way to do so using the mysql metastore db? Idea

Re: RCFile in java MapReduce

2012-01-09 Thread Yin Huai
I have some experiences using RCFile with new MapReduce API from the project HCatalog ( http://incubator.apache.org/hcatalog/ ). For the output part, In your main, you need ... > job.setOutputFormatClass(RCFileMapReduceOutputFormat.class); > > RCFileMapReduceOutputFormat.setColumnNumber(job.getCo

Re: Is there a way to track Hive jobs

2012-01-09 Thread Edward Capriolo
the hive service interface a method getClusterStatus() which gets information from the job tracker. I have a smaller project that mines JobTracker job tracker information and persists it to Cassandra for later examination. It has no interface but it does collect data. https://github.com/edwardcap

Is there a way to track Hive jobs

2012-01-09 Thread Mark Schramm (tetrascend)
Anyone, I would like to be able to submit Hive queries/jobs and track their progress (from a Java App). For example, I would like to submit a hive sql command string as a job, returning a job-ID and to be able query the hive server for status of jobs (e.g. queued, completed, executing hadoop j

Re: inconsistent results when doing a select over a join

2012-01-09 Thread Edward Capriolo
Create table, query , and some small data set to reproduce On Monday, January 9, 2012, Guy Doulberg wrote: > Thanks, I am trying to reproduce it again, > > But what should I send the ML? > > > > > On Mon 09 Jan 2012 07:54:24 PM IST, Edward Capriolo wrote: >> >> Can you reproduce the issue? possib

Re: inconsistent results when doing a select over a join

2012-01-09 Thread Guy Doulberg
Thanks, I am trying to reproduce it again, But what should I send the ML? On Mon 09 Jan 2012 07:54:24 PM IST, Edward Capriolo wrote: Can you reproduce the issue? possibly with the smaller tables and send that to the ML? Edward On Mon, Jan 9, 2012 at 12:46 PM, Guy Doulberg mailto:guy.dou

Re: inconsistent results when doing a select over a join

2012-01-09 Thread Edward Capriolo
Can you reproduce the issue? possibly with the smaller tables and send that to the ML? Edward On Mon, Jan 9, 2012 at 12:46 PM, Guy Doulberg wrote: > Hey Dave, > I didn't understand your question, > > The Inconsistant is slightly different, about 2% of differences, > > Thanks > > Guy > > On 01/0

Re: inconsistent results when doing a select over a join

2012-01-09 Thread Bejoy Ks
Hi Guy     The easily possible option to nail down the root cause is divide and conquer. You can try the following -ensure the results are consistent on individual tables without joins -try to narrow down the input to your join with a few ON condns You can get whether it is an issue with code

Re: inconsistent results when doing a select over a join

2012-01-09 Thread Guy Doulberg
Hey Dave, I didn't understand your question, The Inconsistant is slightly different, about 2% of differences, Thanks Guy On 01/09/2012 07:05 PM, David Houston wrote: Hi Guy, Inconsistant by way of the results are total off or the order is different? Thanks Dave On Jan 9, 2012 5:03 PM,

Re: inconsistent results when doing a select over a join

2012-01-09 Thread David Houston
Hi Guy, Inconsistant by way of the results are total off or the order is different? Thanks Dave On Jan 9, 2012 5:03 PM, "Guy Doulberg" wrote: > Hi guys, > > We are using hive for a while now, and recently we have encountered an > issue we just can't understand, > > We are selecting(the select

inconsistent results when doing a select over a join

2012-01-09 Thread Guy Doulberg
Hi guys, We are using hive for a while now, and recently we have encountered an issue we just can't understand, We are selecting(the select includes count(*)) over a join of two big tables. We ran the same query twice consequently over the same two tables , and each time the result were sl

Re: need help in Mapreduce(urgent)

2012-01-09 Thread Martin Kuhn
> one more i wanna ask like how i can write output in different directories > according to key values. It would be good to know your use case, but maybe you can partition your results according to the keys http://developer.yahoo.com/hadoop/tutorial/module5.html#partitioning and use a cust

Re: need help in Mapreduce(urgent)

2012-01-09 Thread hadoop hive
Thanks martin :) one more i wanna ask like how i can write output in different directories according to key values. On Mon, Jan 9, 2012 at 6:39 PM, Martin Kuhn wrote: > Hi Vikas, > > > 1:- How to format output from reduce( like default is tab separator can > we make it "," separator) > > If you

Re: need help in Mapreduce(urgent)

2012-01-09 Thread Martin Kuhn
Hi Vikas, > 1:- How to format output from reduce( like default is tab separator can we > make it "," separator) If you want this behaviour for all your Hadoop jobs, you have to put this into your mapred-site.xml: mapred.textoutputformat.separator , (see https://issue

Re: need help in Mapreduce(urgent)

2012-01-09 Thread vikas Srivastava
do you have any sample code of that , it ll be very helpful to me On Mon, Jan 9, 2012 at 6:08 PM, Bhavesh Shah wrote: > Hello, > when you use the context.write(key,value), it appears as (seperated by > tab): > (key)(value) > (Text)(IntWritable) > hive 1 > hadoop 2 > > I am also

Re: need help in Mapreduce(urgent)

2012-01-09 Thread Bhavesh Shah
Hello, when you use the context.write(key,value), it appears as (seperated by tab): (key)(value) (Text)(IntWritable) hive 1 hadoop 2 I am also new to Hadoop but faced this problems previously. What I did that I had combine the value in mapper by "," and then I passed it to contex

Re: need help in Mapreduce(urgent)

2012-01-09 Thread vikas Srivastava
Hi Bhavesh Shah , thanks for you reply but the this is , my Mapper is sending *context.write(new Text(catagoryAll), new IntWritable(val));* and my reducer is like *context.write(key,new IntWritable(sum));* * * and these provide result like hive 1 hadoop 2 and wat i want is output l

Re: need help in Mapreduce(urgent)

2012-01-09 Thread Bhavesh Shah
Hello, *1:- How to format output from reduce( like default is tab separator can we make it "," separator) * Instead of formatting output in reduce, you set it in map phase when you set the value for mapper. In that you can set according to our formayt e.g. word.set(fname+" "+lname); and gi

need help in Mapreduce(urgent)

2012-01-09 Thread vikas Srivastava
Hi folks, i have few question like 1:- How to format output from reduce( like default is tab separator can we make it "," separator) 2:- and how to make output in different directories according to reducer values. Thanks in advance r

Re: Doubt in Hive

2012-01-09 Thread Ankit Jain
No buddy, we can't do that. On Mon, Jan 9, 2012 at 3:57 PM, Bhavesh Shah wrote: > Hello all, > > Can we write the Hive Jdbc code in UDF? > > -- > Regards, > Bhavesh Shah > >

Doubt in Hive

2012-01-09 Thread Bhavesh Shah
Hello all, Can we write the Hive Jdbc code in UDF? -- Regards, Bhavesh Shah

RE: Problem with Hive->Hadoop

2012-01-09 Thread Ian.Meyers
I will attempt to downgrade to 0.20.2 of Hadoop libraries for hive to use - can anyone please advise if the below syntax is the correct method to change the Hadoop jar path? As an alternative I will change the Hadoop home for the session... Ian From: Meyers, Ian: IT (LDN) Sent: Monday, January

Re: Problem with Hive->Hadoop

2012-01-09 Thread Bhavesh Shah
Hello Ian, Refer this link: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/e804ebbe13a2f6fc -- Regards, Bhavesh Shah On Mon, Jan 9, 2012 at 3:35 PM, Ankit Jain wrote: > It will use hadoop lib by default.. requirement is hadoop must be in class > path > > On Mo

Re: Problem with Hive->Hadoop

2012-01-09 Thread Ankit Jain
Hi, It will use hadoop lib by default. Requirement is HADOOP_HOME must be set. Thanks, Ankit On Mon, Jan 9, 2012 at 3:30 PM, wrote: > I did see that, and to try and ensure the right Hadoop libs are being > used, I’ve created the following entry in hive-site.xml: > > ** ** > >

Re: Problem with Hive->Hadoop

2012-01-09 Thread Ankit Jain
It will use hadoop lib by default.. requirement is hadoop must be in class path On Mon, Jan 9, 2012 at 3:30 PM, wrote: > I did see that, and to try and ensure the right Hadoop libs are being > used, I’ve created the following entry in hive-site.xml: > > ** ** > > > >

RE: Problem with Hive->Hadoop

2012-01-09 Thread Ian.Meyers
I did see that, and to try and ensure the right Hadoop libs are being used, I've created the following entry in hive-site.xml: hive.aux.jars.path /apps/hadoop/hadoop-0.20.203.0/hadoop-core-0.20.203.0.jar,/apps/hadoop/hadoop-0.20.203.0/hadoop-tools-0.20.20

RE: Error in running hive query

2012-01-09 Thread Ian.Meyers
I don't appear to be getting any! It doesn't appear to be able to speak to the task tracker to create one in the first place. The full trace I get is listed again below. Being new to Hive, I don't know how to configure a hive log directory. Ian hive> select count(9) from currency_dim; Total Ma