RE: Defining collection items terminated by for a nested data type

2012-09-27 Thread Manish . Bhoge
Hi Sadu, See my answer below. Also this will help you to understand in detail about collection, MAP and Array. http://datumengineering.wordpress.com/2012/09/27/agility-in-hive-map-array-score-for-hive/ From: Sadananda Hegde [mailto:saduhe...@gmail.com] Sent: Friday, September 28, 2012 10:31 AM

RE: Problem loading a CSV file

2012-09-27 Thread Savant, Keshav
Hi Sarath, Considering your two step approach... The load command by default searches for file in HDFS, so you are doing the same by following command hive> load data inpath '/user/hduser/dumps/table_dump.csv' overwrite into table table1; instead, you can use 'local' to tell hive that the CSV

Problem loading a CSV file

2012-09-27 Thread Sarath
Hi, I have created a new table using reference to a file on HDFS - /create external table table1 (field1 STRING, field2 STRING, field3 STRING, field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT, field8 STRING, field9 STRING) row format delimited fields terminated by ',' loc

Re: Can we write output directly to HDFS from Mapper

2012-09-27 Thread Hemanth Yamijala
Can certainly do that. Indeed, if you set the number of reducers to 0, the map output will be directly written to HDFS by the framework itself. You may also want to look at http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Task+Side-Effect+Files to see some things that need to be taken care

Re: Can we write output directly to HDFS from Mapper

2012-09-27 Thread Harsh J
Anand, You may read this in the FAQ: http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F On Fri, Sep 28, 2012 at 9:45 AM, Balaraman, Anand wrote: > Hi > > > > In Map-Reduce, is it appropriate to write the output directly to HDFS

Can we write output directly to HDFS from Mapper

2012-09-27 Thread Balaraman, Anand
Hi In Map-Reduce, is it appropriate to write the output directly to HDFS from Mapper (without using a reducer) ? Are there any adverse effects in doing so or are there any best practices to be followed in this aspect ? Comments are much appreciable at the moment J Thanks and Regards A

Re: how to load TAB_SEPRATED file in hive table

2012-09-27 Thread MiaoMiao
It is very much about your source file. If you choose wrong separators, then hive will not parse your file correctly. There are four kinds of separators, making it possible to clearlify lines, fields, collections, and keyvalues. You can refer to the following uri on this topic. https://cwiki.apache

Re: Performance tuning in hive

2012-09-27 Thread Abhishek
Hi Bejoy, Thanks for the reply.Can I know whether combination of 1) Indexing and Bucketing Or 2) bucketing with Rc file Or 3) sequence file with bucketing and indexing Or 4) map join with indexes Or Any other combination of above mentioned or non mentioned, would fetch a bette

Re: ERROR: Hive subquery showing

2012-09-27 Thread Chen Song
Sorry that I misunderstood the question. I think Phil's query will do the trick. On Thu, Sep 27, 2012 at 4:46 PM, Philip Tromans wrote: > How about: > select name from ABC order by grp desc limit 1? > > Phil. > On Sep 27, 2012 9:02 PM, "yogesh dhari" wrote: > >> Hi Bejoy, >> >> I tried this one

RE: ERROR: Hive subquery showing

2012-09-27 Thread Philip Tromans
How about: select name from ABC order by grp desc limit 1? Phil. On Sep 27, 2012 9:02 PM, "yogesh dhari" wrote: > Hi Bejoy, > > I tried this one also but here it throws horrible error: > > i.e: > > hive: select name from ABD where grp=MAX(grp); > > FAILED: Hive Internal Error: java.lang.NullPoi

Re: Reduce phase initialization

2012-09-27 Thread Bertrand Dechoux
Strictly speaking reduce can only start once all map is done. However copy is part of the reported reduce metrics. So that's maybe why you are asking the question. A simple answer is because you don't have much data so there is only one mapper which means the copy will have to start after all (only

RE: ERROR: Hive subquery showing

2012-09-27 Thread yogesh dhari
Hi Bejoy, I tried this one also but here it throws horrible error: i.e: hive: select name from ABD where grp=MAX(grp); FAILED: Hive Internal Error: java.lang.NullPointerException(null) java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(Exp

RE: ERROR: Hive subquery showing

2012-09-27 Thread yogesh dhari
thanks Chen, I want output like ( the name and grp having highest grp) D 8 for the table. name grp A 1 B 2 C 4 D 8 Query : select name from ( select MAX(grp) as name from ABC ) gy ; showing outpu

Re: ERROR: Hive subquery showing

2012-09-27 Thread Bejoy KS
Hi yogesh What about a query like this select name from ABC WHERE grp=MAX(grp); Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Chen Song Date: Thu, 27 Sep 2012 15:33:11 To: Reply-To: user@hive.apache.org Subject: Re: ERROR: Hive subquery showing

Re: ERROR: Hive subquery showing

2012-09-27 Thread Chen Song
Can you try this? * * *select name from ( select MAX(grp) as name from ABC ) gy ;* On Thu, Sep 27, 2012 at 3:29 PM, yogesh dhari wrote: > Hi all, > > I have a table called ABC, like > > namegrp > A 1 > B 2 > C 4 > D 8 > > I want the output lik

ERROR: Hive subquery showing

2012-09-27 Thread yogesh dhari
Hi all, I have a table called ABC, like namegrp A 1 B 2 C 4 D 8 I want the output like the name having greatest grp i.e D; I wrote a query: select name from ( select MAX(grp) from ABC ) gy ; but it gives error FAILED: Error in semantic ana

Re: Performance tuning in hive

2012-09-27 Thread Bejoy KS
Hi Abshiek You can have a look at join optimizations as well as group by optimizations Join optimization - Based on your data sets you can go in with map side join or bucketed map join or to enable map join -> set hive.auto.convert.join = true; to enable bucketed map join ->  set hive.optimize.

Error on hive web interface

2012-09-27 Thread Germain Tanguy
Hi I am a new user of Hive, I am on version 0.9.0. I try to use hive web interface and I have this error : 12/09/27 11:05:02 INFO hwi.HWIServer: HWI is starting up 12/09/27 11:05:02 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 12

Reduce phase initialization

2012-09-27 Thread Abhishek
Hi all, In some of my hive scripts, reduce phase is not starting until map finishes 100%.What would be the reason for this. Regards Abhi Sent from my iPhone

Performance tuning in hive

2012-09-27 Thread Abhishek
Hi all, I am trying to increase the performance of some queries in hive, all queries mostly contain left outer join , group by and conditional checks, union all. I have over riden some properities in hive shell Set io.sort.mb=512 Set io.sort.factor=100 Set mapred.child.jvm.opts=-Xmx2048mb Set

Re: Re: size of RCFile in hive

2012-09-27 Thread Chen Song
You can force reduce phase by adding distribute by or order by clause after your select query. On Thu, Sep 27, 2012 at 2:03 PM, 王锋 wrote: > but it's map only job > > > At 2012-09-27 05:39:39,"Chen Song" wrote: > > As far as I know, the number of files emitted would be determined by the > number

RE: zip file or tar file cosumption

2012-09-27 Thread Savant, Keshav
True Manish. Keshav C Savant From: Manish.Bhoge [mailto:manish.bh...@target.com] Sent: Thursday, September 27, 2012 4:26 PM To: user@hive.apache.org; manishbh...@rocketmail.com Subject: RE: zip file or tar file cosumption Thanks Savant. I believe this will hold good for .zip file also. Thank Yo

RE: zip file or tar file cosumption

2012-09-27 Thread Manish . Bhoge
Thanks Savant. I believe this will hold good for .zip file also. Thank You, Manish. From: Savant, Keshav [mailto:keshav.c.sav...@fisglobal.com] Sent: Thursday, September 27, 2012 10:19 AM To: user@hive.apache.org; manishbh...@rocketmail.com Subject: RE: zip file or tar file cosumption Manish the

RE: hive server security/authentication

2012-09-27 Thread Manish . Bhoge
Hi, We have been using data from Hive on Tableau. You need to have JDBC connection. I don't remember the exaction menu in Tableau. But having your metadata in MySQL and having JDBC connection on top of MySQL will allow you to access the data from Hive. If you have CDH3 then make sure you don't

Re: issue hive with external derby

2012-09-27 Thread AnilKumar B
Yes, after adding these jars it's working. Thanks Bertrand & Bejoy. On Thu, Sep 27, 2012 at 1:05 PM, Bejoy KS wrote: > Hi Anil Kumar > > The error comes basically due to lack of availability of connector jars in > class path. Please ensure you have derbyclient.jar and derbytools.jar in > /

unsubscribe

2012-09-27 Thread Kuldeep Chitrakar

Re: Hive 0.9.0 with hadoop 0.20.2 (fair scheduler mode)

2012-09-27 Thread Amit Sangroya
On Thu, Sep 27, 2012 at 10:56 AM, Amit Sangroya wrote: > Hello everyone, > > I am experiencing that Hive v-0.9.0 works with hadoop 0.20.0 only in > default scheduling mode. But when I try to use the "Fair" scheduler using > this configuration, I see that map reduce do not progress and hive log > s

Re: issue hive with external derby

2012-09-27 Thread Bejoy KS
Hi Anil Kumar The error comes basically due to lack of availability of connector jars in class path. Please ensure you have derbyclient.jar and derbytools.jar  in /hive/lib . In the wort case you need to add there jars in /hadoop/lib as well. A snippet from hive wiki Copy Derby Jar Fi