Re: Unit testing Hive script

2011-02-18 Thread Andrew Wilson
Hi Radek, I've been using the MiniMRCluster and MiniDFSCluster to run unit tests locally. That has been giving me decent cycle time. I have fixture tables in my test/resources which I can load into the MiniHiveCluster as part of test setup. I loosely based my code on QTestUtil in the Hive trunk

Re: date-time functions in hive

2011-02-18 Thread Edward Capriolo
On Fri, Feb 18, 2011 at 3:47 PM, Viral Bajaria wrote: > Hi, > I have a question regarding the existing date functions in Hive > (http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions) > The unix_timestamp() functions return a bigint while the from_unixtime() > accepts an int. I don'

date-time functions in hive

2011-02-18 Thread Viral Bajaria
Hi, I have a question regarding the existing date functions in Hive ( http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions) The unix_timestamp() functions return a bigint while the from_unixtime() accepts an int. I don't think that's the right thing to do ? the from_unixtime shoul

Re: Unit testing Hive script

2011-02-18 Thread Kirk True
Hi Radek, I'm actually in the process of running the map-join unit tests against EMR as we speak. It's possible but dog slow :) Thanks, Kirk On 2/18/11 11:09 AM, Edward Capriolo wrote: On Fri, Feb 18, 2011 at 6:58 AM, Radek Maciaszek wrote: Hello, I was wondering if anyone managed to unit

Re: left outer join and nulls

2011-02-18 Thread Ashish Thusoo
You could use the following construct if ( T.c is null, 0, T.c) Checkout the conditional functions at http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Conditional_Functions Ashish On Feb 18, 2011, at 12:01 AM, Ca

Re: Unit testing Hive script

2011-02-18 Thread Edward Capriolo
On Fri, Feb 18, 2011 at 6:58 AM, Radek Maciaszek wrote: > Hello, > I was wondering if anyone managed to unit test Hive scripts and share > his/her experience? My first thought was to prepare sample data, run hive > scripts in order to generate output and then compare the generated output > with th

RE: Hive Not Returning Column Names, even what not using 'When'??

2011-02-18 Thread Sirota, Peter
Hi Mark, Try this link for Hive .5 JDBC driver: http://aws.amazon.com/developertools/Elastic-MapReduce/0196055244487017 We are actually in Seattle office. Best Regards, Peter- From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com] Sent: Friday, February 18, 2011 6:26 AM To: 'user@hive.apach

Re: problem while performing union on twotables

2011-02-18 Thread sangeetha s
Hi, Thanks Jov and Ajo Changing from Union to Union all solved the issue. But do we need to specify all the fields in the sub query? Actually I had used the following query. INSERT OVERWRITE TABLE tab3 SELECT t3.col1,t3col2 FROM (SELECT id AS col1,name AS col2 FROM tab1 UNION ALL SELECT id AS c

RE: Hive Not Returning Column Names, even what not using 'When'??

2011-02-18 Thread Sunderlin, Mark
Hey Peter, this looks like it ought to work for me but The link to the hive 0.5 hive drivers ... seems broken??? http://buyitnw.appspot.com/aws.amazon.com/developertools/Elastic-MapReduce/0196055244487017 seems to be the link from the site you mention below, but it returns a blank page?

Re: problem while performing union on twotables

2011-02-18 Thread Ajo Fod
Here is the relevant documentation: http://wiki.apache.org/hadoop/Hive/LanguageManual ... see the Union section. Cheers, Ajo. On Thu, Feb 17, 2011 at 11:12 PM, sangeetha s wrote: > Hi, > > I am trying to perform union of two tables which are having identical > schemas and distinct data.There ar

Unit testing Hive script

2011-02-18 Thread Radek Maciaszek
Hello, I was wondering if anyone managed to unit test Hive scripts and share his/her experience? My first thought was to prepare sample data, run hive scripts in order to generate output and then compare the generated output with the expected output. Sounds fairly simple but it may be a bit compli

OutOfMemory errors on joining 2 large tables.

2011-02-18 Thread Bennie Schut
When we try to join two large tables some of the reducers stop with an OutOfMemory exception. Error: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) at org.apache.hadoop.mapred.ReduceTask$Re

Re: problem while performing union on twotables

2011-02-18 Thread Jov
hive0.4.1 do not support union,only support union all 在 2011-2-18 下午3:12,"sangeetha s" 写道: > > Hi, > > I am trying to perform union of two tables which are having identical > schemas and distinct data.There are two tables 'oldtable' and 'newtable'. > The old table contains the information of old us

left outer join and nulls

2011-02-18 Thread Cam Bazz
Hello, When we do a left outer join, and the right table does not have row, it will return NULL s for those values. is there any way to turn those nulls into 0's ? since it is cointing operation, if the right table does not have the row, it means 0's not nulls. best regards, -c.b.