Re: Passing parameters to initialize UDF

2012-06-26 Thread Jasper Knulst
Hi Jamie, I am also working on a UDF that has to take 2 arguments. Hive detects the number of arguments you declare in your evaluate method automatically it seems so you can work with multiple arguments. So far so good, but after that I have some problems that the 2nd argument (String in my case)

Re: date datatype in hive

2012-06-26 Thread Soham Sardar
Hey bejoy thats the problem i am not able to run the group by query in hive i dunno whether i m making a mistake or some thing see my previoius reply to this same thread i put up the same issue ... On Wed, Jun 27, 2012 at 12:02 PM, Bejoy KS wrote: > Hi Soham > > Rewrite your query with the colum

Re: date datatype in hive

2012-06-26 Thread Bejoy KS
Hi Soham Rewrite your query with the columns in Group By included in Select as well. Something like select country,name from users_info group by country; Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Soham Sardar Date: Wed, 27 Jun 2012 11:57:23

Re: date datatype in hive

2012-06-26 Thread Soham Sardar
And btw does group by works in hive because the same wuery i am running in mysql and its working fine but its failing in hive select name from users_info group by country; in mysql its working but whn i try to run it in hive its telling hive> select name from users_info group by country; FAILE

Re: Making UDFs "permanent"

2012-06-26 Thread Jasper Knulst
Hi Denny, I asked the same question a few days ago and got this reference to another question: "If you want to make your temporary function permanent , you have to patch hive source code. Please refer to this discussion http://mail-archives.apache.org/mod_mbox/hive-user/201101.mbox/%3caanlktimbx1

Re: date datatype in hive

2012-06-26 Thread Soham Sardar
See Bejoy and Everyone , I have two tables one users_info and one users_audit in hive .. hive> desc users_audit; OK id int userid int logtime string hive> desc users_info; OK id int namestring age int country string gender string bdaystring n

Re: hive - snappy and sequence file vs RC file

2012-06-26 Thread yongqiang he
Can you share the reason of choosing snappy as your compression codec? Like @omalley mentioned, RCFile will compress the data more densely, and will avoid reading data not required in your hive query. And I think Facebook use it to store tens of PB (if not hundred PB) of data. Thanks Yongqiang On

Making UDFs "permanent"

2012-06-26 Thread Denny Lee
We have a scenario where we want to make a UDF permanent so that way a query through the HiveODBC driver will be able to access the UDF. I seem to recall that after creating the UDF, you can make it "permanent" by adding it into the Function Registry. But it seems that I also need to rebuild the

Re: Dates in Hive

2012-06-26 Thread sonia gehlot
Thanks everyone, This worked for me: from_unixtime(unix_timestamp(dt,'MMdd' ), '-MM-dd'). -Sonia On Mon, Jun 25, 2012 at 11:18 PM, VanHuy Pham wrote: > More specific, you need to use three functions in a row: > 1) Use unix_timestamp(string date, string pattern) to convert the date > va

Re: hive - snappy and sequence file vs RC file

2012-06-26 Thread Owen O'Malley
SequenceFile compared to RCFile: * More widely deployed. * Available from MapReduce and Pig * Doesn't compress as small (in RCFile all of each columns values are put together) * Uncompresses and deserializes all of the columns, even if you are only reading a few In either case, for long te

Re: Is the USE database command hive server-wide or session specific

2012-06-26 Thread Vinod Singh
This is session specific command. Thanks, Vinod http://blog.vinodsingh.com/ On Tue, Jun 26, 2012 at 9:27 PM, Ladda, Anand wrote: > I am connecting to Hive through a client tool via Hive Server. The > client tool tries to set a database context by running the USE [database] > command when it ma

Is the USE database command hive server-wide or session specific

2012-06-26 Thread Ladda, Anand
I am connecting to Hive through a client tool via Hive Server. The client tool tries to set a database context by running the USE [database] command when it makes a connection. However, when I create another session from the client and not specify any database context (i.e, implicitly connecting

RE: hive - snappy and sequence file vs RC file

2012-06-26 Thread Chalcy Raja
Thanks! Bejoy. I'll let you know which way we are going. Thanks, Chalcy From: Bejoy Ks [mailto:bejoy...@yahoo.com] Sent: Tuesday, June 26, 2012 9:22 AM To: user@hive.apache.org Subject: Re: hive - snappy and sequence file vs RC file Hi Chalcy AFAIK, RC File format is good when your queries deal

Re: join string in hive udf

2012-06-26 Thread Jan Dolinár
Hi, Check the hadoop logs of the failed task. My best guess is that there is an uncaught exception thrown somewhere in your code. The logs will tell where and what caused the problem. Best regards, Jan On Tue, Jun 26, 2012 at 4:20 PM, Yue Guan wrote: > Hi, hive users > > I have the following u

join string in hive udf

2012-06-26 Thread Yue Guan
Hi, hive users I have the following udf: package com.name.hadoop.hive.udf; import java.util.Set; import org.apache.commons.lang.StringUtils; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; public class MyUDF extends UDF { private Map> aMapping; private f

Re: date datatype in hive

2012-06-26 Thread Bejoy Ks
Hi Soham Hive Supports pretty much all the primitive data types including INT. For a detaild list please refer https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-PrimitiveTypes The only draw back as in common is when you have the data type as String you cannot use it directly

Re: hive - snappy and sequence file vs RC file

2012-06-26 Thread Bejoy Ks
Hi Chalcy AFAIK, RC File format is good when your queries deal with some specific columns and not on the whole data in a row. For a general purpose, Sequence File is a better choice. Also it is widely adopted, so more tools will have support for Sequence Files. Regards Bejoy KS ___

hive - snappy and sequence file vs RC file

2012-06-26 Thread Chalcy Raja
Hi Hive users, We are going to use snappy for compression. What is the best file format, sequence file or RC file? Both are splittable and therefore will work well for us. RC file performance seems to be better than Sequence file. Sqoop, looks like, may support --as-sequencefile tag somet

date datatype in hive

2012-06-26 Thread Soham Sardar
I have a native data type in mysql and i just imported it into hive and the data type of the column has now become string .. Now i would like to know if there is any native data type in hive and What are the pros and cons of using string type in hive rather than (int)(thats what i expect ) type An

Re: hi all

2012-06-26 Thread Bejoy KS
Hi Shaik On a first look, since you are using Dynamic Partition Insert, the partition column should be the last column on select query used in Insert Overwrite. Modify your Insert as INSERT OVERWRITE TABLE vender_part PARTITION (order_date) SELECT vender,supplier,quantity,order_date  FROM ve

hi all

2012-06-26 Thread shaik ahamed
Hi Users, As i created an hive table with the below syntax CREATE EXTERNAL TABLE vender_part(vender string, supplier string,quantity int ) PARTITIONED BY (order_date string) row format delimited fields terminated by ',' stored as textfile; And inserted the 100GB of data with the be