Hi Jamie,
I am also working on a UDF that has to take 2 arguments. Hive detects the
number of arguments you declare in your evaluate method automatically it
seems so you can work with multiple arguments.
So far so good, but after that I have some problems that the 2nd argument
(String in my case)
Hey bejoy thats the problem i am not able to run the group by query in
hive i dunno whether i m making a mistake or some thing
see my previoius reply to this same thread i put up the same issue ...
On Wed, Jun 27, 2012 at 12:02 PM, Bejoy KS wrote:
> Hi Soham
>
> Rewrite your query with the colum
Hi Soham
Rewrite your query with the columns in Group By included in Select as well.
Something like
select country,name from users_info group by country;
Regards
Bejoy KS
Sent from handheld, please excuse typos.
-Original Message-
From: Soham Sardar
Date: Wed, 27 Jun 2012 11:57:23
And btw does group by works in hive because the same wuery i am
running in mysql and its working fine but its failing in hive
select name from users_info group by country;
in mysql its working but whn i try to run it in hive its telling
hive> select name from users_info group by country;
FAILE
Hi Denny,
I asked the same question a few days ago and got this reference to another
question:
"If you want to make your temporary function permanent , you have to patch
hive source code. Please refer to this discussion
http://mail-archives.apache.org/mod_mbox/hive-user/201101.mbox/%3caanlktimbx1
See Bejoy and Everyone ,
I have two tables
one users_info and one users_audit
in hive ..
hive> desc users_audit;
OK
id int
userid int
logtime string
hive> desc users_info;
OK
id int
namestring
age int
country string
gender string
bdaystring
n
Can you share the reason of choosing snappy as your compression codec?
Like @omalley mentioned, RCFile will compress the data more densely,
and will avoid reading data not required in your hive query. And I
think Facebook use it to store tens of PB (if not hundred PB) of data.
Thanks
Yongqiang
On
We have a scenario where we want to make a UDF permanent so that way a
query through the HiveODBC driver will be able to access the UDF. I seem
to recall that after creating the UDF, you can make it "permanent" by
adding it into the Function Registry.
But it seems that I also need to rebuild the
Thanks everyone,
This worked for me: from_unixtime(unix_timestamp(dt,'MMdd' ),
'-MM-dd').
-Sonia
On Mon, Jun 25, 2012 at 11:18 PM, VanHuy Pham wrote:
> More specific, you need to use three functions in a row:
> 1) Use unix_timestamp(string date, string pattern) to convert the date
> va
SequenceFile compared to RCFile:
* More widely deployed.
* Available from MapReduce and Pig
* Doesn't compress as small (in RCFile all of each columns values are put
together)
* Uncompresses and deserializes all of the columns, even if you are only
reading a few
In either case, for long te
This is session specific command.
Thanks,
Vinod
http://blog.vinodsingh.com/
On Tue, Jun 26, 2012 at 9:27 PM, Ladda, Anand wrote:
> I am connecting to Hive through a client tool via Hive Server. The
> client tool tries to set a database context by running the USE [database]
> command when it ma
I am connecting to Hive through a client tool via Hive Server. The client tool
tries to set a database context by running the USE [database] command when it
makes a connection. However, when I create another session from the client and
not specify any database context (i.e, implicitly connecting
Thanks! Bejoy. I'll let you know which way we are going.
Thanks,
Chalcy
From: Bejoy Ks [mailto:bejoy...@yahoo.com]
Sent: Tuesday, June 26, 2012 9:22 AM
To: user@hive.apache.org
Subject: Re: hive - snappy and sequence file vs RC file
Hi Chalcy
AFAIK, RC File format is good when your queries deal
Hi,
Check the hadoop logs of the failed task. My best guess is that there is an
uncaught exception thrown somewhere in your code. The logs will tell where
and what caused the problem.
Best regards,
Jan
On Tue, Jun 26, 2012 at 4:20 PM, Yue Guan wrote:
> Hi, hive users
>
> I have the following u
Hi, hive users
I have the following udf:
package com.name.hadoop.hive.udf;
import java.util.Set;
import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
public class MyUDF extends UDF {
private Map> aMapping;
private f
Hi Soham
Hive Supports pretty much all the primitive data types including INT. For a
detaild list please refer
https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-PrimitiveTypes
The only draw back as in common is when you have the data type as String you
cannot use it directly
Hi Chalcy
AFAIK, RC File format is good when your queries deal with some specific columns
and not on the whole data in a row. For a general purpose, Sequence File is a
better choice. Also it is widely adopted, so more tools will have support for
Sequence Files.
Regards
Bejoy KS
___
Hi Hive users,
We are going to use snappy for compression.
What is the best file format, sequence file or RC file? Both are splittable
and therefore will work well for us. RC file performance seems to be better
than Sequence file. Sqoop, looks like, may support --as-sequencefile tag
somet
I have a native data type in mysql and i just imported it into hive
and the data type of the column has now become string ..
Now i would like to know if there is any native data type in hive and
What are the pros and cons of using string type in hive rather than
(int)(thats what i expect ) type
An
Hi Shaik
On a first look, since you are using Dynamic Partition Insert, the partition
column should be the last column on select query used in Insert Overwrite.
Modify your Insert as
INSERT OVERWRITE TABLE vender_part PARTITION (order_date) SELECT
vender,supplier,quantity,order_date FROM ve
Hi Users,
As i created an hive table with the below syntax
CREATE EXTERNAL TABLE vender_part(vender string, supplier string,quantity
int ) PARTITIONED BY (order_date string) row format delimited fields
terminated by ',' stored as textfile;
And inserted the 100GB of data with the be
21 matches
Mail list logo