Re: How to execute query with timestamp type (Hbase/Hive integeration)

2012-05-30 Thread Peyton Peng
Hi Mark, thanks for your response, I tried with other data type, it seems the issue occur while query the timestamp field only, not sure how the timestamp mapping work... From the hbase, I seek the data and the value of timestamp(event_time) is: Wed May 30 16:15:06 CST 2012, should I should

Re: Help with TimeSeries serde

2012-05-30 Thread Russell Jurney
Thanks, I've got it working. I am making a chart UDF, to convert BigInts from count(6) to ** I am trying to find a way to select TimeSeries(count(*)) and group by day, but this results in an error: select to_date(dt) as total, TimeSeries(CAST(count(*) AS INT)) as stars, count(*

Re: Help with TimeSeries serde

2012-05-30 Thread Aniket Mokashi
If this is UDF, you will need hive-exec.jar to compile it. I am not sure what is the use of this udf. Serde has following interface-- public interface SerDe extends Deserializer, Serializer ~Aniket On Wed, May 30, 2012 at 9:51 PM, Russell Jurney wrote: > I tried to make a simple Serde that con

Re: How to execute query with timestamp type (Hbase/Hive integeration)

2012-05-30 Thread Mark Grover
Hi Peyton, It seems like something to do with timestamp mapping. What happens if you change your Hive table definition to have the event_time as int or string? Mark - Original Message - From: "Peyton Peng" To: user@hive.apache.org Sent: Wednesday, May 30, 2012 5:54:20 AM Subject: Re: Ho

Hive job tuning

2012-05-30 Thread Ranjith
I have been looking at a job that was performing badly. Noticed there were several splits occurring due to the buffer record limit being reached. I get that the io.sort.mb provides the data and record buffer for the mapper task. Given that the mapper jvm starts up with 500mb and the buffer is 30

Re: HIVE and S3 via EMR?

2012-05-30 Thread Russell Jurney
Thanks, I uploaded hive 0.9.0. Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com On May 30, 2012, at 1:22 PM, Mark Grover wrote: > Good catch, Pedro! > > Russell: Not sure how you can be using Hive 0.9 on EMR since EMR only > supports upto Hive 0.7.1. > > Check this

Hive inconsistently interpreting 'where' and 'group by'

2012-05-30 Thread Adam Laiacano
Hi all, I have an activity log stored in an external Hive table, LZO-compressed, and partitioned by 'dt' which is the date that the data was recorded. Because of time zones and when we dump the data into HDFS, there are about 22 hours of one day and 2 of the following in each partition. In the

Re: Hive 'rest' column

2012-05-30 Thread shrikanth shankar
I believe the default LazySerDe takes a parameter called 'serialization.last.column.takes.rest'. Setting this to true might solve your issue (restoMsg would become a string then and you might have to parse it in the query into an array) thanks, Shrikanth On May 30, 2012, at 9:27 AM, wrote:

Re: HIVE and S3 via EMR?

2012-05-30 Thread Mark Grover
Good catch, Pedro! Russell: Not sure how you can be using Hive 0.9 on EMR since EMR only supports upto Hive 0.7.1. Check this for details: http://aws.amazon.com/elasticmapreduce/faqs/#hive-9 Mark - Original Message - From: "Russell Jurney" To: user@hive.apache.org Sent: Wednesday, May

Re: HIVE and S3 via EMR?

2012-05-30 Thread Russell Jurney
You = Excellent Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com On May 29, 2012, at 11:06 PM, Pedro Figueiredo wrote: On 30 May 2012, at 02:17, Russell Jurney wrote: I've made the bucket - which is derived from the enron emails - available at s3:///rjurney_public_

Re: FW: Hive 'rest' column

2012-05-30 Thread Gireesh Subramanya
Ramon, If all the data is in one line, then you would need to preprocess the data, but from your explanation below it sounds like the lines terminated by a newline character after the | ? Thanks,Gireesh vivarasystems.com On Wed, May 30, 2012 at 9:27 AM, wrote: > Hi, > > > >I’m trying to d

FW: Hive 'rest' column

2012-05-30 Thread ramon.pin
Hi, I'm trying to define a table over an external file. My file has 12 fixed columns followed by a varying amount of columns that depends on some of the fixed ones. I tried to define the table as: CREATE EXTERNAL TABLE IF NOT EXISTS log_array ( dt string, txOperOpciResto string,

Re: Hive UDF error : numberformat exception (String to Integer) conversion

2012-05-30 Thread Edward Capriolo
Again. Suggest trapping exceptions with try/catch and return null. If you initialize a logger with log4j or commons logging your can log the event and find the failure information by clicking though the job tracker web interface to drill down to the error. On Wed, May 30, 2012 at 12:14 PM, pravee

Re: Hive UDF error : numberformat exception (String to Integer) conversion

2012-05-30 Thread praveenesh kumar
I have done both the things. There is no null issue here. Checked the nulls also. Sorry not mentioned in the code. I also have made a main function and called my evaluate function. If I am passing a string, its working fine. Problem is of numberformat exception. Integer.parseInt is throwing this..

Re: Hive UDF error : numberformat exception (String to Integer) conversion

2012-05-30 Thread Nitin Pawar
i won't tell the error but I would recommend to write a main function in your udf and try with sample inputs which you are expecting in your query. You will know whats the error you are committing On Wed, May 30, 2012 at 8:14 PM, Edward Capriolo wrote: > You should to try catch and return NULL o

Re: Hive UDF error : numberformat exception (String to Integer) conversion

2012-05-30 Thread Edward Capriolo
You should to try catch and return NULL on bad data. The issue is if you have a single bad row the UDF will throw a exception up the chain. It will try again, it will fail again, ultimately the job will fail. On Wed, May 30, 2012 at 10:40 AM, praveenesh kumar wrote: > Hello Hive Users, > > There

Hive UDF error : numberformat exception (String to Integer) conversion

2012-05-30 Thread praveenesh kumar
Hello Hive Users, There is a strange situation I am facing. I have a string column in my Hive table ( its IP address). I am creating a UDF where I am taking this string column and converting it into Long value. Its a simple UDF. Following is my code : package com.practice.hive.udf; public class

Re: How to execute query with timestamp type (Hbase/Hive integeration)

2012-05-30 Thread Peyton Peng
Actually I can execute the first sql and it works well, all the libs you specified is under the hive lib folder, I doubt if the issue is caused by the timestamp mapping between hbase with hive.. Regards, Peyton From: shashwat shriparv Sent: Wednesday, May 30, 2012 5:26 PM To: user@hive.apach

Re: How to execute query with timestamp type (Hbase/Hive integeration)

2012-05-30 Thread shashwat shriparv
Add these file to hive lib folder >>> hadoop-0.20-core.jar >>> hive/lib/hive-exec-0.7.1.jar >>> hive/lib/hive-jdbc-0.7.1.jar >>> hive/lib/hive-metastore-0.7.1.jar >>> hive/lib/hive-service-0.7.1.jar >>> hive/lib/libfb303.jar >>> lib/commons-logging-1.0.4.jar >>> slf4j-api-1.6.1.jar >>> slf4j-log4j

Hadoop BI Usergroup Stuttgart (Germany)

2012-05-30 Thread alo alt
For our german speaking folks, we want to start a Hadoop-BI Usergroup in Stuttgart (Germany), if you have interest in please visit our LinkedIn Group (http://www.linkedin.com/groups/Hadoop-Germany-4325443) and our Doodle poll (http://www.doodle.com/aqwsg4snbwimrsfc). If we figure out a real int

How to execute query with timestamp type (Hbase/Hive integeration)

2012-05-30 Thread Peyton Peng
Hi, I build the hive table mapped with hbase table, CREATE TABLE http_access(key string, client_ip string, client_port int, request_method string, event_time timestamp) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,client:ip