Re: how to manage the result set?

2012-04-18 Thread Bhavesh Shah
Hello Andes, I don't know about the HBASE, And about your ResultSet : You can traverse your resultset like as usually like: ResultSet rs = stmt.executeQuery("select split(value,',') from table1"); while(rs.next()) { String Field1=rs.getString() or rs.getInt(); and do some operation you w

Row Group Size of RCFile

2012-04-18 Thread Ladda, Anand
How do I set the Row Group Size of RCFile in Hive CREATE TABLE OrderFactPartClustRcFile( order_id INT, emp_id INT, order_amt FLOAT, order_cost FLOAT, qty_sold FLOAT, freight FLOAT, gross_dollar_sales FLOAT, ship_date STRING, rush_order STRING, customer_id INT, pymt_type INT,

how to manage the result set?

2012-04-18 Thread ylyy-1985
I have managed to get the hbase table throught hive using: ResultSet rs = stmt.executeQuery("select split(value,',') from table1"); but how is the result set look like? and if I want to put this result set back to hive table,how to mangage this using java code? thanks in advanced! 2012-04-19

Re: Median in Hive

2012-04-18 Thread sonia gehlot
Great, Thanks Roberto. On Wed, Apr 18, 2012 at 3:42 PM, Roberto Sanabria wrote: > Sure, use this function: > > percentile(BIGINT col, p) > > and set p to be 0.5 > > Cheers, > R > > On Wed, Apr 18, 2012 at 3:32 PM, sonia gehlot wrote: > >> Hi Guys, >> >> Any idea, how to calculate median in Hive?

Lifecycle and Configuration of a hive UDF

2012-04-18 Thread Ranjan Bagchi
Hi, What's the lifecycle of a hive udf. If I call select MyUDF(field1,field2) from table; Then MyUDF is instantiated once per mapper, and within each mapper execute(field1, field2) is called for each reducer? I hope this is the case, but I can't find anything about this in the documentation

Re: Median in Hive

2012-04-18 Thread Roberto Sanabria
Sure, use this function: percentile(BIGINT col, p) and set p to be 0.5 Cheers, R On Wed, Apr 18, 2012 at 3:32 PM, sonia gehlot wrote: > Hi Guys, > > Any idea, how to calculate median in Hive? Do we have any inbuilt function > for it? > > Thanks for any help. > > Sonia >

Re: Can we define external table Fields enclosed in "

2012-04-18 Thread Edward Capriolo
The most optimal way would be to create an InputFormat and or SerDe. On Wed, Apr 18, 2012 at 4:37 PM, shashwat shriparv wrote: > Check out this thread too : > > http://mail-archives.apache.org/mod_mbox/hbase-user/201204.mbox/%3ccaaxmexxpho7fr4939aljyse1j2unvqxom3h+zfvagroaqhg...@mail.gmail.com%3E

Re: Can we define external table Fields enclosed in "

2012-04-18 Thread shashwat shriparv
Check out this thread too : http://mail-archives.apache.org/mod_mbox/hbase-user/201204.mbox/%3ccaaxmexxpho7fr4939aljyse1j2unvqxom3h+zfvagroaqhg...@mail.gmail.com%3E or http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/25507 On Thu, Apr 19, 2012 at 1:31 AM, Mark Grover wrote: > Gopi,

Re: Can we define external table Fields enclosed in "

2012-04-18 Thread Mark Grover
Gopi, I was thinking something very similar to Tim's suggestion: CREATE EXTERNAL TABLE table_stg(ip STRING, id1 STRING, ts STRING, id2 STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '~' LOCATION 'my_hdfs_location'; CREATE VIEW my_view(ip, id1, ts, id2) AS SELECT substr(

Re: Hive can't find some tables

2012-04-18 Thread Tim Robertson
It sounds like you have run Sqoop without specifying a durable metastore for Hive. E.g. you haven't told Hive to use MySQL, PostGRES etc to store it's metadata. It probably used Derby DB which either put it all in memory, or put it all on the /tmp directory, which was destroyed on restart. I wou

Re: Can we define external table Fields enclosed in "

2012-04-18 Thread Tim Robertson
Hi again, How about defining a table (t1) with ~ as the delimiter and then creating a view to that table which uses the regexp_replace UDF? CREATE VIEW v_user_access AS SELECT regexp_replace(ip, "\"", "") as ip, ... FROM t1; Not sure the implications on joining, but basic queries should work ok

Re: parallel inserts ?

2012-04-18 Thread Edward Capriolo
Wow. This thread has some shelf life. Inserts in hive are not like inserts in most relational databases. "INSERTING" into a table typically involves moving files into a directory. So this case The bulk of the work is in the selecting half of the query. Hive knows the source and the destination met

Re: Can we define external table Fields enclosed in "

2012-04-18 Thread Gopi Kodumur
Thanks Tim, Sorry for not explaining the problem clearly...   I have data in this format ,  I wanted to store the data in Ext-Hive table without the Double Quote "127.0.0.17"~"444c1c9a-8820-11e1-aaa8-00219b8a879e"~"2012-04-17T00:00:01Z"~"476825ea-8820-11e1-a105-0200ac1d1c3d "127.0.0.12"~"544c

Re: Can we define external table Fields enclosed in "

2012-04-18 Thread Tim Robertson
I believe so. From the tutorial [1] : CREATE EXTERNAL TABLE page_view_stg(viewTime INT, userid BIGINT, page_url STRING, referrer_url STRING, ip STRING COMMENT 'IP Address of the User', country STRING COMMENT 'country of origination')

Re: parallel inserts ?

2012-04-18 Thread John B
Thanks. I would like to use the cloudera demo (vmware) vm to test the actual performance of this. https://ccp.cloudera.com/display/SUPPORT/Cloudera%27s+Hadoop+Demo+VM It only has 2 vcores it seems. What setup would get the best performance on such a hive query with possibly a more complicated sel

Re: Re: how to split the value of hbase into hive

2012-04-18 Thread Nitin Pawar
you can checkout hive trunk and do a build it has implementation of accessing hbase tables directly through hive with HBaseStorageHandler. That will solve plenty of other problems which you might face during getting data from hbase to hive tables You can search on hive mailing lists. Vishal had s

回复: Re: how to split the value of hbase into hive

2012-04-18 Thread ylyy-1985
hi,Nitin and other friends: I see the UDF function is suited for me, but I don't really understand how to use it. Right now, I have put a record into Hbase, and scan it: hbase(main):015:0> scan 'table_wyl' ROW COLUMN+CELL r1

Re: Creating Hbase table with pre-splitted regions using Hive QL

2012-04-18 Thread Viral Bajaria
I doubt you can do that directly through the HIVE interface right now (atleast from what I know). Why don't you create a wrapper around the hive create table command i.e. write a simple java utility which will take your region split arguments, create a hbase table in code with the pre-split criter

回复: Re: how to split the value of hbase into hive

2012-04-18 Thread ylyy-1985
thanks Nitin. :) You are so nice. I will try this first. if I meet with problems later, pls be kind to help 2012-04-18 Best Regards Andes 发件人:Nitin Pawar 发送时间:2012-04-18 15:43 主题:Re: how to split the value of hbase into hive 收件人:"user" 抄送: you can use the inbuilt udf split(string str, st

Re: how to split the value of hbase into hive

2012-04-18 Thread Nitin Pawar
you can use the inbuilt udf split(string str, string pat) to split the string with the separator you want. It returns the array and you can access the array and insert array elements in the hive table On Wed, Apr 18, 2012 at 1:06 PM, ylyy-1985 wrote: > ** > ** > hi,all; > > I have a problem. I h