Re: Is there a metastore schema script for postgresql for Hive version 0.9.0

2012-05-11 Thread Jagat
May be you can see razorsql to convert schemas. --- Sent from Mobile , short and crisp. On 12-May-2012 11:58 AM, "Xiaobo Gu" wrote: > ** > I can't find it in the release package. > > -- > Xiaobo Gu >

How to create index on HBase?

2012-05-11 Thread ransom.hezhiqiang
Hi all I create index on hbase faild . This is my sql. How to create index on HBase? create index i_hhive on hhive(c1,c2) as "compact" with deferred rebuild STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val,cf1:val2,cf1:va

Re: changing field delimiter for an existing table?

2012-05-11 Thread Igor Tatarinov
That's exactly what I was looking for! Thanks! On Fri, May 11, 2012 at 5:34 PM, David Kulp wrote: > You're right. I assumed there was a corresponding ALTER TABLE foo SET ROW > FORMAT ... > But I found the answer in the archives. Modify the SERDE properties, e.g. > SET SERDEPROPERTIES ('field.d

How to create index on HBase?

2012-05-11 Thread ransom.hezhiqiang
Hi all I create index on hbase faild . This is my sql. How to create index on HBase? create index i_hhive on hhive(c1,c2) as "compact" with deferred rebuild STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val,cf1:val2,cf1:va

RE: how to select without Mapreduce after index build?

2012-05-11 Thread ransom.hezhiqiang
Thanks Shrikanth In my test, I have 120MB+ text data, 4 cols. I build index for 2 cols. In compact index. Index size is 340MB+ In first step query, it will also scan all index data. So I think I should choose right cols to create index, and the index size will be more smaller ,is it right

Re: how to select without Mapreduce after index build?

2012-05-11 Thread shrikanth shankar
My understanding is that the scan of the index is used to remove splits that are known not to contain matching data. If you remove enough splits the second MR task will run much faster. The index should also be much smaller than the base table and that MR task should be much cheaper Shrikanth O

RE: how to select without Mapreduce after index build?

2012-05-11 Thread ransom.hezhiqiang
Thanks Ashish the query will be split into three steps after index build. 1、 query from index table and get the offset. 2、 Move result. 3、 Get select result by offset. So I think the query will be more slow then no index because it has more step and has two mapreduce task in query.

Re: changing field delimiter for an existing table?

2012-05-11 Thread David Kulp
You're right. I assumed there was a corresponding ALTER TABLE foo SET ROW FORMAT ... But I found the answer in the archives. Modify the SERDE properties, e.g. SET SERDEPROPERTIES ('field.delim' = '|'); http://osdir.com/ml/hive-user-hadoop-apache/2009-12/msg00109.html On May 11, 2012,

Re: changing field delimiter for an existing table?

2012-05-11 Thread Igor Tatarinov
Thanks but that requires fixing the table schema. Actually, I haven't found a way to change the delimiters of an existing table (created with a LIKE statement). I did find a workaround. While I don't know the schema of the data, I do know the number of columns, so I am going to create a table with

Re: changing field delimiter for an existing table?

2012-05-11 Thread David Kulp
Here is the default textfile. Substitute delimiters as necessary. CREATE TABLE ... ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\002' MAP KEYS TERMINATED BY '\003' LINES TERMINATED BY '\n' STORED AS TEXTFILE; On May 11, 2012, at 5:58 PM, Igor Tatarinov wrot

changing field delimiter for an existing table?

2012-05-11 Thread Igor Tatarinov
Is that possible? What I am trying to do is create an S3 table using CTAS. Since CTAS doesn't allow specifying a location, I have to create a managed table first: CREATE TABLE T AS SELECT ...; (I don't want to fix T's schema because the list of selected expressions is dynamically generated and c

Re: Hive equivalent of group_concat() ?

2012-05-11 Thread Edward Capriolo
The main issue with group_concat is that aggregates have to keep each column in memory and that is a big problem. If The user knows the list will be small you could write a UDAF like collectset, collect which puts each value into a list and then you can lateral view that list. Edward On Fri, May

Hive equivalent of group_concat() ?

2012-05-11 Thread Saurabh S
As far as I understand, there is no equivalent of MySQL group_concat() in Hive. This stackoverflow question is from Sept 2010: http://stackoverflow.com/questions/3703740/combine-multiple-rows-into-one-space-separated-string Does anyone know any other method to create a delimited list from from

Re: how to select without Mapreduce after index build?

2012-05-11 Thread Ashish Thusoo
Indexing in Hive works through map/reduce. There are no active components in Hive (such as the region servers in Hbase), so the way the index is basically used is by running the map/reduce job on the table that holds the index data to get all the relevant offsets into the main table and then using

Hex and UnHex UDFs

2012-05-11 Thread Tucker, Matt
I'm trying to convert a base-16 string to a base-10 bigint, but I'm getting strange results using unhex(). When I convert 2819892256088064694 with hex(), I get "27224511050102B6". Converting it back using unhex(), I get strange output ('"E). Casting the output to bigint does not resolve the i

Re: The Confused question of hive udf

2012-05-11 Thread Mark Grover
Hi, There isn't much information here for us to say much. Could you share you code so we can take a look at what could be wrong? Possibly, issue a query like: select createtime, minf(createtime) as minf from table limit 500; so you can see if there is anything out of the ordinary with the parame

Is it possible to execute Hive queries parallelly by writing mapper and reducer

2012-05-11 Thread Bhavesh Shah
Hello all, I am asking you about the increasing the performance of Hive. I tried with mappers and reducers but I didn't see difference in execution. Don't know why, may be I did in some other way which may be not correct or due to some other reason. I am thinking that Is it possible to execute Hiv

Re: hive failed execution error return code 2 from org.apache.hadoop.hive.ql.exec.mapredtask

2012-05-11 Thread Vinod Singh
At times Hive error message could be misleading. I face similar error message while running query by embedding Hive in my application. Though actual error message in my case is, which is not propagated properly- FAILED: Error in semantic analysis: Line 3:2 Invalid function abc See if you are doin

Re: hive failed execution error return code 2 from org.apache.hadoop.hive.ql.exec.mapredtask

2012-05-11 Thread Bhavesh Shah
One more I problem I am facing is that, My program is executing well, All queries are executing one after another and giving the result. But there is a one query at which always program gives me error as: java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: Execution Error, return

The Confused question of hive udf

2012-05-11 Thread 王锋
Hi, we have an udf called minf which can change current time to one point . for example 20120510:00:00:00 --> minf 2012051, 20120510:00:06:00 -> minf 20120510001 20120510:10:51:38 -> minf 20120510130 we test the minf function