Re: Issue with creating HIVE metadata for a HBASE table with 2000 + columns

2013-06-10 Thread Nitin Pawar
Can you share your hive version? On Sat, Jun 8, 2013 at 12:03 AM, Stephen Sprague wrote: > I would venture to say if you haven't got a reply nobody particularly has > anything useful to add. > > you have a meta data error there. the error message shows you the table > name. You're going to hav

Re: Issue with creating HIVE metadata for a HBASE table with 2000 + columns

2013-06-10 Thread Nitin Pawar
I just took a look at the error stack and hive schema definition. >From error, it looks like you are hitting the char length limit on postgres schema table SERDE_PARAMS. if your values for the column on postgres are more than 4000 chars then I would recommend you alter your postgres meta data tab

RE: Issue with creating HIVE metadata for a HBASE table with 2000 + columns

2013-06-10 Thread shouvanik.haldar
Hi, "if your values for the column on postgres are more than 4000 chars then I would recommend you alter your postgres meta data table to have a bigger limit." How to do the above? Actually, while looking at the logs, I found out that /usr/lib/hive/scripts/metastore/upgrade/postgres location,

Sequence file compression in Hive

2013-06-10 Thread Sachin Sudarshana
Hi, I have a table stored as SEQUENCEFILE in hive-0.10,* facts520_normal_seq* Now, i wish to create another table stored as a SEQUENCEFILE itself, but compressed using the Gzip codec. So, i set the compression codec and type as BLOCK and then executed the following query: *SET hive.exec.compres

Re: Issue with creating HIVE metadata for a HBASE table with 2000 + columns

2013-06-10 Thread Rob Roland
You should consult the PostgreSQL documentation: http://www.postgresql.org/docs/9.1/static/sql-altertable.html http://www.postgresql.org/docs/9.1/static/datatype-character.html Essentially, connect to your PostgreSQL instance as a super-user and issue: ALTER TABLE serde_params ALTER COLUMN para

Use of virtual columns in joins

2013-06-10 Thread Peter Marron
Hi, I'm using hive 0.10.0 over hadoop 1.0.4. I have created a couple of test tables and found that various join queries that refer to virtual columns fail. For example the query: SELECT * FROM a JOIN b ON b.rownumber = a.number; works but the following three queries all fail. SELECT *,a.BLOCK

RE: Compression in Hive

2013-06-10 Thread Ravi Mummulla (BIG DATA)
Documentation is here https://cwiki.apache.org/confluence/display/Hive/CompressedStorage. Performance overhead is trivial for larger amounts of data but may be magnified as data size gets smaller. Typically where you gain is data transfers between nodes and disk reads/writes. Again, the larger

Create table like with partitions

2013-06-10 Thread Peter Marron
Hi, Using hive 0.10.0 over hadoop 1.0.4 I have a (non-partitioned) table with loads of columns. I would like to create a partitioned table with the same set of columns. So the approach that I have been taking is to use "CREATE TABLE copy LIKE original;" then I can use ALTER TABLE to change the l

Re: Create table like with partitions

2013-06-10 Thread Nitin Pawar
If a table is not partitioned and then you want to partition the table on the data already written but data is not in partition format, that is not doable. Best approach would be, create a new table definition with the partition columns you want. turn on the dynamic partitioning system before you

Re: Create table like with partitions

2013-06-10 Thread Richa Sharma
Hi, Can you please point to documentation on Dynamic partitioning. I dont fully understand meaning of values for these parameters. Regards Richa On Mon, Jun 10, 2013 at 7:08 PM, Nitin Pawar wrote: > If a table is not partitioned and then you want to partition the table on > the data already

Re: Create table like with partitions

2013-06-10 Thread Owen O'Malley
You need to create the partitioned table and then copy the rows into it. create table foo_staging (int x, int y); create table foo(int x) partitioned by (int y) clustered by (x) into 16 buckets; set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.enforc

GROUP BY Issue

2013-06-10 Thread Gourav Sengupta
Hi, On running the following query I am getting multiple records with same value of F1 SELECT F1, COUNT(*) FROM ( SELECT F1, F2, COUNT(*) FROM TABLE1 GROUP BY F1, F2 ) a GROUP BY F1; As per what I understand there are multiple number of records based on number of reducers. Replicating the test

Re: Use of virtual columns in joins

2013-06-10 Thread Ashutosh Chauhan
You might be hitting into https://issues.apache.org/jira/browse/HIVE-4033in which case its recommended that you upgrade to 0.11 where in this bug is fixed. On Mon, Jun 10, 2013 at 1:57 AM, Peter Marron < peter.mar...@trilliumsoftware.com> wrote: > Hi, > > ** ** > > I’m using hive 0.10.0 ove

unsubscribe

2013-06-10 Thread Chris Kudelka

unsubscribe

2013-06-10 Thread tofunmibabatunde
--Original Message-- From: Chris Kudelka To: user@hive.apache.org ReplyTo: user@hive.apache.org Subject: unsubscribe Sent: 10 Jun 2013 4:53 PM Sent from my BlackBerry® wireless handheld from Glo Mobile.

Re: Sequence file compression in Hive

2013-06-10 Thread Stephen Sprague
On Mon, Jun 10, 2013 at 12:48 AM, Sachin Sudarshana wrote: what's the header of the first sequence file look like? *$ dfs -cat /user/hive/warehouse/facts_520.db/test3facts520_gzip_seq/00_0 | head *

Re: Sequence file compression in Hive

2013-06-10 Thread Alexander Pivovarov
Sachin, it works SET hive.exec.compress.output=true; SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; SET mapred.output.compression.type=BLOCK; create table data1_seq STORED AS SEQUENCEFILE as select * from date1; hadoop fs -cat /user/hive/warehouse/data1_seq/00_0

Re: Sequence file compression in Hive

2013-06-10 Thread Alexander Pivovarov
Sachin, it works SET hive.exec.compress.output=true; SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; SET mapred.output.compression.type=BLOCK; create table data1_seq STORED AS SEQUENCEFILE as select * from date1; hadoop fs -cat /user/hive/warehouse/data1_seq/00_0

Re: Sequence file compression in Hive

2013-06-10 Thread tofunmibabatunde
Unsubscribe Sent from my BlackBerry® wireless handheld from Glo Mobile. -Original Message- From: Alexander Pivovarov Date: Mon, 10 Jun 2013 11:37:30 To: Reply-To: user@hive.apache.org Subject: Re: Sequence file compression in Hive Sachin, it works SET hive.exec.compress.output=true; S

Re: Update statment on Hive

2013-06-10 Thread Renata Ghisloti Duarte de Souza
I got it. Makes total sense! Thank you for your explanation. On Fri, 2013-05-31 at 20:04 +, Sanjay Subramanian wrote: > Hi > Hive reads and writes to HDFSŠand by definition HDFS is write once and > immutable after that. > So like an RDBMS there is no concept of an update rows. > However if u

parse_url returning NULL

2013-06-10 Thread Mohammad Tariq
Hello list, I have a file stored in my HDFS which contains some urls. File looks like this : abc.in xyz.net http://tariq.com http://tariq.in/sompath And i'm trying to get the hostnames from these urls using *parse_url*. It works fine except for the urls which do not contain any scheme. S

Re: parse_url returning NULL

2013-06-10 Thread Edward Capriolo
It is not a valid URL if it does not have a scheme and can not be parsed. SELECT if (column like 'http%', column, concat( 'http://', column) ) as column might do what you need. On Mon, Jun 10, 2013 at 5:59 PM, Mohammad Tariq wrote: > Hello list, > > I have a file stored in my HDFS whi

Re: parse_url returning NULL

2013-06-10 Thread Mohammad Tariq
Hello Edward, Thank you so much for the quick response. I'll try it out. But I would like to know, is it something Hive specific?Links do work without a scheme, like *hive.apache.org*. Thank again. Warm Regards, Tariq cloudfront.blogspot.com On Tue, Jun 11, 2013 at 3:40 AM, Edward Capri