Re: Bucketing external tables

2013-04-16 Thread Bejoy KS
considered these, then can you please your CLI logs here so that we can help you better.   Regards, Bejoy KS From: Sadananda Hegde To: user@hive.apache.org Sent: Thursday, April 11, 2013 11:16 PM Subject: Re: Bucketing external tables I was able to load data

Re: pblm in joining two tables in hive.....

2012-11-23 Thread Bejoy KS
Hi Do a Left outer join A to B and do a null check on B's columns. SELECT A.* FROM A LEFT OUTER JOIN B ON (...) WHERE B.clmn IS NULL. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Manjinder Singh01 Date: Fri, 23 Nov 2012 09:03:06 To:

Re: Multiuser setup on Hive

2012-11-21 Thread Bejoy KS
Hi Austin In hive currently you can have permissions only on the hdfs layer not on the metastore. The current hive metastore don't have multiuser permission support. Any user will be able to drop the metadata information now. Regards Bejoy KS Sent from handheld, please excuse

Re: populating xml data in hive

2012-11-20 Thread Bejoy KS
You can use your custom mapreduce code. Just check the record type and if xml then preprocess to avoid new lines. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: iwannaplay games Date: Tue, 20 Nov 2012 14:29:18 To: Reply-To: user@hive.apache.org

Re: How does hive decide to launch how many map tasks?

2012-11-19 Thread Bejoy KS
; You can use mapred.max.split.size to control the split size while using CombineFileInputFormat. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Cheng Su Date: Mon, 19 Nov 2012 15:01:03 To: ; Reply-To: user@hive.apache.org Subject: Re: How does hive

Re: Why hadoop job pending for so long?

2012-11-16 Thread Bejoy KS
Hi Chen Hope you ensured that you have enough free slots in your queue/pool if you are using fair/capacity scheduler in your cluster. Some times the job initialization would take some time if there are larger number of partitions and lots of small input files in them. Regards Bejoy KS Sent

Re: How does hive decide to launch how many map tasks?

2012-11-16 Thread Bejoy KS
the split sizes. Mapred.min.split.size Mapred.max.split.size Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Cheng Su Date: Fri, 16 Nov 2012 14:39:57 To: Reply-To: user@hive.apache.org Subject: How does hive decide to launch how many map tasks? Hi

Re: Can I merge files after I loaded them into hive?

2012-11-15 Thread Bejoy KS
Hi Chen You can do it in hive as well. Enable hive merge and Insert OverWrite the Partition once agin with Select *. Hive.merge.mapfiles=true. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: "Bejoy KS" Date: Thu, 15 Nov 2012 08:10:12 To:

Re: Can I merge files after I loaded them into hive?

2012-11-15 Thread Bejoy KS
Identity mapper with split size set to the required large file size. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Cheng Su Date: Thu, 15 Nov 2012 16:03:44 To: Reply-To: user@hive.apache.org Subject: Can I merge files after I loaded them into hive? Hi

Re: Hive Join with distinct rows

2012-11-09 Thread Bejoy KS
Hi Praveen Have you tried applying DISTINCT without the brackets around T1.ID select distinct T1.ID, T1.url, T1.timestamp from T1 LEFT OUTER JOIN T2 on T1.ID= T2.ID AND T2.ID=NULL Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Praveen Kumar K J V

Re: Hive compression with external table

2012-11-06 Thread Bejoy KS
Hi Krishna Sequence Files + Snappy compressed would be my recommendation as well. It can be processed by managed as well as external tables. There is no difference in storage formats for managed and external tables. Also this can be consumed by mapred or pig directly. Regards Bejoy KS

Re: FAILED: Hive Internal Error

2012-10-27 Thread Bejoy KS
any other process running on the same. Or are you able to start your Name node if you change your port to say 9000 in core-site.xml? Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: sagar nikam Date: Sat, 27 Oct 2012 18:04:25 To: Reply-To: user

Re: Executing queries after setting hive.exec.parallel in hive-site.xml

2012-10-25 Thread Bejoy KS
jobs will be executed sequentially and this parameter won't have any effect. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Chunky Gupta Date: Thu, 25 Oct 2012 16:44:43 To: Reply-To: user@hive.apache.org Subject: Executing queries aft

Re: Query

2012-10-22 Thread Bejoy KS
Hi Venugopal If you like to have your column names along with result set you need to set 'hive.cli.print.header' to true. SET hive.cli.print.header=true; It works well over CLI, I have not tried it over jdbc. Regards Bejoy KS Sent from handheld, please excuse typos. -Origin

Re: How to run multiple Hive queries in parallel

2012-10-22 Thread Bejoy KS
. It is primarily used to control the number of slots used by each user/pool in a cluster. You can read more @ http://hadoop.apache.org/docs/mapreduce/r0.20.2/fair_scheduler.html Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Chunky Gupta Date: Mon

Re: How to run multiple Hive queries in parallel

2012-10-22 Thread Bejoy KS
user gets his fair share of task slots. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Chunky Gupta Date: Mon, 22 Oct 2012 17:27:45 To: Reply-To: user@hive.apache.org Subject: How to run multiple Hive queries in parallel Hi, I have one name node

Re: Implementing a star schema (facts & dimension model)

2012-10-22 Thread Bejoy KS
denormalization much I guess. Joins work well in hive. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Austin Chungath Date: Mon, 22 Oct 2012 16:47:04 To: Reply-To: user@hive.apache.org Subject: Implementing a star schema (facts & dimension model) Hi

Re: need help to write a query that does same as BETWEEN operator

2012-10-12 Thread Bejoy KS
Hi Praveen If Between is not supported in your hive version, you can replace Between using < and > . Like SELECT *FROM account a, timezone.g WHERE a.create_date >= g.start_date AND a.create_date <= g.end_date ; Regards Bejoy KS Sent from handheld, please excuse typos. -

Re: Need Help in Hive storage format

2012-10-11 Thread Bejoy KS
Hi Yogesh. It should be a simple delimited file with ^A character as the field delimiter. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: yogesh dhari Date: Thu, 11 Oct 2012 23:18:35 To: hive request Reply-To: user@hive.apache.org Subject: Need

Re: Compression of Intermediate Data

2012-10-06 Thread Bejoy KS
codec=org.apache.hadoop.io.compress.SnappyCodec; This property 'hive.exec.compress.intermediate ' Is used to enable compression of data in between multiple mapreduce jobs generated by a hive query. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Hadi Moshayedi D

Re: hive query fail

2012-10-03 Thread Bejoy KS
Hi Ajit Oracle as a metastore has popped up a few issues for us in the past. The most widely used metastore db is MySql which is pretty good. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Ajit Kumar Shreevastava Date: Thu, 4 Oct 2012 11:39:50

Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

2012-10-03 Thread Bejoy KS
nto effect.   Regards, Bejoy KS From: Raihan Jamal To: user@hive.apache.org Sent: Thursday, October 4, 2012 5:24 AM Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException Just to add here SojTimestampToDate will return da

Re: File Path and Partition names

2012-10-02 Thread Bejoy KS
r as a new partition on to required table using 'Alter Table Add Parition ...' Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Date: Tue, 2 Oct 2012 10:55:19 To: Reply-To: user@hive.apache.org Subject: File Path and Partition names Quick

Re: hive permissions issue on a database

2012-10-01 Thread Bejoy KS
future versions of hive. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Rahul Sarma Date: Mon, 1 Oct 2012 11:50:19 To: Reply-To: user@hive.apache.org Subject: hive permissions issue on a database I have a Hadoop cluster running CDH4 version. I am

Fw: Cartesian Product in HIVE

2012-09-30 Thread Bejoy KS
s goes well try doing map side join. hive> set auto.convert.join=true; hive>SELECT t2.col1,t3.col1FROM table2 t2JOIN table3 t3; --Original Message-- From: Abhishek To: user@hive.apache.org Cc: user@hive.apache.org Cc: Bejoy Ks Subject: Re: Cartesian Product in HIVE Sent: Oct 1, 2012 0

RE: zip file or tar file cosumption

2012-09-30 Thread Bejoy KS
Definitely Raja, but looks like the one for zip is blocked for some time now https://issues.apache.org/jira/browse/MAPREDUCE-210 Regards Bejoy KS > Date: Sun, 30 Sep 2012 12:41:29 -0700 > Subject: Re: zip file or tar file cosumption > From: thiruvath...@gmail.com > To: user@hiv

RE: zip file or tar file cosumption

2012-09-30 Thread Bejoy KS
Yes Manish, Zip is not supported in hadoop. You may have to use gzip instead.Regards Bejoy KS Subject: RE: zip file or tar file cosumption From: manishbh...@rocketmail.com To: user@hive.apache.org CC: chuck.conn...@nuance.com Date: Sun, 30 Sep 2012 20:35:35 +0530 Thanks Bejoy. I have

RE: Cartesian Product in HIVE

2012-09-30 Thread Bejoy KS
r two tables. Regards Bejoy KS > From: abhishek.dod...@gmail.com > Date: Sat, 29 Sep 2012 07:44:06 -0700 > Subject: Re: Cartesian Product in HIVE > To: user@hive.apache.org; bejoy...@yahoo.com > > Thanks for the reply Bejoy. > > I tried to map join, by setting the prope

RE: how to perform GROUP BY:: in pig for this

2012-09-30 Thread Bejoy KS
Hi Yogesh If you are looking for the solution in hive, then the following query will get you the required result Select month(Date), max(rate) from date_sample Group BY month(Date); Regards Bejoy KS > From: yogesh.kuma...@wipro.com > To: user@hive.apache.org > CC: yogeshdh...

RE: zip file or tar file cosumption

2012-09-30 Thread Bejoy KS
Hi ManishGzip works well if you have the compression codec available in 'io.compression.codes' . Gzip codec is present in default.I don't think untar ing world be done by map reduce jobs. So tar files may not work with hive, you need to untar the files out of hadoop hive as a prerequisite. Regar

Re: Book 'Programming Hive' from O'Reilly now available!

2012-09-30 Thread Bejoy KS
guys. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Edward Capriolo Date: Sat, 29 Sep 2012 19:51:47 To: ; Reply-To: user@hive.apache.org Subject: Book 'Programming Hive' from O'Reilly now available! Hello all, I wanted to le

Re: Cartesian Product in HIVE

2012-09-28 Thread Bejoy KS
Hi Abshiek What is the data size of the 20k rows? If it is lesser then you can go in for map join, which will give you a performance boost. set hive.auto.convert.join = true; Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Abhishek Date: Fri, 28

Re: about jvm reuse

2012-09-28 Thread Bejoy KS
slots are not being used. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: researcher qiao Date: Fri, 28 Sep 2012 14:48:51 To: Reply-To: user@hive.apache.org Subject: about jvm reuse deal all, i was running hive on hadoop. we noticed that there were

Re: Performance tuning in hive

2012-09-28 Thread Bejoy KS
Hi Abshiek I don't think Partition By and Clustered By is supported in CTAS. You need to create the bucketed Table separately, then enable hive.enforce.bucketing , after that use Select statement from the parent table to load data into the bucketed one. Regards Bejoy KS Sent from han

Re: Performance tuning in hive

2012-09-28 Thread Bejoy KS
if those suits your data set.   Regards, Bejoy KS From: Abhishek To: "user@hive.apache.org" Cc: "user@hive.apache.org" Sent: Friday, September 28, 2012 5:16 AM Subject: Re: Performance tuning in hive Hi Bejoy, Thanks for the repl

Re: ERROR: Hive subquery showing

2012-09-27 Thread Bejoy KS
Hi yogesh What about a query like this select name from ABC WHERE grp=MAX(grp); Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Chen Song Date: Thu, 27 Sep 2012 15:33:11 To: Reply-To: user@hive.apache.org Subject: Re: ERROR: Hive subquery showing

Re: Performance tuning in hive

2012-09-27 Thread Bejoy KS
ou are querying only a few columns then RC files gives you a performance edge but if the queries are spanned across pretty much all columns then use the more generalized Sequence Files.   Regards, Bejoy KS From: Abhishek To: Hive Sent: Thursday, September 27,

Re: issue hive with external derby

2012-09-27 Thread Bejoy KS
0.4.1.3-bin/lib/derbytools.jar /opt/hadoop/hadoop-0.17.2.1/lib   Regards, Bejoy KS From: Bertrand Dechoux To: user@hive.apache.org Sent: Thursday, September 27, 2012 10:57 AM Subject: Re: issue hive with external derby Hi, For 1), did you follow the

Re: how to load TAB_SEPRATED file in hive table

2012-09-26 Thread Bejoy KS
Hi Yogesh Which ever character is the column separator in your input file , you need to provide that in FIELDS TERMINATED BY clause. Also the common storage formats supported by hive includes - Text File - Sequence File - RC File etc Regards Bejoy KS Sent from handheld, please excuse typos

Re: How to optimize a group by query

2012-09-26 Thread Bejoy KS
your table is bucketed or sorted bucketed. This optimization applies when the Group By columns are same as bucketed columns or the group by columns are a subset of sorted bucked columns. This optimization is enabled using 'hive.optimize.groupby' which is true by default   Regards

Re: How to optimize a group by query

2012-09-26 Thread Bejoy KS
property SET hive.groupby.skewindata=true;   Regards, Bejoy KS From: Abhishek To: Hive Sent: Wednesday, September 26, 2012 10:31 PM Subject: How to optimize a group by query Hi all, I have written a query with group by clause, it is consuming lot of time is

Re: Hive configuration property

2012-09-26 Thread Bejoy KS
Hi Abshiek Based on my experience you can always provide the number of reduce tasks (mapred.reduce.tasks) based on the data volume your query handles. It can yield you better performance numbers.    Regards, Bejoy KS From: Abhishek To: "user@hive.apach

Re: Map issue in Hive.

2012-09-21 Thread Bejoy KS
you can have Map as the column data type and DDL should be like  . COLLECTION ITEMS TERMINATED BY ';' MAP KEYS TERMINATED BY '='     Hope it is clear now :) Regards, Bejoy KS From: Manish.Bhoge To: "user@hive.apache.org" ; &#x

Re: How to run big queries in optimized way ?

2012-09-20 Thread Bejoy KS
small files issue generated by queries. Then optimization is totally based on what you use in your queries, you can go in with join optimizations, group by optimizations etc based on your queries.   Regards, Bejoy KS - Original Message - From: MiaoMiao To: user@hive.apache.org Cc: Sen

Re: Map issue in Hive.

2012-09-20 Thread Bejoy KS
column if it is just a collection of elements rather than a key value pair, you can use an Array data type instead. Here just specify the delimiter for each values using 'COLLECTION ITEMS TERMINATED BY' Regards, Bejoy KS From: Manish To: user Cc:

Re: Hive ignoring buckets when using dynamic where

2012-09-20 Thread Bejoy KS
substitute the same in your hive query.   Some links for your reference http://hive.apache.org/docs/r0.9.0/language_manual/var_substitution.html  http://kickstarthadoop.blogspot.in/2011/10/include-values-during-execution-time-in.html   Regards, Bejoy KS From

Re: Hive ignoring buckets when using dynamic where

2012-09-20 Thread Bejoy KS
ide 'bdate='2012-09-01'' the hive parser knows initially itself what data which all partitions should be taken into account. So this query runs on only the required partitions and not on whole data. To add on , it is not the buckets considered here on where clause but the par

Re: Performance: hive+hbase integration query against the row_key

2012-09-12 Thread Bejoy KS
Hi Ashok, AFAIK, there is no property that will get you this functionality on the fly.   Regards, Bejoy KS From: "ashok.sa...@wipro.com" To: user@hive.apache.org; bejoy...@yahoo.com Sent: Thursday, September 13, 2012 2:42 AM Subject: RE: Perform

Re: Performance: hive+hbase integration query against the row_key

2012-09-12 Thread Bejoy KS
uired hdfs location.    Regards, Bejoy KS From: "ashok.sa...@wipro.com" To: user@hive.apache.org Sent: Wednesday, September 12, 2012 8:55 AM Subject: RE: Performance: hive+hbase integration query against the row_key after loading the data into hi

Re: most efficient way to concatenate 3 tables into one?

2012-09-12 Thread Bejoy KS
the above commands with those used in your tables. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: zuohua zhang Date: Wed, 12 Sep 2012 13:14:20 To: ; Reply-To: user@hive.apache.org Subject: Re: most efficient way to concatenate 3 tables into one

Re: most efficient way to concatenate 3 tables into one?

2012-09-12 Thread Bejoy KS
Hi If all the 3 tables have the same. Schema, Create an external table and move the data from all the 3 tables to this new table's location. Just a hdfs copy or move is not that expensive. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: zuohua

Re: How to update and delete a row in hive

2012-09-11 Thread Bejoy KS
ly so that only less volume of data is overwritten every time. Also to ensure performance while partitioning you need to ensure that all or most of the sub partitions contains data volume atleast equal to your block size.    Regards, Bejoy KS From: "Connell,

Re: Where does hive store join query results

2012-09-09 Thread Bejoy KS
f a directory is not specified then the results of the query is just printed on the CLI.   Regards, Bejoy KS From: Hongjie Guo To: user@hive.apache.org Sent: Monday, September 10, 2012 10:10 AM Subject: Re: Where does hive store join query results maybe there i

Re: hive table missing

2012-09-09 Thread Bejoy KS
7;t exists. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Sam Darwin Date: Sun, 9 Sep 2012 12:49:14 To: Reply-To: user@hive.apache.org Subject: hive table missing Hi, We are seeing a hive table is gone unexpectedly. I suspect that this must have been

Re: FAILED: Error in metadata

2012-09-09 Thread Bejoy KS
Hi Yogesh It looks like hive is still on the derby db . Can you restart your hive instances after updating the hive-site.xml. Also please make sure that you are modifying the right copy of the file.   Regards, Bejoy KS From: yogesh dhari To: hive request

Re: How to load csv data into HIVE

2012-09-08 Thread Bejoy KS
can be loaded into hive table. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: "Connell, Chuck" Date: Sat, 8 Sep 2012 12:18:33 To: user@hive.apache.org Reply-To: user@hive.apache.org Subject: RE: How to load csv data into HIVE I would li

Re: Changing Hive default representation of nulls from \N to something else

2012-09-07 Thread Bejoy KS
Hi Charles You may need to replace the NULLS with 'NULL' string .  INSERT OVERWRITE staging_table SELECT ... CASE WHEN clmn_1 IS NULL THEN "NULL" else clmn_1 ... Thank You  Regards, Bejoy KS From: Charles Menguy To: user@hive.apac

Re: How to get percentage of each group?

2012-09-06 Thread Bejoy KS
Hi CROSS JOIN is same as giving JOIN keyword. CROSS JOIN just a new notation in later releases of hive. JOIN without ON is same as CROSS JOIN   Regards, Bejoy KS From: MiaoMiao To: user@hive.apache.org Sent: Friday, September 7, 2012 11:46 AM Subject: Re

Re: Unexpected end of input stream

2012-08-28 Thread Bejoy KS
Hi Kiwon You can get this information from the jobdetails web page itself. Browse to your failed task and there you can see the details on which file/block it had processed and failed with the error. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From

Re: alter external table location with new namenode address

2012-08-24 Thread Bejoy KS
Yes you need to update the metastore db directly for this to be in effect. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Alok Kumar Date: Fri, 24 Aug 2012 13:30:36 To: ; Reply-To: user@hive.apache.org Subject: alter external table location with

Re: move Hive from one distribution to another

2012-08-21 Thread Bejoy KS
details please refer the following jira https://issues.apache.org/jira/browse/HIVE-1918    Regards, Bejoy KS From: Anson Abraham To: user@hive.apache.org Sent: Tuesday, August 21, 2012 9:58 PM Subject: move Hive from one distribution to another I have a h

Re: how to handling complex log file(compressed, 200G)

2012-08-17 Thread Bejoy KS
27; statement. If the data is not partitioned in hdfs but would like to be partitioned in hive then you can take a look at Dynamic Partition Insert. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Kiwon Lee Date: Sat, 18 Aug 2012 00:29:20 To: Reply-To: user

Re: UNION ALL - what is the simplest form

2012-08-16 Thread Bejoy KS
le to another would be to use a hdfs copy/move. LOAD DATA INPATH 'location/of/stage_table' INTO TABLE 'main_table'; Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: "Balaraman, Anand" Date: Thu, 16 Aug 2012 11:36:49

Re: how to do random sampling in hive?

2012-08-15 Thread Bejoy KS
https://cwiki.apache.org/Hive/languagemanual-sampling.html  Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Roberto Sanabria Date: Tue, 14 Aug 2012 15:31:14 To: Reply-To: user@hive.apache.org Subject: Re: how to do random sampling in hive? Try this

Re: OPTIMIZING A HIVE QUERY

2012-08-14 Thread Bejoy Ks
Hi Sudeep You can also look at join optimizations like map join, bucketed map join,sort merge join etc and choose the right one that fits your requirement. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins   Regards, Bejoy KS From

Re: Need Unstructured data sample file

2012-08-14 Thread Bejoy Ks
gData    Regards, Bejoy KS From: shaik ahamed To: user@hive.apache.org Sent: Tuesday, August 14, 2012 1:31 PM Subject: Need Unstructured data sample file Hi Users,   Need the clarifications on the below tasks.   1. What are the unstrutured type files in

Re: Loading data only into one node

2012-08-14 Thread Bejoy Ks
ut of track, please share more details for a better understanding on your requirement so that we can help you better.   Regards, Bejoy KS From: shaik ahamed To: user@hive.apache.org Sent: Tuesday, August 14, 2012 1:40 PM Subject: Loading data only into one

Re: loading data in HDFS similar to raid concept(i.e i have 100GB data file load as 30GB in one node, 40 GB in other node and 30GB in other node

2012-08-13 Thread Bejoy KS
Hi Shaik AFAIK it is not possible in hadoop. The hdfs storage concept is different from RAID, In hdfs your file is broken down to blocks and each of these blocks are stored in one or more Data Nodes in your cluster based on the replication factor. Regards Bejoy KS Sent from handheld, please

Re: Exploding Array of String in Hive

2012-08-11 Thread Bejoy KS
f you are looking at exploding the row itself to multiple rows you can do so with SELECT * FROM test_table LATERAL VIEW explode(arr_clmn) exp_arr AS arr_elmnt; HTH Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Techy Teck Date: Fri, 10 Aug 2012 20:22:26

Re: [ANNOUNCE] New Hive Committer - Navis Ryu

2012-08-10 Thread Bejoy KS
Congrats Navis.. :) Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: alo alt Date: Fri, 10 Aug 2012 17:08:07 To: Reply-To: user@hive.apache.org Cc: ; Subject: Re: [ANNOUNCE] New Hive Committer - Navis Ryu Congratulations! Well done :) cheers

Re: Hive append support

2012-08-09 Thread Bejoy Ks
a least over head while implementing updates using overwrite. I have scribbled something long back, it can give you some idea on it http://kickstarthadoop.blogspot.in/2011/06/implementing-basic-sql-update-statement.html   Regards, Bejoy KS From: Sandeep Reddy

Re: Hive append support

2012-08-09 Thread Bejoy Ks
-InsertingdataintoHiveTablesfromqueries Updates are not supported by hive directly.   Regards, Bejoy KS From: Sandeep Reddy P To: u...@hadoop.apache.org Cc: user@hive.apache.org; cdh-u...@cloudera.org Sent: Thursday, August 9, 2012 7:56 PM Subject: Hive append support Hi, Is there any version

Re: Hive and joins

2012-08-08 Thread Bejoy Ks
Hi Ranjith BETWEEN a and b, you can implement as >=a , <=b . Since that is not equality you cannot use that in ON clause you need to move it to WHERE condition in your query.   Regards, Bejoy KS From: "Raghunath, Ranjith" To: "

Re: Custom UserDefinedFunction in Hive

2012-08-08 Thread Bejoy Ks
.   Regards, Bejoy KS From: Raihan Jamal To: user@hive.apache.org Cc: d...@hive.apache.org Sent: Tuesday, August 7, 2012 10:50 PM Subject: Re: Custom UserDefinedFunction in Hive Hi Jan,  I figured that out, it is working fine for me now. The only question I have

Re: Caused by: java.io.EOFException

2012-08-06 Thread Bejoy KS
It could be like the file corresponding to the partition dt='20120731' got corrupted. This file as pointed in the error logs should be the culprit. hdfs://ares-nn/apps/hdmi-technology/b_apdpds/real-time_new/20120731/PDS_HADOOP_REALTIME_EXPORT-part-3-2 Regards Bejoy KS Sent fro

Re: Passing date as command line arguments

2012-08-04 Thread Bejoy KS
Yes that is the right issue. Variable substitution is not happening. I can't say much here as I haven't tried out this on 0.6 . The code on my blog post is based on 0.7 or higher version I guess. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message

Re: Passing date as command line arguments

2012-08-04 Thread Bejoy KS
You can try it out on a local installation and test it against the later versions. As I pointed out, I just tested on hive 0.9 and it was working good for me. I guess you should recommend an upgrade of hive in your cluster as well. Hive has gone far too ahead after 0.6 . :) Regards Bejoy KS

Re: Passing date as command line arguments

2012-08-04 Thread Bejoy KS
Try it on a higher version of hive and let me know if that doesn't work still. 0.9 should be good. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Techy Teck Date: Sat, 4 Aug 2012 00:26:44 To: ; Reply-To: user@hive.apache.org Subject: Re: Pa

Re: Passing date as command line arguments

2012-08-04 Thread Bejoy KS
Hi Techy Which version of hive are you on? I'm on hive 0.9 and I'm sure I have executed similar scripts in hive 0.7 as well. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message----- From: "Bejoy KS" Date: Sat, 4 Aug 2012 07:24:37

Re: Passing date as command line arguments

2012-08-04 Thread Bejoy KS
I tried the same query on my end, It is working fine for me without any issues. By de way the data type for 'dt' is String itself right? Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Techy Teck Date: Sat, 4 Aug 2012 00:21:37 To: ; Repl

Re: Passing date as command line arguments

2012-08-04 Thread Bejoy KS
Yes. From the logs the query being executed is select * from lip_data_quality where dt=20120709 But here the dt is not in quotes. It should be like select * from lip_data_quality where dt='20120709'; Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message

Re: Passing date as command line arguments

2012-08-03 Thread Bejoy Ks
g executed?   Regards, Bejoy KS From: Techy Teck To: user@hive.apache.org Cc: Vijay Dirisala Sent: Saturday, August 4, 2012 12:11 PM Subject: Re: Passing date as command line arguments Thanks Vijay for the suggestion. I also tried that and it still didn

Re: decompress the file that has been compressed in LzoCodec format

2012-08-03 Thread Bejoy Ks
Hi Techy Try using hadoop fs -text That should give the output in some readable format.   Regards, Bejoy KS From: Techy Teck To: user@hive.apache.org Sent: Friday, August 3, 2012 7:25 AM Subject: decompress the file that has been compressed in LzoCodec

Re: schema of hive database

2012-08-02 Thread Bejoy KS
Hi Techy To use 'describe formatted' you need at least hive 0.7. But 'describe extended' should be available in hive 0.6 itself. It is always better using the latest version. I recommend hive 0.9 at the moment. Regards Bejoy KS Sent from handheld, please excuse t

Re: Error while inserting 100GB data in to hive external table

2012-08-02 Thread Bejoy Ks
Hi Shaik What does the failed MR task logs say? Can you share the failed Task error log here? If you are using hadoop 2.0/0.23 then there is a similar issue reported https://issues.apache.org/jira/browse/HIVE-2804   Regards, Bejoy KS From: shaik ahamed To

Re: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

2012-08-01 Thread Bejoy Ks
n you need to ensure that the user has sufficient permissions on the source directory in hdfs as well. Since it is a move operation happening under the hood the user need to have sufficient permissions on the source dir in hdfs as well as the table's storage location in hdfs.   Regards

Re: Efficiently Store data in Hive

2012-08-01 Thread Bejoy Ks
huffled is large it saves on network transfers there by increasing mapreduce performance to some extent.   Regards, Bejoy KS From: Techy Teck To: user@hive.apache.org Sent: Thursday, August 2, 2012 12:18 AM Subject: Efficiently Store data in Hive

Re: Unable to merge 3 tables in hive

2012-08-01 Thread Bejoy KS
have the same structure and you need to just club the data together into single table in hive. 1) Import data into hdfs separately for 3 tables. 2) Create hive table. 3) Use 3 load data statements without Overwrite option to load the imported data into same hive table. Regards Bejoy KS Sent from

Re: ERROR

2012-07-31 Thread Bejoy KS
Hi Abshiek To get the cause of this error, you need to look at the failed mapreduce task logs. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: abhiTowson cal Date: Tue, 31 Jul 2012 12:08:33 To: Reply-To: user@hive.apache.org Subject: ERROR hi all

Re: Data Loaded but Select returns nothing!

2012-07-30 Thread Bejoy KS
speed-up-your-hive-queries-in.html To ensure that both partition and buckets work seamlessly in your case load the source data into a non partitioned normal table from there enable the required properties and load into the final partitioned bucketed table. Regards Bejoy KS Sent from handheld

Re: Data Loaded but Select returns nothing!

2012-07-30 Thread Bejoy KS
'Insert Overwrite'. Then another quick nit Your table is partitioned so you need to load your data into some partition but you have not spefied a partition in Load. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Kuldeep Chitrakar Date: Mon, 3

Re: Create external table like.

2012-07-27 Thread Bejoy Ks
Hi Vidhya This bug was reported and fixed in a later version of hive , Hive 0.8. An upgrade would set things in place. https://issues.apache.org/jira/browse/HIVE-2888   Regards, Bejoy KS From: Vidhya Venkataraman To: user@hive.apache.org Sent: Friday, July

Re: Performance Issues in Hive with S3 and Partitions

2012-07-27 Thread Bejoy Ks
Hi Richin I agree with Edward on this. You have to design your partition in such a way that each partition holds data that is atleast an hdfs block size.   Regards, Bejoy KS From: Edward Capriolo To: user@hive.apache.org Sent: Saturday, July 28, 2012 12:32

Re: Fwd: High availability with hive

2012-07-27 Thread Bejoy Ks
but If you can employ HA in metastore db then you can claim full HA in hive.     Regards, Bejoy KS From: Abhishek To: Hive Sent: Friday, July 27, 2012 10:10 PM Subject: Fwd: High availability with hive > > Hi all, > > I am trying to instal

Re: Rename an output file in hive {was: Re: Possibility of defining the Output directory programmatically}

2012-07-27 Thread Bejoy KS
Hi Manisha In mapreduce if you want to change the name of output file you may need to write your own OutputFormat. Renaming files in hdfs is straight forward hadoop fs -mv oldFileName newFileName Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From

Re: STREAM (TABLE) IN HIVE

2012-07-26 Thread Bejoy Ks
7;t  other  than right table to be streamed you go for this hint. If you are joining more tables on different keys, then for every join set just specify the larger table on the right of ON condition. No need of stream table hint here. Regards Bejoy KS From:

Re: HBASE and HIVE Integration

2012-07-26 Thread Bejoy Ks
Hi Vijay Your current error looks like some issue with the Select query. Is the select query working as desired? hive> SELECT * FROM pokes where foo=98; Regards Bejoy KS From: vijay shinde To: user@hive.apache.org; Bejoy Ks Sent: Friday, July 27, 2012

Re: unable to see the file

2012-07-26 Thread Bejoy KS
you used Load data with overwrite all the table's dirs also Got deleted. You can recover the data only if trash is enabled in hdfs. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: shaik ahamed Date: Thu, 26 Jul 2012 19:39:01 To: Reply-To: user@

Re: Group By with rollup in HiveQL?

2012-07-26 Thread Bejoy Ks
Hi At the moment Hive QL doesn't support rollup clause, however the development is in progress for this feature. https://issues.apache.org/jira/browse/HIVE-2397  Regards Bejoy KS From: Techy Teck To: user@hive.apache.org Sent: Thursday, July 26, 2

Re: HBASE and HIVE Integration

2012-07-26 Thread Bejoy Ks
Hi Vijay Is your hbase working independently without any issues. I mean, are you able to insert data into hbase tables without using hive integration? Was the same error message thrown when you directly provided hbase.master instead of zookeeper quorum? Regards Bejoy KS

  1   2   3   >