HDFS directory in /user/hive/warehouse getting "hive" as Owner ?

2013-03-25 Thread Sanjay Subramanian
Steps to recreate the use case: - Log in as sasubramanian to Linux Box - Execute hive -e "CREATE TABLE name (id INT, name STRING);" - Go to HDFS /user/hive/warehouse/ Name Type Size Replication Block Size Modification Time Permission Owner Group name dir

Re: HDFS directory in /user/hive/warehouse getting "hive" as Owner ?

2013-03-25 Thread Sanjay Subramanian
enable kerberos based security. On Tue, Mar 26, 2013 at 7:41 AM, Nitin Pawar mailto:nitinpawar...@gmail.com>> wrote: Sanjay, can you try adding 'LOCATION' clause to your create statement. By default the hive warehouse directory is writable by all the user. To create it by the individual users y

Re: HDFS directory in /user/hive/warehouse getting "hive" as Owner ?

2013-03-25 Thread Sanjay Subramanian
r create statement. By default the hive warehouse directory is writable by all the user. To create it by the individual users you need to provide by the location clause. On Tue, Mar 26, 2013 at 7:31 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Steps to rec

Re: HDFS directory in /user/hive/warehouse getting "hive" as Owner ?

2013-03-25 Thread Sanjay Subramanian
ans users can create partitions .. you can refer the entire table at https://cwiki.apache.org/Hive/languagemanual-auth.html On Tue, Mar 26, 2013 at 8:24 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: I am using Hive Version: 0.9.0+155-1.cdh4.1.2.p0.21~precise-cdh4

Re: S3/EMR Hive: Load contents of a single file

2013-03-26 Thread Sanjay Subramanian
Hi Tony Can u create the table without any location. After that you could do an ALTER TABLE add location and partition ALTER TABLE myData ADD PARTITION (partitionColumn1='$value1' , partitionColumn2='$value2') LOCATION '/path/to/your/directory/in/hdfs';" An example Without Partitions -

Re: S3/EMR Hive: Load contents of a single file

2013-03-26 Thread Sanjay Subramanian
n when the LOCATION is a directory. Cool! Tony From: Sanjay Subramanian [mailto:sanjay.subraman...@wizecommerce.com] Sent: 26 March 2013 17:22 To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: S3/EMR Hive: Load contents of a single file Hi Tony Can u create the table

Re: HDFS directory in /user/hive/warehouse getting "hive" as Owner ?

2013-03-26 Thread Sanjay Subramanian
itions. from language manual * CREATE - Allows users to create objects. For a database, this means users can create tables, and for a table, this means users can create partitions .. you can refer the entire table at https://cwiki.apache.org/Hive/languagemanual-auth.html On Tue, Mar 26

Hive CLI works fine for "ALTER TABLE" but get HiveServerException using ThriftHive.Client

2013-03-26 Thread Sanjay Subramanian
Hive-site.xml setting - hive.security.authorization.enabled = true Script -- ALTER TABLE myTable ADD PARTITION (partition1='some_value1' , partition2='some_value2') LOCATION '/path/to/directory/on/hdfs/containing/data' I can execute this script using Hive CLI but ThriftHi

Re: HDFS directory in /user/hive/warehouse getting "hive" as Owner ?

2013-03-26 Thread Sanjay Subramanian
orted user and group permissions. Note that this property must be set on both the client and server sides. Further note that its best effort. If client sets its to true and server sets it to false, client setting will be ignored. From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerc

MySQL instance on hadoop name node server in production ?

2013-03-27 Thread Sanjay Subramanian
Hi all I am planning to install mysql server (as hive metastore) on the same box as my name node. My name node has 16GB RAM and hopefully I can get 2TB Any problems with mysql on the dame node as name node ? Thanks sanjay CONFIDENTIALITY NOTICE == This email message and any

Re: MySQL instance on hadoop name node server in production ?

2013-03-27 Thread Sanjay Subramanian
ally dedicated to that purpose only. Depending on how frequently you are going to run queries and how much data the hdfs is going to hold is key factor in deciding this. On Wed, Mar 27, 2013 at 11:32 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Hi all I am pla

Re: MySQL instance on hadoop name node server in production ?

2013-03-27 Thread Sanjay Subramanian
you setup a couple of VMs with mysql replication enabled. so the reads can be distributed with load balancers etc On Thu, Mar 28, 2013 at 12:00 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Thanks Nitin. The mysql instance is for hive metastore only so f

Re: hive hbase storage handler fail

2013-03-27 Thread Sanjay Subramanian
If you can run your hive insert data script with debug option u may get some clues /usr/lib/hive/bin/hive -hiveconf hive.root.logger=INFO,console -e "insert into dest select * from some_table_same_structure_as_dest limit 10;" I created a small demo usecase and this is failing for me as well The e

hive.limit.optimize.fetch.max

2013-03-27 Thread Sanjay Subramanian
Hi I have following settings in the hive-site.xml hive.limit.row.max.size 10 hive.limit.optimize.enable true hive.limit.optimize.fetch.max 11 When I do a select query with WHERE clause it does not LIMIT The results to 10. How do u limit the SELECT query results to 10 rows ? M

Re: External table for hourly log files

2013-03-28 Thread Sanjay Subramanian
Hi You may want to look at Dynamic partitions https://cwiki.apache.org/Hive/dynamicpartitions.html Thanks sanjay From: Ian mailto:liu...@yahoo.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>>, Ian mailto:liu...@yahoo.com>> Date: Thursday, March 28

Re: Noob question on creating tables

2013-03-29 Thread Sanjay Subramanian
Hi CREATE EXTERNAL TABLE IF NOT EXISTS log_data(col1 datatype1, col2 datatype2, . . . colN datatypeN) PARTITIONED BY (YEAR INT, MONTH INT, DAY INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; ALTER table log_data ADD PARTITION (YEAR=2013 , MONTH=2, DAY=27) LOCATION '/path/to/YEAR/MONTH/DAY/d

Re: Noob question on creating tables

2013-03-29 Thread Sanjay Subramanian
lto:static.void@gmail.com>> wrote: Thanks Does this mean I need to create a partition for each day manually? There is no way to have infer that from my directory structure? On Mar 29, 2013, at 10:40 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: >

Re: Noob question on creating tables

2013-03-29 Thread Sanjay Subramanian
e myself, or use an existing one like we have, I will need to use the external keyword. Does that sound about right? On Mar 29, 2013, at 12:45 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: I agree. BASH is super easy for things like this I have a daily alter p

Re: Problem when trying to connect to hive server using jdbc

2013-04-01 Thread Sanjay Subramanian
Hi First of if u r planning to run YARN on 4.2.0 then stay with 4.1.2. I installed 4.2.0 but had to roll back :-( Hit upon this error https://issues.cloudera.org/browse/DISTRO-461. If u r not using yarn then it will not affect u. When u install Cloudera Manager, it installs Hive. But Hive-serv

Re: External Table to Sequence File on HDFS

2013-04-03 Thread Sanjay Subramanian
Check this out http://stackoverflow.com/questions/13203770/reading-hadoop-sequencefiles-with-hive From: Ranjitha Chandrashekar mailto:ranjitha...@hcl.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Wednesday, April 3, 2013 10:43 PM To: "user

Re: Partition performance

2013-04-04 Thread Sanjay Subramanian
The slow down is most possibly due to large number of partitions. I believe the Hive book authors tell us to be cautious with large number of partitions :-) and I abide by that. Users Please add your points of view and experiences Thanks sanjay From: Ian mailto:liu...@yahoo.com>> Reply-To: "us

Correct syntax for EXPLAIN DEPENDENCY

2013-04-04 Thread Sanjay Subramanian
Hi Whats the correct syntax for EXPLAIN DEPENDENCY ? Query == /usr/lib/hive/bin/hive -e "explain dependency select * from channel_market_lang where channelid > 29000" org.apache.hadoop.hive.ql.parse.ParseException: line 1:8 cannot recognize input near 'plan' 'dependency' 'select' in stateme

Re: Correct syntax for EXPLAIN DEPENDENCY

2013-04-04 Thread Sanjay Subramanian
Ah its available only in 0.10.0 :-( And I am still using 0.9.x from the CDH4.1.2 distribution From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Thu

Re: Correct syntax for EXPLAIN DEPENDENCY

2013-04-04 Thread Sanjay Subramanian
05, 2013 at 01:48:39AM +, Sanjay Subramanian wrote: >> Ah its available only in 0.10.0 :-( >> And I am still using 0.9.x from the CDH4.1.2 distribution >> >> >> From: Sanjay Subramanian >>mailto:sanjay.subramanian@wizecommer >>ce.com>> &g

Re: how to limit mappers for a hive job

2013-04-24 Thread Sanjay Subramanian
I use the following To specify the Mapper Input Split Size (134217728 is in bytes) == SET mapreduce.input.fileinputformat.split.maxsize=134217728; From: Frank Luo mailto:j...@merkleinc.com>> Reply-To: "user@hive.apache.org

Re: map tasks are taking ever when running job on 24 TB

2013-04-25 Thread Sanjay Subramanian
That’s a lot of partitions for one Hive Job ! Not sure if that itself is the root of the issues….There have been quite a few discussions on max 1000-ish number of partitions as good… Is your use case conducive too using Combiners (though they cannot be guaranteed to be called) Thanks sanjay Fro

Re: Very poor read performance with composite keys in hbase

2013-04-30 Thread Sanjay Subramanian
My experience with hive + hbase has been about 8x slower on an average. So I went ahead with hive only option. Sent from my iPhone On Apr 30, 2013, at 11:19 PM, "Rupinder Singh" mailto:rsi...@care.com>> wrote: Hi, I have an hbase cluster where I have a table with a composite key. I map this

Re: Variable resolution Fails

2013-04-30 Thread Sanjay Subramanian
+1 agreed Also as a general script programming practice I check if the variables I am going to use are NON empty before using them…nothing related to Hive scripts If [ ${freq} == "" ] then echo "variable freq is empty…exiting" exit 1 Fi From: Anthony Urso mailto:antho...@cs.ucla.edu>>

Re: Describe extended shows number of rows as 0

2013-05-02 Thread Sanjay Subramanian
Not that it could be related but if possible setup a Mysql or similar serious datastore….that Hive can connect to… Its possibly not prudent spending time to analyze problems caused by derby metastore and with Mysql u can start doing some heavy duty stinging with Hive :-) Regards sanjay From:

Re: Getting Started

2013-05-02 Thread Sanjay Subramanian
Can u share your hive-site.xml ? What meta store r u using ? Also try this to get additional debug messages that u can use to analyze the problem >From your linux command prompt run the following and tell us what u see. Also >hive-site.xml please /path/to/hive -hiveconf hive.root.logger=INFO,c

Re: external table or gz compressed file

2013-05-02 Thread Sanjay Subramanian
Hi INPUT = Hive can handle gz files out of the box with NO additional configurations OUTPUT == If you want Hive to output to compressed files (say gz) then add the following as part of the hive SQL at the begining SET hive.exec.compress.output=true; SET mapred.reduce.tasks=16;// this

Re: Hive 0.10.0 Postgres Schema script?

2013-05-02 Thread Sanjay Subramanian
https://github.com/apache/hive/tree/trunk/metastore/scripts/upgrade/postgres From: Leena Gupta mailto:gupta.le...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Thursday, May 2, 2013 12:22 PM To: "user@hive.apache.org

Re: Parallely Load Data into Two partitions of a Hive Table

2013-05-03 Thread Sanjay Subramanian
Why are u using LOAD DATA syntax ? Are these Hive managed tables ? LOAD DATA will actually copy files into HDFS I would recommend using EXTERNAL table and use ALTER TABLE ADD PARTITION(logdate='2013-04-01') LOCATION '/logs/processed/2013-04-01' This just makes entries in MYSQL and is lot faste

Re: hive cli escaping TAB and NEW LINE Characters.

2013-05-03 Thread Sanjay Subramanian
+1 to Stephens suggestion… From: Stephen Sprague mailto:sprag...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Friday, May 3, 2013 11:29 AM To: "user@hive.apache.org" mailto:user@hive.apache.org>> Subjec

Re: Connecting to Hive from R through JDBC

2013-05-07 Thread Sanjay Subramanian
Hi Saurabh The usual suspect looks like hive-server service is not running on server where hive is installed….The hive-server service needs to be installed and started….It listens on port 1 by default. Also on a side note is I hope your Hive connecting to MySQL or some non-derby RDBMS :-)

Re: Connecting to Hive from R through JDBC

2013-05-08 Thread Sanjay Subramanian
Saurabh Can u please try to install squirrel and point to hive jdbc jar and see if u can connect to Hive thru Squirrel client….It will just check if u can connect using another client Thanks sanjay From: Saurabh S mailto:saurab...@live.com>> Reply-To: "user@hive.apache.org

Re: warehouse directory of HIVE

2013-05-08 Thread Sanjay Subramanian
https://issues.apache.org/jira/secure/attachment/12471108/HiveMetaStore.pdf From: Stephen Sprague mailto:sprag...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Wednesday, May 8, 2013 12:51 PM To: "user@hive.apache.org

Re: Hive Professional Services

2013-05-09 Thread Sanjay Subramanian
Hi Liz Don't mind my saying so but as much I understand your eagerness to procure a good candidate (albeit for Cassandra , a totally different technology from Hive) you should not be sending mails of this sort to this group. This group is a serious Hive users group that uses this forum for discu

Re: How to Disable Hive CLI interactive mode

2013-05-15 Thread Sanjay Subramanian
no args) [interactive mode] yet still permit "hive -e" [batch mode]? if that's the case my proposal would be to have a wrapper around the hive executable and check if stdin is a tty. then again if i've completely misunderstood the question could you elaborate? On Wed, May 15,

Re: How to Disable Hive CLI interactive mode

2013-05-15 Thread Sanjay Subramanian
r edge cases you may want to consider. "hive -service cli" would pass and you wouldn't want that but you get the idea i presume. On Wed, May 15, 2013 at 5:32 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Hi To clarify what I need is as follow

Re: Hive Web Interface

2013-05-16 Thread Sanjay Subramanian
+1 agreed…beeswax is way better From: "kulkarni.swar...@gmail.com" mailto:kulkarni.swar...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Thursday, May 16, 2013 9:21 AM To: "user@hive.apache.org

Re: Hive Web Interface

2013-05-16 Thread Sanjay Subramanian
1. U will need to set this in the hive-site.xml hive.hwi.war.file /path/to/lib/hive-hwi-0.9.0.war This sets the path to the HWI war file, relative to ${HIVE_HOME}. 2. Get this http://archive.apache.org/dist/hive/hive-0.9.0/hive-0.9.0.tar.gz 3. Tar xvf hive-0.9.0.tar.gz 4. hive-0.9.0/l

Re: Hive Web Interface

2013-05-16 Thread Sanjay Subramanian
; mailto:user@hive.apache.org>> Cc: "hive-u...@hadoop.apache.org<mailto:hive-u...@hadoop.apache.org>" mailto:hive-u...@hadoop.apache.org>> Subject: Re: Hive Web Interface Again, you need to set value to "lib/hive-hwi-0.9.0.war". value = '/path/to/li

Re: Hive Authorization and Views

2013-05-16 Thread Sanjay Subramanian
Also we have all external tables to ensure that accidental dropping of tables does not delete data…Plus the good part of HDFS architecture is data is immutable….which means u cannot update rows….u can move partitions or delete/insert data from hdfs which IMHO is very cool….but may not solve all

need help with an error - script used to work and now it does not :-(

2013-05-16 Thread Sanjay Subramanian
2013-05-16 18:57:21,094 FATAL [IPC Server handler 19 on 40222] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1368666339740_0135_m_000104_1 - exited : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(Reflection

Re: need help with an error - script used to work and now it does not :-(

2013-05-16 Thread Sanjay Subramanian
got a misleading error like this today. What happened was I upgraded to hive 0.10.One of my programs was liked to guava 15 but hive provides guava 09 on the classpath confusing things. I also had a similar issue with mismatched slf 4j and commons-logger. On Thu, May 16, 2013 at 10:34 PM, Sa

Re: need help with an error - script used to work and now it does not :-(

2013-05-16 Thread Sanjay Subramanian
:-( Still facing problems in large datasets Were u able to solve this Edward ? Thanks sanjay From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Thurs

Re: need help with an error - script used to work and now it does not :-(

2013-05-17 Thread Sanjay Subramanian
I am using Hive 0.9.0+155 that is bundled in Cloudera Manager version 4.1.2 Still getting the errors listed below :-( Any clues will be be cool !!! Thanks sanjay From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Date: Thursday, May 16, 2013 9:42 PM To:

Re: collect_set question

2013-05-17 Thread Sanjay Subramanian
Hi The Hive Programming Book explains how to implement a "collect" function that returns a list (instead of set) of objects…perhaps u can tweak this to return the N number of items in List Thanks sanjay From: Robert Li mailto:robert...@kontagent.com>> Reply-To: "user@hive.apache.org

Re: Hive on Oracle

2013-05-18 Thread Sanjay Subramanian
Try installing cloudera manager 4.1.2. It has bundled Hadoop hive and few other components. I have this version in production. Cloudera has pretty good documentation. This way u don't have to spend time installing versions that work successful with each other. Sent from my iPhone On May 17, 2

Re: Did any one used Hive on Oracle Metastore

2013-05-18 Thread Sanjay Subramanian
Raj It should be pretty much similar to setting it up in MySQL. Except any syntax differences. Read the cloudera hive installation notes. They have a separate Section for using mysql and oracle. Also one of my favorite $0.02 about the open source software is just dare and try it out...get error

Re: Hive Error

2013-05-20 Thread Sanjay Subramanian
Hi Varun Can u attach the error logs…I don't seem to have the attachment Thaks sanjay From: Kasa V Varun mailto:kasa.va...@mu-sigma.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Sunday, May 19, 2013 11:16 PM To: "user@hive.apache.org

Re: Unable to stop Thrift Server

2013-05-20 Thread Sanjay Subramanian
Raj Which version r u using ? I think from 0.9+ onwards its best to use service to stop and start and NOT hive sudo service hive-metastore stop sudo service hive-server stop sudo service hive-metastore start sudo service hive-server start Couple of general things that might help 1. Use linux s

Re: Unable to stop Thrift Server

2013-05-20 Thread Sanjay Subramanian
Not that I know of…..sorry sanjay From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: Raj Hadoop mailto:hadoop...@yahoo.com>> Date: Monday, May 20, 2013 2:17 PM To: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>>, "user@hive.apache.org<m

LZO compression implementation in Hive

2013-05-20 Thread Sanjay Subramanian
Hi Programming Hive Book authors Maybe a lot of u have already successfully implemented this but only these last two weeks , we implemented our aggregations using LZO compression in Hive - MR jobs creating LZO files as Input for Hive ---> Therafter Hive aggregations creating more LZO files as o

Re: hive.metastore.warehouse.dir - Should it point to a physical directory

2013-05-21 Thread Sanjay Subramanian
Notes below From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>>, Raj Hadoop mailto:hadoop...@yahoo.com>> Date: Tuesday, May 21, 2013 10:49 AM To: Dean Wampler mailto:deanwamp...@gmail.com>>, "user@hive.apache.

Re: hive.metastore.warehouse.dir - Should it point to a physical directory

2013-05-21 Thread Sanjay Subramanian
013 11:12 AM To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>>, Raj Hadoop mailto:hadoop...@yahoo.com>> Cc: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>>, User mailto:u...@hadoop.apache.org>> Subject

Re: Where to get Oracle scripts for Hive Metastore

2013-05-21 Thread Sanjay Subramanian
Raj The correct location of the script is where u deflated the hive tar For example /usr/lib/hive/scripts/metastore/upgrade/oracle You will find a file in this directory called hive-schema-0.9.0.oracle.sql Use this sanjay From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "u...@hadoop.ap

Re: Where to get Oracle scripts for Hive Metastore

2013-05-21 Thread Sanjay Subramanian
I think it should be this link because this refers to the /branches/branch-0.9 http://svn.apache.org/viewvc/hive/branches/branch-0.9/metastore/scripts/upgrade/oracle/ Can one of the Hive committers please verify…thanks sanjay From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "user@hive

Re: ORA-01950: no privileges on tablespace

2013-05-21 Thread Sanjay Subramanian
See the CDH notes here…scroll down to where the Oracle section is http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_18_4.html From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "user@hive.apache.org"

Re: Hive tmp logs

2013-05-22 Thread Sanjay Subramanian
hive.querylog.location /path/to/hivetmp/dir/on/local/linux/disk hive.exec.scratchdir /data01/workspace/hive scratch/dir/on/local/linux/disk From: Anurag Tangri mailto:tangri.anu...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>

Re: io.compression.codecs not found

2013-05-23 Thread Sanjay Subramanian
This property needs to be set in core-site.xml. If u r using clouderamanager then ping me I will tell u how to set it there. Out of the box hive works beautifully with gzip and snappy. And if u r using lzo then needs some plumbing. Depends on what ur usecase is I can provide guidance. Regards

Re: Snappy with HIve

2013-05-23 Thread Sanjay Subramanian
ss.output=true; SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec; SET mapred.output.compression.type=BLOCK; Bejoy: It should be fine. If it shows any issues add mapred.output.compress=true as well Regards Bejoy KS Sent from remote device, Please excuse typos F

Re: io.compression.codecs not found

2013-05-23 Thread Sanjay Subramanian
; ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: io.compression.codecs not found Hi, I'm not using CM. I have installed CDH 4.2.1 using Linux packages. Thank you, Sachin On Thu, May 23, 2013 at 7:13 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerc

Re: Hive tmp logs

2013-05-23 Thread Sanjay Subramanian
Clarification This property defines a file on HDFS hive.exec.scratchdir /data01/workspace/hive scratch/dir/on/local/linux/disk From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Date: Wednesday, May 22, 2013 12:23 PM To: "user@hive.apache.org&

hive.log

2013-05-23 Thread Sanjay Subramanian
How do I set the property in hive-site.xml that defines the local linux directory for hive.log ? Thanks sanjay CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged i

Re: hive.log

2013-05-23 Thread Sanjay Subramanian
Ok figured it out - vi /etc/hive/conf/hive-log4j.properties - Modify this line #hive.log.dir=/tmp/${user.name} hive.log.dir=/data01/workspace/hive/log/${user.name} From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org&

Re: how does hive find where is MR job tracker

2013-05-28 Thread Sanjay Subramanian
In Cloudera Manager , there is a Safety Valve feature (its a multiline text widget) that u can use to input the XML properties that u would use for mapred-site.xml Possibly since u changed the JobTracker machine , u have to mod the mapred-site.xml to specify the machine name and port Regards

Re: Update statment on Hive

2013-05-31 Thread Sanjay Subramanian
Hi Hive reads and writes to HDFSŠand by definition HDFS is write once and immutable after that. So like an RDBMS there is no concept of an update rows. However if u want to delete some records based on a criteria, yesterday there was a smart post about it, basically selecting the inverse and doing

Re: .sql vs. .hql

2013-05-31 Thread Sanjay Subramanian
For Hive u need the other bible by Dean Wampler , Edward Capriolo et al Also if u tell us what use cases u have we could provide helpŠ sanjay On 5/31/13 1:17 PM, "Keith Wiley" wrote: >I'm looking for documentation on how to use .sql and .hql files in Hive >and what the differences are between t

Re: .sql vs. .hql

2013-05-31 Thread Sanjay Subramanian
put files and how .sql and .hql may >differ. > >Cheers! > >On May 31, 2013, at 13:20 , Sanjay Subramanian wrote: > >> For Hive u need the other bible by Dean Wampler , Edward Capriolo et al >> Also if u tell us what use cases u have we could provide helpŠ >> >&

Re: .sql vs. .hql

2013-05-31 Thread Sanjay Subramanian
Ok cool On 5/31/13 2:00 PM, "Keith Wiley" wrote: >On May 31, 2013, at 13:52 , Sanjay Subramanian wrote: > >> First u need to setup some files in HDFS and define some Hive tables >>ands >> point to the location and then start your journey ! >> >>

Running hadoop commands through Hive -e and -f option - this is not a question, just some answers :-)

2013-06-07 Thread Sanjay Subramanian
Hi As a part of my oozie flows I need to create "touchz" files to notify status of each stage. I could never get Oozie shell action to work so I wanted to see if I could use Hive to run a hdfs/hadoop command I had run hadoop commands inside hive but wanted to run them using -e and -f option Here

Renaming partition columnname only (locations remain unchanged)

2013-06-11 Thread Sanjay Subramanian
Hi I have external tables where I want to change the name of the partition column, locations remaining constant Is there a way to do this…Else I will drop and create the table with new partition column names and run my scripts to ADD PARTITION, with LOCATION specified Thanks sanjay CONFIDENTIAL

Re: Create table like with partitions

2013-06-11 Thread Sanjay Subramanian
>From Russia with Love... Domain ID:D39022749-LRMS Domain Name:IT-EBOOKS.INFO Created On:27-Jul-2011 12:24:45 UTC Last Updated On:29-Apr-2013 11:00:03 UTC Expiration Date:27-Jul-2015 12:24:45 UTC Sponsoring Registrar:GoDaddy.com LLC (R171-LRMS) Status:CLIENT DELETE PROHIBITED Status:CLIENT RENEW

Re: Compression in Hive

2013-06-11 Thread Sanjay Subramanian
1. We use LZO compression in our MR jobs that create LZO files (these are NOT sequence files) that are the feeder files for Hive 2. Then we we use Hive data (LZO files) and run aggregation reports Hope this helps Good luck sanjay From: "Ravi Mummulla (BIG DATA)" mailto:rav...@microsoft.com>>

Re: Create table like with partitions

2013-06-11 Thread Sanjay Subramanian
or Amazon. (Not only will you have a legal copy, but you'll encourage authors to write more books for the benefit of all.) I've got it in hardcover and digital format. – Lefty On Tue, Jun 11, 2013 at 6:46 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>>

Re: Renaming partition columnname only (locations remain unchanged)

2013-06-12 Thread Sanjay Subramanian
our new column name in the partition directories on hdfs. It looks much simpler to just create new table and dump the old one via hive though On Wed, Jun 12, 2013 at 3:46 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Hi I have external tables where I want

Re: Enhancing Query Join to speed up Query

2013-06-12 Thread Sanjay Subramanian
Hi I would actually do it like this…so that the set on the left of JOIN becomes smaller SELECT a.item_id, a.create_dt FROM ( SELECT item_id, create_dt FROM A WHERE item_id = 'I001' AND category_n

Re: hive to hbase mapping

2013-06-14 Thread Sanjay Subramanian
6 months back I was tasked with building a Data platform for logs and I benchmarked Hbase + Hive (queries were 8X slower) Hive only So I decided for Hive option and am deploying that solution to production. Couple of things u can think while u design if u really want to go HBase+Hive (also look

Re: LZO compression implementation in Hive

2013-06-17 Thread Sanjay Subramanian
you quickly give your insights on thip topic, if possible? Regards, Ramki. On Mon, May 20, 2013 at 2:51 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Hi Programming Hive Book authors Maybe a lot of u have already successfully implemented this but only these l

Re: hive to hbase mapping

2013-06-17 Thread Sanjay Subramanian
ecause is perfect for aggregating data through the counters, and write performance is great. Now the problem is...Which is the best way for loading periodically (every hour for example) Hbase data in Hive table? Mario 2013/6/14 Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>>

Re: LZO compression implementation in Hive

2013-06-17 Thread Sanjay Subramanian
:-) Not sure how to add a page…may be the Admin needs to grant me permission From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Monday, June

Errors in one Hive script using LZO compression

2013-06-18 Thread Sanjay Subramanian
Hi I am using LZO compression in our scripts but one script is still creating errors Diagnostic Messages for this Task: Error: java.io.IOException: java.io.EOFException: Premature EOF from inputStream at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationExc

Re: Errors in one Hive script using LZO compression

2013-06-18 Thread Sanjay Subramanian
Yes I am going to start debugging from the inner query working my way outwards….starting tomorrow AM… :-) From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Date: Monday, June 17, 2013 11:59 PM To: "user@hive.apache.org<mailto:user@hive.apache.org&

Re: LZO compression implementation in Hive

2013-06-18 Thread Sanjay Subramanian
ving links. * Many people don't pay attention to the page structure, they just google the topic they're looking for. – Lefty On Tue, Jun 18, 2013 at 2:56 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: :-) Not sure how to add a page…may be the Admin

Re: Errors in one Hive script using LZO compression

2013-06-18 Thread Sanjay Subramanian
From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Monday, June 17, 2013 11:59 PM To: "user@hive.apache.org<mailto:user@hive.apache.org>&quo

Re: "show table" throwing strange error

2013-06-20 Thread Sanjay Subramanian
Can u try from your ubuntu command prompt $> hive -e "show tables" From: Mohammad Tariq mailto:donta...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Thursday, June 20, 2013 4:28 AM To: user mailto:user@hive.apache.org>> Subject: Re:

Re: "show table" throwing strange error

2013-06-21 Thread Sanjay Subramanian
totally messed up. Looks like logs are getting written in some binary encoding. I have attached a snapshot of the same. Any idea? Warm Regards, Tariq cloudfront.blogspot.com<http://cloudfront.blogspot.com> On Fri, Jun 21, 2013 at 1:03 AM, Sanjay Subramanian mailto:sanjay.subraman..

Request perm to edit wiki

2013-07-01 Thread Sanjay Subramanian
e page structure, they just google the topic they're looking for. – Lefty On Tue, Jun 18, 2013 at 2:56 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: :-) Not sure how to add a page…may be the Admin needs to grant me permission From: Sanjay Subramanian

One query works the other does notŠany clues ?

2013-07-03 Thread Sanjay Subramanian
THIS FAILS = INSERT OVERWRITE DIRECTORY '/user/beeswax/warehouse/impressions_hive_stats/outpdir_impressions_header/2013-07-01/record_counts' select 'outpdir_impressions_header', '2013-07-01', 'record_counts', 'all_servers', count(*) from outpdir_impressions_header where header_date_part

Re: One query works the other does notŠany clues ?

2013-07-03 Thread Sanjay Subramanian
ount(*) FROM outpdir_impressions_header WHERE header_date_partition='2013-07-01' ; From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Wednesday, J

Re: Loading a flat file + one additional field to a Hive table

2013-07-05 Thread Sanjay Subramanian
How about this ? Assume you have a log file called oompaloompa.log TIMESTAMP=$(date +%Y_%m_%d_T%H_%M_%S);mv oompaloopa.log oompaloopa.log.${TIMESTAMP};cat oompaloopa.log.${TIMESTAMP}| hdfs dfs -put - /user/sasubramanian/oompaloopa.log.${TIMESTAMP} This will directly put the file on HDFS and u

Re: Loading a flat file + one additional field to a Hive table

2013-07-06 Thread Sanjay Subramanian
m <> Versus wc -l <> I see a few hundred records greater in <>. How should I debug it? Any tips please. From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> To: "user@hive.apache.org<mailto:user@hive.apache.org>

Re: Special characters in web log file causing issues

2013-07-08 Thread Sanjay Subramanian
U may have to remove non-printable chars first, save an intermediate file and then load into Hive tr -cd '[:print:]\r\n\t' Or if u have strings function that will only output printable chars From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "user@hive.apache.org

Re: Hive CLI

2013-07-09 Thread Sanjay Subramanian
Hi Rahul Is there a reason why u use Hive CLI ? I have aliases defined that I use, so I never had to use Hive CLI again alias hivescript='hive -e ' alias hivescriptd='hive -hiveconf hive.root.logger=INFO,console -e ' So when I want to run hive commands from Linux I just type hivescript "sele

Re: integration issure about hive and hbase

2013-07-09 Thread Sanjay Subramanian
I am attaching portions from a document I had written last year while investigating Hbase and Hive. You may have already crossed that bridge….nevertheless… Please forgive me :-) if some steps seamy hacky and not very well explained….I was on a solo mission to build a Hive Data platform from sc

Re: export csv, use ',' as split

2013-07-10 Thread Sanjay Subramanian
Hive does not have a output delimiter specifier yet (not sure if 0.11.x may have it) But for now please try the following hive -e myquery | sed 's/\t/,/g' >> result.csv Good luck Sanjay From: kentkong_work mailto:kentkong_w...@163.com>> Reply-To: "user@hive.apache.org

Re: how to let hive support lzo

2013-07-22 Thread Sanjay Subramanian
This works for us SET hive.exec.compress.intermediate=true SET hive.exec.compress.output=true SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec SET mapreduce.map.output.compress=true SET mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.Sna

Calling same UDF multiple times in a SELECT query

2013-07-23 Thread Sanjay Subramanian
Hi V r using version hive-exec-0.9.0-cdh4.1.2 in production I need to check and use the output from a UDF in a query to assign values to 2 columns in a SELECT query Example SELECT a, IF(fooUdf(a) < -1 , -1, fooUdf(a)) as b, IF(fooUdf(a) < -1 , fooUdf(a), 0) as c FROM my_h

  1   2   >