Re: Using SPLIT with DOT(.) delimiter demonstrate funny behavior within a VIEW

2015-08-24 Thread Sanjay Subramanian
ed square brackets) split(reverse(split(reverse(floc),'/')[0]),'[.]')[0]   (need those square brackets)    From: Vivek Veeramani To: user@hive.apache.org; Sanjay Subramanian Sent: Monday, August 24, 2015 1:57 PM Subject: Re: Using SPLIT with DOT(.) delimiter

Using SPLIT with DOT(.) delimiter demonstrate funny behavior within a VIEW

2015-08-24 Thread Sanjay Subramanian
Hi guys I am using Hive version = 0.13.1-cdh5.3.3 HIVE TABLE =  qnap_resume_file_location---DROP  TABLE IF EXISTS      qnap_resume_file_location;CREATE EXTERNAL TABLE qnap_resume_file_location ( floc STRING     ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' 

Pointing SparkSQL to existing Hive Metadata with data file locations in HDFS

2015-05-27 Thread Sanjay Subramanian
hey guys On the Hive/Hadoop ecosystem we have using Cloudera distribution CDH 5.2.x , there are about 300+ hive tables.The data is stored an text (moving slowly to Parquet) on HDFS.I want to use SparkSQL and point to the Hive metadata and be able to define JOINS etc using a programming structure

Using Hive as a file comparison and grep-ping tool

2015-04-20 Thread Sanjay Subramanian
hey guys As data wranglers and programmers we often need quick tools. One such tool I need almost everyday is one that greps a file based on contents of another file. One can write this in perl, python but since I am already using hadoop ecosystem extensively, I said why not do this in Hive ?  P

Re: [ANN] Hivemall v0.3 is now available

2015-02-08 Thread Sanjay Subramanian
awesome thank u. really value your ML contributions. regardssanjay From: Makoto Yui To: user@hive.apache.org Sent: Friday, February 6, 2015 3:31 AM Subject: [ANN] Hivemall v0.3 is now available Hello all, We are excited to announce that a new stable version of Hivemall (v0.3.0) is n

Re: Hive JSON Serde question

2015-01-25 Thread Sanjay Subramanian
sure will try get_json_objectthank uregardssanjay   From: 丁桂涛(桂花) To: user@hive.apache.org; Sanjay Subramanian Sent: Sunday, January 25, 2015 4:45 PM Subject: Re: Hive JSON Serde question Try get_json_object UDF. No iterations need. :) On Mon, Jan 26, 2015 at 12:25 AM, Sanjay

Re: Hive JSON Serde question

2015-01-25 Thread Sanjay Subramanian
Thanks Ed. Let me try a few more iterations. Somehow I am not doing this correctly :-)  regards sanjay From: Edward Capriolo To: "user@hive.apache.org" ; Sanjay Subramanian Sent: Sunday, January 25, 2015 8:11 AM Subject: Re: Hive JSON Serde question Nested lists requ

Hive JSON Serde question

2015-01-25 Thread Sanjay Subramanian
hey guys  This is the Hive table definition I have created based on the JSON I am using this version of hive json serde https://github.com/rcongiu/Hive-JSON-Serde ADD JAR /home/sanjay/mycode/jar/jsonserde/json-serde-1.3.1-SNAPSHOT-jar-with-dependencies.jar;DROP TABLE IF EXISTS  datafeed_json;C

Re: Writing Hive Query Output to local system

2015-01-16 Thread Sanjay Subramanian
uot; To: user@hive.apache.org; Sanjay Subramanian Sent: Friday, January 16, 2015 11:27 AM Subject: Re: Writing Hive Query Output to local system In your hive-site.xml remove the block corresponding to the parametereç hive.metastore.local That's hit. Is a deprecated parameter and is not necessa

Writing Hive Query Output to local system

2015-01-16 Thread Sanjay Subramanian
hey guys I recall this did not happen in the days of 0.9.x version But I use 0.13.x now and when I run a hive query  hive -e "select * from tablename" > ./myfile.txt The first line in myfile.txt is as follows 2015-01-16 10:48:13,091 WARN  [main] conf.HiveConf (HiveConf.java:initialize(1491)) - DEP

Using IF in the JOIN clause

2015-01-08 Thread Sanjay Subramanian
hey guys This is a portion of a long query we wrote.Can u advise if the bold portion will work ?     and if(f.fr_name is not null, f.fr_name, e.fr_name)=d.fr_name     and if(f.pos_bin is not null, f.pos_bin, e.pos_bin)=d.pos_bin  thanksregards sanjay PART OF A LARGER QUERY==

Re: How to convert RDBMS DDL to Hive DDL ?

2015-01-08 Thread Sanjay Subramanian
@Krishare u looking for an automated tool that takes RDBMS DDL as input and outputs Hive DDL ?I exported the DDLS of all tables with col sequence numbersI wrote code that converted all DB2 tables we have to Hive. Not sure if there is a standard tool. regards sanjay From: Lefty Leverenz To

Re: Optimize hive external tables with serde

2014-10-22 Thread Sanjay Subramanian
WHERE            attribute_X1='1'   AND         attribute_X2='1'  ) atON      jt.customerId = at.customerId From: ptrst To: user@hive.apache.org; Sanjay Subramanian Sent: Wednesday, October 22, 2014 1:02 AM Subject: Re: Optimize hive external tables with serde ad

Re: It's extremely slow when hive reads compression files

2014-10-22 Thread Sanjay Subramanian
It could be the serde that is slow and not the compression ?If your input XML is in multiline records then u may wanna write a bit of RecordReader code to process the multiline XML yourself, just to see if it makes any changes to the processing speed ?https://github.com/sanjaysubramanian/big_da

Re: Migration of metastore tables from mysql to oracle.

2014-10-21 Thread Sanjay Subramanian
First question, Why are u migrating to Oracle ? Since u never store data on Hive Metastore MYSQL is a great choice.  I have done a MYSQL to MYSQL transfer From the source DB mysql dump, it should be possible  to mod any Oracle required syntax right ? From: hadoop hive To: user@hive.apa

Re: select * from table and select column from table in hive

2014-10-21 Thread Sanjay Subramanian
One way to debug is to put bash in action say you have a data file in hdfs (/data/rockers/rockers.csv) that looks like  cust_num,cust_name,instrument1,paul,bass2,john,rhythm3,ringo,drums4,george,lead to get the column=cust_num of data (in this case its column 1) hdfs dfs -cat /data/rockers/rockers.

Re: Optimize hive external tables with serde

2014-10-21 Thread Sanjay Subramanian
1. The gzip files are not splittable, so gzip itself will make the queries slower. 2. As a reference for JSON serdes , here is a example from my blog http://bigdatalatte.wordpress.com/2014/08/21/denormalizing-json-arrays-in-hive/ 3. Need to see your query first to try and optimize it 4. Even if y

Re: Weird Error on Inserting in Table [ORC, MESOS, HIVE]

2014-10-07 Thread Sanjay Subramanian
hi  I faced a similar situation in my dev cluster CDH distribution 5.1.3 See the thread details with log files   https://groups.google.com/a/cloudera.org/forum/#!mydiscussions/scm-users/MpcpHj5mWT8 thanks sanjay From: John Omernik To: user@hive.apache.org Sent: Tuesday, September 9, 2014

Re: Storing result of a query in a variable

2014-06-02 Thread Sanjay Subramanian
Add  -e option ./hive -e --hiveconf MY_VAR =`cat /tmp/result/00_0`; But u could use the following hdfs command as well MY_VAR=$(hdfs dfs -cat /tmp/result/00_0) thanks sanjay From: Chhaya Vishwakarma To: "user@hive.apache.org" Sent: Monday, June 2

Hadoop summit San Jose 5/3/14 - 5/5/14

2014-06-02 Thread Sanjay Subramanian
hi guys  I am going to attend the 3 day hadoop summit in San Jose tomorrow. Looking fwd to the Hive sessions. Hope to see many of u there. regards sanjay

Re: problem with delimiters (control A)

2014-05-29 Thread Sanjay Subramanian
Hi Jack  Since u already have your data with columns separated by CtrlA then u need to define the HIVE table as follows  (by default Hive will assume CtrlA as column delimiter) create table if not exists my_test(     userid  BIGINT,     movieId BIGINT,     comment STRING

Re: Finding Max of a column without using any Aggregation functions

2014-04-23 Thread Sanjay Subramanian
Thanks For the sake of this question I wanted to avoid all order by and limit syntax 😄. It's more of a challenge question Regards Sanjay Sent from my iPhone > On Apr 23, 2014, at 2:51 AM, Furcy Pin wrote: > > Hi, > > note that if your table contains the max value several time, all the > o

Re: create table question

2014-04-22 Thread Sanjay Subramanian
For example if ur name node was hadoop_name_nodeIP:8020 (verify this thru your browser http://hadoop_name_nodeIP:50070) Modified Create Table == CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 'hdfs://hp83

Re: Query hangs at 99.97 % for one reducer in Hive

2014-03-02 Thread Sanjay Subramanian
Even 500 reducers sounds a high number but I don't know the deatils of your cluster. Can u provide some details How many nodes in cluster Hive version Which distribution (Hortonworks, Apache, CDH, Amazon) Node specs Partitions in the table Number of records. Thanks Sanjay Sent from my iPhone

Re: Amazon EMR error

2014-03-01 Thread Sanjay Subramanian
ok so  I spun up another cluster with a previous version and it worked successfully  This Amazon Hive version WORKS SUCCESSFULLY  = AMI version:2.4.1 Hadoop distribution:Amazon 1.0.3 Applications:Hive 0.11.0.1 From: Sanjay

Amazon EMR error

2014-03-01 Thread Sanjay Subramanian
Sorry guys , not sure if I should request help with this error here because   its an error on Amazon EMR Hive But you guys have been my Hive fraternity for about 2 years now  and I thought it best to turn to u for help first  Amazon Hive version  === AMI version:2.4.2 Hadoop distribut

Re: How to prevent user drop table in Hive metadata?

2013-11-30 Thread Sanjay Subramanian
Cloudera Sentry is awesome and I have implemented this in Cloudera manager 4.7.2 CDH 4.4.0. Thanks again to shreepadma for all answers to my questions on the CDH users group. I can provide guidance on Sentry configs if needed. Sent from my iPhone > On Nov 22, 2013, at 4:25 PM, Shreepadma Venug

Re: In Beeline what is the syntax for ALTER TABLE ?

2013-10-21 Thread Sanjay Subramanian
e as hive cli On Mon, Oct 21, 2013 at 11:41 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Hi guys Using Hive0.10.0+198 CDH4 Getting this error for ALTER table command jdbc:hive2://dev-thdp5.corp.nextag.com:100<http://dev-thdp5.corp.nextag.com

In Beeline what is the syntax for ALTER TABLE ?

2013-10-21 Thread Sanjay Subramanian
Hi guys Using Hive0.10.0+198 CDH4 Getting this error for ALTER table command jdbc:hive2://dev-thdp5.corp.nextag.com:100> ALTER TABLE outpdir_seller_hidden ADD IF NOT EXISTS PARTITION (header_date_partition='2013-10-17', header_servername_partition='lu3') LOCATION '/data/output/impres

Re: Execution failed with exit status: 3

2013-10-08 Thread Sanjay Subramanian
ct: RE: Execution failed with exit status: 3 Hi Sanjay, thanks for the suggestion. There are no partitions on either table. From: Sanjay Subramanian [mailto:sanjay.subraman...@wizecommerce.com] Sent: Monday, October 07, 2013 8:19 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subjec

Re: JSON format files versus AVRO

2013-10-08 Thread Sanjay Subramanian
ormat you are suggesting directly, but if you made the unique I'd part of the json object, so that each line was a json record, it would. It's made to be used in conjunction with text tables. Also, even if it proves to not be what you want directly, it already provides a serializ

Re: Execution failed with exit status: 3

2013-10-07 Thread Sanjay Subramanian
Hi Nick How many partitions are there in table t1 and table t2 If there are many partitions in either t1 or t2 or both can u mod your query as follows and see if the error comes up SELECT T1.somecolumn, T2.someothercolumn FROM (SELECT * FROM t1 WHERE partition_column1='') T1 JOIN

JSON format files versus AVRO

2013-10-07 Thread Sanjay Subramanian
Sorry if the subject sounds really stupid ! Basically I am re-architecting our web log record format Currently we have "Multiple lines = 1 Record " format (I have Hadoop jobs that parse the files and create columnar output for Hive tables) [begin_unique_id] Pipe delimited Blah..

Re: Is there any API that tells me what files comprise a hive table?

2013-10-07 Thread Sanjay Subramanian
Perhaps a good thing to have in your Hive cheat sheet :-) ' I use the following mySQL query to find out the locations of the Hive table echo "select t.TBL_NAME, p.PART_NAME, s.LOCATION from PARTITIONS p, SDS s, TBLS t where t.TBL_ID=p.TBL_ID and p.SD_ID=s.SD_ID "| mysql -u -p -A | grep "" Th

Hiveserver2 Authentication (openLDAP) and Authorization (using Sentry)

2013-09-17 Thread Sanjay Subramanian
Hi guys DISCLAIMER == I have no affiliations to Cloudera and I am writing this mail of my own free will, with the hope to help fellow Hive users who will be

Re: Issue while quering Hive

2013-09-16 Thread Sanjay Subramanian
With regards to splitting an compression there are 2 options really as of now If u r using Sequence Files , then Snappy If u r using TXT files then LZO us great (u have to cross a few minor hoops to get LZO to work and I can provide guidance on that) Please don't use GZ (not splittable) / or wor

Re: Inner Map key and value separators

2013-09-13 Thread Sanjay Subramanian
e.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Cc: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Subject: Re: Inner Map key and value separators Unfortunately, I believe there's no way to do this.

Re: question about partition table in hive

2013-09-13 Thread Sanjay Subramanian
A couple of days back, Erik Sammer at the Hadoop Hands On Lab at the Cloudera Sessions demonstrated how to achieve dynamic partitioning using Flume and created those partitioned directories on HDFS which are then readily usable by Hive Understanding what I can from the two lines of your mail be

Inner Map key and value separators

2013-09-13 Thread Sanjay Subramanian
Hi guys I have to load data into the following data type in hive map > Is there a way to define custom SEPARATORS (while creating the table) for - Inner map collection item - Inner map key delimiters for 2nd-level maps are \004 and \005 per this http://mail-archives.apache.org/mod_mbox/hadoop-

Re: Interesting claims that seem untrue

2013-09-12 Thread Sanjay Subramanian
I have not read the full blogs but in the year 2013 , IMHO , LOC is a very old metric that defines good software any more... From: Edward Capriolo mailto:edlinuxg...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Thursday, September

Re: Sentry Meetup

2013-09-04 Thread Sanjay Subramanian
;user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Subject: Re: Sentry Meetup Hi Sanjay, The problems you are encountering with Sentry are due to misconfiguration. These are not bugs. We have responded to you questions on cdh-user@. Shreepadma

Re: how to config the job id in oozie

2013-09-04 Thread Sanjay Subramanian
ser@hive.apache.org>> Subject: Re: how to config the job id in oozie You can also use shell action to generate timestamp in format you want and pass to the next action as parameter. I do agree it should be easier. Artem Ervits Data Analyst New York Presbyterian Hospital From: Sanjay S

Re: Sentry Meetup

2013-09-04 Thread Sanjay Subramanian
Ahh…..why not SFO :-) I am struggling to implement Sentry and would love some inputs Shreepadma, thanks for all your clarifications but I am only partially done with my implementation I have a workaround whereby I can blank out tables from "default" db of hive to all hiveserver2 JDBC users…

Re: how to config the job id in oozie

2013-09-03 Thread Sanjay Subramanian
Hi See here http://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html#a4.2.1_Basic_EL_Constants String timestamp() It returns the UTC current date and time in W3C format down to the second (-MM-DDThh:mm:ss.sZ). I.e.: 1997-07-16T19:20:30.45Z I don’t like the fact that that

Re: Hive Query - Issue

2013-09-03 Thread Sanjay Subramanian
Hi When you do a SELECT * , the partition columns are returned as last N columns (if u have N partitions) In this case the 63rd column in SELECT * is the partition column Instead of SELECT * Do a SELECT col1, col2, col3, ….. Not to show the candle to t

Re: Hive Statistics information

2013-09-03 Thread Sanjay Subramanian
_tempstatsstore<http://v-so1.nextagqa.com/hive_vso1_tempstatsstore?&user=hive_user_vso1&password=hive_user_vso1> exists in your MySQL? Regards Ravi Magham On Sat, Aug 31, 2013 at 6:15 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Hi guys I have configu

Hive Statistics information

2013-08-30 Thread Sanjay Subramanian
Hi guys I have configured Hive to use MySQL for all statistics hive.stats.atomic=false hive.stats.autogather=true hive.stats.collect.rawdatasize=true hive.stats.dbclass=jdbc:mysql hive.stats.dbconnectionstring=jdbc:mysql://v-so1.nextagqa.com/hive_vso1_tempstatsstore?&user=hive_user_vso1&password=

SAS-->Hive integration

2013-08-28 Thread Sanjay Subramanian
Hi guys Anyone tried SAS-->Hive integration successfully ? I tried a simple query in SAS (select col1 from table1 limit 10) and it opened 3 connections to hive-server and killed it !!! :-( I will setup a dev environment for SAS and Hive to test all this But I was wondering if you guys had any

Re: hiveserver2 with OpenLDAP ?

2013-08-24 Thread Sanjay Subramanian
lass (and make sure it's on classpath, of course). I prefer second way. Hope it should help. Let me know it it worked for you. *General question to folks* - am I missing something or there's really a bug in LDAP authenticator, which doesn't allow precise configuration of bind

Re: hiveserver2 with OpenLDAP ?

2013-08-24 Thread Sanjay Subramanian
there's really a bug in LDAP authenticator, which doesn't allow precise configuration of binding string? Mikhail 2013/8/23 Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Thanks a lot Mikhail for getting back. That means I cannot use this using beeline unless

Re: hiveserver2 with OpenLDAP ?

2013-08-23 Thread Sanjay Subramanian
tring as uid=user1,dc=wizetest,dc=com. But most likely, your open ldap expects it to be rather cn=user1,dc=wizetest,dc=com, uid attribute isn't being used. I think the way to go is to provide you own LDAP authenticator, which has more control on how to generate LDAP bind string. Mikhail

hiveserver2 with OpenLDAP ?

2013-08-23 Thread Sanjay Subramanian
Hi guys I tested hiveserver2 with Active directory - It works With Open LDAP it does not Is there any specific syntax for specifying the LDAP url or baseDN ? hive.server2.authentication.ldap.url ldap://myserver.corp.nextag.com:389 hive.server2.authentication.ldap.baseDN dc=wizetest,dc

Re: How to perform arithmetic operations in hive

2013-08-22 Thread Sanjay Subramanian
Yes this will work Also arithmetic operations will work in a WHERE clause Example select channel_id from keyword_impressions_log where header_date_partition='2013-08-21' and channel_id*10=290640 limit 10 From: Justin Workman mailto:justinjwork...@gmail.com>> Reply-To: "user@hive.apache.org

Re: Alter or Query a table with field name 'date' always get error

2013-08-22 Thread Sanjay Subramanian
Yes "date" is a reserved word. My recommendations If your table is Hive managed (I.e u created the table without using EXTERNAL ) === - Then copy the data for this hive table that is on HDFS to another location - Drop the table - CREATE a EXTERNAL TABLE with filename - replace

Re: single output file per partition?

2013-08-21 Thread Sanjay Subramanian
Hi I tried file crusher with LZO but it does not work….I have LZO correctly configured in production and my jobs are running daily using LZO compression. I like Crusher so I will see why its not working…Thanks to Edward the code is there to tweak :-) and test locally sanjay From: Stephen S

Re: only one mapper

2013-08-21 Thread Sanjay Subramanian
Hi Try this setting in your hive query SET mapreduce.input.fileinputformat.split.maxsize=; If u set this value "low" then the MR job will use this size to split the input LZO files and u will get multiple mappers (and make sure the input LZO files are indexed I.e. .LZO.INDEX files are created)

Re: using hive with multiple schemas

2013-08-21 Thread Sanjay Subramanian
Some ideas to get u started CREATE EXTERNAL TABLE IF NOT EXISTS names(fullname STRING,address STRING,phone STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' CREATE EXTERNAL TABLE IF NOT EXISTS names_detail(id BIGINT, fullname STRING,address STRING,gender STRING, phone STRING) ROW FORMAT DE

Re: Last time request for cwiki update privileges

2013-08-20 Thread Sanjay Subramanian
ee to make changes to make Hive even better! Thanks, Ashutosh On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Hey guys I can only think of two reasons for my request is not yet accepted 1. The admins don't want to give me acces

Last time request for cwiki update privileges

2013-08-20 Thread Sanjay Subramanian
es me. Meanwhile to show my thankfulness to the Hive community I shall continue to answer questions .There will be no change in that behavior Regards sanjay From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Date: Wednesday, August 14, 2013 3:52 PM To: "user

Hive Authorization (ROLES AND PRIVILEGES) does not work with hiveserver2 ?

2013-08-19 Thread Sanjay Subramanian
rincipalType ROLE privilege select grantTime Mon Aug 19 12:24:08 PDT 2013 grantor hive Time taken: 1.76 seconds From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>

Re: question about hive SQL

2013-08-19 Thread Sanjay Subramanian
Here is my stab at it. I have not tested it but this should get you started Following points are importat 1. I added a WHERE clause in the sub query to limit he data set by any partition u may have 2. You have to write a collect UDF to use it. Wampler/Capriolo's book in Chapter 13.Functions - r

Hive Authorization clarification

2013-08-16 Thread Sanjay Subramanian
Hi guys I am not getting the expected result from my authorization settings I am evaluating Hive0.10.0+121 mysql> select * from hive.ROLES; +-+-++---+ | ROLE_ID | CREATE_TIME | OWNER_NAME | ROLE_NAME | +-+-++---+ |

Re: SHOW ALL ROLES

2013-08-16 Thread Sanjay Subramanian
Ok never mind , this will work just fine for me mysql> select ROLE_NAME from ROLES; From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Friday, Au

SHOW ALL ROLES

2013-08-16 Thread Sanjay Subramanian
Hi How do I display all ROLES defined in hive thru CLI ? Thanks sanjay CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use

Re: FAILED: Error in metadata: MetaException

2013-08-16 Thread Sanjay Subramanian
Hi Ankit Do u have a directory on HDFS /user/hive/warehouse And its permission should be 1777 sanjay From: Ankit Bhatnagar mailto:ank...@yahoo-inc.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Friday, August 16, 2013 12:02 PM To: "user@h

Re: Hive cli Vs beeline cli

2013-08-16 Thread Sanjay Subramanian
Some notes from my experience * Beeline u have the benefit of being able to use LDAP/Kerberos authentication * I am not sure how to use -e and -f option with Beeline which is very strong with hive CLI * Beeline at the present version may not fully integrate with Oozie, so if u are u

Re: Review Request (wikidoc): LZO Compression in Hive

2013-08-14 Thread Sanjay Subramanian
Once again, I am down on my knees humbling calling upon the Hive Jedi Masters to please provide this paadwaan with cwiki update privileges May the Force be with u Thanks sanjay From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org&

Re: Hive and Lzo Compression

2013-08-14 Thread Sanjay Subramanian
a table def without mentioning a stored as clause then you load data into table from a compressed a file then do a select query and it still works but how did it figured out which compression codec to use? Am I stating it correctly ? On Wed, Aug 14, 2013 at 11:11 PM, Sanjay Subramanian mailto:

Re: Strange error in Hive - Insert INTO

2013-08-14 Thread Sanjay Subramanian
Another reason I can think of is possibly some STRING column in your table has a "DELIMITER" character…Like once in production I had tab spaces in the string and my table was also defined using TAB as delimiter From: Stephen Sprague mailto:sprag...@gmail.com>> Reply-To: "user@hive.apache.org

Re: Hive and Lzo Compression

2013-08-14 Thread Sanjay Subramanian
rience is that the SELECT query still works - even when I do not specify the STORED AS clause... that puzzles me a bit. Von: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> An: "user@hive.apache.org<mailto:user@hive.apache.o

Hiveserver2 Beeline command clarification

2013-08-13 Thread Sanjay Subramanian
Hi guys I just hooked up hivservrer2 to ldap. In beeline I realized you can login like the following (don't need to define "org.apache.hive.jdbc.HiveDriver") beeline> !connect jdbc:hive2://dev-thdp5:1 sanjay.subraman...@wizecommerce.com scan complete in 2ms Connecting to jdbc:hive2://dev-th

Re: LZO output compression

2013-08-13 Thread Sanjay Subramanian
Check this class where these are defined http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1/src/mapred/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java From: w00t w00t mailto:w00...@yahoo.de>> Reply-To: "user@hive.apache.org" mailto:user@hiv

Re: Hive and Lzo Compression

2013-08-13 Thread Sanjay Subramanian
guage manual): https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO. On Thu, Aug 8, 2013 at 3:30 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Please refer this documentation here Let me know if u need more clarifications so that we can make t

Does hiveserver2 support -e and -f options

2013-08-12 Thread Sanjay Subramanian
CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the inten

"hive -h " option bypasses ROLES and access permissions ?

2013-08-12 Thread Sanjay Subramanian
Hi Hive version 0.9.0 (hive-common-0.9.0-cdh4.1.2.jar) hive.security.authorization.enabled true enable or disable the hive client authorization Linux User = hiveuser1 (no hive permissions) CASE 1 hive -e "select * from outpdir_ptitle_explanation_parsed limit 10" Authorization failed:

Re: Hive and Lzo Compression

2013-08-10 Thread Sanjay Subramanian
che.org/confluence/display/Hive/LanguageManual+LZO. On Thu, Aug 8, 2013 at 3:30 PM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: Please refer this documentation here Let me know if u need more clarifications so that we can make this document better and complete Thank

Hive query hive_server_host A and write results to hadoop cluster B

2013-08-08 Thread Sanjay Subramanian
Hi guys Perhaps u know this already but very useful. This directly creates a file based on the output of this query to name_node_host_2 HDFS cluster Regards Sanjay hive -h hive_server_host1 -e ""| hdfs dfs -put - hdfs://:/path/to/ur/dir/your_file_name CONFIDENTIALITY NOTICE ===

Re: Hive and Lzo Compression

2013-08-08 Thread Sanjay Subramanian
Please refer this documentation here Let me know if u need more clarifications so that we can make this document better and complete Thanks sanjay From: w00t w00t mailto:w00...@yahoo.de>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>>, w00t w00t ma

Re: Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-07 Thread Sanjay Subramanian
m>> wrote: Sounds like the wikidoc needs some work. I'm open to suggestions. If Sanjay's simple UDF helps, I could put it in the wiki along with any advice you think would help. Does anyone else have use cases to contribute? -- Lefty On Mon, Aug 5, 2013 at 2:45 PM, Sanjay Su

Re: Hive Query Issue

2013-08-07 Thread Sanjay Subramanian
Hi Some quick checks Please don't mind if my questions sound trivial 1. Is your hdfs cluster or pseudo-distributed node up and running Can u see the HDFS at http://host:50070 ? 2. Is your mrV1 or Yarn (mrV2) up and running 2a)mrV1 http://host:50030 2b) YARN http://host:8088 3. Is

Re: Hive Thrift Service - Not Running Continously

2013-08-06 Thread Sanjay Subramanian
At least that would ensure asynch running and would not die if ur session dies Another way I would propose also is to have a screen session dedicated to hive_server and start it in synch mode Create a screen session == screen -S my_awesome_hive_server ## gets u into the screen se

Re: Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-05 Thread Sanjay Subramanian
Hi Ritesh To help u get started , I am writing a simple HelloWorld-ish UDF that might help…If it doesn't please ask for more clarifications... Good Luck Thanks sanjay ToUpperCase.java package com.sanjaysubramani

Re: Hive Thrift Service - Not Running Continously

2013-08-05 Thread Sanjay Subramanian
Can u see the logs why the service is dying ? Will have some clues there…sometimes the hive log directory can get full and that might kill it. I have seen that happening in our install here From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "user@hive.apache.org

Is there a way to disable -h option ?

2013-08-02 Thread Sanjay Subramanian
Thanks Sanjay From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Thursday, August 1, 2013 6:37 PM To: "user@hive.apache.org<mailto:user@hive.apach

Hive Authorization is bypassed with -h option

2013-08-01 Thread Sanjay Subramanian
Hi Hive version 0.9.0 (hive-common-0.9.0-cdh4.1.2.jar) hive.security.authorization.enabled true enable or disable the hive client authorization Linux User = hiveuser1 (no hive permissions) CASE 1 hive -e "select * from outpdir_ptitle_explanation_parsed limit 10" Authorization failed:

Nice hive notes and cheat sheets from Minwoo Kim

2013-08-01 Thread Sanjay Subramanian
http://julingks.tistory.com/category/Hive Thanks sanjay CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure

Re: Review Request (wikidoc): LZO Compression in Hive

2013-07-31 Thread Sanjay Subramanian
Hi guys Any chance I could get cwiki update privileges today ? Thanks sanjay From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Date: Tuesday, July 30, 2013 4:26 PM To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hi

Review Request (wikidoc): LZO Compression in Hive

2013-07-30 Thread Sanjay Subramanian
Hi Met with Lefty this afternoon and she was kind to spend time to add my documentation to the site - since I still don't have editing privileges :-) Please review the new wikidoc about LZO compression in the Hive language manual. If anything is unclear or needs more information, you can email

Possible release date for Hive 0.12.0 ?

2013-07-29 Thread Sanjay Subramanian
Hi guys When is stable Hive 0.12.0 expected I have a use case that needs this fixed and looks like its fixed in 0.12.0 https://issues.apache.org/jira/browse/HIVE-3603 Sanjay CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the

Re: Merging different HDFS file for HIVE

2013-07-26 Thread Sanjay Subramanian
Hi I am using Oozie Coordinators to schedule and run daily Oozie Workflows that contain 35-40 actions each (I use shell, java , hive and map reduce oozie actions) So if anyone needs help and has questions please fire away… sanjay From: Sanjay Subramanian mailto:sanjay.subraman

Re: Merging different HDFS file for HIVE

2013-07-26 Thread Sanjay Subramanian
We have a similar situation like this in production…for your case case I would propose the following steps 1. Design a map reduce job (Job Output format - Text, Lzo, Snappy, your choice) Inputs to Mapper -- records from these three feeds Outputs from Mapper -- Key =Value =

Re: Need help in joining 2 tables

2013-07-26 Thread Sanjay Subramanian
Hi Rams Please don't think I am sermonizing or preaching and please don't mind what I am saying :-) This community is there is help u and there is no doubt about that. However I am assuming you tried out a few options by yourself before you reached out to the community with your question. Sin

Re: Help in debugging Hive Query

2013-07-25 Thread Sanjay Subramanian
The query is correct but since u r creating a managed table , that is possibly creating some issue and the records are not all getting created This is what I would propose CHECKPOINT 1 : Is this query running at all ? === Use this option in BOLD and run the QUERY

Re: Calling same UDF multiple times in a SELECT query

2013-07-23 Thread Sanjay Subramanian
wrote: fucntion return values are not stored for repeat use of same (as per my understanding) I know you may have already thought about other approach as select a , if (call <-1, -1 call) as b from (select a, fooudf(a) as call from table On Wed, Jul 24, 2013 at 12:42 AM, Sanjay Subramani

Re: Calling same UDF multiple times in a SELECT query

2013-07-23 Thread Sanjay Subramanian
multiple times in a SELECT query fucntion return values are not stored for repeat use of same (as per my understanding) I know you may have already thought about other approach as select a , if (call <-1, -1 call) as b from (select a, fooudf(a) as call from table On Wed, Jul 24, 2013 at

Calling same UDF multiple times in a SELECT query

2013-07-23 Thread Sanjay Subramanian
Hi V r using version hive-exec-0.9.0-cdh4.1.2 in production I need to check and use the output from a UDF in a query to assign values to 2 columns in a SELECT query Example SELECT a, IF(fooUdf(a) < -1 , -1, fooUdf(a)) as b, IF(fooUdf(a) < -1 , fooUdf(a), 0) as c FROM my_h

Re: how to let hive support lzo

2013-07-22 Thread Sanjay Subramanian
This works for us SET hive.exec.compress.intermediate=true SET hive.exec.compress.output=true SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec SET mapreduce.map.output.compress=true SET mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.Sna

Re: export csv, use ',' as split

2013-07-10 Thread Sanjay Subramanian
Hive does not have a output delimiter specifier yet (not sure if 0.11.x may have it) But for now please try the following hive -e myquery | sed 's/\t/,/g' >> result.csv Good luck Sanjay From: kentkong_work mailto:kentkong_w...@163.com>> Reply-To: "user@hive.apache.org

Re: integration issure about hive and hbase

2013-07-09 Thread Sanjay Subramanian
I am attaching portions from a document I had written last year while investigating Hbase and Hive. You may have already crossed that bridge….nevertheless… Please forgive me :-) if some steps seamy hacky and not very well explained….I was on a solo mission to build a Hive Data platform from sc

Re: Hive CLI

2013-07-09 Thread Sanjay Subramanian
Hi Rahul Is there a reason why u use Hive CLI ? I have aliases defined that I use, so I never had to use Hive CLI again alias hivescript='hive -e ' alias hivescriptd='hive -hiveconf hive.root.logger=INFO,console -e ' So when I want to run hive commands from Linux I just type hivescript "sele

  1   2   >