hide few columns of a table

2016-01-21 Thread Shushant Arora
Hi Is it possible to retstrict access on few columns of a table using view on top of table and expose allowed columns in view . But making table invisible in select queries and show tables queries while view is visible? Thanks

hive locking doubt

2015-11-16 Thread Shushant Arora
Hi I have a doubt on hive locking mechanism. I have 0.13 deployed on my cluster. When I create explicit lock using lock table tablename partition(partitionname) exclusive. It acquires lock as expected. I have a requirement to release the lock if hive connection with process who created the lock d

writing in parquet hive table using custom MR

2015-11-14 Thread Shushant Arora
Hi I have a requirement to dump parquet files in hive table using custom MR. Parquet has so many data models- avro-parquet,proto-parquet,hive-parquet ? Which one is recommended over other for inmemory plain java objects. Hive internally uses MapredParquetOutputformat . Is it better than AvroParq

hive udf from oozie not working

2015-05-16 Thread Shushant Arora
I have a hive script , where I call a udf . Script works fine when called from local shell script. But when called from within oozie workflow, it throws an exception saying jar not found. add jar hdfs://hdfspath of jar; create temporary function duncname as 'pkg.className'; then on calling func

default number of reducers

2015-04-28 Thread Shushant Arora
In Normal MR job can I configure ( cluster wide) default number of reducers - if I don't specify any reducers in my job.

mapred.reduce.tasks

2015-04-21 Thread Shushant Arora
In MapReduce job how reduce tasks numbers are decided ? I haven't override the mapred.reduce.tasks property and its creating ~700 reduce tasks. Thanks

rename a database

2015-03-24 Thread Shushant Arora
Hi Is there any way in hive0.10 to rename a database ? Thanks

hive across volume rename

2014-12-09 Thread Shushant Arora
Hi Want to know hive across volume rename issue ? I am getting error when loading hdfs file into hive table If dir already exists in table.,loading fails but renames hdfs file. In second try while loading renamed file it succeeds since file in table is not present. Why this issue comes and wats t

create table exception

2014-10-20 Thread Shushant Arora
what could be the reason for create table test_table(a int); FAILED: Error in metadata: MetaException(message:javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : Binary logging not possible. Message: Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEM

problems while writing in partitioned table using oozie

2014-09-29 Thread Shushant Arora
Hi While writing in a partitioned hive table using oozie , at end of job I am getting copy file exception. Oozie is creating job with mapred user not by user who submitted the job, and table is created using another user. Does providing 777 access on table directory solves the problem? Or is ther

Re: bug in hive

2014-09-23 Thread Shushant Arora
s > 3. Writer unlocks the table explicitly via UNLOCK TABLE > > If you're using ZK for your locking I think the client dying (as opposed > to ending the session) should cause the lock to expire. If not, you may > have to assure the unlock happens in your application. Hope that hel

Re: bug in hive

2014-09-20 Thread Shushant Arora
's much more "stable." > > Gotta luv it. Good luck. > > On Sat, Sep 20, 2014 at 8:00 AM, Shushant Arora > wrote: > >> Hi Alan >> >> I have 0.10 version of hive deployed in my org's cluster, I cannot update >> that because of org'

Re: bug in hive

2014-09-20 Thread Shushant Arora
d ignores both LOCK and UNLOCK commands. > Note that it is off by default, you have to configure Hive to use the new > DbTxnManager to get turn on this locking. In 0.13 it still has the bug you > describe as far as acquiring the wrong lock for dynamic partitioning, but I > believe I'

bug in hive

2014-09-20 Thread Shushant Arora
Hive version 0.9 and later has a bug While inserting in a hive table Hive takes an exclusive lock. But if table is partitioned , and insert is in dynamic partition , it will take shared lock on table but if all partitions are static then hive takes exclusive lock on partitions in which data is b

collect_set does not remove duplicate

2014-09-07 Thread Shushant Arora
While group by, if I do collect_set on some other column , documentation says it will return Array of that column after removing duplicates, but its not doing dedup?Is it expected?

overriding slave for a particular job

2014-06-21 Thread Shushant Arora
Hi Can I override slaves nodes for one of my job only. Let say I want current job to be executed on node1 and node2 only. If both are busy let the job wait. Thanks Shushant

Re: large small files vs one big file in hive table

2014-05-05 Thread Shushant Arora
etail. Say for ex. > -How are you planning to consume the data stored in this partition table? > - Are you looking for storage and performance optimizations? Etc. > > Thanks > Saurabh > > Sent from my iPhone, please avoid typos. > > > On 05-May-2014, at 3:33 pm, Shushant

large small files vs one big file in hive table

2014-05-05 Thread Shushant Arora
I have a hive table in which data is populated from RDBMS on daily basis. After map reduce each mapper write its data in hive table partitioned at month level. Issue is daily when job runs it fetches data of last day and each mapper writes its output in seperate file. Shall I merge those files in

Re: data transfer from rdbms to hive

2014-05-02 Thread Shushant Arora
; have to enable dynamic partition i.e dynamic partition = true, in hive. > > > On Fri, May 2, 2014 at 12:57 PM, unmesha sreeveni > wrote: > >> >> On Fri, May 2, 2014 at 9:41 AM, Shushant Arora > > wrote: >> >>> Sqoop >> >> >> ​Hi S

Re: data transfer from rdbms to hive

2014-05-01 Thread Shushant Arora
hey imports data from RDBMS. > > > On Thu, May 1, 2014 at 7:13 PM, Shushant Arora > wrote: > >> Hi >> >> I have a requirement to transfer data from RDBMS mysql to partitioned >> hive table >> Partitioned on Year and month. >> Each record in mysql data

data transfer from rdbms to hive

2014-05-01 Thread Shushant Arora
Hi I have a requirement to transfer data from RDBMS mysql to partitioned hive table Partitioned on Year and month. Each record in mysql data contains timestamp of user activity. What is the best tool for that. 1.Shall I go with sqoop? 2.How to compute dynamic partition from RDBMS data . Shall

Re: when to use hive vs hbase

2014-04-30 Thread Shushant Arora
> even map your existing HBase tables to Hive and operate on them. > > > On Wed, Apr 30, 2014 at 2:04 PM, Shushant Arora > wrote: > >> I have a requirement of processing huge weblogs on daily basis. >> >> 1. data will come incremental to datastore on daily basis a

when to use hive vs hbase

2014-04-30 Thread Shushant Arora
I have a requirement of processing huge weblogs on daily basis. 1. data will come incremental to datastore on daily basis and I need cumulative and daily distinct user count from logs and after that aggregated data will be loaded in RDBMS like mydql. 2.data will be loaded in hdfs datawarehouse o

hive hbase integration

2014-04-17 Thread Shushant Arora
what is hive storage handlers? What are the best practices for hive hbase integration?