Re: RegexSerDe with Filters

2016-06-21 Thread Arun Patel
s could be create in similar ways. > > > > > > Dudu > > > > > > > > bash > > > ---- > > > > hdfs

Re: Re: [ANNOUNCE] Apache Hive 2.1.0 Released

2016-06-21 Thread tanxinz
Thank you On 2016-06-22 01:37 , Thejas Nair Wrote: Thanks for your hard work and patience in driving the release Jesus! :) On Tue, Jun 21, 2016 at 10:18 AM, Jesus Camachorodriguez wrote: > The Apache Hive team is proud to announce the release of Apache Hive > version 2.1.0. > > The Apache Hive

Re: loading in ORC from big compressed file

2016-06-21 Thread Marcin Tustin
This is because a GZ file is not splittable at all. Basically, try creating this from an uncompressed file, or even better split up the file and put the files in a directory in hdfs/s3/whatever. On Tue, Jun 21, 2016 at 7:45 PM, @Sanjiv Singh wrote: > Hi , > > I have big compressed data file *my_

RE: RegexSerDe with Filters

2016-06-21 Thread Markovitz, Dudu
dfs -mkdir -p /tmp/log/20160621 hdfs dfs –put logfile.txt /tmp/log/20160621 hive /* External table log Defines all common columns + optional column 'tid' which appears in most l

loading in ORC from big compressed file

2016-06-21 Thread @Sanjiv Singh
Hi , I have big compressed data file *my_table.dat.gz* ( approx size 100 GB) # load staging table *STAGE_**my_table* from file *my_table.dat.gz* HIVE>> LOAD DATA INPATH '/var/lib/txt/*my_table.dat.gz*' OVERWRITE INTO TABLE STAGE_my_table ; *# insert into ORC table "my_table"* HIVE>> INSERT IN

RE: if else condition in hive

2016-06-21 Thread Markovitz, Dudu
I understand that you’re looking for the functionality of the MERGE statement. 1) MERGE is currently an open issue. https://issues.apache.org/jira/browse/HIVE-10924 2) UPDATE and DELETE (and MERGE in the future) work under a bunch of limitations, e.g. – Currently only ORC tables are supported ht

Re: [ANNOUNCE] Apache Hive 2.1.0 Released

2016-06-21 Thread Thejas Nair
Thanks for your hard work and patience in driving the release Jesus! :) On Tue, Jun 21, 2016 at 10:18 AM, Jesus Camachorodriguez wrote: > The Apache Hive team is proud to announce the release of Apache Hive > version 2.1.0. > > The Apache Hive (TM) data warehouse software facilitates querying an

[ANNOUNCE] Apache Hive 2.1.0 Released

2016-06-21 Thread Jesus Camachorodriguez
The Apache Hive team is proud to announce the release of Apache Hive version 2.1.0. The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides, among others: * Tools to enable easy

Re: if else condition in hive

2016-06-21 Thread Jörn Franke
I recommend you to rethink it as part of a bulk transfer potentially even using separate partitions. Will be much faster. > On 21 Jun 2016, at 13:22, raj hive wrote: > > Hi friends, > > INSERT,UPDATE,DELETE commands are working fine in my Hive environment after > changing the configuration an

Re: if else condition in hive

2016-06-21 Thread Dmitry Tolpeko
Hi Raj, Hive hpl/sql component can be used to achieve such functionality (see www.hplsql.org for docs), now you can run it as a separate tool, and I hope some day it will be available from Hive or Beeline CLI. Thanks, Dmitry On Tue, Jun 21, 2016 at 2:22 PM, raj hive wrote: > Hi friends, > > IN

RE: RegexSerDe with Filters

2016-06-21 Thread Markovitz, Dudu
Hi I would suggest creating a single external table with daily partitions and multiple views each with the appropriate filtering. If you’ll send me log sample (~100 rows) I’ll send you an example. Dudu From: Arun Patel [mailto:arunp.bigd...@gmail.com] Sent: Tuesday, June 21, 2016 1:51 AM To: us

if else condition in hive

2016-06-21 Thread raj hive
Hi friends, INSERT,UPDATE,DELETE commands are working fine in my Hive environment after changing the configuration and all. Now, I have to execute a query like below sql in hive. If exists(select * from tablename where columnname=something) update table set column1=something where columnname=s

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread Jörn Franke
Well see also comment that it is NOT advisable to use jdbc for these data transfers but to consider the alternatives mention below. The alternatives are more reliable and you will save yourself a lot of troubles. I also doubt that beeline is suitable for this volumes in general. So yes it could

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread Deepak Goel
Could you increase this number (Mebbe three times the current value) and see if it has any impact on throughput: --hiveconf mapreduce.input.fileinputformat.split.maxsize=33554432 \ Hey Namaskara~Nalama~Guten Tag~Bonjour -- Keigu Deepak 73500 12833 www.simtree.net, dee...@simtree.net deic..

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread Mich Talebzadeh
this is a classic issue. are there other users using the same network to connect to Hive. Can your unix admin use a network sniffer to determine the issue with your case? in normal operations with modest amount of data do you see the same issue or this is purely due to your load (the number of ro

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread David Nies
> Am 21.06.2016 um 08:59 schrieb Mich Talebzadeh : > > is the underlying table partitioned i.e. > > 'SELECT FROM `db`.`table` WHERE (year=2016 AND month=6 AND > day=1 AND hour=10)‘ Yes, it is, year, month, day and hour are partition columns. > > and also what is the RS size it is expected. I

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread David Nies
> Am 20.06.2016 um 20:20 schrieb Gopal Vijayaraghavan : > > >> is hosting the HiveServer2 is merely sending data with around 3 MB/sec. >> Our network is capable of much more. Playing around with `fetchSize` did >> not increase throughput. > ... >> --hiveconf >> mapred.output.compression.codec=

Re: Show Redudant database name in Beeline -Hive 2.0

2016-06-21 Thread Mich Talebzadeh
hm That is very strange. What other database besides default you expect to be there Mine is as below Beeline version 2.0.0 by Apache Hive 0: jdbc:hive2://rhes564:10010/default> show databases; ++--+ | database_name | ++--+ | accounts | | asehadoop | | d

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread Mich Talebzadeh
is the underlying table partitioned i.e. 'SELECT FROM `db`.`table` WHERE (year=2016 AND month=6 AND day=1 AND hour=10)' and also what is the RS size it is expected. JDBC on its own should work. Is this an ORC table? What version of Hive are you using? HTH Dr Mich Talebzadeh LinkedIn *