Hi all,
Am using Hive 2.0 , once i have connected via beeline and i queried "show
databases;" command , It will show same database name by more than once.
Is there any issue over this ???
Best,
Karthik
In my test case below, I’m using `beeline` as the Java application receiving
the JDBC stream. As I understand, this is the reference command line interface
to Hive. Are you saying that the reference command line interface is not
efficiently implemented? :)
-David Nies
> Am 20.06.2016 um 17:46
Hello Hive Experts,
I use flume to ingest application specific logs from Syslog to HDFS.
Currently, I grep the HDFS directory for specific patterns (for multiple
types of requests) and then create reports. However, generating reports
for Weekly and Monthly are not salable.
I would like to create
> is hosting the HiveServer2 is merely sending data with around 3 MB/sec.
>Our network is capable of much more. Playing around with `fetchSize` did
>not increase throughput.
...
> --hiveconf
>mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
>\
The current implementation
Aside from this the low network performance could also stem from the Java
application receiving the JDBC stream (not threaded / not efficiently
implemented etc). However that being said, do not use jdbc for this.
> On 20 Jun 2016, at 17:28, Jörn Franke wrote:
>
> Hallo,
>
> For no databases (
Hi David,
What are you actually trying to do with the data.
Hive and map-reduce are notoriously slow for this type of operations. Hive
is good for storage that is what I vouch for.
There are other alternatives.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEA
Hallo,
For no databases (including traditional ones) it is advisable to fetch this
amount through jdbc. Jdbc is not designed for this (neither for import nor for
export of large data volumes). It is a highly questionable approach from a
reliability point of view.
Export it as file to HDFS and
Dear Hive mailing list,
in my setup, network throughput from the HiveServer2 to the client seems to be
the bottleneck and I’m seeking a way do increase throughput. Let me elaborate
my use case:
I’m using Hive version 1.1.0 that is bundeled with Clouders 5.5.1.
I want to fetch a huge amount of