date:20140128

Re: Hive dynamic partitions generate multiple files

2014-01-28 Thread Cosmin Cătălin Sanda

Hi Andre, The reason is that I want those partitions to go into other queries. If the individual files are only a few MB than the performance will be sub-optimal. As far as I understood, the individual files need to be at least around 140MB for the Maps to work properly. -

Re: Using Hive generated SeqeunceFiles and RC files with Java MapReduce and PIG

2014-01-28 Thread Thilina Gunarathne

Thanks for the information Edward. When you use the default Serde (lazySerde) and sequence files hive writes a > SequenceFile(create table x stored as sequence file) , the key is null > and hive serializes all the columns into a Text Writable that is easy for > other tools to read. > Does thi

Is it possible to run Hive 0.12 in local mode without Hadoop binary?

2014-01-28 Thread moon soo Lee

Hi, cool guys. I'm doing Hive GUI opensource project called zeppelin. http://zeppelin-project.org In this project, i execute hive in local mode when GUI application want to run in local mode. Everything works really well. However, Hive local mode still need location of HADOOP_HOME and looking for

Re: Using Hive generated SeqeunceFiles and RC files with Java MapReduce and PIG

2014-01-28 Thread Edward Capriolo

When you use the default Serde (lazySerde) and sequence files hive writes a SequenceFile(create table x stored as sequence file) , the key is null and hive serializes all the columns into a Text Writable that is easy for other tools to read. Hive does not dictate the input format or the output

Re: Hive dynamic partitions generate multiple files

2014-01-28 Thread Andre Araujo

Why do you need exactly one file? This is transparent to Hive and it should treat it seamlessly. Unless you have external requirements (reading files from somewhere else) it shouldn't matter. HDFS support to file append is not a solid standard afaik, and will depend on the distribution and version

Using Hive generated SeqeunceFiles and RC files with Java MapReduce and PIG

2014-01-28 Thread Thilina Gunarathne

Hi, We have a requirement to store a large data set (more than 5TB) mapped to a Hive table. This Hive table would be populated (and appended periodically) using a Hive query from another Hive table. In addition to the Hive queries, we need to be able to run Java MapReduce and preferably Pig jobs as

Re: Hive dynamic partitions generate multiple files

2014-01-28 Thread Cosmin Cătălin Sanda

Hi Andre, So the thing is like this: the first time the query runs, it generates one file per dynamic partition, The next time the query runs and it needs to write to the same partition, it will generate another file instead of merging with the existing one. Eg: 1.The partitioned S3 path looks li

Re: Hive dynamic partitions generate multiple files

2014-01-28 Thread Andre Araujo

Hi, Cosmin, Have you tried using DISTRIBUTE BY to distribute the query's data by the partitioning columns? That way all the data for each partition should be sent to the same reducer and should be written to a single file in each partition, I think. If your data is being distributed by a differen

Hive dynamic partitions generate multiple files

2014-01-28 Thread Cosmin Cătălin Sanda

Hi, I have a number of Hive jobs that run during a day. Each individual job is outputting data to Amazon S3. The Hive jobs use dynamic partitioning. The problem is that when different jobs need to write to the same dynamic partition, they will each generate one file. What I would like is for th

Re: Building Hive

2014-01-28 Thread Stephen Sprague

you => useful. the rest of us schmucks => enlightened. seems like a fair trade-off. :) On Tue, Jan 28, 2014 at 2:26 PM, Lefty Leverenz wrote: > > ... probably should have known that already. > > Oh sure, assuming you spend your spare time reading release notes or > browsing the Hive wiki. Inst

Re: Building Hive

2014-01-28 Thread Lefty Leverenz

> ... probably should have known that already. Oh sure, assuming you spend your spare time reading release notes or browsing the Hive wiki. Instead you've given me a chance to publicize Hive Schema Tool which makes me feel useful, so thanks. Now all you need is answers to your other questions...

Re: Issue with Hive and table with lots of column

2014-01-28 Thread Stephen Sprague

there's always a use case out there that stretches the imagination isn't there? gotta love it. first things first. can you share the error message? the hive version? and the number of nodes in your cluster? then a couple of things come to my mind. Might you consider pivoting the data such th

RE: Building Hive

2014-01-28 Thread Peter Marron

Ah, thank you. I think that I probably should have known that already. Z From: Lefty Leverenz [mailto:leftylever...@gmail.com] Sent: 28 January 2014 11:05 To: user@hive.apache.org Subject: Re: Building Hive I can only answer your last question about rebuilding the metastore: a new Hive schema t

Re: Building Hive

2014-01-28 Thread Lefty Leverenz

I can only answer your last question about rebuilding the metastore: a new Hive schema tool can do that for you, as described in the wiki here . This tool can be used to initialize the metastore sc

Building Hive

2014-01-28 Thread Peter Marron

Hi, So I can see from http://hive.apache.org/downloads.html that I can download versions 11 and 12 and they will work with Hadoop 1.0.4 which I am currently using. So if I want to start stepping through the source, to look into my problem with indexes, should I try and build version 11 or 12 with

Re: Hive dynamic partitions generate multiple files

Re: Using Hive generated SeqeunceFiles and RC files with Java MapReduce and PIG

Is it possible to run Hive 0.12 in local mode without Hadoop binary?

Re: Using Hive generated SeqeunceFiles and RC files with Java MapReduce and PIG

Re: Hive dynamic partitions generate multiple files

Using Hive generated SeqeunceFiles and RC files with Java MapReduce and PIG

Re: Hive dynamic partitions generate multiple files

Re: Hive dynamic partitions generate multiple files

Hive dynamic partitions generate multiple files

Re: Building Hive

Re: Building Hive

Re: Issue with Hive and table with lots of column

RE: Building Hive

Re: Building Hive

Building Hive

15 matches

Site Navigation

Mail list logo

Footer information