The primary difference between hive and pig is the language. There are
implementation differences that will result in performance
differences, but it will be hard to figure out what aspect of
implementation responsible for what improvement.
I think a more interesting project would be to compare th
HcatInputFormat does not run any initial mapreduce jobs. It seems to
me that the MapReduce job actually ran.
You might want to do a jstack on your java program client side, to see
what it is waiting on.
On Fri, May 2, 2014 at 7:28 AM, Fabian Reinartz
wrote:
> I implemented a MapReduce job with H
set these value before running insert overwrite command (execute in hive
console)
*set hive.exec.dynamic.partition=true;set
hive.exec.dynamic.partition.mode=nonstrict;*
On Fri, May 2, 2014 at 6:18 PM, Kishore kumar wrote:
> Hi Experts,
>
> How to change the non partitioned table into partit
set these value before running insert overwrite command (execute in hive
console)
*set hive.exec.dynamic.partition=true;set
hive.exec.dynamic.partition.mode=nonstrict;*
On Fri, May 2, 2014 at 6:18 PM, Kishore kumar wrote:
> Hi Experts,
>
> How to change the non partitioned table into partit
I implemented a MapReduce job with HCatalog as input and output. It's
pretty much the same as the example on the website.
If I start my job with `hadoop jar` an initial MapReduce is performed
(which, I guess is the query for the HCatalog data as the setup method in
my mapper is not executed). Afte
I am new in hive and here is my idea?
1. Use mysqldump to dump your data to csv file.
2. Load csv to hive temp table.
3. Create partition table.
4. Use dynamic partition, select from temp table to insert to partition
table. You can use udf to get the date from the timestamp.
Regards,
Craig
2014-5-
Hi Experts,
How to change the non partitioned table into partitioned table in hive.
I created a table with
create table table_name1(col1 type, col2 type...)
row format
fields terminated by '|'
stored as textfile
loaded data from local with
load data local inpath "/to/path" (overwrite)into
for that do i need to load files first in non partitioned table and then in
from there to partitioned table
use insert from unpartitioned table to partitioned one.
On Fri, May 2, 2014 at 4:04 PM, Hamza Asad wrote:
> Sqoop also support dynamic partitioning. I have done that. For that you
> have
It sounds like you might need to export. Via sqoop using a query or view,
as the date granularity in your MySQL table is different from the desired
Hive table. The overall performance may be lower as MySQL must do more than
just read rows from disk, but you may still find ways to get the data in
pa
Hello,
There is this old recommendation for optimizing Hive join to use the
largest table last in the join.
http://archive.cloudera.com/cdh/3/hive/language_manual/joins.html
The same recommendation appears in Programming Hive book.
Is this recommendation still valid or newer version of Hive take
Sqoop also support dynamic partitioning. I have done that. For that you
have to enable dynamic partition i.e dynamic partition = true, in hive.
On Fri, May 2, 2014 at 12:57 PM, unmesha sreeveni wrote:
>
> On Fri, May 2, 2014 at 9:41 AM, Shushant Arora
> wrote:
>
>> Sqoop
>
>
> Hi Shushant
>
On Fri, May 2, 2014 at 9:41 AM, Shushant Arora wrote:
> Sqoop
Hi Shushant
I dont think other ecosystem projects can help you.The only way to import
data from relational DB is SQOOP.
http://my.safaribooksonline.com/book/databases/9781449364618/6dot-hadoop-ecosystem-integration/integration_hiv
12 matches
Mail list logo