Hi
when i use Hive dynamic partition feature , I found that it is very easy to
meet exceed max created files count exception ( i have set the
hive.exec.max.created.files as 100K but still fail)
I have generated a unpartitioned table 'bsl12.email_edge_lyh_mth1' which
con
Can you please provide us more details:
Number of rows in each table and per partition, the table structure, hive
version, table format, is table sorted or partitioned on dt?
Why don’t you use a join, potentially with a mapjoin hint?
> Am 19.12.2018 um 09:02 schrieb Prabhakar Reddy :
>
> Hello,
Hello,
I have a table large_table with more than 50K partitions and when I run
below query it is running for ever.The other table small_table2 has only
five partitions and when ever I run below query it seems to be scanning all
partitions rather than scanning only five partitions which are there i
Thanks, I think it's the proper explanation. For the query result in the second
query is null, there won't be a partition name generated in dynamic partition
step, so the system doesn't know which partition to overwrite.
Thanks very much!
Regards,
孙志禹
From: Tanvi Thacker
D
query, Dynamic partition step needs to deduce
partition name from the query result, but as your query is not producing
any row, there is no info of the partition to take action on.
Regards,
Tanvi Thacker
On Tue, Oct 23, 2018 at 9:38 PM anci_...@yahoo.com
wrote:
> Dears,
> I fo
Dears,
I found an interesting thing.
When inserting a NULL result into a partition which already contained some
records, there was a difference in the results between using static partition
INSERT and using dynamic partition INSERT.
See the example below:
Partition
Hi all,
I am trying to ingest data to hive using hive streaming processor via apache
nifi. It is working fine for unpartitioned table and also for the existing
partition. When I'm trying to ingest to the partitioned table, it throws the
following error:-
2017-10-26 18:13:15,312 ERROR [Timer
rt.dynamic.partition set to false.
>
> Thanks
> Prasanth
>
> _
> From: Harshit Sharan
> Sent: Thursday, February 11, 2016 5:07 AM
> Subject: HIVE insert to dynamic partition table runs forever / hangs
> To:
>
>
>
> Let us say we
: HIVE insert to dynamic partition table runs forever / hangs
To: mailto:user@hive.apache.org>>
Let us say we have 2 hive tables, tableA & tableB. I am exploding tableA,
JOINing it with few other tables, and then inserting into tableB.
Insert works fine when tableB has no partitions, o
Let us say we have 2 hive tables, tableA & tableB. I am exploding tableA,
JOINing it with few other tables, and then inserting into tableB.
Insert works fine when tableB has no partitions, or insertions are done
using static partition.
However, when there is a dynamic partition, the map re
I actually decided to remove one of my 2 partition columns and make it a
bucketing column instead... same query completed fully in under 10 minutes
with 92 partitions added. This will suffice for me for now.
On Thu, Jun 11, 2015 at 2:25 PM, Pradeep Gollakota
wrote:
> Hmm... did your performance
Hmm... did your performance increase with the patch you supplied? I do need
the partitions in Hive, but I have a separate tool that has the ability to
add partitions to the metastore and is definitely much faster than this. I
just checked my job again, the actual Hive job completed 24 hours ago and
This is something that a few of us have run into. I think the bottleneck is
in partition creation calls to the metastore. My work around was HIVE-10385
which optionally removed partition creation in the metastore but this isn't
a solution for everyone. If you don't require actual partitions in the
Hi All,
I have a table which is partitioned on two columns (customer, date). I'm
loading some data into the table using a Hive query. The MapReduce job
completed within a few minutes and needs to "commit" the data to the
appropriate partitions. There were about 32000 partitions generated. The
comm
I see. The last column in u_data is unixtime while I wanted to partition on
rating. I just assumed Hive would use the same-named column as the one
mentioned in the partition spec. Thanks for clarifying this, I missed that
bit in the documentation.
On Tue, Jul 22, 2014 at 10:13 PM, Prasanth Jayach
>From the error msg it looks like there are too many distinct values in
>partition column. Try increasing the count
>hive.exec.max.dynamic.partitions.pernode to a number >100. If you already know
>the number of distinct values in partition column, try a value greater than or
>equal to that numb
While playing with the movielens data set to learn about dynamic partitions
I ran
from u_data insert overwrite table u_data_p partition (rating) select *
This failed with
[Error 20004]: Fatal error occurred when node tried to create too many
dynamic partitions. The maximum number of dynamic pa
Hi Prasanth,
Thanks a lot for your quick response.
From: Gajendran, Vishnu
Sent: Tuesday, July 22, 2014 11:47 AM
To: user@hive.apache.org
Cc: d...@hive.apache.org
Subject: RE: hive 13: dynamic partition inserts
Hi Prasanth,
Thanks a lot for your quick response
Hi Prasanth,
Thanks a lot for your quick response.
From: Prasanth Jayachandran [pjayachand...@hortonworks.com]
Sent: Tuesday, July 22, 2014 11:28 AM
To: user@hive.apache.org
Cc: d...@hive.apache.org
Subject: Re: hive 13: dynamic partition inserts
Hi Vishnu
Yes
14 10:42 AM
> To: d...@hive.apache.org
> Subject: hive 13: dynamic partition inserts
>
> Hello,
>
> I am seeing a difference between hive 11 and hive 13 when inserting to a
> table with dynamic partitions.
>
> In Hive 11, when I set hive.merge.mapfiles=false before doi
adding user@hive.apache.org for wider audience
From: Gajendran, Vishnu
Sent: Tuesday, July 22, 2014 10:42 AM
To: d...@hive.apache.org
Subject: hive 13: dynamic partition inserts
Hello,
I am seeing a difference between hive 11 and hive 13 when inserting to a table
Hello,
I am seeing a difference between hive 11 and hive 13 when inserting to a table
with dynamic partition.
In Hive 11, when I set hive.merge.mapfiles=false before doing a dynamic
partition insert, I see number of files (generated my each mapper) in the
specified hdfs location as expected
quot;sum_packets\":61,\"sum_flows\":35,\"rank\":1},
{\"protocol\":\"tcp\",\"sum_bytes\":20469,\"sum_packets\":229,\"sum_flows\":10,\"rank\":2},
{\"protocol\":\"icmp\",\"sum_bytes\&qu
g?. What's the reason absolute
path not accepted in below stream reduce.
From: Bogala, Chandra Reddy [Tech]
Sent: Thursday, February 13, 2014 10:42 PM
To: 'user@hive.apache.org'
Subject: RE: Hadoop streaming with insert dynamic partition generate many small
files
Thanks Wang.
Wang [mailto:chen.apache.s...@gmail.com]
Sent: Tuesday, February 04, 2014 3:00 AM
To: user@hive.apache.org
Subject: Re: Hadoop streaming with insert dynamic partition generate many small
files
Chandra,
You don't necessary need java to implement the mapper/reducer. Checkout the
answer in
2014 12:26 PM
> *To:* user@hive.apache.org
> *Subject:* Re: Hadoop streaming with insert dynamic partition generate
> many small files
>
>
>
> it seems that hive.exec.reducers.bytes.per.reducer is still not big
> enough: I added another 0, and now i only gets one file under ea
From: Chen Wang [mailto:chen.apache.s...@gmail.com]
Sent: Monday, February 03, 2014 12:26 PM
To: user@hive.apache.org
Subject: Re: Hadoop streaming with insert dynamic partition generate many small
files
it seems that hive.exec.reducers.bytes.per.reducer is still not big enough: I
added another 0,
it seems that hive.exec.reducers.bytes.per.reducer is still not big
enough: I added another 0, and now i only gets one file under each
partition.
On Sun, Feb 2, 2014 at 10:14 PM, Chen Wang wrote:
> Hi,
> I am using java reducer reading from a table, and then write to another
> one:
>
> FROM (
Hi,
I am using java reducer reading from a table, and then write to another one:
FROM (
FROM (
SELECT column1,...
FROM table1
WHERE ( partition > 6 and partition < 12 )
) A
MAP A.co
in any search engine you like, search for "*could only be replicated to 0
nodes, instead of 1" *
On Mon, Jun 17, 2013 at 11:44 AM, Hamza Asad wrote:
> Im trying to create partition table (dynamically) from old non partitioned
> table. the query is as follow
>
> *INSERT OVERWRITE TABLE new_even
Im trying to create partition table (dynamically) from old non partitioned
table. the query is as follow
*INSERT OVERWRITE TABLE new_events_details Partition (event_date) SELECT
id, event_id, user_id, intval_1, intval_2, intval_3, intval_4, intval_5,
intval_6, intval_7, intval_8, intval_9, intval_
your partitioned column is
>>>>>>>>>> normally at the end of the table, so when you are inserting data
>>>>>>>>>> into this partitioned table, I would recommend using the column
>>>>>>>>>> names
e
>>>>>>>>>
>>>>>>>>> set hive.exec.dynamic.partition=true;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
t;>>>>>>
>>>>>>>> set hive.exec.dynamic.partition=true;
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> insert overwrite table new_table partition(event_date) select col1,
>>>
t;>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> insert overwrite table new_table partition(event_date) select col1, col2
>>>>>> coln, event_date from old_table;
>>>>>>
>>>
col2
>>>>> coln, event_date from old_table;
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 13, 2013 at 5:24 PM, Hamza Asad wrote:
>>>>>
>>>>>> when i browse it in browser, all the data is in *
>&g
_date=__HIVE_DEFAULT_PARTITION__<http://10.0.0.14:50075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070>
>>>>> *, rest of the files does not contai
075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070>
>>>> *, rest of the files does not contains data
>>>>
>>>>
>>>> On Thu, Ju
wrote:
>>>
>>>> what do you mean when you say "it wont split correctly" ?
>>>>
>>>>
>>>> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad wrote:
>>>>
>>>>> what if i have data of more then 500 days the
awar wrote:
>>
>>> what do you mean when you say "it wont split correctly" ?
>>>
>>>
>>> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad wrote:
>>>
>>>> what if i have data of more then 500 days then how can i create
>>>> partition
> what if i have data of more then 500 days then how can i create
>>> partition on date column by specifying each and every date? (i knw that
>>> does not happens in dynamic partition but on dynamic partition, it wont
>>> splits correctly).
>>>
>>>
can i create partition
>> on date column by specifying each and every date? (i knw that does not
>> happens in dynamic partition but on dynamic partition, it wont splits
>> correctly).
>>
>>
>> On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar wrote:
>>
>&
what do you mean when you say "it wont split correctly" ?
On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad wrote:
> what if i have data of more then 500 days then how can i create partition
> on date column by specifying each and every date? (i knw that does not
> happens in dy
what if i have data of more then 500 days then how can i create partition
on date column by specifying each and every date? (i knw that does not
happens in dynamic partition but on dynamic partition, it wont splits
correctly).
On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar wrote:
> you
to partitioned table created
>> something like
>> partitioned by (event_date string)
>>
>>
>> On Wed, Jun 12, 2013 at 7:17 PM, Hamza Asad wrote:
>>
>>> i have created table after enabling dynamic partition. i partitioned it
>>> on da
o partitioned table created
> something like
> partitioned by (event_date string)
>
>
> On Wed, Jun 12, 2013 at 7:17 PM, Hamza Asad wrote:
>
>> i have created table after enabling dynamic partition. i partitioned it
>> on date but it is not splitting data datewise. Below
you did not create partitioned table. You just created a bucketed table.
refer to partitioned table created
something like
partitioned by (event_date string)
On Wed, Jun 12, 2013 at 7:17 PM, Hamza Asad wrote:
> i have created table after enabling dynamic partition. i partitioned it on
>
i have created table after enabling dynamic partition. i partitioned it on
date but it is not splitting data datewise. Below is the query of table
created and data insert
CREATE TABLE rc_partition_cluster_table(
id int,
event_id int,
user_id BIGINT,
event_date string,
intval_1 int
my hive version is 0.9 installed along with cloudera 4.1.
i use "insert into" + "dynamic partition" in my use case but i found the
partition is still overwritten after multiple inserting. I also found some
explains from hive manual:
- INSERT OVERWRITE will overwrite any
.
Hi users,
As im inserting 1GB data in to hive partition table and the job
was ended with the below error
Below is my vender1gb data
vender string
supplierstring
order_date string
quantity int
Vendor_1Supplier_1212012-03-06 2763
Vendor_1
src: /172.23.108.105:57388,
> dest: /172.23.106.80:50010, bytes: 6733, op: HDFS_WRITE, cliID:
> DFSClient_attempt_201205261626_0011_r_01_0, offset: 0, srvID:
> DS-1416163861-172.23.106.80-50010-1335859555961, blockid:
> blk_4133062118632896877_497881, duration: 17580129*
>
> Reg
-Original Message-
From: Philip Tromans [mailto:philip.j.trom...@gmail.com]
Sent: Tuesday, May 29, 2012 3:16 PM
To: user@hive.apache.org
Subject: Re: dynamic partition import
Is there anything interesting in the datanode logs?
Phil.
On 29 May 2012 10:37, Nitin Pawar
mailto:nitinpawar
Is there anything interesting in the datanode logs?
Phil.
On 29 May 2012 10:37, Nitin Pawar wrote:
> can you check atleast one datanode is running and is not part of blacklisted
> nodes
>
>
> On Tue, May 29, 2012 at 3:01 PM, Nimra Choudhary
> wrote:
>>
>>
>>
>> We are using Dynamic partitioning
All my data nodes are up and running with none blacklisted.
Regards,
Nimra
From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Tuesday, May 29, 2012 3:07 PM
To: user@hive.apache.org
Subject: Re: dynamic partition import
can you check atleast one datanode is running and is not part of
can you check atleast one datanode is running and is not part of
blacklisted nodes
On Tue, May 29, 2012 at 3:01 PM, Nimra Choudhary wrote:
> ** **
>
> We are using Dynamic partitioning and facing the similar problem. Below is
> the jobtracker error log. We have a hadoop cluster of 6 nodes, 1.16
We are using Dynamic partitioning and facing the similar problem. Below is the
jobtracker error log. We have a hadoop cluster of 6 nodes, 1.16 TB capacity
with over 700GB still free.
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hadoop.ipc.RemoteException: java.io.IOE
These are my ddls and the select transform query which calls a python
script which returns a json string and date.
I tried it with for a single day it is inserting data properly . Also I
tried using local tables instead of external tables and it works fine.
CREATE EXTERNAL TABLE IF NOT EXISTS t
Can you please add ddls of both the tables and insert/CTAS statement ?
I have been using dynamic partitions in S3 since a long time and
haven't faced any issues.
-
Regards,
Thulasi Ram P
On Fri, Oct 28, 2011 at 6:01 PM, Deepak Subhramanian
wrote:
> I used hive ver 0.71 in Amazon elastic ma
So we got it, I hope!
We did take care of the ulimit max open file thing (e.g. 1.3.1.6.1. ulimit on
Ubuntu: http://hbase.apache.org/book/notsoquick.html). But after the switch
from "native" hadoop to cloudera ditribution cdh3u0 we didn't mention to to
this for the users "hdfs", "hbase" AND "mapr
atest error has been not having enough partitions pernode set in
hive (set to 1000 currently). When increasing this setting it gives the
same error however I've noticed by loading fewer logs I'm avoiding
dynamic partition errors (and thus the job failing).
I have to keep reminding my
It seems that the more dynamic partitions are imported the fewer I am able to
import respectively the smaller the files have to be.
Any clues?
Original-Nachricht
> Datum: Wed, 13 Jul 2011 09:45:27 +0200
> Von: "labtrax"
> An: user@hive.apache.org
>
Hi,
I allways get
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:
Hive Runtime Error while processing row (tag=0)
{"key":{},"value":{"_col0":"1129","_col1":"Campaign","_col2":"34811433","_col3":"group","_col4":"1271859453","_col5":"Soundso","_col6":"93709590","_col
Hello,
can't import files with dynamic partioning. Query looks like this
FROM cost c INSERT OVERWRITE TABLE costp PARTITION (accountId,day) SELECT
c.clientId,c.campaign,c.accountId,c.day DISTRIBUTE BY c.accountId,c.day
Strange thing is: Sometimes it works sometimes mapred fails with something
So we're seeing the following error during some of our hive loads:
2011-07-05 12:26:52,927 Stage-2 map = 100%, reduce = 100%
Ended Job = job_201106302113_3864
Loading data to table default.merged_weblogs partition (day=null)
Failed with exception Number of dynamic partitions created is 1013,
wh
ion.
Can I insert without the overwrite key word?
2011/4/15 Ning Zhang mailto:nzh...@fb.com>>
The LOAD DATA command only copy the files to the destination directory. It
doesn't read the records of the input file, so it cannot do partitioning based
on record values.
On Apr 14, 2011, at
ge_view PARTITION(dt='2008-06-08', country)
SELECT pvs.viewTime, pvs.userid, pvs.page_url, pvs.referrer_url,
null, null, pvs.ip, pvs.country
The dynamic partition we have is on country, and the other partition is dt.
In this implementation, what if I want to import the data in
DATA command only copy the files to the destination directory. It
>> doesn't read the records of the input file, so it cannot do partitioning
>> based on record values.
>>
>> On Apr 14, 2011, at 10:52 PM, Erix Yao wrote:
>>
>> hi,all
>> The dynamic pa
Oh, I see.
just as the example we have:
FROM page_view_stg pvs
INSERT OVERWRITE TABLE page_view PARTITION(dt='2008-06-08', country)
SELECT pvs.viewTime, pvs.userid, pvs.page_url,
pvs.referrer_url, null, null, pvs.ip, pvs.country
The dynamic partition we have is on co
.com>>
The LOAD DATA command only copy the files to the destination directory. It
doesn't read the records of the input file, so it cannot do partitioning based
on record values.
On Apr 14, 2011, at 10:52 PM, Erix Yao wrote:
hi,all
The dynamic partition function is amazing ,but o
tination directory. It
> doesn't read the records of the input file, so it cannot do partitioning
> based on record values.
>
> On Apr 14, 2011, at 10:52 PM, Erix Yao wrote:
>
> hi,all
> The dynamic partition function is amazing ,but only works in insert
> cla
The LOAD DATA command only copy the files to the destination directory. It
doesn't read the records of the input file, so it cannot do partitioning based
on record values.
On Apr 14, 2011, at 10:52 PM, Erix Yao wrote:
hi,all
The dynamic partition function is amazing ,but only wor
hi,all
The dynamic partition function is amazing ,but only works in insert
clause. Can I use it while loading data into table?
For example: load data LOAD DATA LOCAL INPATH
`/tmp/pv_2008-06-08_us.txt` INTO TABLE page_view
PARTITION(date='2008-06-08', country='US'
Thanks, the query works as expected. I guess the query on the wiki is out of
date.
- Original Message
From: Thiruvel Thirumoolan
To: "user@hive.apache.org"
Sent: Tue, March 1, 2011 3:26:13 AM
Subject: Re: Dynamic partition - support for distribute by
Not sure about that, b
Not sure about that, but this is supported:
FROM (SELECT *, dt from table_a DISTRIBUTE BY dt) T
INSERT OVERWRITE TABLE table_b PARTITION(dt)
SELECT * ;
On Mar 1, 2011, at 5:28 AM, Wil - wrote:
> Hi,
>
> Reading the wiki on dynamic partition, there is best
Hi,
Reading the wiki on dynamic partition, there is best practice example to solve
the issue of creating too many dynamic partitions on a specific node. However,
the query does not work.
(http://wiki.apache.org/hadoop/Hive/Tutorial#Dynamic-partition_Insert)
Is this form of query support
Thanks Ryan... that does seem to be my issue.
I found the first thread after I sent this email, but not the second
thread saying it will be fixed next week.
thanks
Khaled
> You are likely encountering a bug w/ Amazon's S3 code:
> https://forums.aws.amazon.com/thread.jspa?threadID=56358&tstart=25
You are likely encountering a bug w/ Amazon's S3 code:
https://forums.aws.amazon.com/thread.jspa?threadID=56358&tstart=25
Try inserting into a non-S3 backed table to see if this is indeed your
problem.
Based on the Amazon forums they are expected a fix this week:
https://forums.aws.amazon.com/
Khaled, which version of Hive are you running? I tried a similar query in trunk
(0.7.0-SNAPSHOT) and it worked.
The error does't mean the data is wrong (ds=null), it means the compiled query
plan doesn't indicate it is a dynamic partition (which is very unlikely for
this simple que
Hi Pat.. I have hive.exec.dynamic.partition.mode=nonstrict and
hive.exec.dynamic.partition=true. That let hive accept the dynamic
partitions, but it still fails.
Khaled
> Check that you have hive.exec.dynamic.partition.mode set to false. That
> or have a static partition column first in your par
Check that you have hive.exec.dynamic.partition.mode set to false. That or
have a static partition column first in your partitioning clause.
Pat
-- Sent from my Palm Pre
On Feb 12, 2011 11:09 PM, khassou...@mediumware.net
wrote:
Hello,
I have the followin
Hello,
I have the following table definition (simplified to help in debugging):
create external table pvs (
time INT,
server STRING,
thread_id STRING
)
partitioned by (
dt string
)
row format delimited fields terminated by '\t'
stored as textfile
Hello,
I am calculating certain statistics on an hourly basis. At the end of
the day, I would like to calculate a daily log.
Each hour, the log file in downloaded from the server, and processed
using an external java program. Here is some sample data from
item_hourly.
hive> select * from item_ho
Hi,
I am trying to test Dynamic-partition Insert. But this is not working as
expected. Kindly help how to solve this problem.
Created source table
--
CREATE EXTERNAL TABLE testmove (
a string,
b string
)
PARTITIONED BY (cust string, dt string);
Data has been kept in
84 matches
Mail list logo