subject:"Reparitioning Hive tables \- Container killed by YARN for exceeding memory limits"

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

2017-08-03 Thread Chetan Khatri

Thanks Holden ! On Thu, Aug 3, 2017 at 4:02 AM, Holden Karau wrote: > The memory overhead is based less on the total amount of data and more on > what you end up doing with the data (e.g. if your doing a lot of off-heap > processing or using Python you need to increase it). Honestly most people

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

2017-08-02 Thread Holden Karau

The memory overhead is based less on the total amount of data and more on what you end up doing with the data (e.g. if your doing a lot of off-heap processing or using Python you need to increase it). Honestly most people find this number for their job "experimentally" (e.g. they try a few differen

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

2017-08-02 Thread Chetan Khatri

Ryan, Thank you for reply. For 2 TB of Data what should be the value of spark.yarn.executor.memoryOverhead = ? with regards to this - i see issue at spark https://issues.apache.org/jira/browse/SPARK-18787 , not sure whether it works or not at Spark 2.0.1 ! can you elaborate more for spark.memor

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

2017-08-02 Thread Ryan Blue

Chetan, When you're writing to a partitioned table, you want to use a shuffle to avoid the situation where each task has to write to every partition. You can do that either by adding a repartition by your table's partition keys, or by adding an order by with the partition keys and then columns you

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

2017-08-02 Thread Chetan Khatri

Can anyone please guide me with above issue. On Wed, Aug 2, 2017 at 6:28 PM, Chetan Khatri wrote: > Hello Spark Users, > > I have Hbase table reading and writing to Hive managed table where i > applied partitioning by date column which worked fine but it has generate > more number of files in a

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

5 matches

Site Navigation

Mail list logo

Footer information