Out of curiosity, I searched for 'capacity scheduler deadlock' yielded the following:
[YARN-3265] CapacityScheduler deadlock when computing absolute max avail capacity (fix for trunk/branch-2) [YARN-3251] Fix CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) YARN-2456 Possible livelock in CapacityScheduler when RM is recovering apps Looks like CapacityScheduler should get more stable in the upcoming hadoop 2.7.0 release. Cheers On Sat, Mar 14, 2015 at 4:25 AM, Simon Elliston Ball < si...@simonellistonball.com> wrote: > You won’t be able to use YARN labels on 2.2.0. However, you only need the > labels if you want to map containers on specific hardware. In your > scenario, the capacity scheduler in YARN might be the best bet. You can > setup separate queues for the streaming and other jobs to protect a > percentage of cluster resources. You can then spread all jobs across the > cluster while protecting the streaming jobs’ capacity (if your resource > containers sizes are granular enough). > > Simon > > > On Mar 14, 2015, at 9:57 AM, James <alcaid1...@gmail.com> wrote: > > My hadoop version is 2.2.0, and my spark version is 1.2.0 > > 2015-03-14 17:22 GMT+08:00 Ted Yu <yuzhih...@gmail.com>: > >> Which release of hadoop are you using ? >> >> Can you utilize node labels feature ? >> See YARN-2492 and YARN-796 >> >> Cheers >> >> On Sat, Mar 14, 2015 at 1:49 AM, James <alcaid1...@gmail.com> wrote: >> >>> Hello, >>> >>> I am got a cluster with spark on yarn. Currently some nodes of it are >>> running a spark streamming program, thus their local space is not enough to >>> support other application. Thus I wonder is that possible to use a >>> blacklist to avoid using these nodes when running a new spark program? >>> >>> Alcaid >>> >> >> > >