-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44820/
-----------------------------------------------------------

(Updated March 15, 2016, 3:19 a.m.)


Review request for samza, Boris Shkolnik, Jake Maes, Yi Pan (Data 
Infrastructure), Navina Ramesh, and Xinyu Liu.


Repository: samza


Description (updated)
-------

The ContainerRequestState class is currently not thread-safe. The class's 
methods and state variables are called from both the ContainerAllocator and the 
AMRMCallback handler.
Here's an analysis I summarized by looking at the current implementation. From 
the below table, getContainersOnAHost() returns the entire list of containers 
to the AbstractContainerAllocator which reads it. However, the Callback thread 
invokes methods that write to the allocatedContainerQueue. (causing a potential 
race condition)

+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| Method                      | requestQueue | hostsToCountMap | 
allocatedContainerQueue | ThreadsAccesedFrom  |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| updateRequestState          | write        | write/read      | write          
         | Callback, Allocator |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| addContainer                |              | read            | write          
         | Callback            |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| updateStateAfterAssignment  | write        | write           | read           
         | Allocator           |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| releaseExtraContainers      | write        | write           | write          
         | Allocator           |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| releaseUnstartableContainer |              |                 |                
         | Allocator           |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| getContainersOnAHost        |              |                 | read           
         | Allocator           |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| getRequestsQueue            | read         |                 |                
         | Allocator           |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+

I'm proposing to make the ContainerRequestState class thread-safe by:
i) Expose more granular synchronized APIs for things like:
1. Getting num containers on a host.
2. Getting the number of pending requests.
3. Checking if the queue is empty etc. / peeking a queue etc.

ii) Getting rid of the non-synchronized methods.

iii) Ensuring that methods that return a map/a list, don't return the original 
map/list, but a defensive copy instead.

The other methods are already synchronized.

This improvement is beneficial because:
1. This allows us to reason about concurrency more effectively when using the 
request-state in multi-threaded contexts.
2. It's cleaner for the ContainerRequestState class to expose higher level 
methods and not expose the guts of the class in its entirety.

--- Diff 2---
Modified the getContainersOnHost method to make a defensive copy.


Diffs (updated)
-----

  
samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java
 b4789e62beb1120f11a8101664b10c34ae930e58 
  samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java 
3e3f48ce2b5c0802e8c7c4f09b632df2d2265c12 
  
samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java 
2b1bdab3c8de3184e930c244a8cae55813c33565 
  
samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerRequestState.java
 402fe784120ac40ed542d9fa60d6a6d7df9c8cda 
  
samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java
 0c7a09f3e4c4c2ce6788be729d0bf4a294243c68 
  samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaTaskManager.java 
9da1edf6ff165ef0306de8730853ad30551a9831 

Diff: https://reviews.apache.org/r/44820/diff/


Testing
-------

All unit tests pass.


Thanks,

Jagadish Venkatraman

Reply via email to