-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45144/
-----------------------------------------------------------

Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan (Data 
Infrastructure).


Repository: samza


Description
-------

Persist the task-to-container mapping in the coordinator stream and use it to 
minimize the reassignment when the container count changes.

A new BalancingTaskNameGrouper interface exposes the balance() method, which 
can be implemented pluggably. 
GroupByContainerCount has been rewritten in java and the balance() 
functionality added. This is because the balance logic is specific to a grouper.

Detailed changes:
import-control.xml - Update imports that weren't needed in scala and add some 
for tests.
LocalityManager.java - Add TaskAssignment manager. This mostly just keeps the 
JobCoordinator code cleaner, but also associates the 2 managers for Host 
Affinity info.
GroupByContainerCount - THE BIG ONE. Rewritten in Java and it now implements 
BalancingTaskNameGrouper
GroupByContainerCountFactory - Rewritten in Java
TestStorageRecovery - Old test depended on the order of the partitions in a 
container. Now it doesn't.
TestGroupByContainerCount - Rewritten in Java and lots of tests added for 
balance()
BalancingTaskNameGrouper - New interface for the balance method. Exposes the 
new functionality without breaking backward compatibility
TaskAssignmentManager - Coordinator stream manager for task-to-container mapping
TaskNameGroupBalancer - Bridges the task mapping (balance) capability with 
taskname groupers, old and new
SetTaskContainerMapping - Coordinator strem message for the task-to-container 
mapping


Diffs
-----

  checkstyle/import-control.xml 53cb8b447240fea08d98ccfb12ed24bec6cbf67c 
  samza-core/src/main/java/org/apache/samza/container/LocalityManager.java 
acf93525ea5c97df187bbe7977e2ae9fea65b32b 
  
samza-core/src/main/java/org/apache/samza/container/grouper/task/BalancingTaskNameGrouper.java
 PRE-CREATION 
  
samza-core/src/main/java/org/apache/samza/container/grouper/task/GroupByContainerCount.java
 PRE-CREATION 
  
samza-core/src/main/java/org/apache/samza/container/grouper/task/TaskAssignmentManager.java
 PRE-CREATION 
  
samza-core/src/main/java/org/apache/samza/container/grouper/task/TaskNameGroupBalancer.java
 PRE-CREATION 
  
samza-core/src/main/java/org/apache/samza/coordinator/stream/AbstractCoordinatorStreamManager.java
 211b64215f26db49cd4411ff3fb41231145307d7 
  
samza-core/src/main/java/org/apache/samza/coordinator/stream/messages/SetContainerHostMapping.java
 4d093b500b7f3b582446634ced3e9d0b28371616 
  
samza-core/src/main/java/org/apache/samza/coordinator/stream/messages/SetTaskContainerMapping.java
 PRE-CREATION 
  
samza-core/src/main/scala/org/apache/samza/container/grouper/task/GroupByContainerCount.scala
 cb0a3bde15174c53f8eb3c0dbbb4f59dbf2589b1 
  
samza-core/src/main/scala/org/apache/samza/container/grouper/task/GroupByContainerCountFactory.scala
 8bbfd639cd9ea1d758d0daa45ce41093c1cb66f6 
  samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 
06a96ad6ed786c22924017f894413bfa1ea34c06 
  
samza-core/src/test/java/org/apache/samza/container/grouper/task/TestGroupByContainerCount.java
 PRE-CREATION 
  
samza-core/src/test/java/org/apache/samza/container/grouper/task/TestTaskAssignmentManager.java
 PRE-CREATION 
  
samza-core/src/test/java/org/apache/samza/container/grouper/task/TestTaskNameGroupBalancer.java
 PRE-CREATION 
  samza-core/src/test/java/org/apache/samza/container/mock/ContainerMocks.java 
PRE-CREATION 
  
samza-core/src/test/java/org/apache/samza/coordinator/stream/MockCoordinatorStreamSystemFactory.java
 e0d4aa1016d61ce328d7ff74b58f7b8f7682f386 
  
samza-core/src/test/java/org/apache/samza/coordinator/stream/MockCoordinatorStreamWrappedConsumer.java
 429573b480112c7491303dc410d78f37a308c4a7 
  samza-core/src/test/java/org/apache/samza/storage/TestStorageRecovery.java 
53207ad7e87fe491c6ae95ae6c590c6d5668d3dc 
  
samza-core/src/test/scala/org/apache/samza/container/grouper/task/TestGroupByContainerCount.scala
 6e9c6fa579a5901000bea0601c771783d8334f0e 

Diff: https://reviews.apache.org/r/45144/diff/


Testing
-------

A bunch of new unit tests have been added. 

Also tested with a test job. The task mapping (initially missing) is added the 
first time the job is run. It is then used as expected to reduce task 
reassignment as the container count was adjusted from 4->3->5 on subsequent 
runs.


Thanks,

Jake Maes

Reply via email to