[ 
https://issues.apache.org/jira/browse/IGNITE-19519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev updated IGNITE-19519:
--------------------------------------
    Summary: Deployment unit removal  (was: Add remove verification after 
restart )

> Deployment unit removal
> -----------------------
>
>                 Key: IGNITE-19519
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19519
>             Project: Ignite
>          Issue Type: New Feature
>            Reporter: Mikhail Pochatkin
>            Assignee: Vadim Pakhnushev
>            Priority: Major
>              Labels: iep-103, ignite-3
>
> h3. Deployment unit removal process
> The deployment unit must be removed from all cluster nodes. In order to 
> achieve this the following must be implemented:
>  # Change {{clusterDURecord.status}} to {{OBSOLETE}} value. This operation 
> could fail because another process has already changed status to {{OBSOLETE}} 
> or {{REMOVING}} value. It is also impossible to start an undeployment process 
> in case the deployment process is still in progress.
> After this step the deployment unit is not available for new code execution. 
> Code execution in progress still can use this deployment unit.
>  # Meta storage event must be fired to all target nodes due to a change of 
> {{{}clusterDURecord.status{}}}.
>  # After receiving this event by the target node the system must change 
> {{nodeDURecord.status}} to{{ }}{{OBSOLETE}} value.
>  # The node waits for finishing of all code executions in progress that 
> depend on this deployment unit. As soon as all code executions are finished 
> {{nodeDURecord.status}} must be changed to {{REMOVING}} value.
> From this point it is impossible to use the deployment unit for code 
> execution neither for new tasks nor for old tasks (the second is impossible 
> due to the invariant that all old tasks are finished).
>  # For each change of {{nodeDURecord.status}} to {{REMOVING}} value the 
> system is able to receive an event from meta storage and check that all nodes 
> have {{{}nodeDURecord.status == REMOVING{}}}. If the condition is met then 
> {{clusterDURecord.status}} must be changed to {{REMOVING}} too.
>  # Now the deployment unit can be removed from each target node and, after 
> it, remove corresponding status records.
>  # For each removal of {{nodeDURecord}} record from meta storage the system 
> is able to receive an event from meta storage and check that there are no any 
> {{nodeDURecord}} records for the given deployment unit. Now the system must 
> remove the {{clusterDURecord}} record for the deployment unit.
>  
> Note that If the deployment unit was removed then there are no any class 
> loaders associated with this deployment unit. Eventually the class loader 
> should be collected by GC and all classes must be unloaded from JVM. It is 
> the critical requirement in order to avoid memory leaks related to multiple 
> class loading/unloading.
> h3. Node restart during unit removal process
> If a target node was restarted during deployment unit removal process then 
> the node must find all deployment units with clusterDURecord.status == 
> OBSOLETE or clusterDURecord.status == REMOVING for restarted node and finish 
> deployment unit removal process as described in the previous section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to