The current ACM state machine works but it is incosistent in the way it handles error states or failed transitions. A composition and its elements can get "stuck" in transition states.
We need to
- Specify what the current state machine is for both compositions and elements and describe what the state machine for both should be
- Specify what the behaviour of the runtime and participants should be in each state
- Specify what the behaviour should be for the runtime and participants should be in transitions
Specifically we need to clarify:
- State of the composition elements
- State of the overall composition is derived from the composition element states
- Admin state/Running state
- When all the elements are fully up and configured, the go to state Passive, when all elements are in Passive, the full composition goes to Passive
- Error states: Are they parallel sates or part of the same state?
- There should “it didn’t work” states like “Passive-Error” or “Run_Error” (names to be decided later)
- Describe what the “Running” state means and what the participant should do in Passive->Running and Running->Passive transitions.
- Say a K8S service crashes, how do we feed that back? Running_Error. The state of the POD is only checked during startup. It is not periodically checked. There should be supervision.
State Machine for Automation Compositions
Current State Machine
- ACM in UNINITIALIZED state: all elements of a ACM are in UNINITIALIZED state, all applications are not deployed and policy types are not deployed and not present in Api.
- User triggers to move ACM from UNINITIALIZED to PASSIVE: runtime-acm moves elements from UNINITIALIZED state to UNINITIALIZED_TO_PASSIVE.
- Element in UNINITIALIZED_TO_PASSIVE:
- participant-ks8: deploys applications
- participant-policy: creates policy types in Api and deploys them with Pap.
- participant-http: configures applications.
- Element in PASSIVE state:
- participant-ks8: applications are deployed.
- participant-policy: policy types are create in Api and deployed with Pap.
- participant-http: applications are configured.
- ACM in PASSIVE state: all elements are moved to PASSIVE, all applications are deployed and configured.
- User triggers to move ACM from PASSIVE to UNINITIALIZED: runtime-acm moves elements from PASSIVE state to PASSIVE_TO_UNINITIALIZED.
- Element in UNINITIALIZED_TO_PASSIVE:
- participant-ks8: undeploys applications
- participant-policy: undeploys policy types with Pap and deletes them in Api.
- Element in UNINITIALIZED state:
- participant-ks8: applications are undeployed.
- participant-policy: policy types are not deployed and not present in Api.
Proposed State Machine
State Machine for Automation Composition Elements
Current State Machine
TBC
Proposed State Machine
Proposed State Machine
- ACM in UNINITIALIZED state: all elements of a ACM are in UNINITIALIZED state, all applications are not deployed and policy types are not deployed and not present in Api.
- User triggers to move ACM from UNINITIALIZED to RUNNING: all runtime-acm moves elements from UNINITIALIZED state to UNINITIALIZED_TO_PASSIVE.
- Element in UNINITIALIZED_TO_PASSIVE:
- participant-ks8: deploys applications
- participant-policy: creates policy types in Api and deploys them with Pap.
- participant-http: do nothing.
- Element in UNINITIALIZED_TO_PASSIVE_ERROR state: got error during deploy.
- ACM in UNINITIALIZED_TO_PASSIVE_ERROR state: at least one element is in UNINITIALIZED_TO_PASSIVE_ERROR state.
- Element in PASSIVE state:
- participant-ks8: applications are deployed.
- participant-policy: policy types are create in Api and deployed with Pap.
- participant-http: applications are not configured yet.
- ACM in PASSIVE state: all elements are moved to PASSIVE, all applications are deployed. In this state, runtime-ACM moves elements from PASSIVE state to PASSIVE_TO_RUNNING.
- Element in PASSIVE_TO_RUNNING:
- participant-ks8: do nothing (maybe checks if application are running).
- participant-policy: do nothing (maybe checks if policy types are running).
- participant-http: configures applications.
- Element in PASSIVE_TO_RUNNING_ERROR state: error during configuration.
- ACM in PASSIVE_TO_RUNNING_ERROR state: at least one element is in PASSIVE_TO_RUNNING_ERROR state.
- Element in RUNNING state:
- participant-ks8: applications are deployed.
- participant-policy: policy types are create in Api and deployed with Pap.
- participant-http: applications are configured.
- ACM in RUNNING state: all elements of a ACM are in RUNNING state, all applications are running.