Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

EMCO Resources with Status

...

The Instantiated state may be re-invoked to start a new set of rsync resources  and cluster resources.

Rsync resource status values

The status of the rsync resource is maintained in the EMCO AppContext by rsync.  The status values are defined as follows.

  • Pending:  Upon initial creation by ncm (for cluster network intents) or orchestrator (for DIGs), the rsync resources will be initialized to a Pending status to indicate they have not been handled by rsync yet.
  • Applied:  This indicates that rsync has successfully applied the rsync resource to a destination cluster.  This does not indicate anything about the actual status of the corresponding cluster resource(s) - other than it is expected that the cluster resource does exist.
  • Failed:  This indicates that rsync has received a failure response when either attempting to apply or delete the rsync resource from the destination cluster.  rsync  is taking no further action with this resource.
  • Retrying:  This indicates that rsync is continuing to attempt to apply or delete the rsync resource from the destination cluster.  This may occur because connectivity to the destination cluster is currently unavailable, but may resume at a later time.  This will continue until a different lifecyle state is invoked on the controlling EMCO resource.
  • Terminated:  This indicates that rsync has successfully deleted the rsync resource from the destination cluster.  This does not indicate anything about the actual status of the corresponding cluster resource(s) - other than it is expected to that the cluster resource does exist or is in the process of terminating.

Cluster resource status

The status of resources deployed by rsync to clusters is detected as follows.

  1. When rsync instantiates rsync resources, it will also instantiate a ResourceBundleState CR to the cluster.  For a given Composite Application, a ResourceBundleState CR will be deployed for each App (in the Composite App) per cluster.  A label will be applied by rsync to all cluster resources of a given App and will be matched by label to the corresponding ResourceBundleState CR in the cluster.  The label format is:  "emco/deployment-id: <AppContext identifier>-<app-name>"
  2. A 'monitor' pod is present in each cluster is monitors all resources with the "emco/deployement-id" label.  When it detects changes to those resources, it will update the matching ResourceBundleState CR with the details of the resource.  In the example ResourceBundleState CR below, for example, all pod resources that are labeled with emco/deployment-id: 171887448792644816-sink will be included in the 'podStatuses' array.
  3. A Watcher thread is started per cluster by rsync to watch for changes to ResourceBundleState CRs in the cluster.  When an updated CR is detected, the Watcher retrieves it and saves it into the corresponding AppContext per App/Cluster so it is available to provide information for cluster resource queries.

Code Block
collapsetrue
apiVersion: k8splugin.io/v1alpha1
kind: ResourceBundleState
metadata:
  labels:
    emco/deployment-id: 171887448792644816-sink
  name: sink-171887448792644816
  namespace: default
spec:
  selector:
    matchLabels:
      emco/deployment-id: 171887448792644816-sink
status:
  ready: false
  resourceCount: 0
  configMapStatuses: []
  daemonSetStatuses: []
  deploymentStatuses: []
  ingressStatuses: []
  jobStatuses: []
  podStatuses: []
  secretStatuses: []
  serviceStatuses: []
  statefulSetStatuses: []

The cluster resource  status is provided in two forms.

...


PlantUML Macro
@startuml
hide empty description
title Deployment Intent Group state transitions\nwith AppContext state transitions illustrated
state Created #LightGreen
state Approved #LightGreen
state Instantiated #LightGreen
state Terminated #LightGreen

[*] -down-> Created : Create Deployment\nIntent Group
Created --> Created : //Modify intents//
Created -down-> Approved : POST approve
note left of Created
  //Modify intents// encompasses
  adding/deleting/updating the
  intents assigned to a Deployment
  Intent Group as well as updating
  the Deployment Intent Group itself
end note

  

Approved -down-> Instantiated : POST instantiate
Approved -> Created : //Modify intents//

Instantiated -down-> Terminated: POST terminate
Terminated -up-> Instantiated: POST instantiate


state Instantiated #LightGreen {
    state "Instantiated" as acInstantiated #LightBlue
    state Instantiating #LightBlue
    state InstantiateFailed #LightBlue
 
    Instantiating -> InstantiateFailed
    Instantiating -> acInstantiated
 
}

state Terminated #LightGreen {
 
    state "Terminated" as acTerminated #LightBlue
    state "Terminating" #LightBlue
    state "TerminateFailed" as termFailed #LightBlue
    Terminating -> termFailed
    Terminating -> acTerminated
 }

Terminated --> Created: //Modify intents//
Created --> [*]: DELETE Deployment Intent Group
@enduml


AppContext State

The rsync process will maintain a top level state for the AppContext.  The states are:

  • Instantiating:  Once rsync is invoked to instantiate an AppContext, the state will be set to Instantiating.
  • InstantiatedRsyinc will set the .AppContext state to Instantiated after all Resources in the AppContext have been Applied.
  • InstantiateFailed:  This indicates that one or more Resources Failed to be applied.
  • PreTerminate:  If rsync is invoked to terminate an AppContext which is still in the Instantiating state, rsync needs to shutdown any threads in the process of trying to instantiate resources before beginning the terminate sequence.
  • Terminating:  When terminate is invoked, this state will be entered directly if the AppContext is in Instantiated or InstantiateFailed state.  Otherwise, it will enter this state from PreTerminate after the initialize sequence has been shutdown.
  • Terminated:  This indicates that rsync has successfully Deleted all resources.
  • TerminateFailed:  This indicates that rsync has received a failure response from one or more Resources when attempting to delete them.  


PlantUML Macro
@startuml
hide empty description
title AppContext State Transition
state Instantiated #LightBlue
state Instantiating #LightBlue
state InstantiateFailed #LightBlue
state PreTerminate #LightBlue
state Terminating #LightBlue
state Terminated #LightBlue
state TerminateFailed #LightBlue
 
[*] -> Instantiating : **gRPC instantiate**
Instantiating -> InstantiateFailed : //any resource//\n//Apply Failed//
Instantiating -> Instantiated : //all resources//\n//Applied//

Instantiating --> PreTerminate : **gRPC terminate**
Instantiated --> Terminating : **gRPC terminate**
InstantiateFailed --> Terminating : **gRPC terminate**
PreTerminate -> Terminating : //old instantiate//\n//go routines//\n//cleaned up//

note left of PreTerminate
  Transition to PreTerminate/Terminating
  is allowed to occur when the controllering
  EMCO resource (e.g. DeploymentIntentGroup
  or Cluster) is in the Instantiated state
end note

Terminating -> Terminated : //all resources//\n//Deleted//
Terminating --> TerminateFailed : //some resources//\n//Delete Failed//
@enduml


Rsync resource state values

The state of the rsync resource is maintained in the EMCO AppContext by rsync.  The status values are defined as follows.

  • Pending:  Upon initial creation by ncm (for cluster network intents) or orchestrator (for DIGs), the rsync resources will be initialized to a Pending status to indicate they have not been handled by rsync yet.
  • Applied:  This indicates that rsync has successfully applied the rsync resource to a destination cluster.  This does not indicate anything about the actual status of the corresponding cluster resource(s) - other than it is expected that the cluster resource does exist.
  • Failed:  This indicates that rsync has received a failure response when either attempting to apply or delete the rsync resource from the destination cluster.  rsync  is taking no further action with this resource.
  • Retrying:  This indicates that rsync is continuing to attempt to apply or delete the rsync resource from the destination cluster.  This may occur because connectivity to the destination cluster is currently unavailable, but may resume at a later time.  This will continue until a different lifecyle state is invoked on the controlling EMCO resource.
  • Terminated:  This indicates that rsync has successfully deleted the rsync resource from the destination cluster.  This does not indicate anything about the actual status of the corresponding cluster resource(s) - other than it is expected to that the cluster resource does exist or is in the process of terminating.


PlantUML Macro
@startuml
hide empty description
title Rsync Resource State Transition (with corresponding AppContext states)
 
[*] -left-> Instantiating #LightBlue : **gRPC**\n**instantiate**
[*] -right-> Pending : **AppContext**\n**changes to**\n**Instantiating**
Pending -> Applied : Successful\napply
Pending --> Failed : Failed apply\nor delete
Pending --> Retrying

Failed --> Deleted : **AppContext**\n**changes to**\n**Terminating**
Pending --> Deleted : **AppContext**\n**changes to**\n**Terminating**
Retrying --> Deleted : **AppContext**\n**changes to**\n**Terminating**
Applied --> Pending : **AppContext**\n**changes to**\n**Terminating**
Pending --> Deleted : successful\ndelete
Retrying -> Applied
Retrying --> Retrying 
Retrying --> Deleted : successful\ndelete
Retrying --> Failed : Exceed\nretry time

Instantiating -left-> PreTerminate #LightBlue : **//gRPC//**\n**//terminate//**
note left of PreTerminate
  On gRPC terminate, if the
  AppContext is Instantiating
  then it moves to PreTerminate
  and any resources states that
  have not reached a final state
  (Applied or Failed) have to be
  set to Deleted
end note
PreTerminate -down-> Terminating #LightBlue : //all resource states//\n//etc. cleaned up//
X #LightBlue -> Terminating : **//gRPC//**\n**//terminate//**\n//from InstantiateFailed//\n//or Instantiated//
Failed --> InstantiateFailed #LightBlue : //any resource//
Failed --> TerminateFailed #LightBlue : //any resource//
Applied --> Instantiated #LightBlue : //all resources//
Deleted --> Terminated #LightBlue : //all resources//
@enduml


Cluster resource status

The status of resources deployed by rsync to clusters is detected as follows.

  1. When rsync instantiates rsync resources, it will also instantiate a ResourceBundleState CR to the cluster.  For a given Composite Application, a ResourceBundleState CR will be deployed for each App (in the Composite App) per cluster.  A label will be applied by rsync to all cluster resources of a given App and will be matched by label to the corresponding ResourceBundleState CR in the cluster.  The label format is:  "emco/deployment-id: <AppContext identifier>-<app-name>"
  2. A 'monitor' pod is present in each cluster is monitors all resources with the "emco/deployement-id" label.  When it detects changes to those resources, it will update the matching ResourceBundleState CR with the details of the resource.  In the example ResourceBundleState CR below, for example, all pod resources that are labeled with emco/deployment-id: 171887448792644816-sink will be included in the 'podStatuses' array.
  3. A Watcher thread is started per cluster by rsync to watch for changes to ResourceBundleState CRs in the cluster.  When an updated CR is detected, the Watcher retrieves it and saves it into the corresponding AppContext per App/Cluster so it is available to provide information for cluster resource queries.

Code Block
collapsetrue
apiVersion: k8splugin.io/v1alpha1
kind: ResourceBundleState
metadata:
  labels:
    emco/deployment-id: 171887448792644816-sink
  name: sink-171887448792644816
  namespace: default
spec:
  selector:
    matchLabels:
      emco/deployment-id: 171887448792644816-sink
status:
  ready: false
  resourceCount: 0
  configMapStatuses: []
  daemonSetStatuses: []
  deploymentStatuses: []
  ingressStatuses: []
  jobStatuses: []
  podStatuses: []
  secretStatuses: []
  serviceStatuses: []
  statefulSetStatuses: []

The cluster resource  status is provided in two forms.

  1. The actual status{} portion of the cluster resource (if present) is made available in the information returned via the ResourceBundleState CR.
  2. Summarized in a value as follows.
    • Unknown:  The unknown status represents the case where a ResourceBundleState CR has not been received yet, or that the ResourceBundleState CR does not  support that resource type.
    • NotPresent:  For resource types (Kinds) that are supported by the ResourceBundleState CR, if an rsync resource does not have a corresponding cluster resource in the CR, then the cluster status of the resource is NotPresent.
    • Present: For rsync resources with a corresponding cluster resource in the ResourceBundleState CR, the clustesr status is present.
    • TBD: Further work can be done to summarize the Status{} portion of cluster resources to identify the status more precisely than Present - such as: Active, Ready, Not Ready, etc.


Instantiate Sequence

This illustrates the Deployment Intent Group instantiate sequence


PlantUML Macro
@startuml
title Deployment Intent Group Instantiation Sequence
actor Admin

box "Orchestrator"
participant scheduler
participant generic_placement_controller
end box

box "EMCO DB"
database mongo
end box

box "AppContext"
participant etcd
end box

box "resource synchronizer"
participant rsync
participant appcontext_watcher
collections worker_thread
collections cluster_watcher
end box

box "Edge Cluster"
participant API_Server
participant cluster_etcd
participant monitor
end box


Admin -> scheduler : POST instantiate\nDeployment Intent Group
activate scheduler
activate scheduler #Red

scheduler -> mongo : retrieve Deployment Intent Group
scheduler -> Admin : ERROR if wrong state
deactivate scheduler #Red

scheduler -> generic_placement_controller : create AppContext
activate generic_placement_controller

generic_placement_controller -> mongo : get generic placement intent
generic_placement_controller -> etcd : create AppContext and metadata
generic_placement_controller ->scheduler : done
deactivate generic_placement_controller


scheduler -> scheduler : invoke all other\nplacement and action\ncontrollers (not shown)

scheduler -> rsync : gRPC call to instantiate AppContext
activate rsync
activate rsync #Red

rsync -> etcd : Set Appcontext state=Instantiating\nSet Resources state=Pending
rsync -> scheduler : Return Error if on occurs
deactivate rsync #Red
scheduler --> Admin : Return ERROR

create appcontext_watcher
rsync -> appcontext_watcher: invoke thread to instantiate AppContext
activate appcontext_watcher
rsync -> scheduler : return OK
deactivate rsync
scheduler -> Admin : Return OK
deactivate scheduler

create worker_thread
loop for each app
appcontext_watcher -> worker_thread : start thread to handle app
loop for each cluster
worker_thread -> worker_thread : start thread to handle cluster
create cluster_watcher
worker_thread -> cluster_watcher : invoke Watcher for\nResourceBundleState CRs\n(may already exist)
loop until all resources are Applied or a Failure occurs
worker_thread -> API_Server : Send Resource Apply to Cluster
worker_thread -> etcd : If success Resource State = Applied\nElse If Connectivity Error Resource State = Retrying\nElse Resource State = Failed
end
worker_thread -> API_Server : Apply ResourceBundleState CR

end
end
worker_thread -> appcontext_watcher : all worker threads complete
appcontext_watcher -> etcd : if any Resource = Failed, AppContext State = Failed\nElse All Resources=Applied, AppContext State = Instantiated

deactivate appcontext_watcher
@enduml


Terminate Sequence

This illustrates the Deployment Intent Group terminate sequence


PlantUML Macro
@startuml
title Deployment Intent Group Termination Sequence

actor Admin

box "Orchestrator"
participant scheduler
participant generic_placement_controller
end box

box "EMCO DB"
database mongo
end box

box "AppContext"
participant etcd
end box

box "resource synchronizer"
participant rsync
participant old_appcontext_watcher
participant appcontext_watcher
collections old_worker_thread
collections worker_thread
collections cluster_watcher
end box

box "Edge Cluster"
participant API_Server
participant cluster_etcd
participant monitor
end box

activate old_appcontext_watcher
note right of old_appcontext_watcher : may be a thread still\nstill trying to instantiate\nthe appcontext
activate old_worker_thread
note left of old_worker_thread : may be thread(s) still trying\nto instantiate resources

old_appcontext_watcher <- etcd : watching AppContext state

Admin -> scheduler : POST terminate\nDeployment Intent Group
activate scheduler
activate scheduler #Red

scheduler -> mongo : retrieve Deployment Intent Group
scheduler -> Admin : ERROR if wrong state
deactivate scheduler #Red


scheduler -> rsync : gRPC call to terminate AppContext
activate rsync
activate rsync #Red


rsync -> etcd : If AppContext==Instantiating\nAppContext=PreTerminate\nElse AppContext=Terminating
create appcontext_watcher
rsync -> appcontext_watcher ++ : invoke thread to terminate AppContext
rsync -> scheduler : Return Error if one occurs at this stage
deactivate rsync #Red
scheduler --> Admin : Return ERROR
rsync -> scheduler : return OK
deactivate rsync
scheduler -> Admin : Return OK
deactivate scheduler


etcd -> appcontext_watcher : start watching for AppContext=Terminating
etcd -> old_appcontext_watcher ++ : detects AppContext=PreTerminate
old_appcontext_watcher -> old_worker_thread : terminate any running Instantiate threads
deactivate old_worker_thread
etcd <- old_appcontext_watcher -- : Set AppContext=Terminating
deactivate old_appcontext_watcher

appcontext_watcher <- etcd : Detects AppContext=Terminated
loop initialize all resources states for terminate handling
appcontext_watcher -> etcd : If Resource = Applied, Resource= Pending\nelseif Resource = (Pending, Failed, Retrying), Resource=Deleted
end

activate appcontext_watcher

loop for each app
create worker_thread
appcontext_watcher -> worker_thread : start thread to handle app
loop for each cluster
worker_thread -> worker_thread : start thread to handle cluster
loop until all resources are Deleted or Failed
worker_thread -> API_Server : Send Resource Delete to Cluster
worker_thread -> etcd : If success Resource State = Deleted\nElse If Connectivity Error Resource State = Retrying\nElse Resource State = Failed
end

loop waiting until deleted resources disappear from ResourceBundleState CR
monitor -> cluster_etcd : Update ResourceBundleState CR\nas resources get deleted
cluster_watcher <- cluster_etcd : Notifications of changes in\nResourcebundleState CR
cluster_watcher -> etcd : Update Status at app-cluster\nin AppContext
worker_thread -> etcd : wait until ResourceBundleState CR is empty
end
worker_thread -> API_Server : Delete ResourceBundleState CR

end
end
worker_thread -> appcontext_watcher : all worker threads complete
appcontext_watcher -> etcd : if any Resource = Failed, AppContext State = Failed\nElse All Resources=Deleted, AppContext State = Terminated

deactivate appcontext_watcher
@enduml



Status Query Sequence

This illustrates a status query sequence


PlantUML Macro
@startuml
title Status / State Update Sequence

actor Admin

box "Orchestrator"
participant status_api
end box

box "EMCO DB"
database mongo
end box

box "AppContext"
participant etcd
end box

box "resource synchronizer"
participant rsync
participant appcontext_watcher
collections worker_thread
collections cluster_watcher
end box

box "Edge Cluster"
participant API_Server
participant cluster_etcd
participant monitor
end box


worker_thread -> API_Server : App Resources Applied
worker_thread -> API_Server : Create ResourceBundleState CR

loop monitoring for lifetime of ResourceBundleStateCR
monitor <- cluster_etcd : watches for changes on labeled resources
monitor -> cluster_etcd : updates associated ResourceBundleStateCR
cluster_watcher <- cluster_etcd : watches for changes on ResourceBundleStateCR
cluster_watcher -> etcd : Update app/cluster Status object in AppContext
end

Admin -> status_api : GET Deployment Intent\nGroup Status
status_api -> mongo : get Deployment Intent Group
status_api -> etcd : access AppContext State\n(from Deployment Intent Group state
loop for all requested apps/clusters/resources
status_api -> etcd : access AppContext Cluster Status
status_api -> etcd : access AppContext Resource State
end
status_api -> status_api : prepare response
status_api -> Admin : return Status query results
@enduml

...




Status Query

The status query, and variations with query parameters, on an EMCO resource will present the information described previously to the caller.  The basic status query for the two EMCO resources discussed above will look like the following:

...

The following table shows the essential structure of the status query response with a description of which elements are present based on the 'output' parameter.


Format of the status query response

Description

'summary' query

'all' query

'detail' query

'rsync'

query

{






  "name": "<name>",

The name of the Deployment Intent Group or Cluster

X

X

X

X

  "project": "<project name>",

Present for the Deployment Intent Group

X

X

X

X

  "composite-app-name": "<composite-app-name>",

Present for the Deployment Intent Group

X

X

X

X

  "composite-app-version": "<composite-app-version>",

Present for the Deployment Intent Group

X

X

X

X

  "composite-profile-name": "<composite-profile-name>",

Present for the Deployment Intent Group

X

X

X

X

  "state": "[ Created, Approved, Instantiated, Terminated ]",

'state' is the action made by the user

X

X

X

X

  "rsync-status": {

'rsync-status' is the aggregated rsync-status of the resources - subject to query parameter filters

X

X

X

X

      "Pending": 0,

elements with zero can be dropped

X

X

X

X

      "Applied": 5,


X

X

X

X

      "Failed": 2,


X

X

X

X

      "Retrying": 3,


X

X

X

X

      "Terminated": 0


X

X

X

X

  },


X

X

X

X

  "cluster-status" : {

'cluster-status' is the aggregated cluster-status of the resources - subject to query parameter filters

X

X

X


      "NotPresent": 0,


X

X

X


      "Present": 5,


X

X

X


      "Unknown": 5


X

X

X


  },


X

X

X


  "resources": [

array of resources organized by app in the composite-app


X

X

X

    {



X

X

X

      "app-name": "collectd",



X

X

X

      "clusters": [

array of clusters in the app


X

X

X

        {



X

X

X

          "name": "cluster1",



X

X

X

          "resources": [

array of resources in the cluster


X

X

X

            {



X

X

X

              "GVK": {

 The GVK will come from the AppContext except for cases where the resource is only a cluster resource - e.g. an rsync resource that is a Deployment can result in cluster resources of both Deployment and Pod(s).


X

X

X

                "Group": "<group>",



X

X

X

                "Version": "<version>",



X

X

X

                "Kind": "<kind>"



X

X

X

              },



X

X

X

              "Name": "<resource name>",

The name of the resource from the AppContext (or cluster as described for GVK)


X

X

X

              "metadata": { },

The 'metadata' element of the cluster resource as received in the ResourceBundleState CR.  In the case of an 'output=rsync' query, this will be the 'metadata' element from the AppContext.



X

X

              "spec": { },

The 'spec' element of the cluster resource as received in the ResourceBundleState CR. In the case of an 'output=rsync' query, this will be the 'metadata' element from the AppContext.



X

X

              "status": { },

The 'status' element of the k8s resource as received in the ResourceBundleState CR (if the resource has one). In the case of an 'output=rsync' query, this will not be present.



X


              "rsync-status": "[ Pending | Instantiated | Failed | Retrying |Terminated ]",

"Pending" - is set by orchestrator before issuing an Instantiate command to rsync

"Instantiate" - means rsync has successfully invoked deployment of the resource to the cluster

"Failed" - means rsync got an explicit failure when invoking to the cluster

"Retrying" - means connection to cluster is temporarily unavailable, rsync will continue to retry - applies to both instantiate and terminate sequences (initial thought is to detect this condition at a cluster level - but mark each resource)

"Terminated" - means rsync has successfully invoked termination of the resource to the cluster


X

X

X

              "cluster-status": "[ Unknown | NotPresent | Present | (tbd) ]"

Summary status for the resource from info obtained from the cluster (e.g. via the ResourceBundleState CR)

"Unknown" - means a ResourceBundleState has not yet been received or the resource type is not supported in the ResourceBundleState CR

"NotPresent" - the resource is not in the ResourceBundleState CR and it is supported.

"Present" - the resource is present in the ResourceBundleState CR

"tbd" - further status values can be derived as analysis of how to represent the full status {} object from the resource can be interpreted - e.g. 'Ready', 'Failed', 'Pending', etc.


X

X


            }



X

X

X

          ]



X

X

X

        }



X

X

X

      ]



X

X

X

    }



X

X

X

  ]



X

X

X

}



X

X

X

Examples of queries and outputs:

...