Overview
This analysis is to see the possibility of scaling the CPS-NCMP component horizontally to respond well to increased load.
Jira |
---|
server | ONAP Jira |
---|
columnIds | issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution |
---|
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution |
---|
serverId | 425b2b0a-557c-3c0c-b515-579789cceedb |
---|
key | CPS-786 |
---|
|
Scenario
Load scenario: (model)sync 10,000 cm handles with the same schema-set: imagine this is done by firing 1000 requests of (batches of) 10 cm handles registration to a pool of cps-ncmp instances
Issues and Resolutions
Issue | Description | Resolution |
---|
Pod Instances readiness check | - the current CPS deployment contains 'livenessProbe' and 'readinessProbe' to determine if containers are
|
|
HTTP keep-alive |
|
|
CPS Deployment in ONAP
Currently, CPS in ONAP deployment has the following configuration shown in the table below.
**Resources listed below is only relevant for this study
Resource name | Description | Notes |
---|
cps-and-ncmp deployment | - creates instances of pods (set of containers) that run cps application
- default pod replica count is set to 1 (i.e. creates 1 pod)
Gliffy Diagram |
---|
displayName | Diagram 4 |
---|
name | Diagram 4 |
---|
pagePin | 4 |
---|
|
| - replica count can be changed to the number of pods desired to be added through values.yaml file of the cps-core component
- the value 'replicaCount' is used by the deployment object configured in the manifest deployment.yaml
- deployment.yaml
- apiVersion: apps/v1
kind: Deployment metadata: {{- include "common.resourceMetadata" . | nindent 2 }} spec: replicas: {{ .Values.replicaCount }} - . . .
-
- if the replica count is 2, the deployment object will create and k8s controller ensures that the number of running pods is always 2, the load is also distributed among the replicas
- this method has been tested to create 3 instances of cps-ncmp by specifying replicaCount in values.yaml
- values.yaml
- . . .
# Resource Limit flavor -By Default using small flavor: small # default number of instances replicaCount: 3 # Segregation for Different environment (Small and Large) resources: . . .
|
postgres deployment | - creates instances of pods that run Postgres
- default pod replica count is set to 1 (i.e. creates 1 pod)
- utilizes persistent volumes
|
|
cps-core service | - enables network access to pods running cps application
- uses the default service type 'ClusterIP' which exposes CPS's multiple ports to allow only internal access to pods
- this service is also exposed by an Ingress which exposes HTTP and HTTPS routes from outside
- operates with an Ingress resource wherein the traffic rules are stated
- allows the service to be reachable via URL
- allows multiple services to be mapped to a single port
Gliffy Diagram |
---|
border | true |
---|
displayName | Diagram 5 |
---|
name | Diagram 5 |
---|
pagePin | 3 |
---|
|
| - Other service types are NodePort, LoadBalancer, and ExternalName
- ClusterIp service type uses round-robin/random to load balance
|
postgres Service | - uses the default service type ClusterIP
- ingress is not enabled for this service therefore it is only accessible within the Kubernetes cluster
|
|
Load Balancing Options
Load balancing option | Description | Issues | Possible solution |
---|
Increasing pod/replica count
| Replica count is changed from the default number to the number of pods desired to be added through values.yaml file - Defined in the manifest deployment.yaml which means the deployment object created will own and manage the replicaSet
- this is the recommended way to use replicaSets
| - Burst propagation
- eviction of one replica results in the load being redirected to another replica
- CPU in the node can be/will be fully used up
| - adding Horizontal Pod Autoscaler (HPA)
- allows the automatic deployment of pods to satisfy the stated configuration wherein the number of pods depends on the load i.e. the workload resource will be scaled back down if the load decreases and the number of pods are above the configured minimum.
- HPA can be defined with multiple and custom metrics.
- HPA scales up and distributes workload efficiently over the replicas with the linearly increasing workload but takes time to scale up or down with bursty workloads
Gliffy Diagram |
---|
size | 300 |
---|
displayName | Diagram 3 |
---|
name | Diagram 3 |
---|
pagePin | 6 |
---|
|
|
Database Bottleneck | - Vertical Scaling
- more resources such as CPU and memory are added
- if all user is in the same deployment then maybe multiple instances of the database but the synchronization will be complex
- Different databases for different dataspaces**
- there is only one dataspace for cps-ncmp so it is
|
Rolling out - Kubernetes by default removes 1 pod at a time and adds a new one after
- which means we lose 100/Number of pods % ability to server end-users request
| - Changing the 'maxUnavailable' in the specification of the deployment object to be '0' will ensure that Kubernetes will first deploy a pod and ensure it is running before removing an old one
|
Readiness - Establish a connection with the database before ensuring the pod is ready to accept requests
|
|
HTTP Protocol - if the application has 'HTTP keep-alive' this means that once a connection is established it stays and keeps that connection open
|
|
Gliffy Diagram |
---|
displayName | Diagram 1 |
---|
name | Diagram 1 |
---|
pagePin | 10 |
---|
|
References:
https://www.gsd.inesc-id.pt/~mpc/pubs/smr-kubernetes.pdf
https://www.heydari.be/papers/WoC-stef.pdf
http://bawkawajwanw.com/books/masteringkubernetes.pdf
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#:~:text=In%20Kubernetes%2C%20a%20HorizontalPodAutoscaler%20automatically,is%20to%20deploy%20more%20Pods.
https://www.heydari.be/papers/WoC-stef.pdf
https://www.weave.works/blog/how-to-correctly-handle-db-schemas-during-kubernetes-rollouts
https://www.diva-portal.org/smash/get/diva2:1369598/FULLTEXT01.pdf