Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Load balancing optionDescriptionIssuesPossible solution
Increasing pod/replica count via deployment object





















Replica count is changed from the default number to the number of pods desired to be added through values.yaml file

    • Defined in the manifest deployment.yaml which means the deployment object created will own and manage the replicaSet
      • this is the recommended way to use replicaSets

Gliffy Diagram
nameDiagram 6
pagePin1






  • Burst propagation
    • eviction of one replica results in the load being redirected to another replica
  • CPU in the node can be/will be fully used up
  • adding a Horizontal Pod Autoscaler (HPA)

Database Bottleneck

****can be a different study on how to scale Postgres horizontally

  • Vertical Scaling
    • more resources such as CPU and memory are added
  • Postgres horizontal scaling (Sharding)
    • Slices tables into multiple smaller tables (shards) wherein each will be in run on different database

Rolling out

  • Kubernetes by default removes 1 pod at a time and adds a new one after
    • which means we lose 100/Number of pods % ability to server end-users request
  • Changing the 'maxUnavailable'  in the specification of the deployment object to be '0' will ensure that Kubernetes will first deploy a pod and ensure it is running before removing an old one 
Using Horizontal Pod Autoscaling
  • allows the automatic deployment of pods to satisfy the stated configuration wherein the number of pods depends on the load
    • i.e. the workload resource will be scaled back down if the load decreases and the number of pods are above the configured minimum.
  • HPA can be defined with multiple and custom metrics. 
    • metrics can be customized so that HPA only scales on HTTP requests
    • metrics can be customized so that HPA only scales on Ingress requests per second

Gliffy Diagram
size300
displayNameDiagram 3
nameDiagram 3
pagePin6

  • HPA scales up and distributes workload efficiently over the replicas with the linearly increasing workload but takes time to scale up or down with bursty workloads
  • Configure sync period
    • the default interval is 300 seconds
  • Would require changes/updates to the current CPS deployment resources
  • Remove 'replica' variable in cps-and-ncmp deployment manifest
    • deployment.yaml
      • apiVersion: apps/v1
        kind: Deployment
        metadata: {{- include "common.resourceMetadata" . | nindent 2 }}
        spec:
          replicas: {{ .Values.replicaCount }}
    • . . .
  • Add a manifest to create a HorizontalPodAutoscaler object 
    • sample HPA manifest
      • apiVersion: autoscaling/v2
        kind: HorizontalPodAutoscaler
        metadata:
          name: cps-and-ncmp
        spec:
          scaleTargetRef:
            apiVersion: apps/v1
            kind: Deployment
            name: cps-and-ncmp
          minReplicas: 1
          maxReplicas: 10
          metrics:

                    - type: Object
                      object:
                      metric:
                      name: requests-per-second
                      describedObject:
                        apiVersion: networking.k8s.io/v1beta1
                        kind: Ingress
                        name: cps-core-ingress
                        target:
                        type: Value
                        value: 100

                        ....

  • Might need to add a separate service to support monitoring metrics
    • service needs to implement custom.metrics.k8s.io or external.metrics.k8s.io

...