Page History

...

Load balancing option

Description

Issues

Possible solution

Increasing pod/replica count via deployment object

Replica count is changed from the default number to the number of pods desired to be added through values.yaml file

- Defined in the manifest deployment.yaml which means the deployment object created will own and manage the replicaSet
  - this is the recommended way to use replicaSets

Gliffy Diagram


name	Diagram 6
pagePin	1

Burst propagation
- eviction of one replica results in the load being redirected to another replica
CPU in the node can be/will be fully used up

adding a Horizontal Pod Autoscaler (HPA)

Database Bottleneck

****can be a different study on how to scale Postgres horizontally

Vertical Scaling
- more resources such as CPU and memory are added

Postgres horizontal scaling (Sharding)
- Slices tables into multiple smaller tables (shards) wherein each will be in run on different database

Rolling out

Kubernetes by default removes 1 pod at a time and adds a new one after
- which means we lose 100/Number of pods % ability to server end-users request

Changing the 'maxUnavailable' in the specification of the deployment object to be '0' will ensure that Kubernetes will first deploy a pod and ensure it is running before removing an old one

Using Horizontal Pod Autoscaling

allows the automatic deployment of pods to satisfy the stated configuration wherein the number of pods depends on the load
- i.e. the workload resource will be scaled back down if the load decreases and the number of pods are above the configured minimum.
HPA can be defined with multiple and custom metrics.
- metrics can be customized so that HPA only scales on HTTP requests
- metrics can be customized so that HPA only scales on Ingress requests per second

Gliffy Diagram

size	300
displayName	Diagram 3
name	Diagram 3
pagePin	6

HPA scales up and distributes workload efficiently over the replicas with the linearly increasing workload but takes time to scale up or down with bursty workloads

Configure sync period
- the default interval is 300 seconds

Would require changes/updates to the current CPS deployment resources

Remove 'replica' variable in cps-and-ncmp deployment manifest
- deployment.yaml
  - apiVersion: apps/v1
    kind: Deployment
    metadata: {{- include "common.resourceMetadata" . | nindent 2 }}
    spec:
    replicas: {{ .Values.replicaCount }}
- . . .
Add a manifest to create a HorizontalPodAutoscaler object
- sample HPA manifest
  - apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
    name: cps-and-ncmp
    spec:
    scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: cps-and-ncmp
    minReplicas: 1
    maxReplicas: 10
    metrics:

- type: Object
object:
metric:
name: requests-per-second
describedObject:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
name: cps-core-ingress
target:
type: Value
value: 100

....

Might need to add a separate service to support monitoring metrics
- service needs to implement custom.metrics.k8s.io or external.metrics.k8s.io

...

Space shortcuts

Page tree

Versions Compared

Old Version 8

New Version 9

Key