Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Info
titleCasablanca

This functionality was introduced in the Casablanca release. (In Beijing, Kubernetes dashboard was suggested for monitoring the general health of a site.)

Overview

In order to make proper decisions as to whether one site should be made active over another, the ability for a particular site to process messaging needs to be ascertained.

Manually checking site health

In order to manually check the health of a site, the operator can run the sdnc.monitor script from the Kubernetes master in the site they are concerned with:.  Release name is a required argument, namespace defaults to onap if not specified.

Code Block
themeRDark
titlesdnc.makeActive
ubuntu@k8s-s2-master:~/oom/kubernetes/sdnc/resources/geo/bin$ ./sdnc.monitor dev
healthy
ubuntu@k8s-s2-master:~/oom/kubernetes/sdnc/resources/geo/bin$

This version of the script is actually a wrapper that utilizes kubectl to remotely access the PROM pod in order to run the sdnc.monitor script that actually performs the health checks on components in the site.

Alternatively, the sdnc.monitor script available in the PROM pod can be run directly:


Code Block
themeRDark
titlesdnc.monitor
root@dev-prom-6485f566fb-hdhzs:/pathapp/to/scripts#bin# ./sdnc.monitor
healthy
root@dev-prom-6485f566fb-hdhzs:/path/to/scripts#

Advanced health reporting

If an operator wishes to see more detail about the health of site, specifically To help troubleshoot an unhealthy site, include the --debug argument which will show which health checks are passing and which aren't...

Code Block
themeRDark
titlesdnc.makeActive

failing, and for failing checks the health check output to help identify the root cause.

Image Added


The use of consul in component health checks

ubuntu@k8s-s2-master:

The consul health checks that are selected for site health are specified in the prom pod's values.yaml file, e.g.

~/oom/kubernetes/sdnc/

resources/geo/bin$ ./sdnc.monitor -parameter output output output output ubuntu@k8s-s2-master:~/oom/kubernetes/sdnc/resources/geo/bin$

prom/values.yaml.

Code Block
themeRDark
config:
  ...
  healthChecks:
  # All top-level checks must pass
  - "Health Check: SDNC - SDN Host"
  - "Health Check: SDNC"
  - "Health Check: SDNC ODL Cluster"
  - "Health Check: SDNC Portal"
  # Within nested lists, only one must pass
  - - "Health Check: SDNC-SDN-CTL-DB-01"
    - "Health Check: SDNC-SDN-CTL-DB-02"

The above example, the first four health checks (three for OpenDaylight and one for admin portal) must all pass, as well at at least one MySQL port check.  Short-circuit evaluation is used to determine site health in as few consul queries as possible.

The use of consul in component health checks

TBD