Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

DONE


We will use vFW use case as the baseline to test this:

...

Time (EDT)Categories

Sub-Categories

(In Error Mode Component)

Time to Detect Failure and RepairPass?Notes

VNF Onboarding and DistributionSDC< 5 minutesPass

Timing?? 30 minutes.  Using  a script kills those components randomly, and continue onboarding VNFs.

ete-k8s.sh onap healthdist

After kicking off the command; waiting for 1 minutes; killed SDC;

The first one was failed; then we did redistribute, it was success.


SO< 5 minutesPass

After kicking off the command; waiting for 1 minutes; killed SO;

The first one was failed; then we did redistribute, it was success.


A&AI< 5 minutesPass
  1. Killed aai-modelloader; it finished the task in 3:04 minutes
  2. Killed two aai-cassandra pods; it finished the task in ~1 minutes.

SDNC< 8 minutesPass
  1. Run preload using scripts

Delete SDNC pod, it took very very long time to get back, it might because of the network issues. And we got a very "weird" system, SDC gives us the following error:

< 5 minutesPass
  1. Deleted one of the SDNC container: eg. sdnc-0.

2. Run health and preload



VNF InstantiationSDC< 2 secondsPassTested with manually kill the docker container

VID< 1 minutePass
  1. kubectl delete pod dev-vid-6d66f9b8c-9vdlt -n onap //# back in 1 minute
  2. kubectl delete pod dev-vid-mariadb-fc95657d9-wqn9s -n onap   // # back in 1 minute

SO5 minutesPassso pod restarted as part of hard rebooting 2 k8s VMs out of 9

A&AI20 minutesPass

restarted aai-model-loader, aai-hbase, and aai-sparky-be due to hard rebooting 2 more k8s VMs

probably took extra time due to many other pods restarting at the same time and taking time to converge


SDNC5 minutesPasssdnc pods restarted as part of hard rebooting 2 k8s VMs out of 9

MultiVIM< 5 minutesPassdeleted multicloud pods and verified that new pods that come up can orchestrate VNFs as usual

Closed Loop

(Pre-installed manually)

DCAEPre define manually this closed loopDMaaP

< 5 minutes

Pass

Deleted dep-dcae-ves-collector-767d745fd4-wk4ht. No discernible interruption to closed loop. Pod restarted in 1 minute.

Deleted dep-dcae-tca-analytics-d7fb6cffb-6ccpm. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-dcae-db-0. Closed loop failed after about 1 minute. Pod restarted in 2 minutes. Closed loop started suffering from intermittent packet gaps and only recovered after rebooting the packet generator. Most likely suspect is intermittent network or issues within the packet generator.

Deleted dev-dcae-redis-0. No discernible interruption to closed loop. Pod restarted in 2 minutes.


DMaaP10 secondsPassDeleted dev-dmaap-bus-controller-657845b569-q7fr2. No discernible interruption to closed loop. Pod restarted in 10 seconds.

Policy

(Policy documentation: Policy on OOM)

15 minutesPass

Deleted dev-pdp-0. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-drools-0. Closed loop failed immediately. Pod restarted in 2 minutes. Closed loop recovered in 15 minutes.

Deleted dev-pap-5c7995667f-wvrgr. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-policydb-5cddbc96cf-hr4jr. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-nexus-7cb59bcfb7-prb5v. No discernible interruption to closed loop. Pod restarted in 2 minutes.


A&AINeverFail

Deleted aai-modelloader. Closed loop failed immediately. Pod restarted in < 5 minutes. Closed loop never recovered.

--- the rest done on a different instance ---

Deleted dev-aai-55b4c4f4d6-c6hcj. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-aai-babel-6f54f4957d-h2ngd. No discernible interruption to closed loop. Pod restarted in < 5 minutes.

Deleted dev-aai-cassandra-0. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-aai-data-router-69b8d8ff64-7qvjl. After two minutes all packets were shut off, recovered in 5 minutes (maybe intermittent network or packet generator issue). Pod restarted in 2 minutes.

Deleted dev-aai-hbase-5d9f9b4595-m72pf. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-aai-resources-5f658d4b64-66p7b. Closed loop failed immediately. Pod restarted in 2 minutes. Closed loop never recovered.


APPC (3-node cluster)20 minutesPass

Deleted dev-appc-0. Closed loop failed immediately. dev-appc-0 pod restarted in 15 minutes. Closed loop recovered in 20 minutes.

Deleted dev-appc-cdt-57548cf886-8z468. No discernible interruption to closed loop. Pod restarted in 2 minutes.

Deleted dev-appc-db-0. No discernible interruption to closed loop. Pod restarted in 3 minutes.

PolicyPolicy documentation: Policy on OOMA&AIAPPC

Requirement

Area

Priority

Min. Level

Stretch Goal

Level Descriptions (Abbreviated)

Resiliency

High

Level 2 – run-time projects
Level 1 – remaining projects

Level 3 – run-time projects
Level 2 – remaining projects

•1 – manual failure and recovery (< 30 minutes)
•2 – automated detection and recovery (single site) (<30 minutes)
•3 – automated detection and recovery (geo redundancy)