You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Next »

In progress

We will use vFW use case as the baseline to test this:

Pre-requirement: Instantiate a vFW with closed loop running.

  • Error detection is very fast: less than 1 second
  • Recovery:
    • Kill docker container, it normally takes less than 1 minute to get the system in normal state. (SDNC, APPC will take up 5 minutes)
    • Delete the pod, it normally takes much longer to get back specially for SDNC, APPC (up to 15 minutes). 
  • Note: Helm upgrade sometimes messed up the whole system, which will turn the system into un-useable status. However, we think this may not be a normal use case for production env.
Time (EDT)Categories

Sub-Categories

(In Error Mode Component)

Time to Detect Failure and RepairPass?Notes

VNF Onboarding and DistributionSDC< 5 minutesPass

Timing?? 30 minutes.  Using  a script kills those components randomly, and continue onboarding VNFs.

ete-k8s.sh onap healthdist

After kicking off the command; waiting for 1 minutes; killed SDC;

The first one was failed; then we did redistribute, it was success.


SO< 5 minutesPass

After kicking off the command; waiting for 1 minutes; killed SO;

The first one was failed; then we did redistribute, it was success.


A&AI< 5 minutesPass
  1. Killed aai-modelloader; it finished the task in 3:04 minutes
  2. Killed two aai-cassandra pods; it finished the task in ~1 minutes.

SDNC< 8 minutesPass
  1. Run preload using scripts

Delete SDNC pod, it took very very long time to get back, it might because of the network issues. And we got a very "weird" system, SDC gives us the following error:

< 5 minutesPass
  1. Deleted one of the SDNC container: eg. sdnc-0.

2. Run health and preload



VNF InstantiationSDC< 2 secondsPassTested with manually kill the docker container

VID< 1 minutePass
  1. kubectl delete pod dev-vid-6d66f9b8c-9vdlt -n onap //back in 1 minute
  2. kubectl delete pod dev-vid-mariadb-fc95657d9-wqn9s -n onap  // back in 1 minute

SO5 minutesPassso pod restarted as part of hard rebooting 2 k8s VMs out of 9

A&AI20 minutesPass

restarted aai-model-loader, aai-hbase, and aai-sparky-be due to hard rebooting 2 more k8s VMs

probably took extra time due to many other pods restarting at the same time and taking time to converge


SDNC5 minutesPasssdnc pods restarted as part of hard rebooting 2 k8s VMs out of 9

MultiVIM< 5 minutesPassdeleted multicloud pods and verified that new pods that come up can orchestrate VNFs as usual

Closed LoopDCAE

Pre define manually this closed loop

DMaaP



Policy

Policy documentation: Policy on OOM

A&AI



APPC


Requirement

Area

Priority

Min. Level

Stretch Goal

Level Descriptions (Abbreviated)

Resiliency

High

Level 2 – run-time projects
Level 1 – remaining projects

Level 3 – run-time projects
Level 2 – remaining projects

•1 – manual failure and recovery (< 30 minutes)
•2 – automated detection and recovery (single site) (<30 minutes)
•3 – automated detection and recovery (geo redundancy)

  • No labels