You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 21 Next »

The new Beijing release scalability, resiliency, and manageablity are described here.   These capabilities apply to the OOM/Kubernetes installation.

Installation

Follow the OOM installation instructions at 

http://onap.readthedocs.io/en/latest/submodules/oom.git/docs/index.html

TLab specific installation

Below are the notes specific for the TLab environment.

    1. Find Rancher IP in R_Control-Plane tenant - we use “onap_dev” key and “Ubuntu” user to SSH. For example: “ssh -i onap_dev ubuntu@192.168.31.245”.
    2. Login as ubuntu, then run "sudo -i" to login as root. The “oom” git repo is in the rancher vm's root directory, under “/root/oom”.
    3. Edit portal files at /root/oom/kubernetes

      “make portal”

      “make onap”
      Run below command from "/root” folder to do helm upgrade

      "helm upgrade -i dev local/onap -f ../../integration-override.yaml"


    4. Rancher gui is at 192.168.31.245:8080

There is an interactive cli as well in the rancher gui where you can run kubectl commands. Below command will show the list of portal services.

"kubectl get services --namespace=onap | grep portal”

Accessing the ONAP Portal using OOM and a Kubernetes Cluster

If everything is successful, then to access Portal - http://onap.readthedocs.io/en/latest/submodules/oom.git/docs/oom_user_guide.html#accessing-the-onap-portal-using-oom-and-a-kubernetes-cluster 

Overview of the running system

> kubectl get services --namespace=onap | grep portal
portal-app               LoadBalancer   10.43.141.57    10.0.0.8      8989:30215/TCP,8006:30213/TCP,8010:30214/TCP   5d
portal-cassandra         ClusterIP      10.43.158.145   <none>        9160/TCP,7000/TCP,7001/TCP,7199/TCP,9042/TCP   5d
portal-db                ClusterIP      10.43.192.65    <none>        3306/TCP                                       5d
portal-sdk               ClusterIP      10.43.24.82     <none>        8990/TCP                                       5d
portal-widget            ClusterIP      10.43.101.233   <none>        8082/TCP                                       5d
portal-zookeeper         ClusterIP      10.43.0.82      <none>        2181/TCP                                       5d
> kubectl get pods --all-namespaces | grep portal
onap          dev-portal-app-b8c6668d8-56bjb                2/2       Running       0          2m
onap          dev-portal-app-b8c6668d8-g6whb                2/2       Running       0          2m
onap          dev-portal-app-b8c6668d8-xshwg                2/2       Running       0          2m
onap          dev-portal-cassandra-5ddbc59ffd-qc6rp         1/1       Running       0          2m
onap          dev-portal-db-6d7fc58648-sp9sf                0/1       Running       0          2m
onap          dev-portal-sdk-868f696cd7-mnjxk               0/2       Init:0/1      0          2m
onap          dev-portal-widget-694c45b75f-nqdtt            0/1       Init:0/1      0          2m
onap          dev-portal-zookeeper-db466fc-kggsw            1/1       Running       0          2m


Healthchecks

Verify that the portal healtcheck passes by the robot framework:

https://jenkins.onap.org/view/External%20Labs/job/lab-tlab-beijing-oom-deploy/325/robot/OpenECOMP%20ETE/Robot/Testsuites/Health-Check/

portal-app Scaling

To scale a new portal-app, set the replica count appropriately.   

In our tests below, we are going to work with the OOM portal component in isolation.   In this exercise, we scale the portal-app with 2 new replicas.

The below code needs to be added to the integration-override.yaml file.

portal:
  portal-app:
    replicaCount: 3

Then perform the helm upgrade.

portal-app Resiliency

A portal-app container failure can be simulated by stopping the portal-app container.    The kubernetes liveness operation will detect that the ports are down, inferring there's a problem with the service, and in turn, will restart the container. 

Here is where one of the running instances was deleted. And another is starting.

After Deleteing 1 portal App.

Here the new instance has started and the old one is still terminating

After the deleted instance has terminated the 3 instances are all running normally.

During this time there was no issues with the Portal Website.

TODO: In the following releases, the portal-db, portal-widget needs to be tested with similar resiliency and scaling tests. The portal-cassandra and portal-zookeeper requires coordination with MUSIC team to test the resiliency as there are some plans to provide MUSIC as a service. 

Sanity Tests 

Check Portal UI and perform sanity tests.

  1. After 3 instances of Portal are up, edit IP in /etc/hosts file, and logon as demo user on http://portal.api.simpledemo.onap.org:30215/ONAPPORTAL/login.htm
  2. Then killed 1 instance, I am able to continue on Portal page seamlessly
  3. Another test on failover timing, when killed all 3 instances, the new Portal processes are coming up within 30 seconds

Troubleshoot

In case of failures, below commands may get handy for accessing the pods/logs and troubleshooting the issue.

To find all portal pods:

> kubectl get pods --all-namespaces | grep portal
onap          dev-portal-app-b8c6668d8-56bjb                2/2       Running       0          2m
onap          dev-portal-app-b8c6668d8-g6whb                2/2       Running       0          2m
onap          dev-portal-app-b8c6668d8-xshwg                2/2       Running       0          2m
onap          dev-portal-cassandra-5ddbc59ffd-qc6rp         1/1       Running       0          2m
onap          dev-portal-db-6d7fc58648-sp9sf                0/1       Running       0          2m
onap          dev-portal-sdk-868f696cd7-mnjxk               0/2       Init:0/1      0          2m
onap          dev-portal-widget-694c45b75f-nqdtt            0/1       Init:0/1      0          2m
onap          dev-portal-zookeeper-db466fc-kggsw            1/1       Running       0          2m

From above list, now to check DB logs:

> kubectl logs --namespace=onap dev-portal-db-6d7fc58648-sp9sf -c portal-db
2018-05-30 19:49:16 139875765802880 [Note] mysqld (mysqld 10.2.15-MariaDB-10.2.15+maria~jessie) starting as process 1 ...
2018-05-30 19:49:16 139875765802880 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-05-30 19:49:16 139875765802880 [Note] InnoDB: Uses event mutexes
2018-05-30 19:49:16 139875765802880 [Note] InnoDB: Compressed tables use zlib 1.2.8
2018-05-30 19:49:16 139875765802880 [Note] InnoDB: Using Linux native AIO
2018-05-30 19:49:16 139875765802880 [Note] InnoDB: Number of pools: 1
2018-05-30 19:49:16 139875765802880 [Note] InnoDB: Using SSE2 crc32 instructions

To check portal-app logs:

> kubectl logs --namespace=onap dev-portal-app-b8c6668d8-56bjb -c portal-app
/start-apache-tomcat.sh: option -i value is
/start-apache-tomcat.sh: option -n value is
/start-apache-tomcat.sh: values for IP (-i) and/or name (-n) are empty or short
/start-apache-tomcat.sh: Starting server from /opt/apache-tomcat-8.0.37
30-May-2018 19:48:29.480 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version: Apache Tomcat/8.0.37
30-May-2018 19:48:29.482 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built: Sep 1 2016 10:01:52 UTC
30-May-2018 19:48:29.482 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server number: 8.0.37.0

To get inside the portal-app docker and access the application logs:

> kubectl exec -it dev-portal-app-b8c6668d8-56bjb -n onap /bin/sh
Defaulting container name to portal-app.
Use 'kubectl describe pod/dev-portal-app-b8c6668d8-56bjb' to see all of the containers in this pod.
/ # cd /opt/apache-tomcat-8.0.37/logs/onapportal
/opt/apache-tomcat-8.0.37/logs/onapportal # ls
application.log  debug.log        metrics.log
audit.log        error.log

Portal’s Rocket Chat channel - http://onap-integration.eastus.cloudapp.azure.com:3000/channel/portal


  • No labels