...
The goal of this page is to provide an E2E infrastructure for testing an hourly or triggered master/tagged build for the purposes of declaring it ready in terms of health check and use case functionality. CD functionality includes providing real-time and historical analytics of build health via stored/indexed logs from the deployment jobs in our ELK stack that sits outside of ONAP.
Amazon AWS is currently hosting our RI for ONAP Continuous Deployment on my private account for now - I have requested a grant specific to the jenkins, kibana and cd instances.
see Cloud Native Deployment#AmazonAWS
see
Jira | ||||||
---|---|---|---|---|---|---|
|
ONAP Live AWS CD Servers
Server | URL | Notes |
---|---|---|
Live Cassablanca/master server | http://master.onap.info:8880 | Login to Rancher/Kubernetes only in the last 45 min of the hour Use the system only in the last 10 min of the hour Currently off until the account resets to the next bill on 2nd Jan |
Jenkins server | http://jenkins.onap.info/job/oom-cd/ | view deployment status, deployment (pod up status) Paused until 2 Jan 2018 |
Kibana server | http://kibana.onap.info:5601 | query "message" logs or view the dashboard |
CD Architecture
CD Demo Videos
20171210 showing a full CD job on the jenkins server |
| ||||||
Kibana Dashboard of CD system diagnosing health check issues in an Hourly ONAP OOM Deploy
In the combined ELK and Kibana CD system below we can see that SDC is failing healthcheck on average about 35% of the time - this may be due to a gap between healthcheck using a 200 HTTP return, the SDC rest call timing out when Spring is still coming up on the servlet container or a dependency check in SDC itself on another component where a particular startup order or timing of calls exposes an issue - anyway the ELK system that consumes logs from the hourly build can identify issues like this or the 1 hour healthcheck failure in MSB below that for 14 components that was transient.
Jira | ||||||
---|---|---|---|---|---|---|
|
Jira | ||||||
---|---|---|---|---|---|---|
|
Shane Daniel has created a dashboard on our AWS POC that can be used to diagnose the health of the current hourly build based on logs generated by the health check running in robot off an hourly deploy of ONAP OOM (CI triggers are pending)
For example there was a hard coded token in kube2msb that was causing some healthcheck failures - notice the drop in failures 3 hours ago within an hour after the submit to the OOM framework (Immediate because the config is not currently part of the daily-only docker builds)
https://gerrit.onap.org/r/#/c/27943/ for
https://jira.onap.org/browse/OOM-570
Build status and history
Automated POC ONAP CD Infrastructure
...
DI 5: 20171112: Strategy for Manual Config of Rancher 1.6 for Auto Create/Delete of CD VM
ONAP on Kubernetes on Amazon EC2EC2#AWSCLIEC2CreationandDeployment
Code Block |
---|
#20171029 POC working on EC2 Spot using AMI preconfigured with Rancher 1.6 server/client aws ec2 request-spot-instances --spot-price "0.25" --instance-count 1 --type "one-time" --launch-specification file://aws_ec2_spot_cli.json aws ec2 associate-address --instance-id i-048637ed92da66bf6 --allocation-id eipalloc-375c1d02 # DNS record set type A changes take 20 sec to propagate the internet - for a dig command to see them aws ec2 reboot-instances --instance-ids i-048637ed92da66bf6 root@ip-172-31-68-153:~# kubectl cluster-info Kubernetes master is running at https://url.onap.info:8880/r/projects/1a7/kubernetes:6 |
...
Cannot get creation access to https://jenkins.onap.org/sandbox/ via Jenkins -> Configuring Jenkins Jobs
...
current ssh config
/var/jenkins_home/workspace/shared_aws_201801.pem
/var/jenkins_home/workspace/shared_aws_201801.pem obr..._aws_20141115.pem
Automated ONAP CD Infrastructure
We need sufficient resources to run two (amsterdam and beijing/master) deployments either hourly or on commit-trigger demand.
We also need devops infrastructure to provision the servers (an ARM DMZ jumbox), run the jenkins container and ELK containers (a single Kubernetes cluster)
Resources
ONAP Deployment Specification for Finance and Operations#AmazonAWS
name | provider | server | IP/DNS | port | resource group | type | vpc/vn | sg | acl | cert/pass | subnet | hosting | template | purpose |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ons-dmz | Azure | ons-dmz | ons-dmz.onap.cloud | ons-dmz | vm | bastion | Microsoft | dmz-jenkins | bastion/jumpbox | |||||
jenkins | Azure | ons-dmz-jenkins | jenkins.onap.cloud | 80 | ons-dmz-jenkins | dc | Microsoft | dmz-jenkins | jenkins | |||||
kibana | Azure | ons-dmz-kibana | kibana.onap.cloud | 5601 | dc | Microsoft | dmz-jenkins | kibana | ||||||
amsterdam-hourly | Azure | onap-amsterdam | amsterdam.onap.cloud | k8s | k8s | Microsoft | s | |||||||
beijing-hourly | Azure | onap-beijing | beijing.onap.cloud | k8s | k8s | Microsoft | ||||||||
chaos monkey b* | Azure | chaos.onap.cloud | k8s | Microsoft | hammer the system up/down | |||||||||
AWS | ons-dmz | bastion | bastion/jumpbox | |||||||||||
jenkins | AWS | ons-dmz-jenkins | jenkins.onap.info | 80 | ons-dmz | dc | admin m*n* | private | ||||||
kibana | AWS | ons-dmz-kibana | kibana.onap.info | 5601 | ons-dmz | dc | private | |||||||
amsterdam | amsterdam.onap.info | k8s | k8s | Amazon | ||||||||||
beijing | ons-brookhaven | beijing.onap.info | k8s | k8s | Amazon |
Performance
Static Server 4 hour Deploy Frequency
Resource Deployment Scripts
Azure
Code Block |
---|
# for recreation
ubuntu@ons-dmz:~$ sudo ./oom_deployment.sh -b amsterdam -s amsterdam.onap.cloud -e onap -r a_ONAP_CD_amsterdam_nodelete -t _arm_deploy_onap_cd.json -p _arm_deploy_onap_cd_a_parameters.json |
Links
...