Jira | ||||||
---|---|---|---|---|---|---|
|
1. Upgrade of ELK
...
and Data Enrichment
ELK Upgrade
Bath team (in charge of search-data-service, @Colin Burns) is planning to upgrade elasticsearch to 6.1.2 (based on AT&T approved versions) by the end of June.
- Current ELK versions: elasticsearch 2.4, kibana 4.6 (no logstash is being used)
- To create the dashboards with enhanced Kibana features, upgrading to version 5.6 5 or above for all ELK stack is desired. (note: ONAP Logging project is using 5.5)
- Upgrade from 2.x to 5.x or above requires "Full Cluster-restart Upgrade".
- search-data-service should reflect this upgrade
- deploy/configure the right versions
- potentially update relevant API methods for the elasticsearch data management.
Specifically Specifically for POMBA use, Groundhog could provide:
- automatic Automatic deployment of a separate kibana (version 6.1.2) for POMBA through oom (currently, it is manually installed) , configure/install all POMBA dashboards
- if necessary for any audit results parsing, automatic deployment of logstash (version 6.1.2) through oom
Feature Enhancements (Questions)
...
- with all required configuration (kibana.yml, index pattern creation) and pre-installation of the POMBA dashboards
- (Q) Would it be a good idea to use the kibana provided in onap-log pod?
- pros: no redundant install of kibana, an integrated place for all views
- cons: dependency on the onap-log (e.g., version), getting complex with all different types of dashboards
- to-do: only configuration and import of POMBA dashboards
Data Enrichment (with Questions)
The following discusses any enrichment opportunities of the audit validation/violation data being pushed to elasticsearch. Most jobs could be done in the data-router micro-service code instead of using logstash.
Notice that two boxes below are the sample validation and violation events currently stored in ES that will be the data source for the Kibana dashboards.
- The violation event need to add more fields from the field "violations" available in the validation info including: modelName, errorMessage, violationDetails
- The field? violationDetails (which would tell the exact discrepancies; see the sample event below in inside the '?violations') need to be sent/stored by data-router parsed and stored in separate fields or parsed (using logstash)? . Such nested info data cannot be directly used in the kibana visualizations (? mark indicates it).
- We could further parse out the ONAP components involved with the violations to see the violation stats factored by component (from violationDetails)
- "Elapsed time after orchestration" would be useful? if Could the audit result could change over time for the same audit requests at different times since orchestration?
- "Audit duration" stats would be useful? time taken for the auditing itself (from trigger to result).
- Any other meta-data that would be useful? e.g., who invoked the validation (user, dept)
...
2. Dashboard Ideas
...
(Note) One dashboard type could need multiple dashboard pages depending on the amount of visualizations.
Dashboard Type | Description (What To Want to See) | Required Information To Show (Visualizations) | |
---|---|---|---|
1 | Overall Audit Monitor |
|
|
2 | Overall Audit Analysis |
|
|
3 | Individual Audit Analysis |
|
|
4 | Violation Analysis for Network Discovery |
|
|
5 | Violation Summary Report |
|
|
6 |
Audit History |
|
|
| |||
Supportable Features to support as necessary
- Where necessary, provide links to switch the dashboards back and forth: e.g., from the violation page to the page displaying its validation info
- Color coding for the critical violations
...
For the development purpose, we need a certain amount of audit results data which consist of various types of validation and violation cases. Hope We want the data to be reflecting more likely the production reality that would help create more useful dashboards.
Approach 1: Execute the audits in the IST lab (or production) and bring the audit results. Copy the audit-results from IST to dev lab
- Script A runs to collect a list of info that will be used as arguments in input parameters for the audit requests: serviceInstanceId, modelINvariantId, modelVersionId, customerId, serviceType
- Script B runs to send audit requests based on with the data collected above: need to properly distribute the requests over time to be make it more realistic
- Manually collect the elasticsearch dump (which will contain all the audit validation/violation events) and import it to the elasticsearch in the dev lab
Approach 2: Collect the components info and copy Component-level copy from IST to dev lab
- Script X runs to GET all necessary info from each component of interest in the IST or production
- Script Y prepares APIs to PUT the info into the components in the dev lab
- Run Script A
- Run Script B
...