Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Given a RequestId (or TransactionId), bring all related logs across all components for investigationinvestigation 
  • Aggregate and visualize the logs by log level as , component, and error message with highlighting the meaningful logs with by a color coding
  • List up all significant logs and their details 
  • Provide a drilldown link for further investigation at the subcomponent level

...

  • Timestamp (date formatted)
  • RequestId (UUID)
  • componentName (string)
  • subComponent (string)
  • loglevel (one of INFO, DEBUG, WARN, ERROR)
  • detailmessage (which is equivalent to the field message but aggregatable)
  • source (log file path)
  • Logger (logging object)

Create Kibana Scripted Fields

The following two fields need to be manually created by the user at against the index pattern "logstash-*" on Kibana scripted field page (Management → Index Patterns → scripted fields):

...

  • Drill Down: URL link to the subcomponent level dashboard (see below for the details) 

       Script 

    • The code below needs to be put into the 'Script' part of the above page.

    • The drill-down time range is pre-set by -60 seconds to +5 seconds; the user could change the values as necessary.

    • The IP address in the URL of the code should be modified for the user's ONAP kibana server. 
    • The dashboard ID (105a83d0- ...  in the code) should match with the one for the imported dashboard "Culprit Locator (Subcomponent Level)". 
Code Block
if (doc['loglevel.keyword'].value !== "INFOERROR") {
  DateTimeFormatter ft = DateTimeFormatter.ofPattern("YYYY-MM-dd HH:mm:ss.SSS");
  Instant from_instant = Instant.ofEpochMilli(doc['Timestamp'].value - 60000);
  Instant to_instant = Instant.ofEpochMilli(doc['Timestamp'].value + 5000);
  ZonedDateTime from_z = from_instant.atZone(ZoneId.of("UTC"));
  ZonedDateTime to_z = to_instant.atZone(ZoneId.of("UTC"));
  String from_ts = from_z.format(ft);
  String to_ts = to_z.format(ft);

  String ReqId = doc['RequestId.keyword'].value;
  String URL = "http://10.147.58.138:5601/app/kibana#/dashboard/105a83d0-215e-11e8-a744-95cec6f8f1a4?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:'" + from_ts + "',mode:absolute,to:'" + to_ts + "'))&_a=(description:'',filters:!(),options:(darkTheme:!f),query:(query_string:(analyze_wildcard:!t,query:'RequestId:%20%22" + ReqId + "%22')),timeRestore:!f,viewMode:view)";
  return URL;
}

Import Kibana

...

Objects

Two approaches are provided for importing the Kibana objects (dashboards/visualizations/searches).

  • Download the Kibana object json file into the user's computer, and import it on Kibana: Management→Saved Objects → Import.

        or

  • Download and uncompress the tar file

...

  • into the home directory of

...

  • the onap-log kibana server. Run the script, "onap_import

...

  • _kibana_objects.py"

or 

Download json object file attached to the user's computer, and on Kibana do Management→Saved Objects → Import the object file. 

How To Use


How To Use

This explains the basic features of current version of Culprit Locator. The users are encouraged to take advantage of the full flexibility of Kibana features to modify and enhance the dashboards to their puporses and preferences. 

1. Open Dashboard "Culprit Locator (Component Level)" and adjust the time picker for the parts you're interested in or choose one like "Last 7 days". The dashboard features:

...

2. Drill-down (or more focused) view dashboard automatically opens as on a separate tab with time perid set ranging from -60 seconds prior to the error time to +5 seconds after the error occurred. The user can still continue to zoom in or click the boxes. The dashboard features:

  • Log trracking by sub-components over time with the same color scheme with the previous dashboard
  • Table of error message patterns categorizing all diffeerent error messages (to be imporved)
  • Table of all relevant logs at all loglevel, scrollable for investigating the logs for the specified time period


Below is a sample zoomed-in view of the red box in the previous view:

Image Added

Feature Improvements

The next-step of Culprit Locator is to automate the identification of root causes with more intelligent features.

  • Implicit error detection, not only based on actual ERROR logs
  • Pattern recognition of issues: e.g., a series of transactions across the components over time
  • Identification of precursor to failures
  • Identification of issues based on metrics, e.g., elpased time for a specific API alls between the components
  • Customized features based on specifc use cases. e.g., VoLTE, vFW, vCPE