Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In this project, we provide a systematic way to real-time ingest DMaaP data to permanent storage and provide analytics tools and applications built on the dataDataLake's goals are:

  1. Provide a systematic way to real-time ingest DMaaP data to Couchbase, a distributed document-oriented database, and Druid, a data store designed for low-latency OLAP analytics.
  2. Serve as a common document storage for other ONAP components as well, with easy access.
  3. Provide data-access APIs and ways for ONAP components and external systems (e.g. OSS/BSS) to consume the data.
  4. Provide sophisticated and ready-to-use interactive analytics GUI tools that are built on the data. Custom analytics applications are also built on the data, whose results are exposed via REST API.

Architecture

Image RemovedImage Added

The data storage and associated tools are external infrastructures to ONAP, to be installed only once initially, or making use of existing infrastructures. Since costume setting and applications will be deployed to them, they are really integrated part parts of DataLake. 

Scope

Data Sources

...

  • Provide admin REST API for configuration configurations and topic management. A topic can be configured to be exported to which data stores, with Couchbase and Druid supported initially, and TTL (Time To Live) in the stores. We will support more distributed databases in the future if needed.

  • Provide SDC/Design time framework UI for managementAdmin GUI to manage the dispatcher, making use of the above admin REST API. It also manages the analytics tools and applications.

Document Store

  • Monitor selected topics, real-time pull the data and insert it into Couchbase, one table for each topic, with the same table name as the topic name.

  • Data types JSON, XML, and YAML are auto converted into native store  schema. We may support additional formats. Data not in these formats is stored as a single string. 

  • Provide REST API for data query, while applications can access the data through native API as well.

  • Couchbase supports Spark direct running on it, which allow complicate analytics tools to be built. We will develop Spark analytics applications if needed.

  • Other ONAP components can take advantage this to store their operational data. If we need to run heavy analytics jobs on historical data, we should separate the operational data from historical data. Otherwise we have the option to have both to coexist, due to Couchbase's scalability.

OLAP Store

  • Monitor selected topics, real-time pull the data and insert it into Druid, one datasource for each topic, with the same datasource name as the topic name.

  • Extracts the dimensions and metrics from JSON files, and pre-configure Druid settings for each datasource, which is customizable through a web interface.

  • Integrate Apache Superset for data exploration and visualization, and provide pre-builds interactive dashboards. 

  • Integrate Grafana for time series analytics.

...

Use the above information to create a key project facts section on your project page

...