Integrating PNDA

Created by Donald Hunter, last modified by Krzysztof Kepka on Nov 16, 2018

The goal of this DCAE project is to provide the PNDA platform as a deployment option that delivers a big-data analytics platform as part of DCAE.

Meetings: PNDA-DCAE integration is discussed as part of the weekly DCAE call ~~(Thr UTC 13:00 / China 21:00 / Eastern 9:00 / Pacific 06:00 zoom.us/j/824147956)~~ check on DCAE Weekly Meetings

Overview

Overview presentation of DCAE-PNDA-Overview.pdf.

High level summary of tasks:

Installation of PNDA within DCAE:

Create blueprint for PNDA infrastructure deployment with Cloudify DCAEGEN2-366
Build and package PNDA mirror DCAEGEN2-367
Deploy PNDA onto infra using Saltstack DCAEGEN2-368

Health-Check (PNDA to integrate with DCAE health check):

Implement PNDA monitoring in Consul DCAEGEN2-369

Enable Application Deployment on PNDA via DCAE:

Develop Cloudify plugin for PNDA package deployment DCAEGEN2-370

Data Integration (Enable PNDA to receive data from DCAE collectors like VES, etc.)

Integrate PNDA with DMaaP DCAEGEN2-371

Applications for PNDA

Create PNDA applications for test/demo purposes (e.g. similar to current TCA)
( DCAEGEN2-632 - Getting issue details... STATUS - partially worked on as part of DCAEGEN2-371 - Getting issue details... STATUS )

Release related information

Casablanca M3 Milestone for PNDA integration into DCAE:

Support for HDFS API,
- VES data available in HDFS
Support for Spark Streaming API
Support for Spark Batch API
Jupyter Notebook

PNDA 5.0 Components versions

The source of truth regarding versions is available in the PNDA 5.0 release note.

Component	Version
Kafka	1.1.0
Kafka Manager	1.3.3.15
PNDA Deployment Manager	XXX
PNDA Package Repository	XXX
PNDA Console	XXX
Gobblin	0.11.0
Flink	1.4.2
Knox	1.1.0
HortonWorks	2.6.5
	Hadoop	2.7.3
	HBase	1.1.2
	Hive	2.1.0
	Spark	1.6.3
	Spark	2.3.0
	Oozie	4.2.0
Grafana	5.1.3
OpenTSDB	2.3.0
Consul	1.0.3
Jupyter	4.2.1

pnda API's

As part of the ongoing dcae integration with the pnda data platform, here are some pointers defining the provided pnda API’s:

Platform Data Management: https://github.com/pndaproject/platform-data-mgmnt/blob/develop/data-service/README.md
Platform Deployment Manager https://github.com/pndaproject/platform-deployment-manager#api-documentation
Platform Package Repository https://github.com/pndaproject/platform-package-repository#repository-api

List of JIRA tickets associated with PNDA for DCAE - Casablanca

Key	Summary	T	Created	Updated	Due	Assignee	Reporter	P	Status	Resolution

Refresh

List of JIRA tickets associated with PNDA for DCAE - Backlog

Key	Summary	T	Created	Updated	Due	Assignee	Reporter	P	Status	Resolution

Refresh

Content presented at ONS.EU 2018

PNDA Integration into ONAP DCAE - Slideset

PNDA Analytics Application within DCAEgen2 - Video

ONAP DCAE vFW Data in PNDA - Recording.m4v

No labels

18 Comments

Donald Hunter
Guidance from OOM team:
Hi Donald,
Thanks for reaching out to the OOM team.
There is some guidance I can provide to expedite the code review/merge process for your new helm charts. A lot of time has been spent towards standardizing helm charts. Although not perfect, and still evolving, we strive for a level of consistency in the templates.
Creating Helm Charts
There is a “starter” helm chart that can be found here: onap-chart. This is not a one-size fits all, but provides a basis for most charts. The importance is in the values.yaml. An attempt at standardizing configuration parameter names (based on Helm best practices), that allows for centralized hierarchical configuration. I would try to pour your specific configuration into this example. I would have recommended to clone-and-own the dcae-bootstrap helm chart instead but I see it has eliminated some of the standardized config.
Enable/disable PNDA Deployment
There isn’t really a requirement to have every chart disabled by default and then opt-in. In fact, by default, all ONAP components are deployed out-of-the-box. It can be viewed as a “demo” deployment. Customized deployments can use an override file to disable components as necessary. For PNDA, however, it would make sense to either disable the PNDA bootstrap sub chart or have the sub chart deploy but not spin up PNDA VMs unless a configuration flag is enabled and/or openstack configuration is provided. The configuration flag can be added to the DCAE values.yaml.
Today, each ONAP project (ie. DCAE, SO) can be enabled/disabled via the values.yaml (+ requirements.yaml) inside the onap parent Helm chart. Unfortunately, this does not provide control over individual subcharts like the PNDA bootstrap sub chart you’re adding to DCAE.
OpenStack Configuration
There are a few projects that need OpenStack configuration. Take a look inside ONAP values.yaml for APPC, NBI and SO configuration. I would use these as examples on how to propagate the configuration you need. There is an effort to consolidate this configuration into a single shared configuration but that is a future deliverable.
Feel free to put up draft patches as soon as possible to get early feedback from the OOM team.
Please let me know if you have any questions, comments or concerns.
Thanks,
Mike.
--
Mike Elliott
ONAP OOM PTL
Senior Architect - Amdocs
- Permalink
- Aug 20, 2018
Roger Maitland
Hi Donald Hunter, when are you planning to introduce PNDA into DCAE? By the state of the stories it looks like this is Dublin content - correct?
Thanks,
Roger
- Permalink
- Sep 04, 2018
1. Donald Hunter
  Hi Roger Maitland,
  Actually we are planning to integrate several of these stories into Casablanca. #DCAEGEN2-367 is merged but the dcaegen2/deployments job is failing because it needs a larger VM flavour and we have an open helpdesk ticket for that. The stories which touch OOM are queued, waiting for the container artifacts from dcaegen2/deployments.
  Cheers,
  Donald.
  Permalink
  
  Sep 05, 2018
Srinivasa Addepalli
Hi,
As I understand, there is a deployment manager in PNDA that is used to upload packages (Analytics applications), create applications from packages and start applications. Few questions on integration with rest of ONAP:
- Which ONAP components be calling these APIs? Is there any sequence diagram that help in understand full LCM.
- I see from the source code that it support OOzie plugin as well as Yarn plugin via spark-streaming. Any plans to support Kubernetes plugin?
- Is Oozie the only one that is helping in defining the workflow that consists of multiple applications?
- I understand that OOzie support Kubernetes in Oozie 5.0 release. I hope Oozie version 5.0 supported by PNDA.
- In case of Oozie, workflow is defined using XML files. In ONAP, how these XML files get created. Do you expect them to be created outside of ONAP and upload them to ONAP. If so, would they need to be uploaded to PNDA?
- How do we associate various workflow XML files to appropriate VNFs and Kafka topics and collectors. Is there any super XML that connects these all.
Srini
Adding notifications...
Vijay Venkatesh Kumar, Frank Brockners and Donald Hunter
- Permalink
- Oct 01, 2018
1. Donald Hunter
  Hi Srini,
  Your understanding is correct, there is a deployment manager in PNDA that has REST APIs for package upload, application creation and control.
  There are open issues relating to deployment integration for ONAP that we have still plan and see if we can scope them for Dublin.
  Cheers,
  Donald.
  Permalink
  
  Oct 05, 2018
  1. Srinivasa Addepalli
    Thanks Donald. I guess I assumed that the integration aspects are worked out. As part of edge-automation, we are trying to see how we can bring up networking analytics apps in remote spark clusters (deployed at the edge or regional sites). Hence, this integration aspect is very important for us.
    
    Permalink
    
    Oct 05, 2018
Srinivasa Addepalli
Vijay Venkatesh Kumar, Frank Brockners, Donald Hunter, ramki krishnan, Raghu Ranganathan
Hi DCAE team,
We at the edge automation group are studying to see how Analytics applications can be run in Edges or near to the edges. There are three aspects
- Instantiating analytics platform.
- Uploading analytics applications
- Submitting the jobs
All of above need to happen from ONAP-Central.
When edge site is onboarded in ONAP, ONAP. optionally, brings up analytics platform (Spark platform) in edge sites or delegated sites. We also don't want to rule out bringing up spark platform by other means in edge or delegated locations.
When new analytics application is onboarded in ONAP, based on edge configuration, these application images would need to be sent to the edges. We also don't want to rule out uploading analytics applications using other mechanisms.
When ONAP (on what basis is TBD) decides to submit jobs (streaming and batch), ONAP need to communicate with edge platform to submit spark jobs.
There is one Apache Livy project (https://livy.incubator.apache.org/), which provides server software and clients in different languages. It seems that using this ONAP (by integrating client) can talk to various edge analytics platform using RESTful API to submit and query jobs. Thought process is to leverage server portion in analytics platform.
That said, We want to leverage as much work that was done either in DCAE and PNDA. Hence looking for suggestions.
Also looking for suggestions on workflow orchestration of spark pipeline (Use Oozie or Apache Airflow).
Srini
- Permalink
- Oct 04, 2018
Srinivasa Addepalli
Frank Brockners and Donald Hunter,
In 'Creating PNDA' section of PNDA guide, it talked about bringing up standard version of PNDA on various cloud technologies such as Openstack, AWS, bare-metal servers. In case of Openstack, I see set of HOTs - one for each component or dependency component https://github.com/pndaproject/pnda-cli/tree/develop/heat-templates/standard.
Since ONAP is using K8S, I guess there would be Helm charts for each component. Also, as part of building, we would assume that there would be Dockerfile for each one of the components.
I tried to search in OOM for PNDA related helm charts. I only found two of them, even those are not related to actual components https://gerrit.onap.org/r/gitweb?p=oom.git;a=tree;f=kubernetes/pnda/charts;h=adc873ef88314d9addc7581cfc951eea1f80715c;hb=HEAD
Can you point me to the right place on where DockerFiles and Helm charts present in github/ONAP-gerrit for standard PNDA components?
- Permalink
- Oct 08, 2018
1. Srinivasa Addepalli
  Hi,
  On Deployment Manager and its integration with K8S, we are hoping that spark-k8s-operator can be used.
  https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
  Let us know whether this or somethig similar in your plans for Kubernetes. If it is not in your radar, we may be able to help. Please let us know.
  Permalink
  
  Oct 29, 2018
Brian Freeman
I see charts in casablanca OOM for PNDA but they are not enabled - is PNDA a Dublin feature then given where we are in the release cycle ?
- Permalink
- Oct 29, 2018
1. Donald Hunter
  Hi Brian,
  The PNDA charts are not enabled by default because they are only supported when the Kubernetes cluster is on Openstack infra. You need to provide Openstack API parameters to the helm install so that the PNDA bootstrap container can provision Openstack VMs for the PNDA cluster,
  Cheers,
  Donald.
  Permalink
  
  Oct 29, 2018
Brian Freeman
So PNDA is not docker container based ?
- Permalink
- Oct 29, 2018
1. Donald Hunter
  No, it is not. The Hadoop ecosystem has traditionally been bare-metal based. Many Hadoop components can be containerised but the distros are not quite there yet.
  Permalink
  
  Oct 29, 2018
  1. Srinivasa Addepalli
    I see following helm charts in Helm repository
    HDFS: https://github.com/helm/charts/tree/master/stable/hadoop
    Spark: https://github.com/helm/charts/tree/master/stable/spark
    Kafka: https://github.com/helm/charts/tree/master/incubator/kafka
    As part of analytics-as-a-service initiative (in R4), thought is to make everything Helm based deployment for packages that PNDA uses from open source and develop for others which are not yet found in open source (such as OpenTSDB, PNDA deployment manager). Our main intention is to bring up analytics framework not only in ONAP (using OOM), but also bring up framework anywhere (Edge, Regional sites etc.., using site specific K8S.) for doing network analytics.
    Of course, we need to work with you to ensure that there is no duplicate effort. Please do let us know what you are planning for R4.
    Srini
    
    Permalink
    
    Oct 29, 2018
    1. Donald Hunter
      
      Note that the hadoop chart you linked says this:
      "This chart is primarily intended to be used for YARN and MapReduce job execution where HDFS is just used as a means to transport small artifacts within the framework and not for a distributed filesystem. Data should be read from cloud based datastores such as Google Cloud Storage, S3 or Swift."
      
      Permalink
      
      Oct 29, 2018
      1. Srinivasa Addepalli
        
        Sorry. That was meant for Hadoop.
        In case of K8S Spark, we only require HDFS. Charts are given here: https://github.com/apache-spark-on-k8s/kubernetes-HDFS
        These are the ones, we think, can be used as base.
        Let us know whether it satisfies PNDA.
        
        Permalink
        
        Oct 29, 2018
        
        Donald Hunter
        
        Thanks for the link.
        I will take a look at this as to see if we can put PNDA services on top.
        
        Permalink
        
        Oct 29, 2018
    2. Donald Hunter
      
      Hi Srinivasa,
      We are just working through what to plan in R4 and would definitely like to collaborate with you. I would like to eventually reach a fully containerised analytics-as-a-service solution as you describe, but I don't know if that is achievable in R4 timeframe.
      We should be able to leverage existing dockerfiles and helm charts for some components like kafka, OpenTSDB, etc. The HDFS/Spark deployment and the data storage management is the harder part.
      
      Permalink
      
      Oct 29, 2018

Space shortcuts

Page tree

Overview

Release related information

PNDA 5.0 Components versions

pnda API's

List of JIRA tickets associated with PNDA for DCAE - Casablanca

List of JIRA tickets associated with PNDA for DCAE - Backlog

Content presented at ONS.EU 2018

PNDA Integration into ONAP DCAE - Slideset

PNDA Analytics Application within DCAEgen2 - Video

18 Comments

Donald Hunter

Roger Maitland

Donald Hunter

Srinivasa Addepalli

Donald Hunter

Srinivasa Addepalli

Srinivasa Addepalli

Srinivasa Addepalli

Srinivasa Addepalli

Brian Freeman

Donald Hunter

Brian Freeman

Donald Hunter

Srinivasa Addepalli

Donald Hunter

Srinivasa Addepalli

Donald Hunter

Donald Hunter