The goal of this DCAE project is to provide the PNDA platform as a deployment option that delivers a big-data analytics platform as part of DCAE.

Meetings: PNDA-DCAE integration is discussed as part of the weekly DCAE call (Thr UTC 13:00 / China 21:00 / Eastern 9:00 / Pacific 06:00 zoom.us/j/824147956) check on DCAE Weekly Meetings


Overview

Overview presentation of DCAE-PNDA-Overview.pdf.

High level summary of tasks:

Installation of PNDA within DCAE:

Health-Check (PNDA to integrate with DCAE health check):

Enable Application Deployment on PNDA via DCAE:

  • Develop Cloudify plugin for PNDA package deployment DCAEGEN2-370             

Data Integration (Enable PNDA to receive data from DCAE collectors like VES, etc.)

Applications for PNDA

  • Create PNDA applications for test/demo purposes (e.g. similar to current TCA)
    ( DCAEGEN2-632 - Getting issue details... STATUS - partially worked on as part of DCAEGEN2-371 - Getting issue details... STATUS )

Release related information

Casablanca M3 Milestone for PNDA integration into DCAE:

  • Support for HDFS API, 
    • VES data available in HDFS
  • Support for Spark Streaming API
  • Support for Spark Batch API
  • Jupyter Notebook

PNDA 5.0 Components versions

The source of truth regarding versions is available in the PNDA 5.0 release note.

ComponentVersion
Kafka1.1.0
Kafka Manager1.3.3.15
PNDA Deployment ManagerXXX
PNDA Package RepositoryXXX
PNDA ConsoleXXX
Gobblin

0.11.0

Flink1.4.2
Knox1.1.0
HortonWorks2.6.5

Hadoop2.7.3
HBase1.1.2
Hive2.1.0
Spark1.6.3
Spark2.3.0
Oozie4.2.0
Grafana

5.1.3

OpenTSDB2.3.0
Consul1.0.3
Jupyter4.2.1

pnda API's

As part of the ongoing dcae integration with the pnda data platform, here are some pointers defining the provided pnda API’s:

List of JIRA tickets associated with PNDA for DCAE - Casablanca

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

List of JIRA tickets associated with PNDA for DCAE - Backlog

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh


Content presented at ONS.EU 2018

PNDA Integration into ONAP DCAE - Slideset


PNDA Analytics Application within DCAEgen2 - Video

ONAP DCAE vFW Data in PNDA - Recording.m4v



  • No labels

18 Comments

  1. Guidance from OOM team:

    Hi Donald,

    Thanks for reaching out to the OOM team.

    There is some guidance I can provide to expedite the code review/merge process for your new helm charts. A lot of time has been spent towards standardizing helm charts. Although not perfect, and still evolving, we strive for a level of consistency in the templates. 

    Creating Helm Charts

    There is a “starter” helm chart that can be found here: onap-chart. This is not a one-size fits all, but provides a basis for most charts. The importance is in the values.yaml. An attempt at standardizing configuration parameter names (based on Helm best practices), that allows for centralized hierarchical configuration. I would try to pour your specific configuration into this example. I would have recommended to clone-and-own the dcae-bootstrap helm chart instead but I see it has eliminated some of the standardized config.

    Enable/disable PNDA Deployment

    There isn’t really a requirement to have every chart disabled by default and then opt-in. In fact, by default, all ONAP components are deployed out-of-the-box. It can be viewed as a “demo” deployment. Customized deployments can use an override file to disable components as necessary. For PNDA, however, it would make sense to either disable the PNDA bootstrap sub chart or have the sub chart deploy but not spin up PNDA VMs unless a configuration flag is enabled and/or openstack configuration is provided. The configuration flag can be added to the DCAE values.yaml.

    Today, each ONAP project (ie. DCAE, SO) can be enabled/disabled via the values.yaml (+ requirements.yaml) inside the onap parent Helm chart. Unfortunately, this does not provide control over individual subcharts like the PNDA bootstrap sub chart you’re adding to DCAE.

    OpenStack Configuration

    There are a few projects that need OpenStack configuration. Take a look inside ONAP values.yaml for APPC, NBI and SO configuration. I would use these as examples on how to propagate the configuration you need. There is an effort to consolidate this configuration into a single shared configuration but that is a future deliverable.  

    Feel free to put up draft patches as soon as possible to get early feedback from the OOM team.

    Please let me know if you have any questions, comments or concerns.

    Thanks,

    Mike.

    --
    Mike Elliott
    ONAP OOM PTL
    Senior Architect - Amdocs

  2. Hi Donald Hunter, when are you planning to introduce PNDA into DCAE? By the state of the stories it looks like this is Dublin content - correct?

    Thanks,

    Roger

    1. Hi Roger Maitland,

      Actually we are planning to integrate several of these stories into Casablanca. #DCAEGEN2-367 is merged but the dcaegen2/deployments job is failing because it needs a larger VM flavour and we have an open helpdesk ticket for that. The stories which touch OOM are queued, waiting for the container artifacts from dcaegen2/deployments.

      Cheers,
      Donald.

  3. Hi, 

    As I understand, there is a deployment manager in PNDA that is used to upload packages (Analytics applications), create applications from packages and start applications. Few questions on integration with rest of ONAP:

    • Which ONAP components be calling these APIs? Is there any sequence diagram that help in understand full LCM.
    • I see from the source code that it support OOzie plugin as well as Yarn plugin via spark-streaming.  Any plans to support Kubernetes plugin?
    • Is Oozie the only one that is helping in defining the workflow that consists of multiple applications?
    • I understand that OOzie support Kubernetes in Oozie 5.0 release. I hope Oozie version 5.0 supported by PNDA.
    • In case of Oozie, workflow is defined using XML files. In ONAP, how these XML files get created. Do you expect them to be created outside of ONAP and upload them to ONAP. If so, would they need to be uploaded to PNDA?
    • How do we associate various workflow XML files to appropriate VNFs and Kafka topics and collectors. Is there any super XML that connects these all.

    Srini

    Adding notifications...

    Vijay Venkatesh KumarFrank Brockners and Donald Hunter

    1. Hi Srini,

      Your understanding is correct, there is a deployment manager in PNDA that has REST APIs for package upload, application creation and control.

      There are open issues relating to deployment integration for ONAP that we have still plan and see if we can scope them for Dublin.

      Cheers,
      Donald.

      1. Thanks Donald. I guess I assumed that the integration aspects are worked out. As part of edge-automation, we are trying to see how we can bring up networking analytics apps in remote spark clusters (deployed at the edge or regional sites). Hence, this integration aspect is very important for us.

  4. Vijay Venkatesh KumarFrank Brockners,  Donald Hunterramki krishnanRaghu Ranganathan

    Hi DCAE team,

    We at the edge automation group are studying to see how Analytics applications can be run in Edges or near to the edges.  There are three aspects

    • Instantiating analytics platform.
    • Uploading analytics applications
    • Submitting the jobs

    All of above need to happen from ONAP-Central.

    When edge site is onboarded in ONAP,  ONAP. optionally, brings up analytics platform (Spark platform) in edge sites or delegated sites. We also don't want to rule out bringing up spark platform by other means in edge or delegated locations.

    When new analytics application is onboarded in ONAP,  based on edge configuration, these application images would need to be sent to the edges.  We also don't want to rule out uploading analytics applications using other mechanisms.  

    When ONAP (on what basis is TBD) decides to submit jobs (streaming and batch),  ONAP need to communicate with edge platform to submit spark jobs.

    There is one Apache Livy project (https://livy.incubator.apache.org/), which provides server software and clients in different languages. It seems that using this ONAP (by integrating client) can talk to various edge analytics platform using RESTful API to submit and query jobs.  Thought process is to leverage server portion in analytics platform.

    That said, We want to leverage as much work that was done either in DCAE and PNDA.  Hence looking for suggestions. 

    Also looking for suggestions on workflow orchestration of spark pipeline (Use Oozie or Apache Airflow).

    Srini


  5. Frank Brockners and Donald Hunter,

    In 'Creating PNDA' section of PNDA guide, it talked about bringing up standard version of PNDA on various cloud technologies such as Openstack, AWS, bare-metal servers. In case of Openstack, I see set of HOTs - one for each component or dependency component https://github.com/pndaproject/pnda-cli/tree/develop/heat-templates/standard.  

    Since ONAP is using K8S, I guess there would be Helm charts for each component. Also, as part of building, we would assume that there would be Dockerfile for each one of the components. 

    I tried to search in OOM for PNDA related helm charts. I only found two of them, even those are not related to actual components https://gerrit.onap.org/r/gitweb?p=oom.git;a=tree;f=kubernetes/pnda/charts;h=adc873ef88314d9addc7581cfc951eea1f80715c;hb=HEAD

    Can you point me to the right place on where DockerFiles and Helm charts present in github/ONAP-gerrit for standard PNDA components?



    1. Hi,

      On Deployment Manager and its integration with K8S, we are hoping that spark-k8s-operator can be used.  

      https://github.com/GoogleCloudPlatform/spark-on-k8s-operator

      Let us know whether this or somethig similar in your plans for Kubernetes.  If it is not in your radar, we may be able to help. Please let us know.

  6. I see charts in casablanca OOM for PNDA but they are not enabled - is PNDA a Dublin feature then given where we are in the release cycle ?

    1. Hi Brian,

      The PNDA charts are not enabled by default because they are only supported when the Kubernetes cluster is on Openstack infra. You need to provide Openstack API parameters to the helm install so that the PNDA bootstrap container can provision Openstack VMs for the PNDA cluster,

      Cheers,
      Donald.

  7. So PNDA  is not docker container based ?

    1. No, it is not. The Hadoop ecosystem has traditionally been bare-metal based. Many Hadoop components can be containerised but the distros are not quite there yet.

      1. I see following helm charts in Helm repository

        HDFS: https://github.com/helm/charts/tree/master/stable/hadoop

        Spark: https://github.com/helm/charts/tree/master/stable/spark

        Kafka: https://github.com/helm/charts/tree/master/incubator/kafka

        As part of analytics-as-a-service initiative (in R4), thought is to make everything Helm based deployment for packages that PNDA uses from open source and develop for others which are not yet found in open source (such as OpenTSDB, PNDA deployment manager).  Our main intention is to bring up analytics framework not only in ONAP (using OOM), but also bring up framework anywhere (Edge, Regional sites etc.., using site specific K8S.) for doing network analytics.

        Of course, we need to work with you to ensure that there is no duplicate effort.  Please do let us know what you are planning for R4.

        Srini

        1. Note that the hadoop chart you linked says this:

          "This chart is primarily intended to be used for YARN and MapReduce job execution where HDFS is just used as a means to transport small artifacts within the framework and not for a distributed filesystem. Data should be read from cloud based datastores such as Google Cloud Storage, S3 or Swift."

          1. Sorry. That was meant for Hadoop. 

            In case of K8S Spark, we only require HDFS. Charts are given here: https://github.com/apache-spark-on-k8s/kubernetes-HDFS

            These are the ones, we think, can be used as base.

            Let us know whether it satisfies PNDA.

            1. Thanks for the link.

              I will take a look at this as to see if we can put PNDA services on top.

        2. Hi Srinivasa,

          We are just working through what to plan in R4 and would definitely like to collaborate with you. I would like to eventually reach a fully containerised analytics-as-a-service solution as you describe, but I don't know if that is achievable in R4 timeframe. 

          We should be able to leverage existing dockerfiles and helm charts for some components like kafka, OpenTSDB, etc. The HDFS/Spark deployment and the data storage management is the harder part.