DMaaP Edge Deployment

NOTE: Originally, the 5G Edge Use Case was planned for Dublin. But that was deferred because an ONAP-wide multi-site (routing) solution was not implemented. So, this page represents notes for the Work In Progress...as DMaaP team experiments with various approaches.

The 5G Use Case has dependencies on DMaaP services. The 5G components (Data File Collector and 3GPP PM Mapper) will be deployed at the Edge, so DMaaP services should be made available to them to avoid a data flow path through the Central Kubernetes cluster. This 5G Use Case relies on both Data Router and Message Router.

This Use Case will help flesh out the requirements and techniques for DMaaP Edge deployments.

Definitions

Dublin introduces the notion of a multi-cloud deployment consisting of a single "central" kubernetes High Availablility (HA) installation, and 0 or more "edge" kubernetes installations.
Geo-redundancy applies to a multi-site central k8s deployment. This shouldn't be confused with a multi-site ONAP deployment consisting of central and edge sites.

Assumptions

1.DMaaP will maintain a single set of Helm Charts in the oom/kubernetes repo. Said a different way, we will strive to not maintain separate DMaaP Central charts and DMaaP Edge charts.
1. The DMaaP Helm charts will continue to be maintained as a single oom kubernetes directory, with sub-directories for each component.
2. The "central" site will always be deployed before any edge sites.
1. The Edge deployment (and operation) will rely on central ONAP services (e.g. AAF)
2. This will allow a human (at least) to capture any values representing central deployment details (such as a K8S gateway IP address)
All DMaaP components will continue to be deployed in the "central" k8s. The details of what components will be deployed at any Edge, and how it will be deployed are the subject of this page.
An "edge" site can be deployed any time after the "central" site.
Not all edge sites need be deployed at the same time.
As a Platform Service, DMaaP will be deployed before any application/microservice.
SSL Server Certificates will be created in advance of deployment, and not generated at deployment time. (This is a feature for El Alto)
By convention, the kubernetes cluster name will be used as the name of the site.

Requirements

A Central-deployed DMaaP component muse be able to route to an Edge-deployed component, and distinguish between the same component deployed at different Edge sites. Examples include:
1. dr-prov periodically sends provisioning info to each dr-node
2. A centrally-deployed dr-node may transfer a file to an Edge-deployed dr-node for delivery to a subscriber in that Edge, based on an egress rule
3. A central mirrormaker subscribes to an Edge-deployed message-router kafka
An Edge-deployed DMaaP component must be able to route to a central-deployed service. Examples include:
1. dr-node periodically syncs with dr-prov
2. dr-node authenticates publish requests using aaf
3. message-router authenticates client requests using aaf
4. dbc-client makes request to dmaap-bc API during post-install provisioning
5. Edge mirrormaker subscribes to central message-router kafka
Localized DR Routing between a Data File Collector (DFC) and a PM Mapper deployed in the same Edge X.
1. Localized DR Routing means DR Node is deployed in the same Edge site so data doesn't need to leave the site.
2. DFC will be a publisher to a feed provisioned at deployment time.
3. PM Mapper will be a subscriber provisioned at deployment time.
4. The feed should be unique per site so that when there are multiple sites, PM Mapper only receives its locally produced data.
Localized messaging from PM Mapper and DFC. This will signal DFC that a file was processed.
1. Localized messaging implies a Message Router instance in the same edge location.
2. PM Mapper will a publisher provisioned at deployment time
3. DFC will be a subscriber provisioned at deployment time.
4. Communication will utilize an authenticated topic in the MR deployed in the same edge site.
  1. PM Mapper and DFC will use AAF credentials to authenticate.
  2. PM Mapper identity will be authorized to publish on the topic
  3. DFC identity will be authorized to subscribe on the topic
Inter-site messaging from PM Mapper to VES perf3gpp
1. Inter-site messaging means sending a message from an edge location publisher to a central location subscriber.
2. PM Mapper, deployed at Edge, will be a publisher using AAF credentials
3. VES perf3gpp, deployed in Central, will be a subscriber using AAF credentials
4. Communication will utilize an authenticated topic on the MR deployed in the same edge site.
  1. PM Mapper and DFC will use AAF credentials to authenticate.
  2. PM Mapper identity will be authorized to publish on the topic
  3. VES perf3gpp identity will be authorized to subscribe on the topic
5. Furthermore, messages on this topic will be replicated to the central MR instance.
6. Are there any other subscribers? (esp, are there any other at edge?)

Solution Options for Dublin

NOTE: planning for Dublin assumed that the AAI component would provide an API that served as a registry of each ONAP site. This did not happen.

This section is based on a discussion with Jack Lucas about possible approaches that we might consider within the Dublin feature set.

Ways to route to a k8s service in another k8s cluster:

Extend the configuration of the Jack's proxy to include DMaaP services. Note: Current capability will route from edge to central. (See Jack's demo from ~ 0:29:40)
1. Include central deployed DMaaP services with existing node ports in proxy config: dr-prov, message-router, dmaap-bc (Completed: see https://gerrit.onap.org/r/#/c/87710/)
2. Expose central deployed DMaaP service on node port and add to proxy configuration: dr-node (Completed: see https://gerrit.onap.org/r/#/c/87710/)
3. NOTE: proxy can subsequently route by FQDN (for HTTP only).
K8S External Service. Deploy services at Edge which map to central services.
1. REF: https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-mapping-external-services
Add entries for central services into /etc/hosts on Edge pods so they can route properly
Provision some external DNS service that is able to resolve to required IP addresses in other k8s cluster
1. Will require establishing a convention for FQDN. eg. <Release>-<service>.<namespace>
2. Convention should leverage assumptions of using same value for Release and k8s cluster name.
Determine how clients can specify FQDN (service name) but designate IP address to use.
1. See --resolve option in curl for example of how this might work.
Apply k8s thinking to DMaaP component design:
1. Abandon the DR publish redirect protocol and simply use dr-node service instead.
  1. if dr-node is local to the cluster, then client will route to local dr-node pod for publishing (which is desired)
  2. if dr-node isn't local to cluster, then client will route to central dr-node via proxy (fallback)
2. Change dr-prov algorithm for distributing prov data to dr-node so dr-prov doesn't need to know how to address every pod
  1. consider simple periodic polling by dr-node
  2. consider using an MR topic to trigger dr-node to poll for prov data
3. migrate to ELK design for logging, removes need for dr-prov to gather logs from each dr-node. (already in progress)

Upon review of this list, some concern was expressed about entertaining options that involve code changes given where we are in Dublin. Also, there is a desire for being directionally consistent with future ONAP OOM plans.

Subsequently, Fiachra Corcoran inquired at OOM meeting about approaches consistent with future directions, and learned:

intent is to utilize Ingress Controllers
RKE deployment has Ingress Controller support (although selection of which Ingress Controller technology is not finalized)
Some useful notes:
- From Michael O'Brien(Amdocs, LOG) to Everyone: 10:09 AM
  default rke ingres https://git.onap.org/oom/tree/kubernetes/contrib/tools/rke/rke_setup.sh#n177 ingress: rancher/nginx-ingress-controller:0.21.0-rancher3 ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.4-rancher1
- From Michael O'Brien(Amdocs, LOG) to Everyone: 10:20 AM
  OOM-1598 - Getting issue details... STATUS Document a Highly-Available K8s Cluster Deployment (RKE 0.2.1 / K8S 1.13.5 / Helm (2.12.3 - not 2.13.1) / Docker 18.09.5)
Much of this is now under discussion in the Edge Automation Working Group. (meets wed @11am EST)
Also, Fiachra andMike Elliott agreed to continue discussion on how DMaaP POC might proceed. Possible meeting next week.

Open Issues

REF	Status	Discussion
1	Open	DNS Update for inter-site routing We have several examples of an edge component which needs to communicate to a central service. Mike suggested that edge DNS might be updated such that edge clients could resolve to central services. This might satisfy a common need across several components. e.g. access to central AAF comes to mind 05/02: Another alternative was demoed by DCAE where an nginx container is deployed at edge site which proxies service traffic to the relevant NodePort on the central k8s cluster. This may be suitable for some of DMaaP components (as a POC) but not a preferred solution. Work is ongoing in OOM to provide this (with input from the community) OOM-1572 - Getting issue details... STATUS
2	Open	Location discovery Bus Controller manages dcaeLocations as the name of different sites. What mechanism can be used to: a) register dcaeLocations when each k8s cluster is deployed. b) serve as an attribute when MR and DR clients are provisioned. Current expectation is that there is some k8s info in A&AI API that might be useful. 05/02: Agreement from DCAE on requirement to involve all ONAP components (AAI, OOF, etc) to find a suitable solution here. Defined use-case defined here OOM-1579 - Getting issue details... STATUS
3	Closed	Relying on Helm chart enabled flag 2/12: "Mike, Last week we discussed using a helm configuration override file to control which components get deployed at edge. The idea being we would set enabled: false for a component that shouldn’t be deployed. But dmaap chart actually consists of several sub-charts, each of these sub-charts correspond to a specific dmaap component which we may want to deploy at edge or not. So, curious if you know the syntax for this – I haven’t been able to find a reference for how enabled is actually used, and I don’t see that value referenced in our charts so not clear what is reading it. Wondering if our edge config override would be something like: dmaap: dmaap-message-router: enabled: true dmaap-bus-controller: enabled: false dmaap-dr-prov: enabled: false dmaap-dr-node: enabled: true or, do charts for our individual components need to be top level directories under oom/kubernetes in order to use the enabled flag?" 2/13: From Mike Elliot: "I’ve been trying to allow for the conditional control over the dr-prov and dr-node as well, with no success. Still investigating options for this. Hope to have a solution on this by EOD." 05/02: Current chart structure allows deployment of individual components. (BC, MR, DR). One caveat to this is a dependency on AAF being reachable by BC & MR. (DR soon to follow) See the DMaaP Deployment Guide - Dublin for more details.
4	Open	05/02: Helm chart edge deploy. POC procedure demoed using multiple "kube-config --contexts" to target the edge site/cluster during helm deploy. (Inter cluster security may come into play here also) "edge charts" may require several override params to cater for the following. dcaeLocation (see issue 2) pod specs - size, resources, etc readiness configuration? potential service endpoint changes/proxies?
5	Open	05/02: Need to identify if all of the required services (logstash, AAF, dr-node, mr-kafka, etc) have exposed NodePorts available for bi-directional traffic between sites.

Development

Helm configuration overrides will be collected in a single file (e.g. dmaap-edge.yaml) and delivered to oom/kubernetes/onap/charts/resource/environments. Examples of what kinds of overrides will be present in this file include:
1. Setting the standard enabled indicator to true for dmaap, but false for other components.
```
dmaap:
  enabled: true
```
2. Setting an edge indicator to drive any edge-specific logic. TBD if this is really useful - hopefully other overrides in this file are edge specific.
3. Setting the values for a central service which may be needed at the edge. Known examples include:
  1. Message Router must be configured to access the central AAF instance. (DR Node may have this requirement in the near future)
  2. Data Router Node must be configured to access the central DR Prov
  3. Both MR and DR Node must register with central Bus Controller
4. Setting scaling values appropriate to the edge. e.g. perhaps a single kafka broker is appropriate at the edge
DMaaP Chart changes
1. Reorder charts:
  1. Bus Controller must be up and running if other components are going to register with it. Jira to remove any dependencies on MR.
  2. MR
  3. Mirror Maker
  4. DR Prov
  5. DR Node (DR Prov must be up for Node to retrieve provisioning info)
2. Post-install hooks:
  1. Bus controller:
    1. POST <central dmaap-bc>/webapi/dmaap
    2. POST <central dmaap-bc>/webapi/dcaeLocation (for central)
  2. MR:
    1. POST <central dmaap-bc>/webapi/mr_clusters DMAAP-534 Jira to add kafka brokers to endpoint
  3. DR Node
    1. POST <central dmaap-bc>/webapi/dr_node DMAAP-534

Step-by-step guide

This outlines the approach for solving this Edge deployment, and will undoubtedly be refined over time.

Central K8S Deployment
Central DMaaP Deployment
1. Use k8s cluster name as the Release. e.g. "central"
2. Deploy aaf
3. ~~Deploy aai~~
4. Deploy dmaap
5. Deploy dcae
6. Deploy VES perf3gpp via dcae
Edge K8S Deployment
1. ~~Register Edge K8S deployment in AAI (how?)~~
2. Add dcaeLocation (for new Edge K8S) to DMaaP Bus Controller
Edge DMaaP Deployment
1. Update dmaap-edge.yaml configuration override file with values from central
2. Use k8s cluster name as the Release. e.g. "edge1"
3. deploy dmaap
4. deploy PM Mapper via dcae

Presentation

The following deck can be used to discuss the concepts on this page.

Space shortcuts

Page tree