Summary: Edge Scoping
Distributed Edge Cloud Infrastructure Object Hierarchy (Stretch Goal)
Value:
- Fine grained resource management & analytics for Distributed Edge Clouds
References:
- Infrastructure Modelling: ONAP R3+ Cloud Infrastructure Modeling; Cloud Infrastructure Aggregate Representation Classes
-
MULTICLOUD-153Getting issue details...
STATUS
ONAP Component | Life Cycle Phase | Enhancements |
---|---|---|
Multi-Cloud | Deploy | Support Distributed Cloud Infrastructure Capability Discovery (Note 1, Note 2) |
A&AI | Deploy | Support Standardized Distributed Cloud Infrastructure Object Hierarchy & Capability Database (Ref. 1)
|
OOF | Deploy | Execute Distributed Cloud Infrastructure Placement Policies for Optimized Service/VNF Placement across Cloud Regions (Note 3, Note 4) |
SO | Deploy | Extend SO ↔ OOF API to support data opaque to SO (Note 5) Extend SO ↔ MC API to support data opaque to SO (Note 6) |
Assumption for Policy, SO, OOF:
- This uses the current Generic VNF workflow in SO
Note 1:
- Configured Capacity and Utilized (or Currently Used) Capacity are managed by the specific cloud.
Note 2:
- Cloud SW Capability example
- Cloud region "x" with SR-IOV, GPU, Min-guarantee support
- Cloud region "y" with SR-IOV support
- Cloud HW Capability example
- Resource cluster "xa" in Cloud region "x" with SR-IOV and GPU support
- Resource cluster "xb" in Cloud region "x" with GPU support
- Resource cluster "ya" in Cloud region "y" with SR-IOV support
Note 3:
- 5G Service/VNF placement example
- Constraints used by Optimization Framework (OOF)
5G CU-UP VNF location to be fixed to a specific physical DC based on 5G DU, bounded by a max distance from 5G DU
- Optimization Policy used by OOF
Choose optimized cloud region (or instance) for the placement of 5G CU UP for subscriber group based on the above constraints
- Constraints used by Optimization Framework (OOF)
Note 4:
- For the 5G Service/VNF placement example in Note 3
- 5G CU-UP VNF preferably maps to a specific Cloud region & Physical DC End Point
Note 5:
- For the 5G Service/VNF placement example in Note 3
- OOF will pass the Physical DC End Point to SO as a opaque data
Note 6:
- For the 5G Service/VNF placement example in Note 3
- SO passes the Physical DC End Point to Multi-Cloud as a opaque data, besides the Cloud Region
Cloud-agnostic Placement/Networking & Homing Policies (Phase 1 - Casablanca MVP, Phase 2 - Stretch Goal)
End-to-end use case Applicability:
All (especially the data plane VNFs with fine-grained VNF placement and high performance networking requirements)
Value:
Improve "workload deployability" by avoiding exposure of "cloud specific" capabilities to several ONAP components and addressing "separation of concerns"
Support capacity check (besides capability check) for HPA resources
Applicable to all workloads - VM-based or Container-based
-
MULTICLOUD-272Getting issue details...
STATUS
Phase 1 Summary:
- Multi-Cloud Policy Framework
- Assist OOF in target cloud region selection for VNF placement (aka homing) by summarizing cloud-specific capability, capacity & cost metrics (e.g. VMs could have different cost in different clouds, Infra HA for VMs in a VNF could have different cost in different clouds)
Cloud Agnostic Intent (Policy) Execution Workflow - Steps 1- 6
- Dynamically modify the cloud specific VNF deployment template based on cloud-specific realization of the specified intent (e.g. Infra HA for VMs within a VNF could have different realizations across different clouds)
Cloud Agnostic Intent (Policy) Execution Workflow - Step 7
- Assist OOF in target cloud region selection for VNF placement (aka homing) by summarizing cloud-specific capability, capacity & cost metrics (e.g. VMs could have different cost in different clouds, Infra HA for VMs in a VNF could have different cost in different clouds)
Intent Support
Single realization option per Cloud Region for the specified Intent
- Major Impact Projects:
- Multi-Cloud (Highest), OOF
- Minor Impact Projects:
- A&AI, SO
- End-to-end use case demonstration:
- vCPE, vDNS
Phase 2 Summary (Build on Phase 1 Work):
- Multi-Cloud Policy Framework
- Dynamically modify the cloud specific VNF deployment template based on cloud-specific realization of the specified intent – Impact to VNF configuration
- E.g. High performance Intra-DC data plane networking with several realization choices
- Dynamically modify the cloud specific VNF deployment template based on cloud-specific realization of the specified intent – Impact to VNF configuration
- Intent Support
- Multiple realization options per Cloud Region for the specified Intent
- Major Impact Projects:
- Multi-Cloud
- Minor Impact Projects:
- OOF, GNF Controller
References:
The sequence diagram below expands "Multi-Cloud/VNFM Deploy Apps" in Edge Scoping Sequence Diagram
Cloud Agnostic Intent (Policy) Execution Workflow Summary:
Follow ups:
- Policy DB – is there any restriction on json objects store?
- Matti to follow up with Ankit
- Intent – "Infrastructure Resource Isolation for VNF" – { "qosProperty": { {"Burstable QoS": "TRUE", "Burstable QoS Oversubscription Percentage": "25"} } }
- Only certain pre-defined over-subscription values are allowed to simplify implementation
Private Cloud Setup - OpenStack-based
- Pre-defined (including custom) flavors map to Instance types in Public Clouds
- Pre-defined flavors are created by the Cloud Admin before the Cloud is used by ONAP for workload deployment
- VMware VIO Configuration for Min Guarantee feature
VNFC to Instance Type Mapping
- One or more VNFCs (e.g. vCPE VGW) could map to an Instance Type
Operator Configuration – Multi-VIM/Cloud Plugin
The operator/service provider who uses ONAP will choose which VIMs to use and include the appropriate MultiVIM plugins in his ONAP deployment. For example, let’s assume they pick private Openstack, private VMWare, and public Azure as the platform to run their services on.
For each MultiVIM plugin, operator configures the following information:
- For each Instance Type, the cost of that VM to the operator. Note that this costs includes the (potentially discounted) list price for the VM, support cost, and operations cost. The last one is definitely operator specific.
- Operator also specify the cost for each feature: HA, etc
- Note that the operator is free to choose what time duration the cost metric is specified for each of the MultiVIM plugins (e.g., cost per hour, cost per month) since they will do it consistently for each of the VIMs.
Workflow Details
1. SO → OOF - Get Target <Cloud Owner, Cloud Region> for the Service Instances
2. OOF → Policy - Fetch Enhanced Capacity Check & Cloud Selection Policy for Homing
2a) OOF Processing - the fetched Policy is stored in a local data structure and is available for further use (need OOF code changes).
3. OOF → A&AI - Fetch Cloud-Agnostic (Standardized) Capabilities for the Service Instance
3a) OOF Processing - Perform Cloud Agnostic Capability check for each <cloud owner, cloud region>
4. OOF → MC - Push Cloud Agnostic Policy for the Service Instance - perform Cloud Specific Check (capability/capacity/cost metrics) for each registered Cloud Region in Multi-Cloud
4a) OOF Processing
The enhanced OOF ↔ MC capacity check API, described below, is filled based on the enhanced Capacity Check & Cloud Selection Policy for Homing retrieved in step 2) – need OOF code changes.
5a) MC Processing (need MC code changes)
For each cloud owner
- Instance Type Handling
- Instance Type is passed in the capacity check API from OOF (Discuss) //Note, SO → MC passes OpenStack flavor name in the Heat Template/Env file
- Convert to appropriate instance type based on intent //e.g. "Infrastructure Resource Isolation for VNF" may result in a different instance type if the cloud owner supports "Burstable QoS"
- Parse OOF → MC Policy API
- For each cloud region // Public cloud could have different costs in different geographic locations
- net_value_cost = net_value_cost + cost_instance_type // cost per instance type is based on policy (for R3, it is picked up from Multi Cloud configuration file)
- net_value_cost = net_value_cost + cost_intent //e.g. "Infrastructure High Availability (HA) for VNF" may have additional cost
- Capacity Check
- Private Clouds (OpenStack based)
- Perform capacity check per specified Tenant (OpenStack Project)
- If Capacity check fails, drop the cloud region out of the candidate list
- Public Clouds or Other Clouds
- Capacity check always succeeds //assumption: public cloud has infinite capacity
- Private Clouds (OpenStack based)
5. MC → OOF – Return a net value cost for each <cloud owner, cloud region> if the capacity check succeeds
6a) OOF Processing - cloud_net_value input in Multi-objective Optimization (need OOF code changes)
Each service specifies an service-specific objective function that is stored as part of the service-specific policy and is used by OOF to evaluate the candidate <cloud owner, cloud region>. For simplicity of the example, let’s consider service that consists only of one VNF instance. The objective function has two components:
- distance from customer location to the VNF - the service designed assigns a weight for the distance: wd
- the cost of deploying the VNF in a location - the service designer assigns a weight for the cost: wc
OOF optimization function: min (wd*distance + wc*cloud_net_value)
If the service does not care about the cost at all, it would set wc = 0. If the service designer wants to minimize cost, he could set wd=0. Note that candidates that are too far can be eliminated by a distance constraint even before the optimization. For example, if the service has a distance constraint of at most 100 kilometers, then only those <cloud owner, cloud region> within 100 kilometers to the customer location would be considered in the objective function evaluation.
If the service designer wants to trade off between distance and cost, for example, they might set wd = 1, wc = 2. This would mean that one $1 increase in price is as valuable as 2 kilometers in distance.
<cloud owner, cloud region> Candidate 1: $100, 100 kilometers => value: 300
<cloud owner, cloud region> Candidate 2: $150, 80 kilometers => value: 380
<cloud owner, cloud region> Candidate 3: $50, 190 kilometers => value: 290 <- pick this one
6. OOF → SO - Return the target <cloud owner, cloud region> for the Service Instance
7. SO → MC - Deploy VNF template in the target <cloud owner, cloud region> for the Service Instance
7a) MC Processing (need MC code changes)
- Parse Template (e.g. OpenStack Heat Template)
- For each VNFC, instance type in the template
- Fetch Cloud-Agnostic Workload Deployment Policy (Intent) based on <Service (e.g. vCPE), VNFC (e.g. vGW)>
- Value/Content: <Policy JSON>
- Parse Policy JSON
- Modify template according to Intent - intent examples below
- "Infrastructure High Availability (HA) for VNF"
- "Infrastructure Resource Isolation for VNF"
- "Burstable QoS"
- Fetch Cloud-Agnostic Workload Deployment Policy (Intent) based on <Service (e.g. vCPE), VNFC (e.g. vGW)>
- For each VNFC, instance type in the template
Policy (Intent) Realization
- "Infrastructure High Availability (HA) for VNF"
- OpenStack-based Cloud realization
- For R3, Host-based anti-affinity using server groups //Beyond R3, Support other anti-affinity models at availability zone level etc.
- Notes on implementation:
- Instance "count" in heat template specifies VNFC scale out factor
- While dynamic injection of server group into heat template is ideal, a simple starting point could be just switching to an alternate heat which is identical to the deployment template and additionally has server group
- Azure realization
- Availability Set?
- OpenStack-based Cloud realization
"Infrastructure Resource Isolation for VNF" – { "qosProperty": { {"Burstable QoS": "TRUE", "Burstable QoS Oversubscription Percentage": "25"} } }
OpenStack-based VMware VIO Cloud realization
- This can be achieved through min guarantee -- Max or limit (upper bound) & Min or Reservation (guarantee) are part of OpenStack flavor metadata
- Example
- VNFC with "Guaranteed QoS"
- "flavor-xyz-no-oversubscription"
- vCPU (Min/Max) - 16, Mem (Min/Max) - 32GB
- Same VNFC with "Burstable QoS", 25% over-subscription
- "flavor-xyz-25-percent-oversubscription"
- vCPU (Min) - 16, Mem (Min) - 32GB
- vCPU (Max) - 20, Mem (Max) - 40GB
- VNFC with "Guaranteed QoS"
- Only certain pre-defined over-subscription values are allowed to simplify implementation
- Notes on implementation:
- While dynamic injection of limit/reservation into flavor is ideal, a simple starting would be to be to switch to a pre-defined flavor in the environment file
- For aforementioned example
- Original flavor - "flavor-xyz-no-oversubscription"
- Modified flavor based on Policy - "flavor-xyz-25-percent-oversubscription"
- For aforementioned example
- While dynamic injection of limit/reservation into flavor is ideal, a simple starting would be to be to switch to a pre-defined flavor in the environment file
- Example
- "Infrastructure High Availability (HA) for VNF"
7b) Policy (Intent) Database
- For R3, store Cloud-Agnostic Workload Deployment Policy (Intent) can be stored in the form of configuration file(s) in the OOM K8S Persistent Volumes to simplify implementation.
Cloud Resource Partitioning for Differentiated QoS (Combined with Previous)
Value:
- Applicable to all use cases
- Casablanca Targets:
- vCPE (Enable Tiered service offering); 5G Network Slicing (Stretch Goal)
References:
Edge Automation Requirement:
Support three types of slices in the Cloud Infrastructure (Definition Reference: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/)
- Guaranteed Resource Slice (hard isolation) for various infra Resources (CPU/Memory/Network)
- Max (limit), Min (request) are the same; resource guarantee is "Max"
- Maps to 5G Applications such as Connected Car which fall in the category of ultra-reliable machine-type communications (ref. 1)
- Burstable Resource Slice (soft isolation) for various infra Resources
- Min (request) <= Max (limit); resource guarantee is "Min"
- Maps to Burstable Network Slice such > 1Gbps broadband which fall in the category of extreme mobile broadband (ref. 1)
- Best Effort Resource Slice (no isolation) for various infra Resources
- No Min (request) ; resource guarantee is "None"
- Maps to 5G Applications such as IoT which fall in the category of massive machine-type communications (ref. 1)
Implementation:
- Leverage current HPA framework with appropriate extensions
References:
- https://metis-ii.5g-ppp.eu/wp-content/uploads/white_papers/5G-RAN-Architecture-and-Functional-Design.pdf
Driving Superior Isolation for Tiered Services using Resource Reservation -- Optimization Policies for Residential vCPE
-https://jira.onap.org/browse/OPTFRA-240
Note:
- Any VMs/Containers which are part of a resource slice will adhere to the specs of the resource slice
ONAP Component | Life Cycle Phase | Enhancements |
---|---|---|
Policy | Design | Configuration Policies for Guaranteed, Burstable & Best Effort Cloud Infrastructure Resource Slices (this will apply to VMs/Containers also) Placement Policies for Resource Slices
|
Multi-Cloud | Deploy | Resource Slice Capability Discovery |
A&AI | Deploy | Resource Slice Capability per Cloud Region
Resource Slice Type
|
OOF | Deploy | Execute Resource Slice Placement Policies for Optimized Service/VNF Placement across Cloud Regions |
Aggregated Infrastructure Telemetry Streams (Aligns with HPA requirements, Combining efforts with HPA)
Value
Edge Infrastructure Analytics complementing 5G VNF Analytics
-
MULTICLOUD-254Getting issue details...
STATUS
ONAP, as in R2, collects the statistics/alarms/events from workloads (VMs) and take any close loop control actions such as Heal a process, scale-out, restart etc.. In R3, infrastructure related statistics/alarms/events will be collected, generate actionable insights and take life cycle actions on the workloads. Infrastructure statistics normally include performance counters, NIC counters, IPMI information on per physical server node basis. To reduce the load on the ONAP, it is necessary that aggregated (summarized) information is sent to the ONAP from edge-clouds.
As part of this activity, intention is to create aggregation micro-service that collects the data from physical nodes (over collected and other mechanisms), aggregate the information (time based aggregation, threshold based aggregation, silencing etc.,..) based on the configurable rules and export the aggregate data to DCAE. This micro service can be instantiated by ONAP itself - one or more instances for edge-clouds at the ONAP-central itself using OOM, it could be instantiated at the edge-cloud using their own deployment tools or it could be deployed edge service providers at the regional site level.
Impacted projects (development activities)
ONAP Component | Enhancements |
---|---|
Overall |
|
Multi-Cloud |
|
AAI & ESR |
|
PORTAL | ESR portal related changes to take information about the edge-cloud (CA Cert and UN/PWD information) |
DCAE & DMAPP | None expected?? |
Life Cycle stages related functions
ONAP Component | Life cycle phase | Activities |
---|---|---|
AAI and ESR | Deploy & Run time |
|
AAI and ESR | Run time |
|
Multi-Cloud | Run time |
|
ONAP Edge Analytics with DCAE/DMaaP independent of closed loop (Beyond Casablanca)
Value
- 5G Analytics
ONAP Component | Life cycle phase | Enhancements |
---|---|---|
OOM - ONAP Central | Deploy |
|
Multi-Cloud Deployment in Edge Cloud (Stretch Goal)
- MULTICLOUD-262Getting issue details... STATUS
Value:
- Multi-Cloud service to assist in central A&AI scaling by caching A&AI data locally and syncing up with A&AI periodically