You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 22 Next »

This will include e.g. credentials usage, performance, resiliency, testing requirements etc.

    • Support ONAP platform upgrade

      • How can we update a running ONAP instance in the field. Should this be part of the OOM scope with help of the architecture sub committee?
    • Support ONAP reliability

  • We have no well defined approach to fault tolerance neither on the component nor the system level. Should this be part of the OOM scope with help of the architecture sub committee?
  • An instance of the complete "ONAP platform shall be configurable to be part of an "N+1" group of instances that operate in a specific, coordinated, controlled fail-over manner.  This coordinated fail-over manner enables the  operator to select an "the idle +1" instance in the group to functionally replace any of the operating, healthy,  "N" instances of the group on demand, within a specified amount of time.  This approach shall be used as a common method for surviving disaster, and also, the common approach to software upgrade distribution.

    NOTE: Failover doesn’t necessarily need to be always controlled. For example, if a component fails in the middle of a workflow, ONAP should be able to switch to another available instance of that component so as to terminate operations correctly, without involving humans in the loop.
    • Support ONAP scalability

      • How to scale ONAP? Should this be part of the OOM scope with help of the architecture sub committee?

    • Support ONAP monitoring

      • Common logging formats and approaches need to be supported and automated cross components monitoring tools should be developed/provided

      • The ability to monitor and detect issues with the delivery of feeds and topics in DMaaP should be provided
    • Support a Common ONAP security framework for authorization and authentication

    • Secure all  Secrets and Keys (Identity or otherwise)  while they are in persistent memory or while they are in use 
      • Secrets such as passwords and keys are in clear current ONAP infrastructure components.  Security breaches are a possibility if these secrets are not protected well.  Many modern platforms support trusted execution environments,  It is needed to define the security architecture with respect to secrets and apply the architecture across all ONAP components and may be even across ONAP, VIM,  Site specific controllers and NFVI.
    • Support for ensuring that all ONAP infrastructure VMs/containers are brought up with intended Software
      • ONAP infrastructure itself is set of multiple services. At last count, there are more than 30 services.  In addition, some of these services can also be run in various sites for scalability and availability.  For example, some DCAE components may be run in various sites.  If underlying operating system/system-software and ONAP software is tampered, that could result in misbehavior and in the worst case hackers taking control of the network.   It is good practice to ensure that  the ONAP servers and services (containers or VMs) are brought up on servers with intended firmware, OS,  utilities etc..  TPM based software attestation is one popular method and same can be considered here.
    • Secure communication among the ONAP Services
      • Mutual-TLS (Certificate based authentication)
      • Auto certificate provisioning on each ONAP micro-service when it comes up.
      • Local CA support to provide certificates to ONAP services.
      • Storage of micro-service private key in TEE, if available in the hardware.
    • CA Service for VNFs certificate enrolment
      • Many VNFs need to communicate with other VNFs securely over TLS. Since Mutual TLS requires certificates, there would be a need for CA service for VNFs for VNFs to get the certificate enrolled. 
      • CA Service to provide X.509V3 certificates as one of the ONAP services.
      • CA Service to provide OCSP service.
      • Ability for VNFs (as part of VNF SDK?) to request certificate from the CA service by providing PKCS10 CSR.
      • Ability for VNFs (as part of VNF SDK?) to check the certificate validity.
    • API Versioning

      • What are the detailed rules of API depreciation. We did commit at the NJ F2F that APIs can’t just change between releases but we never agreed on the actual process/timeline surrounding API changes etc.

      • APIs for the current release, and in addition,  the most recent two prior releases shall be supported, and if changed by comparison to the two prior releases, shall support deprecated, unchanged functions of such two prior releases.

    • Support ONAP Portal Platform enhancements

      • What are the new applications that needs to be on-boarded onto Portal Platform?
      • What are the Portal SDK enhancements needed for the usecase developers?
      • Adapting centralized role management by all applications that is now provided by the Portal platform.
    • Software Quality Maturity

      • The ONAP platform is composed of many components, each component exhibiting varying degrees of "software maturity."   A proposed requirement here is that prior to accepting software updates for an particular component as part of the complete ONAP solution, it must be shown that the software for such a component meets or exceeds specific quantitative measures of software reliability.  AKA "Software Releasability Engineering" analysis data should be computed and disclosed for each component.  This data is computed by collecting time-series data from the testing process (testing the published use-cases)  and fitting test failure counts to a family of curves known to track defect density over the software lifetime.  Information on this type of quantitative anallysis is here:   https://drive.google.com/open?id=0By_UqQM0rEuBei10TVdTOU5CalU   A particular tool that may be used to compute this "SRE" (Software Releasability Engineering) is open sourced and available entirely in a Docker container here:   https://cloud.docker.com/swarm/ehwest/repository/docker/ehwest/sre/general   Numerous papers are in the literature explaining the use of Markov Chain Monte Carlo methods to fit test data to the target curve.
    • Documentation and some toolkit to use ONAP and adapt it

    • Support for a common resiliency platform 

      • Making ONAP resilient is challenging broadly because of three reasons –  (1) ONAP services need to be replicated both within and across sites, with efficient failover mechanisms. This is a challenging problem because of WAN latencies and frequent network partitions. (2) The clients of ONAP are spread across the world, and may have locality and high throughput needs. Hence, the load needs to be distributed and or federated across the different replicas of the ONAP services to satisfy such needs. (3) ONAP services often have a diverse range of requirements in terms of replication and resiliency. While some components need to carefully manage state across replicas, others may be stateless. Similarly, some of the ONAP services have strict requirements in terms of how the load should be shared across replicas.

        To address these complex resiliency challenges, different ONAP teams are building their own carefully reasoned, handcrafted solutions leading to much wastage of resources. A proposed requirement here is to systematically understand and model the common resiliency concerns across ONAP services and provide a resiliency platform with the necessary building blocks to address these concerns. Each individual ONAP service can use this common resiliency platform to create design patterns for resiliency as required. 

  • No labels