Table of Contents

References

Survey of Options

Options Excluded

  • Conventional choices, but considered to be difficult to properly configure and maintain, also slow performance
    • IPSec
    • OpenVPN
  • Others

Quick CVE Comparison

Expectations

  • Ability to secure the intra-ONAP communications, i.e. between ONAP projects, such as SO-to-AAI, UUI-to-MSB, OOF-to-VID, etc.
  • Ability to secure the ONAP-to-external-system communications, i.e. ONAP-to-database-cluster, ONAP-to-NetworkFunctions, ONAP-to-other-ONAP, etc.
  • Ability to scale with the defined ONAP projects (static per ONAP release)
  • Ability to scale with the number of deployed instances of ONAP VMs (dynamic)
  • Ability to scale with the number of deployed instances of ONAP pods (dynamic)
  • Ability to scale with the number of deployed instances of ONAP containers (dynamic)
  • Ability to scale with the number of deployed instances of ONAP processes (dynamic)
  • Ability to scale with the number of external-system connections (configurable)
  • Ability to work with HEAT-based deployment
  • Ability to work with OOM-based deployment
  • Ability to work with other (non-HEAT, non-OOM) deployment
  • Ability to operate with other layers of security
  • Ability to securely upgrade ONAP in-the-field
  • Ability for resilient and fault-tolerant ONAP communications in-the-field
  • Minimal efforts to implement across all ONAP projects
  • Minimal impact on resource usage and performance across ONAP

Threat Models

  1. External attacker analyzes the captured traffic among services to steal secrets such as passwords and certificates
  2. Internal attacker analyzes the captured traffic among services to steal secrets such as passwords and certificates
  3. External attacker bombards the container services with new connections, leading to large number forked processes and threads leading to resource issues on other workloads (containers) in the system
  4. Internal attacker bombards the container services with new connections, leading to large number forked processes and threads leading to resource issues on other workloads (containers) in the system
  5. External attacker exploits downloads of containers from repositories to tamper with them and inject malicious code
  6. Internal attacker exploits downloads of containers from repositories to tamper with them and inject malicious code
  7. External attacker introduces malicious VM into ONAP environment to steal data and subvert operations
  8. Internal attacker introduces malicious VM into ONAP environment to steal data and subvert operations
  9. External attacker introduces malicious pod into ONAP environment to steal data and subvert operations
  10. Internal attacker introduces malicious pod into ONAP environment to steal data and subvert operations
  11. External attacker introduces malicious container into ONAP environment to steal data and subvert operations
  12. Internal attacker introduces malicious container into ONAP environment to steal data and subvert operations
  13. External attacker introduces malicious process into ONAP environment to steal data and subvert operations
  14. Internal attacker introduces malicious process into ONAP environment to steal data and subvert operations
  15. External attacker introduces malicious external-system into ONAP environment to steal data and subvert operations
  16. Internal attacker introduces malicious external-system into ONAP environment to steal data and subvert operations

ONAP Operating Environment

Example from Cloud Native Deployment:

  • ubuntu@a-cd-one:~$ kubectl get pods --all-namespaces
    (shows 210 pods in onap namespace)


TypeVMsContainers
Full Cluster (14 + 1) - recommended15248 total



Example from Open Wireless Laboratory (OWL) at Wireless Information Network Laboratory (WINLAB):

  • There are currently three Ubuntu 18.04 servers: node1-1, node2-1 and node2-2, which are managed by OpenStack.
    Node1-1 is the controller node, and node2-1 and node202 are compute nodes.
    We have installed ONAP using the OOM Rancher/Kubernetes instructions into five VMs.

Development

  • There is a transition from http ports to https ports, so that communications are protected by TLS encryption.
  • However the transition is piecemeal and spread over multiple ONAP releases, so individual projects still have vulnerabilities to due intra-ONAP dependencies, e.g. OJSI-97 - Getting issue details... STATUS out of a total of Getting issues... .
  • A node-to-node VPN (working at the level of the VM or physical servers that host the Kubernetes pods/docker containers of ONAP) would provide blanket coverage of all communications with encryption.
  • A node-to-node VPN is both
    • an immediate stopgap solution in the short-term to cover the exposed plain text HTTP ports
    • an extra layer of security in the long-term to thwart unforeseen gaps in the use of HTTPS ports

Discussion

  • There has already been discussion and recommendation for using Istio https://istio.io/
    • Istio Envoy is deployed within each pod using sidecar-injection, then stays in the configuration when the pods are restarted
    • Istio Envoy probably appears within each pod as a network bridge, such as Kubernetes cluster networking bridge cbr0, thereby controlling all network traffic within the pod
    • Istio Envoy provides full mesh routing but can also provides control of routing with traffic management and policies
    • Istio Envoy also provides telemetry in addition to the security of mutual TLS authentication
    • Istio Citadel is run in the environment as the certificate authority / PKI supporting the mutual TLS authentication
    • Istio appears to have only a single overall security domain (i.e. the environment that includes Mixer, Pilot, Citadel and Galley), though it does contain many options to distinguish different services, users, roles and authorities


The sections below are for gathering thoughts on alternative solutions

Discussion of Tinc VPN

  • VPN appears to the IP level network code as a normal network device
  • Automatic full mesh routing. Regardless of how you set up the tinc daemons to connect to each other, VPN traffic is always (if possible) sent directly to the destination, without going through intermediate hops.
  • Easily expand your VPN. When you want to add nodes to your VPN, all you have to do is add an extra configuration file, there is no need to start new daemons or create and configure new devices or network interfaces
  • Ability to bridge ethernet segments. You can link multiple ethernet segments together to work like a single segment, allowing you to run applications and games that normally only work on a LAN over the Internet.
  • Runs on many operating systems and supports IPv6. Currently Linux, FreeBSD, OpenBSD, NetBSD, OS X, Solaris, Windows 2000, XP, Vista and Windows 7 and 8 platforms are supported. tinc has also full support for IPv6.
  • First, create the initial configuration files and public/private keypairs using the following command:

    tinc -n NETNAME init NAME

    Second, use ‘tinc -n NETNAME add ...’ to further configure tinc. Finally, export your host configuration file using ‘tinc -n NETNAME export’ and send it to those people or computers you want tinc to connect to. They should send you their host configuration file back, which you can import using ‘tinc -n NETNAME import’

  • Connections specified using ‘ConnectTo’ are so-called meta-connections. Tinc daemons exchange information about all other daemon they know about via these meta-connections. After learning about all the daemons in the VPN, tinc will create other connections as necessary in order to communicate with them.
  • Invitations are an easy way to add new nodes to an existing VPN. Invitations can be created on an existing node using the tinc invite command, which generates a relatively short URL which can be given to someone else, who uses the tinc join command to automatically set up tinc so it can connect to the inviting node


From https://www.tinc-vpn.org/pipermail/tinc/2017-May/004864.html:

In general however, I would advise against trusting other nodes, even with
StrictSubnets=yes. tinc is not currently designed to provide strong
protection against insider attacks - for the most part it assumes that
every node inside the metaconnection graph can be trusted. In my opinion
tinc will do poorly in a scenario where a "compromised node" is part of
your threat model.

Discussion of ZeroTier

  • ZeroTier One is a service that can run on laptops, desktops, servers, virtual machines, and containers to provide virtual network connectivity through a virtual network port much like a VPN client. It can also act as a network controller and as a federated root server.

  • After the service is installed and started, networks can be joined using their 16-digit network IDs. Each network appears as a virtual "tap" network port on your system that behaves just like an ordinary Ethernet port.

  • ZeroTier protocol is original, though aspects of it are similar to VXLAN and IPSec. It has two conceptually separate but closely coupled layers in the OSI model sense: VL1 and VL2. VL1 is the underlying peer to peer transport layer, the "virtual wire," while VL2 is an emulated Ethernet layer that provides operating systems and apps with a familiar communication medium.
  • VL1 is designed to be zero-configuration. A user can start a new ZeroTier node without having to write configuration files or provide the IP addresses of other nodes. It's also designed to be fast. Any two devices in the world should be able to locate each other and communicate almost instantly.
  • VL1 is organized like DNS. At the base of the network is a collection of always-present root servers whose role is similar to that of DNS root name servers. Roots run the same software as regular endpoints but reside at fast stable locations on the network and are designated as such by a world definition. World definitions come in two forms: the planet and one or more moons
  • There is only one planet. Earth's root servers are operated by ZeroTier, Inc. as a free service

  • A node can "orbit" any number of moons. A moon is just a convenient way to add user-defined root servers to the pool. Users can create moons to reduce dependency on ZeroTier, Inc. infrastructure or to locate root servers closer for better performance. For on-premise SDN use a cluster of root servers can be located inside a building or data center so that ZeroTier can continue to operate normally if Internet connectivity is lost

  • Since roots forward packets, A and B can reach each other instantly. A and B then begin attempting to make a direct peer to peer connection. If this succeeds it results in a faster lower latency link. We call this transport triggered link provisioning since it's the forwarding of the packet itself that triggers the peer to peer network to attempt direct connection
  • VL2 is a VXLAN-like network virtualization protocol with SDN management features. It implements secure VLAN boundaries, multicast, rules, capability based security, and certificate based access control. VL2 is built atop and carried by VL1
  • ZeroTier is available as a linkable or loadable library called libzt. What makes this different from the more familiar ZeroTier One service is that it comes bundled with its own network stack: lwIP, and it doesn't require special permissions on the system. You can now link ZeroTier into your application and access it over your virtual network as if it were a device all of its own. For simplicity, we've modeled its API after Berkeley Sockets.

  • ZeroTier offers paid service for network controllers or self-hosted network controllers

Discussion of WireGuard

  • WireGuard aims to be as easy to configure and deploy as SSH. A VPN connection is made simply by exchanging very simple public keys – exactly like exchanging SSH keys – and all the rest is transparently handled by WireGuard. It is even capable of roaming between IP addresses, just like Mosh. There is no need to manage connections, be concerned about state, manage daemons, or worry about what's under the hood
  • WireGuard securely encapsulates IP packets over UDP. You add a WireGuard interface, configure it with your private key and your peers' public keys, and then you send packets across it. All issues of key distribution and pushed configurations are out of scope of WireGuard. In contrast, it more mimics the model of SSH and Mosh; both parties have each other's public keys, and then they're simply able to begin exchanging packets through the interface
  • WireGuard works by adding a network interface (or multiple), like eth0 or wlan0, called wg0 (or wg1, wg2, wg3, etc). This network interface can then be configured normally using ifconfig(8) or ip-address(8), with routes for it added and removed using route(8) or ip-route(8), and so on with all the ordinary networking utilities. The specific WireGuard aspects of the interface are configured using the wg(8) tool. This interface acts as a tunnel interface.
  • At the heart of WireGuard is a concept called Cryptokey Routing, which works by associating public keys with a list of tunnel IP addresses that are allowed inside the tunnel. Each network interface has a private key and a list of peers. Each peer has a public key. Public keys are short and simple, and are used by peers to authenticate each other. They can be passed around for use in configuration files by any out-of-band method, similar to how one might send their SSH public key to a friend for access to a shell server
  • The client configuration contains an initial endpoint of its single peer (the server), so that it knows where to send encrypted data before it has received encrypted data. The server configuration doesn't have any initial endpoints of its peers (the clients). This is because the server discovers the endpoint of its peers by examining from where correctly authenticated data originates. If the server itself changes its own endpoint, and sends data to the clients, the clients will discover the new server endpoint and update the configuration just the same. Both client and server send encrypted data to the most recent IP endpoint for which they authentically decrypted data. Thus, there is full IP roaming on both ends
  • WireGuard sends and receives encrypted packets using the network namespace in which the WireGuard interface was originally created. This means that you can create the WireGuard interface in your main network namespace, which has access to the Internet, and then move it into a network namespace belonging to a Docker container as that container's only interface. This ensures that the only possible way that container is able to access the network is through a secure encrypted WireGuard tunnel
  • The most obvious usage of this is to give containers (like Docker containers, for example) a WireGuard interface as its sole interface.

  • A less obvious usage, but extremely powerful nonetheless, is to use this characteristic of WireGuard for redirecting all of your ordinary Internet traffic over WireGuard.

  • It turns out that we can route all Internet traffic via WireGuard using network namespaces, rather than the classic routing table hacks.

WireGuard Todo


Mesh Networking Tools. It is possible to build a mesh network out of WireGuard using WireGuard as the building block. Write a tool that builds meshes and has peers discover each other, taking into account important authentication and trust properties. Build onto of wg-dynamic.



Comparisons

  • Appearance:
    • Tinc VPN appears as IP level network device
    • ZeroTier appears as Ethernet level network port
    • WireGuard appears as IP level network device
  • Connectivity provided:
    • Tinc VPN automatically gives full mesh routing
    • ZeroTier automatically gives full mesh routing
    • WireGuard gives point-to-point connection like SSH (mesh routing is a todo)
  • Node/Host Configuration:
    • Tinc VPN host is configured with public/private key pair, in a config file
    • ZeroTier node is configured with public/private key pair, then generates a VL1 ZeroTier Address
    • WireGuard host is configured with public/private key pair and ACL, in a config file
  • Network Configuration:
    • Tinc VPN network is configured by hosts exchanging (out-of-band) exported config files for a specified "network name"
      • rest of network is exchanged in-band
    • ZeroTier network is configured with knowledge of "roots" and with VL2 ZeroTier Network ID (VL1 ZeroTier Address of the controller and network number)
      • rest of network is exchanged in-band
    • WireGuard network is configured by hosts sharing public keys (out-of-band), connect via IP Address corresponding to keys
      • IP roaming is exchanged in-band
  • Number of network connections:
    • Tinc VPN hosts can connect to many "network names" concurrently
    • ZeroTier nodes can connect to multiple VL2 ZeroTier Network IDs concurrently
    • WireGuard hosts can connect to many other hosts concurrently
  • Deployment:
    • Tinc VPN is deployed on the VM hosting the pods/containers/processes
      • could be in the container base image
      • no explicit interoperability with kubernetes to manipulate pod/container network namespaces
    • ZeroTier is deployed on the VM hosting the pods/containers/processes
      • could be in the container base image
      • no explicit interoperability with kubernetes to manipulate pod/container network namespaces
    • WireGuard is deployed on the VM hosting the pods/containers/processes
      • could be in the container base image
      • no explicit interoperability with kubernetes to manipulate pod/container network namespaces
  • Single-Points-of-Failure:
    • Tinc VPN runs daemon processes on each host (one per network name), topology is peer-to-peer
    • ZeroTier runs a global "planet" root server called "Earth" apparently as testing network and casual communications
      • Unclear about how users can deploy their own "planet" root servers
      • Users can deploy their own "moon" root servers
    • WireGuard runs daemon processes on each host, topology is peer-to-peer
  • Scaling:
    • Tinc VPN can add new hosts to existing network names without altering configurations of existing hosts
      • invitations dynamically create a configuration on the server
    • ZeroTier can add new nodes to existing network IDs without altering configurations of existing nodes (Network ID is obscure but public information)
      • Unclear whether adding new root servers requires a restart
    • WireGuard can add new hosts but requires both ends of the connection to be updated and present in the ACL of host config file
  • Access Control:
    • Tinc VPN has control by the exchange of exported host config files
      • an invitation refers to the configuration on the server
    • ZeroTier nodes need to be authorised after attempting to connect the network ID, but it can be turned off to allow "public" networks
    • WireGuard has control by the exchange of host public keys and ACL in host config file
  • Based on example from Cloud Native Deployment:
    • Tinc VPN would be deployed on 15 VMs, compared to 210 pods
    • ZeroTier would be deployed on 15 VMs, compared to 210 pods
    • WireGuard would be deployed on 15 VMs, compared to 210 pods
  • Based on example from Open Wireless Laboratory:
    • Tinc VPN would be deployed on 3 servers or 5 VMs, compared to 210 pods
    • ZeroTier would be deployed on 3 servers or 5 VMs, compared to 210 pods
    • WireGuard would be deployed on 3 servers or 5 VMs, compared to 210 pods
  • tbc


Comparison to Istio

  • Istio Envoy is deployed as a sidecar-per-pod
    • so on a single VM, there could be many such sidecars and resource usage may be higher
    • requires admin access to kubernetes
  • Istio Envoy performs mutual TLS authentication for pod-to-pod network communication
    • appears to work only inside one kubernetes system
  • Istio Envoy provides control of routing with traffic management and policies
    • this might not be needed if full mesh routing is intended everywhere
  • Istio Mixer, Pilot, Citadel and Galley servers may represent Single-Points-of-Failure in the environment, as well as additional setup required
  • Istio provides functionality over and above the VPN encryption of network traffic


Questions

  1. Is it necessary to encrypt pod-to-pod communications if both ends are on the same VM? The traffic would not appear on the network.
  2. What is the actual resource overhead for each VPN/sidecar (e.g. in terms of RAM, CPU, disk, I/O, etc)?
  3. tbc


  • No labels

8 Comments

  1. ISTIO seems to be one of the preferred option.  

    Few reasons

    • Mutual TLS without any development from applications (Productivity improvement is huge)
    • ISTIO has RBAC functionality.
    • ISTIO has inbuilt CA functionality.
    • Active community
    • Used by many in Industry 
    • Supported by all major public providers supporting Kubernetes-as-a-service.
    • Has ability to integrate with third party CA and third party RBAC.



    1. Hi Srinivasa Addepalli,

      I received an off-list reply to my message https://lists.onap.org/g/onap-discuss/message/13141 from Amy Zwarico, so I'm writing up my thoughts for an alternative. I think it may also have applications in securing communications from ONAP to external systems that cannot use the Istio sidecar method, e.g. because they are not under ONAP-developer control.

      Keong

  2. Hi Keong LimSrinivasa Addepalli

    Can you please suggest if you have any view on secure inter-ONAP communications - i.e communications between two instances of ONAP. The context is the wiki page we are drafting for the interlude scope in ONAP Ext-API - here. Here the communication is between two administrative systems having ONAP at either end and communicate over the MEF Interlude reference point using TMF  APIs. 

    Manoj 

    1. hi Manoj Nair,

      Communication between two ONAP systems is the scenario for CCVPN as well.

      I think the VPN ideas on this page can be used to handle it and I will add items to address this scenario.

      Keong


    2. ISTIO provides ingress and egress gateways to talk to external parties. Same can be used for ONAP to ONAP communication.

      1. Hi Srinivasa Addepalli,

        Does that mean the ONAP-to-ONAP communications will be covered by mutual TLS authentication as well?

        Can that be extended to other variations of ONAP-to-external-system communications too?


        1. Yes. That also could be Mutual-TLS.

          1. Hi Srinivasa Addepalli, In case in an interlude scenario (similar to CCVPN, but partner using a non-ONAP) will this mandate a partner Orchestrator to use ISTIO or it can be a non-service mesh implementation ? Also the Mutual TLS  you mentioned will be controlled through AAF or by ISTIO internally ?