Background
L7 Proxy Service Mesh Controller intends to provide connectivity, shape the traffic, apply policies, RBAC and provide
mutual TLS for applications/microservices running across clusters (with service mesh), within the cluster
and with external applications. The functionalities are subjected to the usage of underlying service mesh technology.
Design Overview
Traffic Controller Design Internals
NOTE - Current implementation will support the ISTIO service mesh technology and SD-WAN load balancer. The plugin architecture of the controller makes it extensible to work with any Service mesh technology and any external load balancer as well. It is also designed to configure and communicate with external DNS servers.
JIRA
Elements of Traffic Controller with ISTIO as the service mesh
- Gateways - The inbound/outbound access for the service mesh. It is an envoy service
- VirtualServices - To expose the service outside the service mesh
- DestinationRule - To apply rules for the traffic flow
- AuthorizationPolicy - Authorization for service access
- serviceEntry - add an external service into the mesh
- servicerole and servicerolebinding - for RBAC - Role Based Access Control
These are the Kubernetes resources generated per cluster. There will be multiple of these resources depending on the intent
API
RESTful North API (with examples)
Types | Intent APIs | Functionality |
---|---|---|
| /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/intercluster/ | communication between microservices deployed between two clusters |
2. external outbound service communication | /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/external/outbound/ | communication from microservice to external service |
3. intracluster communication | /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/intracluster/ | communication between microservices in the same cluster |
4. external inbound service communiation | /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/external/inbound/ | API for external service to access the microservices inside the mesh |
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-groups/ POST BODY: { "name": "john", "description": "Traffic intent groups" }
1. Inter Micro-service communication intents - Edit the intent to have inbound services to a target service than the outbound services - check the API level access! - implement for all APIS!
POST
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-sets/{trafficset-name}/interclusterservice/ // URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-groups/{trafficgroup-name}/interclusterservice/{intercluster-record //only ms name}/clientservice// list of clients POST BODY: { "name": "johndoe" //unique name for each intent "description": "connectivity intent for microservice replication across multiple locations and clusters" "inboundservicename": "sleep01" //actual name of the client service "description": "bookinfo app", "protocol": "HTTP", "externalPrefix": "httpservice01", // Optional, default = "", This is the prefix used to expose this service outside the cluster, not mandatory for "intercluster" API "headless": "false", "mutualTLS": "true", // Setting this to true will create a dedicated egrees gateway for the service "httpbin01" on whichever cluster it is running on "port" : "80", // port on which service in exposed as. "serviceMesh": "istio", // get it from cluster record "loadbalancing": "true", // optional "serviceaccounts": [] "clientServices": [{ "clientServiceName": "httpbin02", // if any then allow all the external applications to connect, check for serviceaccount level access "headless": "false", "egressgateway": "true" , // Optional, default = false, All the outbound traffic from this service will flow through a dedicated egress gateway },{ "clientServiceName": "httpbin03", // Note that this is a replication of the "httpbin" serivce which may be running is another cluster. The details of this service will be known only after scheduling. "headless": "false", "egressgateway": "true", // Optional, default = false, All the outbound traffic from this service will flow through a dedicated egress gateway }] } RETURN STATUS: 201 RETURN BODY: { "Message": "Intercluster Connectivity intent success " "description": "connectivity intent for microservice replication across multiple locations and clusters" }
The above intent will generate the following configuration and instructs the deployer to deploy in clusters where the microservices are running.
Name of the Cluster | Microservices | Istio objects | Description/comments |
---|---|---|---|
| microservice01 |
| |
2. Cluster01 | httpbin02 | ||
3. Cluster01 | httpbin03 |
2. microservice connectivity to an external service intent API - Outbound access
NOTE - These are the services whose nature is not known. These services are assumed to have FQDN as a point of connectivity
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-groups/{trafficgroup-name}/externaloutboundaccess/ POST BODY: { "description": "connectivity intent for microservice replication across multiple locations and clusters" "connectivityResource": [ { "microservice01": // unique name of the microservice.not the actual name of the microservice "name": "httpbin01", //actual name of the service "application": "bookinfo", // name of the application to which this microservice belongs "egressgateway": "true", // All the outbound traffic from this service will flow through a dedicated egress gateway "externalPrefix": "", // This is the prefix used to expose this service outside the cluster, not required for externaloutbound API "mutualTLS": "true", "port" : "80", // port on which service in exposed as. "serviceMesh": "istio", // type of service mesh used //"loadbalancing": "true" // optional. "clientServices": [{ "clientServiceName": "mongo.k8s.com", // FQDN of the external Webservice "protocol": "HTTP", "type": "external" },{ "clientServiceName": "newsfeed.news.com", // FQDN of the external Webservice "protocol": "HTTP", "type": "external" }] }, { "microservice02": "name": "httpbin01", //actual name of the service "application": "bookinfo", // name of the application to which this microservice belongs "egressgateway": "true", // All the outbound traffic from this service will flow through a dedicated egress gateway "externalPrefix": "", // This is the prefix used to expose this service outside the cluster, not required for externaloutbound API "mutualTLS": "true", "port" : "80", // port on which service in exposed as. "serviceMesh": "istio", // type of service mesh used "clientServices": [{ "clientServiceName": "google.com", // FQDN of the external Webservice "protocol": "HTTP", // if TCP, the port number of the external service should be known "type": "external" },{ "clientServiceName": "review.gerrit.com" // FQDN of the external Webservice "protocol": "HTTP", "type": "external" }] } ] } RETURN STATUS: 201 RETURN BODY: { "Message": "Connectivity intent $id success " "description": "connectivity intent for microservices to connect to external service" }
3. Intent API for Intracluster communication
NOTE - Call this API only if the services are running in the same cluster, The default authorization policy must have with "deny-all" under spec as we need to disable all the communication between microservices during istio installation implement this API
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-groups/{trafficgroup-name}/intraclusterservice/ POST BODY: { "description": "connectivity intent for microservice replication across multiple locations and clusters" "connectivityResource": [ { "microservice01": // unique name of the microservice. not the actual name of the microservice as seen by kubernetes "name": "httpbin01", // actual name of the service "application": "bookinfo", // name of the application to which this microservice belongs "egressgateway": "true", // All the outbound traffic from this service will flow through a dedicated egress gateway "externalPrefix": "httpservice01", // This is the prefix used to expose this service outside the cluster, not required for "intracluster" API "mutualTLS": "true", "port" : "80", // port on which service in exposed as. "serviceMesh": "istio", // type of service mesh used //"loadbalancing": "true" // optional. "clientServices": [{ "clientServiceName": "httpbin02", "protocol": "HTTP", "headless": "false", "type": "intracluster" },{ "clientServiceName": "httpbin03", "protocol": "HTTP", "type": "kubernetes", "headless": "false", "type:": "intracluster" }] } } RETURN STATUS: 201 RETURN BODY: { "Message": "Connectivity intent $id created successfully" "description": "connectivity intent for Intracluster microservice communication" }
4. Considering RBAC/ABAC - TBD
5. API for external services to access a microservice - Inbound access
Keywords | Supported fields | Description |
---|---|---|
{connectivity-type} | intercluster/intracluster | types in API for {connectivity-type} |
{connectivity-sub-type} | intermicroservice/internalapplication/externalmicroservice | sub-types in API for {connectivity-sub-type} |
name | name of the microservice/application depending on the context | |
External DNS Update Overview
This section covers the design for how external DNS are updated.
The following sequence diagram illustrates the approach:
Elements of the DNS update design
external-dns-agent
This is deployed as part of the Edge Cluster. Essentially, it is an instance of external-dns configures with istio-gateway as the source and uses a DNS CRD as the provider (i.e. detected DNS records are just saved as a CRD). h
DNS CR
The DNS CRD can be based on the examples here: https://github.com/kubernetes-sigs/external-dns/tree/master/docs/contributing/crd-source
The above can already be used as a DNS source for exteranal-dns. The new work here is to make an external-dns provider that saves these CRD.
DNS Provider Intent
The intents for the distributed application provide DNS Provider information.
- Cluster ID
- Project ID
- DNS Provider Type (e.g. coredns, aws route 53, ...)
- DNS Provider locatoin (e.g. IP, hostname)
- DNS Provider credentials (e.g. username, pw, TLS credentials)
- DNS Provider other parameters (maybe provider specific)
Question: Is the DNS Provider information really an Intent ? which is associated with resource bundle, or is it a set of information associated with the edge cluster / project that is common across resource bundles for that project?
external-dns-controller
This component operates at the centralized controller. The function of the external-dns-controller is to periodically start external-dns jobs which will connect to specific edge clusters to query the DNS CRD on the edge cluster and then update the actual DNS provider configured (per the intent) for that edge cluster.
DNS Intent API
DNS provider intent
result:
- an instance of external-dns is running in the centralized controller for each cluster of a project which is hosting a user facing service
- the source for the external-dns instance is the DNS CRD on the specific cluster for that project. The external-dns instance will be supplied the clusters K8S API server and kubeconfig
- the provider for the external-dns instance will be the DNS Provider provided in the intent
- ? what if a given project has multiple services (from different resource bundles) - in principle, a single external-dns instanace should be able to handle all services on that cluster as long as the same provider is used.
- it is kicked off by the external-dns-controller plugin when intents for a deployed app are processed.
assumptions:
- could be multiple dns-providers for a given service
- different clusters may update different set of dns-providers
need to know:
- need to know which clusters have the service deployed
- need to identify which dns-providers are associated with the cluster
URL: /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/dnsproviders POST BODY: { "description": "dns provider intent for updating user facing microservice FQDNs to external DNS providers", "dnsProvider": [ { "id": "microservice01", // unique name of the microservice - provides association to other connectivity intents "externalName": "service1.example.com", // redundant ? - this name should be deployed to istio gateway on microservice deployment (this may not be the right spot) "cluster-selector": "label1, label2, ...", // labels to select which clusters this dns provider is to be used "externalDnsParameters": { // list will be supplied to external-dns as parameters. // for example ... "aws-zone-type": "", When using the AWS provider, filter for zones of this type (optional, options: public, private) "aws-zone-tags": "", When using the AWS provider, filter for zones with these tags "aws-assume-role":"", When using the AWS provider, assume this IAM role. Useful for hosted zones in another AWS account. Specify the full ARN, e.g. `arn:aws:iam::123455567:role/external-dns` (optional) "aws-batch-change-size":"1000", When using the AWS provider, set the maximum number of changes that will be applied in each batch. "aws-batch-change-interval":"1s", When using the AWS provider, set the interval between batch changes. "aws-evaluate-target-health":"enabled", When using the AWS provider, set whether to evaluate the health of a DNS target (default: enabled, disable with --no-aws-evaluate-target-health) "aws-api-retries":"3", When using the AWS provider, set the maximum number of retries for API calls before giving up. "aws-prefer-cname":"disabled" When using the AWS provider, prefer using CNAME instead of ALIAS (default: disabled) } providerCredentials: { ... } } ] }
Example DNS Update diagram
External application communication intents
Considering DNS resolution, No DNS resolution (IP addresses), Egress proxies of the Service Mesh, Third-party egress proxy
User facing communication intents
Considering Multiple DNS Servers
Considering multiple user-facing entities
Considering RBAC/ABAC
Internal Design details
Guidelines that need to kept in mind
- Support for metrics that can be retrieved by Prometheus
- Support for Jaeger distributed tracing by including opentracing libraries around HTTP calls.
- Support for logging that is understood by fluentd
- Mutual exclusion of database operations (keeping internal modules accessing database records simultaneously and also by replication entities of the scheduler micro-service).
- Resilience - ensure that the information returned by controllers is not lost as the synchronization of resources to remote edge clouds can take hours or even days when the edge is not up and running and possibility of restart of scheduler micro service in the meantime.
- Concurrency - Support multiple operations at a time and even synchronizing resources in various edge clouds in parallel.
- Performance - Avoiding file system operations as much as possible.
Modules (Description, internal structures etc..)
....