Background
L7 Proxy Service Mesh Controller intends to provide connectivity, shape the traffic, apply policies, RBAC and provide
mutual TLS for applications/microservices running across clusters (with service mesh), within the cluster
and with external applications. The functionalities are subjected to the usage of underlying service mesh technology.
Design Overview
Traffic Controller Design Internals
NOTE - Current implementation will support the ISTIO service mesh technology and SD-WAN load balancer. The plugin architecture of the controller makes it extensible to work with any Service mesh technology and any external load balancer as well. It is also designed to configure and communicate with external DNS servers.
JIRA
Elements of Traffic Controller with ISTIO as the service mesh
- Gateways - The inbound/outbound access for the service mesh. It is an envoy service
- VirtualServices - To expose the service outside the service mesh
- DestinationRule - To apply rules for the traffic flow
- AuthorizationPolicy - Authorization for service access
- serviceEntry - add an external service into the mesh
- servicerole and servicerolebinding - for RBAC - Role-Based Access Control
These are the Kubernetes resources generated per cluster. There will be multiple of these resources depending on the intent
API
RESTful North API (with examples)
Types | Intent APIs | Functionality |
---|---|---|
| /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/intercluster/ | communication between microservices deployed between two clusters |
2. external outbound service communication | /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/external/outbound/ | communication from microservice to external service |
3. intracluster communication | /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/intracluster/ | communication between microservices in the same cluster |
4. external inbound service communiation | /v2/project/{project-name}/rb/{rb-name}/{version}/intent/{intent-name}/connectivity/external/inbound/ | API for external service to access the microservices inside the mesh |
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-sets/ POST BODY: { "name": "john", "description": "Traffic intent groups" "set":[ { "interclusterservice":"abc" }, { "externaloutboundaccess":"abc" }, { "intraclusterservice":"abc" }, { "externalinboundaccess":"abc" }, { "dnsproviders":"abc" } ] }
1. Inter Micro-service communication intents - Edit the intent to have inbound services to a target service than the outbound services - check the API level access! - implement for all APIS!
POST
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-sets/{trafficset-name}/interclusterservice/client01intent // URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-groups/{trafficgroup-name}/interclusterservice/{intercluster-record //only ms name}/clientservice// list of clients POST BODY: { "name": "johndoe" //unique name for each intent "description": "connectivity intent for microservice replication across multiple locations and clusters" "inboundservicename": "httpbin01" //actual name of the client service "description": "bookinfo app", "protocol": "HTTP", "externalName": "httpservice01.service.com", // Optional, default = "", This is the prefix used to expose this service outside the cluster, not mandatory for "intercluster" API, But mandatory foe external inbound access "headless": "false", // default is false. Option "True" will make sure all the instances of the headless service will have access to the client service "mutualTLS": "true", // Setting this to true will create a dedicated egrees gateway for the service "httpbin01" on whichever cluster it is running on "port" : "80", // port on which service is exposed as through servicemesh, not the port it is actually running on "serviceMesh": "istio", // get it from cluster record "loadbalancing": "true", // optional "user": [mary, kim, roger] # the list of external user who can access this application "clientService": { "clientServiceName": "sleep01", // if any then allow all the external applications to connect, check for serviceaccount level access "headless": "true", // default is false. Option "True" will generate the required configs for all the instances of headless service "egressgateway": "true" , // Optional, default = false, All the outbound traffic from this service will flow through a dedicated egress gateway "serviceaccounts": [] # The serviceaccount to which this client is configured at the cluster where the inboundservice in running } } RETURN STATUS: 201 RETURN BODY: { "Message": "Intercluster Connectivity intent success " "description": "connectivity intent for microservice replication across multiple locations and clusters" }
The above intent will generate the following configuration provided the service mesh is istio.
Name of the Cluster | Microservices | Istio objects | Description/comments |
---|---|---|---|
| httpbin01 |
| |
2. Cluster02 | httpbin02 |
| |
2. Intent API for Intracluster communication
NOTE - Call this API only if the services are running in the same cluster, The default authorization policy must have with "deny-all" under spec as we need to disable all the communication between microservices during istio installation implement this API
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-sets/{trafficset-name}/intraclusterservice/client01intent POST BODY: { "name": "johndoe" //unique name for each intent "description": "connectivity intent for microservice replication across multiple locations and clusters" "inboundservicename": "httpbin01" //actual name of the client service "description": "bookinfo app", "protocol": "HTTP", "externalName": "", // Optional, default = "", Not required for "intraclusterservice" "headless": "false", // default is false. Option "True" will make sure all the instances of the headless service will have access to the client service "mutualTLS": "true", // Setting this to true will create a dedicated egrees gateway for the service "httpbin01" on whichever cluster it is running on "port" : "80", // port on which service is exposed as through servicemesh, not the port it is actually running on "serviceMesh": "istio", // get it from cluster record "loadbalancing": "true", // optional "user": [""] # Not required for intraclusterservice since the communication is between services within cluster "clientService": { "clientServiceName": "sleep01", // if any then allow all the external applications to connect, check for serviceaccount level access "headless": "true", // default is false. Option "True" will generate the required configs for all the instances of headless service "egressgateway": "true" , // Optional, default = false, All the outbound traffic from this service will flow through a dedicated egress gateway "serviceaccounts": [] # The serviceaccount to which this client is configured at the cluster where the inboundservice in running } } RETURN STATUS: 201 RETURN BODY: { "Message": "Intercluster Connectivity intent success " "description": "connectivity intent for microservice replication across multiple locations and clusters" }
3. microservice connectivity to an external service intent API - Outbound access
NOTE - These are the services whose nature is not known. These services are assumed to have FQDN as a point of connectivity
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-sets/{trafficset-name}/outboundservice/client01intent POST BODY: { "name": "johndoe" //unique name for each intent "description": "connectivity intent for microservice replication across multiple locations and clusters" "inboundservicename": "httpbin01" //actual name of the client service "description": "bookinfo app", "protocol": "HTTP", "externalName": "", // Optional, default = "", Not required for Outbound access since the communication will be initialted from inboundservice "headless": "false", // default is false. Option "True" will make sure all the instances of the headless service will have access to the client service "mutualTLS": "true", // Setting this to true will create a dedicated egrees gateway for the service "httpbin01" on whichever cluster it is running on "port" : "80", // port on which service is exposed as through servicemesh, not the port it is actually running on "serviceMesh": "istio", // get it from cluster record "loadbalancing": "true", // optional "user": [""] # Not required for Outbound access "clientService": { "clientServiceName": {"sleep01.service.com} // Only the FQDN of the service name is required. } } RETURN STATUS: 201 RETURN BODY: { "Message": "outbound coonectivity intent creation success " "description": "connectivity intent for microservice replication across multiple locations and clusters" }
5. API for external services to access a microservice - Inbound access
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-sets/{trafficset-name}/outboundservice/client01intent POST BODY: { "name": "johndoe" //unique name for each intent "description": "connectivity intent for microservice replication across multiple locations and clusters" "inboundservicename": "httpbin01" //actual name of the client service "description": "bookinfo app", "protocol": "HTTP", "externalName": "", // Optional, default = "", must be defined for Inbound access "headless": "false", // default is false. Option "True" will make sure all the instances of the headless service will have access to the client service "mutualTLS": "true", // Setting this to true will create a dedicated egrees gateway for the service "httpbin01" on whichever cluster it is running on "port" : "80", // port on which service is exposed as through servicemesh, not the port it is actually running on "serviceMesh": "istio", // get it from cluster record "loadbalancing": "true", // optional "user": [""] # Optional. Restricts the users from accessing these services "clientService": { "clientServiceName": {"sleep01.service.com} // Only the FQDN of the service name is required } } RETURN STATUS: 201 RETURN BODY: { "Message": "Inbound coonectivity intent creation success" "description": "connectivity intent for microservice replication across multiple locations and clusters" }
Keywords | Supported fields | Description |
---|---|---|
{connectivity-type} | intercluster/intracluster | types in API for {connectivity-type} |
{connectivity-sub-type} | intermicroservice/internalapplication/externalmicroservice | sub-types in API for {connectivity-sub-type} |
name | name of the microservice/application depending on the context | |
External DNS Update Overview
See alternate design here: Alternate External DNS provider update approach
This section covers the design for how external DNS are updated.
The following sequence diagram illustrates the approach:
Elements of the DNS update design
external-dns-agent
This is deployed as part of the Edge Cluster. Essentially, it is an instance of external-dns configures with istio-gateway as the source and uses a DNS CRD as the provider (i.e. detected DNS records are just saved as a CRD). h
DNS CR
The DNS CRD can be based on the examples here: https://github.com/kubernetes-sigs/external-dns/tree/master/docs/contributing/crd-source
The above can already be used as a DNS source for exteranal-dns. The new work here is to make an external-dns provider that saves these CRD.
DNS Provider Intent
The intents for the distributed application provide DNS Provider information.
- Cluster ID
- Project ID
- DNS Provider Type (e.g. coredns, aws route 53, ...)
- DNS Provider locatoin (e.g. IP, hostname)
- DNS Provider credentials (e.g. username, pw, TLS credentials)
- DNS Provider other parameters (maybe provider specific)
Question: Is the DNS Provider information really an Intent ? which is associated with resource bundle, or is it a set of information associated with the edge cluster / project that is common across resource bundles for that project?
external-dns-controller
This component operates at the centralized controller. The function of the external-dns-controller is to periodically start external-dns jobs which will connect to specific edge clusters to query the DNS CRD on the edge cluster and then update the actual DNS provider configured (per the intent) for that edge cluster.
DNS Intent API
DNS provider intent
result:
- an instance of external-dns is running in the centralized controller for each cluster of a project which is hosting a user facing service
- the source for the external-dns instance is the DNS CRD on the specific cluster for that project. The external-dns instance will be supplied the clusters K8S API server and kubeconfig
- the provider for the external-dns instance will be the DNS Provider provided in the intent
- ? what if a given project has multiple services (from different resource bundles) - in principle, a single external-dns instanace should be able to handle all services on that cluster as long as the same provider is used.
- it is kicked off by the external-dns-controller plugin when intents for a deployed app are processed.
assumptions:
- could be multiple dns-providers for a given service
- different clusters may update different set of dns-providers
need to know:
- need to know which clusters have the service deployed
- need to identify which dns-providers are associated with the cluster
URL: /v2/project/{project-name}/rb/{rb-name}/{rb-version}/traffic-intent-sets/{traffic-intent-set-name}/dnsproviders: { "name": "dns-provider-intent-name1", "description": "dns provider intent for updating user facing microservice FQDNs to external DNS providers", "dnsProvider": [ { "id": "microservice01", // unique name of the microservice - provides association to other connectivity intents "externalName": "service1.example.com", // redundant ? - this name should be deployed to istio gateway on microservice deployment (this may not be the right spot) "cluster-selector": "label1, label2, ...", // labels to select which clusters this dns provider is to be used "externalDnsParameters": { // list will be supplied to external-dns as parameters. // for example ... "aws-zone-type": "", When using the AWS provider, filter for zones of this type (optional, options: public, private) "aws-zone-tags": "", When using the AWS provider, filter for zones with these tags "aws-assume-role":"", When using the AWS provider, assume this IAM role. Useful for hosted zones in another AWS account. Specify the full ARN, e.g. `arn:aws:iam::123455567:role/external-dns` (optional) "aws-batch-change-size":"1000", When using the AWS provider, set the maximum number of changes that will be applied in each batch. "aws-batch-change-interval":"1s", When using the AWS provider, set the interval between batch changes. "aws-evaluate-target-health":"enabled", When using the AWS provider, set whether to evaluate the health of a DNS target (default: enabled, disable with --no-aws-evaluate-target-health) "aws-api-retries":"3", When using the AWS provider, set the maximum number of retries for API calls before giving up. "aws-prefer-cname":"disabled" When using the AWS provider, prefer using CNAME instead of ALIAS (default: disabled) } providerCredentials: { ... } } ] }
Example DNS Update diagram
External application communication intents
Considering DNS resolution, No DNS resolution (IP addresses), Egress proxies of the Service Mesh, Third-party egress proxy
User facing communication intents
Considering Multiple DNS Servers
Considering multiple user-facing entities
Considering RBAC/ABAC
Internal Design details
Guidelines that need to kept in mind
- Support for metrics that can be retrieved by Prometheus
- Support for Jaeger distributed tracing by including opentracing libraries around HTTP calls.
- Support for logging that is understood by fluentd
- Mutual exclusion of database operations (keeping internal modules accessing database records simultaneously and also by replication entities of the scheduler micro-service).
- Resilience - ensure that the information returned by controllers is not lost as the synchronization of resources to remote edge clouds can take hours or even days when the edge is not up and running and possibility of restart of scheduler micro service in the meantime.
- Concurrency - Support multiple operations at a time and even synchronizing resources in various edge clouds in parallel.
- Performance - Avoiding file system operations as much as possible.
Modules (Description, internal structures etc..)
....