This document relates to investigative work being carried out on the Jira ticket POLICY-3809. The general requirements of the investigation are below:
- How to create a Kubernetes environment that can be spun up and made available on demand on suitable K8S infrastructure.
- How to write suitable test suites to verify the functional requirements below would be developed.
- How such test suites could be done using "Contract Testing".
Functional Requirements Detail
Note that in Postgres, many of the features below are available. In the verification environment, we want to verify that the Policy Framework continues to work in the following scenarios:
- Synchronization and Load Balancing
- Backup and Restore
In addition the environment should:
- Support measurement of Performance Lag
- Use secure communication towards the Database
- Verify that auditing of database operations is working
At the Policy Framework on 2022-02-09, it was decided that:
"Providing a High Availability infrastructure is beyond the scope of the Policy Framework, it is a task for ONAP overall (OOM) and/or individual organizations or companies taking the Policy Framework and integrating it into their infrastructure. The focus from now on in High Availability will be on making the Policy Framework High Availability Ready. Future tasks may be required to improve the HA readiness of the PF."
Database servers can work together to allow a second server to take over quickly if the primary server fails (high availability), or to allow several computers to serve the same data (load balancing). Ideally, database servers could work together seamlessly. Web servers serving static web pages can be combined quite easily by merely load-balancing web requests to multiple machines - this is very common in Kubernetes environments. In fact, read-only database servers can be combined relatively easily too. Unfortunately, most database servers have a read/write mix of requests, and read/write servers are much harder to combine. This is because though read-only data needs to be placed on each server only once, a write to any server has to be propagated to all servers so that future read requests to those servers return consistent results.
This synchronization problem is the fundamental difficulty for servers working together. Because there is no single solution that eliminates the impact of the sync problem for all use cases, there are multiple solutions. Each solution addresses this problem in a different way, and minimizes its impact for a specific workload.
Some solutions deal with synchronization by allowing only one server to modify the data. Servers that can modify data are called read/write, master or primary servers. Servers that track changes in the master are called standby or slave servers. A standby server that cannot be connected to until it is promoted to a master server is called a warm standby server, and one that can accept connections and serves read-only queries is called a hot standby server.
Some solutions are synchronous, meaning that a data-modifying transaction is not considered committed until all servers have committed the transaction. This guarantees that a failover will not lose any data and that all load-balanced servers will return consistent results no matter which server is queried. In contrast, asynchronous solutions allow some delay between the time of a commit and its propagation to the other servers, opening the possibility that some transactions might be lost in the switch to a backup server, and that load balanced servers might return slightly stale results. Asynchronous communication is used when synchronous would be too slow.
Performance must be considered in any choice. There is usually a trade-off between functionality and performance. For example, a fully synchronous solution over a slow network might cut performance by more than half, while an asynchronous one might have a minimal performance impact.
While it is easy in Kubernetes/Helm to create a deployment for database servers that have several replicas and can autoscale, Kubernetes uses volumes to backup each database using PersistentVolumes and PersistentVolumeClaims. These resources ensure that, even with the ephemeral nature of the pods running in the cluster, that if the database server pods fail, that the data will be retained. In OOM deployments, this is done using the hostPath volume type - data is backed up on the actual VM where the pods are running. However, this strategy does not take into consideration the functional requirements set out in this investigation. The remaining sub-sections of this section will outline existing solutions for the different database requirements i.e. Load Balancing, Synchronization, Failover and Backup and Restore.
As outlined above, there are 2 general methods used to approach the sync problem in databases where:
- One replica is responsible for write operations - master. Others can only read - slave.
- Sync transactions where data is not made available until it is committed to all replicas. This can be done async but it increase risk of data loss and stale data.
Master-Slave Example 1: MariaDB/Mysql
There is an example of the first method here: https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/#deploy-mysql. This method use a mysql database but the procedure for Postgres should be much the same. a ConfigMap is written and created where there are 2 different configurations: one for the primary database server that will do the writing and one for the other (slave) servers.
Two services are defined
- One responsible for reading. It will load balance connections across all replicas.
- One responsible for writes. This is a "headless" service. It allows the other pods to specify which db replica they wish to connect to i.e. the primary one...
Finally, a stateful set is created that contains
- initContainers - the main purpose of the init containers is to populate the main container with the correct configuration depending on whether it is a primary or slave.
- Also a main container is specified with 3 replicas.
- An xtrabackup container is used to create a database backup. → This free tool can be used to create backups of mariadb also.
Full details on the use case are available here.
Master-Slave Example 2: Postgres - Bitnami
As a ready-made solution, Bitnami offers a PostgreSQL Helm chart that comes pre-configured for security, scalability and data replication. This helm chart comes with many configurable parameters that can be specified directly in the command line or in the values.yaml file of the chart. Some example parameters are shown below:
- Example Global Parameters
|Global Docker image registry|
|Global Docker registry secret names as an array|
|Global StorageClass for Persistent Volume(s)|
|PostgreSQL database (overrides |
- Example Postgres Parameters
|Replication user password|
|Number of read replicas replicas|
|PostgreSQL user (has superuser privileges if username is |
|PostgreSQL user password|
|Name of existing secret to use for PostgreSQL passwords|
|Mount PostgreSQL secret as a file instead of passing environment variable|
|PostgreSQL data dir folder|
|Enable TLS traffic support|
|Generate automatically self-signed TLS certificates|
|Log client hostnames|
|Add client log-in operations to the log file|
In terms of addressing the functional requirements in POLICY-3809, the Bitnami instance goes a long way.
- Secure communication towards the database can be configured by using the tls parameters in the above table.
- Bitnami Postgres helm chart uses the same master/slave architecture as the previously discussed MySql configuration. However, master has been renamed to "primary" and slave has been renamed to "readReplica". This addresses the synchronisation requirement. Also, as the data is replicated in the primary and the readReplicas, it also addresses persistence/failover here.
- The chart supports the export of prometheus metrics, which will potentially allow analysis of any performance lag.
- The chart supports configuration of audit logs.
- Full backup and restore is supported using the open source tool velero. Incidentally, velero can also be used for backup and restore of Mariadb database. A full tutorial is provided here: https://docs.bitnami.com/tutorials/migrate-data-bitnami-velero/ and also outlines migration of the data from one cluster to another. This works by backup of the Kubernetes Persistent Volume that the database is saving to. These volumes can be saved in cloud storage. This includes Openstack Cinder. Scheduled backup is also configurable.
- Bitnami also provides a MariaDb chart to implement the same approach but with MariaDb here and velero can be used for backup here too.
Bitnami have released a Helm chart for PostgreSQL to support high availability. This is similar to the older Bitnami chart but has some specific changes for High Availabilty.
- A new deployment, service have been added to deploy Pgpool-II to act as proxy for PostgreSQL backend. It helps to reduce connection overhead, acts as a load balancer for PostgreSQL, and ensures database node failover.
bitnami/postgresql-repmgrwhich includes and configures repmgr. Repmgr ensures standby nodes assume the primary role when the primary node is unhealthy.
- In addition, the HA version of Postgres allows for synchronous replication of data with the postgresql.syncReplication parameter in the values.yaml
Synchronous Example 1: MariaDb/Sql Galera Cluster
This is the method currently used by the OOM mariadb deployment. It is also based on a Bitnami image. The bitnami mariadb-galera helm chart supports many different values, which can be set in the values.yaml file. Some examples are shown below.
|MariaDB Galera image registry|
|MariaDB Galera image repository|
|MariaDB Galera image tag (immutable tags are recommended)|
|MariaDB Galera image pull policy|
|Specify docker-registry secret names as an array|
|Specify if debug logs should be enabled|
|StatefulSet controller supports relax its ordering guarantees while preserving its uniqueness and identity guarantees. There are two valid pod management policies: OrderedReady and Parallel|
This database cluster type copies data to all instances of the database. It does not restrict writes to one instance. One can write to or read from any instance. For this reason, it could be regarded as the best solution. Once again, the option of using velero is available here to back up the data to persistent volumes to the underlying could storage.
Synchronous/Asynchronous Postgres Example - Patroni
Patroni creates a HA Postgres cluster that supports either Synchronous or Asynchronous transactions. It has failover built-in - if the master dies, then a worker node will be automatically selected to take over. In addition it provides a CLI tool that is similar to kubectl, where various commands can be carried out on the database nodes. For example, we can do simple things like list the nodes and display their status. More complex operations include taking down a node for maintenance. If the master is taken down for maintenance, another node can be selected via the command line to take over. An interesting video of functionality is found here.
Previously Investigated Backup and Restore
A wiki page outlining how to perform this backup/restore is here. The intention is that a Kubernetes CronJob would be configured to run every 24 hours. The CronJob would execute a db_backup script, which removes any old backups and saves a new one. Example helm chart configs as well as a PersistentVolume, PVC and ConfigMap are provided in the wiki. Note that these scripts are not currently stored in any gerrit repo. The example scripts are currently configured to backup all the databases locally but this could potentially be altered to store in a remote location.
Notes on Load Balancing Database Connections
In Kubernetes, there are 2 types of load balancing - internal and external. External load balancing refers to clients outside the cluster accessing services inside the cluster. These services are exposed externally via a load balancer. The external load balancer needs to be configured inside the cluster but also on the cloud provider. As the service we are referring to in this case is a database, we are unlikely to be exposing it externally. Therefore, we are more interested in internal load balancing. If there are multiple database replicas running in a cluster and there are multiple clients that need to access them, we need to think about how Kubernetes decides which replica to send a query to. This is the job of internal load balancing.
Connections between services and pods/replicas in Kubernetes is managed by another pod - kube-proxy. The kube-proxy pod is running in each node of the kubernetes cluster as part of a daemonset. This can be found as part of the kube-system namespace in the majority of clusters and is brought up when the cluster is created - it should be visible in fresh Kubernetes installations. When kube-proxy is run, it can be supplied with several parameters, one of which is --proxy-mode. Detail on all the parameters that can be supplied is here.
By default, Kubernetes runs with the proxy mode set to "IPTABLES". Although IPTABLES is considered to be tried and tested for routing traffic, in terms of load balancing, its' behaviour is to randomly select a replica to route a request to. There is another proxy-mode that is capable of more sophisticated algorithms for load balancing, namely IPVS.
Lots of detail comparing IPTABLES and IPVS is provided here. In summary IPVS implements transport-layer load balancing, usually called Layer 4 LAN switching, as part of Linux kernel. IPVS runs on a host and acts as a load balancer in front of a cluster of real servers. IPVS can direct requests for TCP and UDP-based services to the real servers, and make services of real servers appear as virtual services on a single IP address.
IPVS provides better scalability and performance for large clusters.
IPVS supports more sophisticated load balancing algorithms than IPTABLES (least load, least connections, locality, weighted, etc.).
IPVS supports server health checking and connection retries, etc.
Put simply, if the cluster is running IPVS proxy mode, we can choose which load balancing algorithm we want to use. This choice can be made with the --ipvs-scheduler parameter - to which we can supply the below arguments.
- rr: round-robin - default
- lc: least connection
- dh: destination hashing
- sh: source hashing
- sed: shortest expected delay
- nq: never queue
This gives us more sophisticated options for load balancing in the cluster as a whole. Changing this option will also enforce the use of this algorithm on connections between the client and the database.
During the course of this investigations, it was discovered that HAProxy does not work correctly as a DB Load Balancer because it does not support read/write splits. This means that the master/slave deployment model, where the master does writes and the slaves do reads will not work unless the read and write operations are configured to be on different ports. For this reason HAProxy should not be used in the database configuration for this environment (at least not when load balancing traffic to databases). Instead, there are other database-specific load balancing tools that can do this job. The Bitnami HA helm chart for Postgres uses pgpool-II for Postgres for this purpose, which supports the read/write split.
Investigated Testing Approaches
This section will outline some of the approaches to tests that are commonly used but also some unique/less common approaches
Chart tests are actually built into helm and detail on them can be found here: https://helm.sh/docs/topics/chart_tests/. The task of a chart test is to verify that a chart works as expected once it is installed. Each helm chart will have a templates directory under it. The test file contains the yaml definition of a Kubernetes Job. A Job in Kubernetes is basically a resource that creates a Pod that carries out a specific task. Once the task is executed, the Job deletes the pods and exits. In the test, the Job runs with a specified command and is considered a success if the container successfully exits with an (exit 0).
- Validate that your configuration from the values.yaml file was properly injected.
- Make sure your username and password work correctly
- Make sure an incorrect username and password does not work
- Assert that your services are up and correctly load balancing
- Test successful connection to a database using a specified secret
The simplicity of specifying tests in this way is a major advantage. Tests can then simply be run with a "helm test" command.
Helm Unit Test Plugin
There is an open source project that has been defined and is present on GitHub - https://github.com/quintush/helm-unittest. It can be installed easily as it is designed as a helm plugin. The plugin allows definition of tests in yaml to confirm basic functionality of the deployed pod/chart. It is operated very simply. You can define a tests/ directory under your chart e.g. YOUR_CHART/tests/deployment_test.yaml. Then an example test suite is defined below:
The test asserts a few different things. The template is a Deployment type, the name of the chart and the container used. Simple cli command is then used to run the test.
Although this library is useful, it does not actually serve to test the functionality of the chart, only the specification.
The Kyma project is a cloud native application runtime that uses Kubernetes, helm and a number of other components. They used helm tests extensively and appreciated how easy the tests were to specify. However, they did find some shortcomings:
- Running the whole suite of integration tests took a long time, so they needed an easy way of selecting tests they wanted to run.
- The number of flaky tests increased, and they wanted to ensure they are automatically rerun.
- They needed a way of verifying the tests' stability and detecting flaky tests.
- They wanted to run tests concurrently to reduce the overall testing time.
For these reasons, Kyma developed their own tool called Octopus and it tackles all of the issues above: https://github.com/kyma-incubator/octopus/blob/master/README.md
In developing tests using Octopus, the tester defines 2 files
- TestDefinition file: Defines a test for a single component or a cross-component scenario. We can see in the example below that the custom TestDefinition resource is used to define a Pod with a specified image for the container and a simple command is carried out. This is not dissimilar to the way that helm test defines tests for the charts.
ClusterTestSuite file: This file defines which tests to run on the cluster and how to run them. In the below example, they specify to run only tests with the "service-catalog" label. It specifies how many times a test should be executed and how many retries of the test should be done. Also, concurrency is specified to define what the maximum number of concurrent tests should be running.
Although this project seems to make some improvement on the helm chart tests, it is unclear how mature the project is. Documentation details how to define the specified files and how to use kubectl CLI to execute a test - https://github.com/kyma-incubator/octopus/blob/master/docs/tutorial.md
This testing framework is part of the terraform project but it seems it can be used for helm charts independent of terraform. All it requires is that you have a kubernetes and helm install and have the go language installed. A very simple example of how the tests are created is found here: https://github.com/gruntwork-io/terratest-helm-testing-example. Tests are specified in the go language and can include the instructions for deploying a chart. An example outlined here: https://blog.gruntwork.io/automated-testing-for-kubernetes-and-helm-charts-using-terratest-a4ddc4e67344 shows how the tests can be specified for 2 different scenarios - for template testing and for integration testing.
- Template Testing: This is used to catch syntax or logical issues in your defined helm charts. The example shown below points to an example helm chart directory and then sets the image value in the chart. It renders the template but doesn't actually deploy the pod and then confirms that the rendered template has the correct image set. After the test is run, an output is provided that displays the template and whether the test is successful or not. These tests are very quick because they do not actually involve deploying any pods.
This is fine as long as you don't want to test any functionality that depends on your chart being up-and-running.
- Integration Testing: These tests deploy the rendered template from above onto an actual Kubernetes cluster. So, inputs to create the actual pods must be provided in the testing script. Helm install the chart and then, once the test is finished uninstall the chart. See the example below:
Although the use of the go language is appealing for this kind of testing, there are some drawbacks when using this method when compared to others.
- Terratest was built to work in Terraform. It will work independently but there could be some "gotchas" here that may result in requiring some Terraform features.
- Terratest seems to do much the same thing as Chart Tests and one could argue that it is easier to use Chart Tests.
- Terratest does not provide the concurrency options that are present in Octopus.
Another tools that is being used for writing integration tests in helm/kubernetes is Kubetest - https://kubetest.readthedocs.io/en/latest/. This is a python-based pytest plugin that aims to make it easier to write tests on Kubernetes - even allowing us to automate tests for the Kubernetes infrastructure, networking and disaster recovery. It has many interesting features:
- Simple API for common cluster interactions.
- Uses the Kubernetes Python client as the backend, allowing more complex cluster control for actions not covered by our API.
- Load Kubernetes manifest YAMLs into their Kubernetes models.
- Each test is run in its own namespace and the namespace is created and deleted automatically.
- Detailed logging to help debug error cases.
- Wait functions for object readiness, deletion, and test conditions.
- Allows you to search container logs for expected log output.
- RBAC permissions can be set at a test-case granularity using pytest markers.
Although the documentation only gives examples for testing kubernetes directly, there are examples of it being used for helm too - https://github.com/omerlh/helm-chart-tests-demo.
Writing the tests looks straight-forward and documentation is good. However, the solution seems similar to Chart Tests and also does not support running concurrent integration tests.
Spring Contract Testing
Contract testing is a way to ensure that services (an API/messaging provider, a client etc.) can communicate with each other.
- Contract - An API agreement between a and a capturing the expected mutual interactions.
- Producer - Service that exposes an API that provides data the client/consumer needs.
- Consumer - A client that want to receive some data from the service/producer.
Contract testing allows the producer to provide the contract to the consumer via generated stubs that can be used by a consumer for ensuring they interact with the service in the expected way.
The purpose of the contract testing is not to assert the functionality rather than checking the semantics if the producer and consumer can communicate as per the contracts in place.
It is not replacement of any other types of testing that assert the functionality like the application and product testings.
There is an example rest controller that has one endpoint "/validate/primenumber" defined on the producer side.
The spring-cloud-starter-contract-verifier dependency needs to be added to the pom.xml as well as the spring-cloud-contract-maven-plugin plugin.
A base for the test classes to be defined has to be provided to the plugin as shown above. The base class itself is shown below.
The contract itself can then be added in the default location of "/src/test/resources/contracts/ package" - this is configurable.
When we run the build, the plugin automatically generates a test class named ContractVerifierTest that extends our BaseTestClass and puts it in "/target/generated-test-sources/contracts/".
The names of the test methods are derived from the prefix “validate_” concatenated with the names of our Groovy test stubs. For the above Groovy file, the generated method name will be “validate_shouldReturnEvenWhenRequestParamIsEven”.
Once the implementation and the test base class are in place, the tests pass, and both the application and the stub artifacts are built and installed in the local Maven repository. You can now merge the changes, and you can publish both the application and the stub artifacts in an online repository.
The consumer side of our CDC will consume stubs generated by the producer side through HTTP interaction to maintain the contract, so any changes on the producer side would break the contract.
We'll add BasicMathController, which will make an HTTP request to get the response from the generated stubs:
For our consumer, we'll need to add the spring-cloud-contract-wiremock and spring-cloud-contract-stub-runner dependencies:
Now it's time to configure our stub runner, which will inform our consumer of the available stubs in our local (or potentially remote) Maven repository:
If we make any changes on the producer side that directly impact the contract without updating the consumer side, this can result in contract failure.
For example, suppose we're to change the EvenOddController request URI to /validate/change/prime-number on our producer side.
If we fail to inform our consumer of this change, the consumer will still send its request to the /validate/prime-number URI, and the consumer side test cases will throw org.springframework.web.client.HttpClientErrorException: 404 Not Found.