...
Patch | Devices | E2E duration (s) | Fragment Query duration (s) | Service duration (s) | Object Size (MB) | Object Size #Fragments | Graph |
---|---|---|---|---|---|---|---|
1) Baseline | 1,000 | 11.8 | <0.1 * | 11.74012 | 0.3 | 86,000 | |
2,000 | 28.5 | <0.1 * | 28.401 | 0.7 | 172,000 | ||
5,000 | 87.0 | <0.1 * | 86.814 | 1,7 | 430,000 | ||
10,000 | 201.0 | <0.1* | 201.008 | 3.3 | 860,000 | ||
2) | 1,000 | 0.5 | 0.2 | 0.3 | 0.3 | 86,000 | |
2,000 | 1.0 | 0.4 | 0.6 | 0.7 | 172,000 | ||
5,000 | 2.5 | 1.1 | 1.4 | 1.7 | 430,000 | ||
10,000 | 7.0 | 2.9 | 4.0 | 3.3 | 860,000 | ||
1,000 | 3.0 | 1.3 | 1.7 | 0.3 | 86,000 | ||
2,000 | 5.5 | 2.3 | 3.2 | 0.7 | 172,000 | ||
5,000 | 11.0 | 5.4 | 5.6 | 1.7 | 430,000 | ||
10,000 | 25.4 | 11.7 | 13.6 | 3.3 | 860,000 |
...
Query: cps/api/v1/dataspaces/openroadm/anchors/owb-msa221-anchor/node?xpath=/openroadm-devices/openroadm-device[@device-id='C201-7-13A-5A1']&include-descendants=true
Patch
...
: https://gerrit.onap.org/r/c/cps/+/133511/12
Threads | E2E duration (s) | Succes Ratio | Fragment Query duration (s) |
---|---|---|---|
1 | 0.082 | 100% | 0.2 |
2 | 0.091 | 100% | 0.1 |
3 | 0.120 | 100% | 0.1 |
5 | 0.3 | 100% | 0.2 |
10 | 0.3 | 99.9% | 0.3 |
20 | 0.5 | 99.5% | 0.5 |
50 | 1.0 | 99.4% | 1.0 |
100 | 2.3 | 99.7% | 2.3 |
200 | 7.6 | 99.7% | 6.2 |
500 | 17.1 | 41.4% | 13.8 |
1,000 | 15.3 (many connection errors) | 26.0% | 11.9 |
Graphs:
- Average E2E Execution Time
- Internal Method Counts (total)
Observations
- From 10 Parallel request (of 10 sequential request) the client can't always connect and we see time out error (succes ratio <100%)
- Sequential request are fired faster than actual responses so from DB perspective they are almost parallel request as well
- Database probably already become bottleneck with 2 threads, effectively firening a total of 20 call very quickly. Its know that the DB connection pool/internal will slow down from 12 or more 'parallel' request
Graphs:
- Average E2E Execution Time
- Internal Method Counts (total)
Observations:
...
Get 1000 nodes in Parallel with varying thread count
In this test, 1000 requests are sent using curl, but with varying thread count (using --parallel-max option).
Code Block | ||
---|---|---|
| ||
echo -e "Threads\tTime"
for threads in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 20 30 40 50; do
echo -n -e "$threads\t"
/usr/bin/time -f "%e" curl --silent --output /dev/null --fail --show-error \
--header "Authorization: Basic Y3BzdXNlcjpjcHNyMGNrcyE=" \
--get "http://localhost:8883/cps/api/v1/dataspaces/openroadm/anchors/owb-msa221-anchor/node?xpath=/openroadm-devices/openroadm-device\[@device-id='C201-7-[1-25]A-[1-40]A1'\]&include-descendants=true" \
--parallel --parallel-max $threads --parallel-immediate
done |
Note the above curl command performs 1000 requests. It is based on globbing in the URL - curl allows ranges such as [1-25]
in the URL, for example:
http://example.com/archive[1996-1999]/vol[1-4].html
which would expand into a series of 16 requests to:
- http://example.com/archive1996/vol1.html
- http://example.com/archive1996/vol2.html
- ...
- http://example.com/archive1999/vol4.html
Results
Threads | Time (s) | Speedup | Comments |
1 | 140.4 | 1.0 | |
2 | 71.6 | 2.0 | 2 threads is 2x faster than 1 thread |
3 | 48.5 | 2.9 | |
4 | 37.2 | 3.8 | |
5 | 31.0 | 4.5 | |
6 | 26.6 | 5.3 | |
7 | 23.8 | 5.9 | |
8 | 21.6 | 6.5 | |
9 | 20.0 | 7.0 | |
10 | 18.7 | 7.5 | 10 threads is 7.5x faster than 1 thread |
11 | 17.7 | 7.9 | |
12 | 16.8 | 8.4 | There are exactly 12 CPU cores (logical) on test machine |
13 | 16.7 | 8.4 | |
14 | 16.7 | 8.4 | |
15 | 16.8 | 8.4 | |
20 | 16.8 | 8.4 | |
30 | 16.7 | 8.4 | |
40 | 16.8 | 8.4 | |
50 | 16.7 | 8.4 |
Graphs
Observations
- There were no failures during the tests (e.g. timeouts or refused connections).
- Performance increases nearly linearly with increasing thread count, up to the number of CPU cores.
- Performance stops increasing when the number of threads equals the number of CPU cores (expected).
- Verbose statistics show that each individual request takes around 0.14 seconds, regardless of thread count (but with multiple CPU cores, requests are really done in parallel).
Data sheets
View file | ||||
---|---|---|---|---|
|
View file | ||||
---|---|---|---|---|
|
View file | ||||
---|---|---|---|---|
|
Test scripts overview
- performanceTest.sh
Get 1000 times single large object from thousands of devices (1000, 2000, ..., 10000) and create metric after each run
- performanceRootTest.sh
Get 10 times the whole data tree as one object from thousands of devices (1000, 2000, ..., 10000) and create metric after each run
- parallelGetRequestTest.sh
Get one devices parallel from a database with 10000 devices, executed 10 times sequential
- buildup.sh
Create the dataspace, create the schemaset, create the anchor and create the root node
- owb-msa221.zip
The schemaset for the tests
- outNode.json
The input for the root node creation
- createThousandNode.sh
Helper script for the database creation
- innerNode.json
The input for the sub node creation
- createMetric.sh
Helper script for metric creation