Table of Contents

Summary

This is an assessment of current performance of CPS operations, as of August 2023.

...

Created device nodes	1	100	200	300	400	500	600	700	800	9001000	1,000
Fragments created	86	8,600	17,200	25,800	34,400	43,000	51,600	60,200	68,800	77,400	86,000
Time (seconds)	0.30	2.36	4.36	7.15	9.76	11.50	14.77	18.43	19.79	22.16	26.54
Fragments per second	86287

Image Removed

Observations

3,644

3,945

3,608

3,525

3,739

3,494

3,266

3,477

3493

3,240

Graph of time taken to store device nodes

Image Added

Observations

Storing data nodes has linear time complexity (as expected).
Raw performance is roughly 3000 3,500 fragments per second for the given test setup.
Performance can be improved by enabling write batching (CPS-1795)
There are edge cases with exponential complexity (adding books to the bookstore model).

...

In this scenario, 1,000 Open ROADM device nodes are already defined. A number of these existing data nodes will be updated using CpsDataService::updateDataNodeAndDescendants.

12.86

Updated device nodes	1	100	200	300	400	500	600	700	800	900	1000	1,000
Fragments updated	86	8,600	17,200	25,800	34,400	43,000	51,600	60,200	68,800	77,400	86,000
Time (seconds)	0.22	12.	Time (seconds)	0.22	79	28.38	44.23	51.55	69.46	85.67	95.02	109.16	117.00	131.15
Fragments	391	672	606

...

583

Observations

667

619

602

634

630

662

656

Graph of time taken to update device nodes and descendants

Image Added

Observations

Updating data nodes Updating data nodes has linear time complexity (as expected).
Raw performance is roughly 600 fragments per second for the given model and test setup.
Updating data nodes is 5 times slower than storing data nodes.

...

This is the weak spot of all write operations. A custom algorithm comparing existing data against updated data (delta algorithm) may be required to improve performance here. This could also indicate that Hibernate is not being used effectively.

TODO mention json alphabetical sorting

Updating data leaves

There are a number of issues possibly affecting performance here. For example, after DataNodes are converted to FragmentEntity objects, they are being persisted using the Hibernate persist method, which will always write the changes, regardless of whether the data changes. There is an alternative method called merge which compares existing data to updated data. Even if this were implemented, there are additional issues, such as JSON encoding giving inconsistent ordering. For example, data leaves could be encoded as "{'x':'1', 'y':'2'}" or "{'y':'2','x':'1'}", depending on exact object type for storing leaves (HashMap, LinkedHashMap, ImmutableMap, etc.). There is an option for JsonObjectMapper to order the keys alphabetically during encoding.

Updating data leaves

In this scenario, 1,000 Open ROADM device nodes are already defined. The data leaves of a number of these In this scenario, 1,000 Open ROADM device nodes are already defined. The data leaves of a number of these existing data nodes will be updated using CpsDataService::updateNodeLeaves.

...

Code Block

language	text

{
    'openroadm-device': [
        {'device-id':'C201-7-1A-1', 'status':'fail', 'ne-state':'jeopardy'}
    ]
}

Test Results

Updated device nodes10000.56

Updated device nodes	1	100	200	300	400	500	600	700	800	900	1,000
Fragments updated		1	100	200	300	400	500	600	700	800	900	1,000
Time (seconds)	0.20	0.27	0.28	0.28	0.32	0.38	0.39	0.47	0.49	0.52		0.38	0.39	0.47	0.49	0.52	0.56
Fragments per second	5	370	714	1,071	1,250	1,316	1,539	1,489	1,633	1,731	1,786

Graph of updating data leaves of device nodes

Observations

Updating data leaves has linear time complexity.
Raw performance is about 3000 around 1,500 fragments per second.
This is very fast compared to updating whole data nodes, and should be preferred where possible.

...

In this scenario, 300 Open ROADM device nodes are already defined. A number of these data nodes will be deleted using CpsDataService::deleteDataNodes. The types of nodes will be varied, for example, deleting container nodes, list elements, or whole lists.

Test results

N =	50	100	150	200	250	300	Example xpath
Delete top-level container node	-	-	-	-	-	0.63	/openroadm-devices
Batch delete N/300 container nodes	0.15	0.26	0.38	0.45	0.55	0.69	/openroadm-devices/openroadm-device[@device-id='C201-7-1A-10']/org-openroadm-device
Batch delete N/300 lists elements	0.13	0.25	0.34	0.45	0.55	0.67	/openroadm-devices/openroadm-device[@device-id='C201-7-1A-49']
Batch delete N/300 whole lists	0.51	1.05	1.40	1.85	2.13	2.56	/openroadm-devices/openroadm-device[@device-id='C201-7-1A-293']/org-openroadm-device/degree
Try batch delete N/300 non-existing	0.25	0.54	0.67	0.95	1.15	1.32	/path/to/non-existing/node[@id='27']

...

Delete performance is linear on the amount of data being deleted (as expected).
Raw performance of deleting container nodes is around 4035,000 fragments per second. (So we can delete data nodes more than 10x faster than creating them.)
Deleting lists is much slower than deleting the parent container of the list (this can be improved).
Of note, attempting to delete non-existing data nodes takes longer than actually deleting the equivalent amount of nodes with descendants - it is a slow operation.

...

Reading the top-level container node with no descendants:node with no descendants:

3000

Total device nodes	500	1,000	1,500	2,000	2,500	3,000
Fragments read	1	1	1	1	1	1	Total device nodes	500	1000	1500	2000	2500
Time (milliseconds)	47	52	48	56	48	47

The above data clearly indicates constant time.

Reading the top-level container node with all descendants:

30002.67

Total device nodes	500	1,000	1,500	1000	1500	2000	2500
Fragments read	43,000	86,000	129,000	172,000	215,000	258,000
Time (seconds)	0.42	1.19	1.54	2.16	2.53		2.53	2.67
Fragments per second	102,381	72,269	83,766	79,630	85,657	96,629

Graph of time taken to read top-level container node with all descendants

Observations

Reading a single top-level container node with no descendants has constant time (as expected).
Reading a single top-level container node with all descendants has linear time (as expected).
Raw performance of reading with all descendants is roughly 100,000 fragments per second.

...

Test results

3000

Total device nodes	500	1,000	1,500	2,000	2,500	1000	1500	2000	2500		3,000
Fragments read	43,000	86,000	129,000	172,000	215,000	258,000
Time (seconds)	0.61	1.15	1.52	2.14	2.96	3.97

Special case: attempting to read multiple non-existing data nodes

In this case, we attempt to read many non-existing data nodes:

3000

Total devices nodes	500	1,000	1,500	2,000	2,500	1000	1500	2000	2500	3,000
Fragments read	0	0	0	0	0	0
Time (milliseconds)	10	10	9	9	7	8

The above case appears to be constant time, but theoretically must be linear - we'll call it negligible time.

...

Recently, functionality was added to enable reading whole lists (CPS-1696). Here we compare performance of reading a container node containing a list, versus reading the list (with all descendants).

3000

Total device nodes	500	1000	1500	2000	2500	1,000	1,500	2,000	2,500	3,000	xpath
Reading container	0.386	0.712	1.529	2.667	1.759	3.112	/openroadm-devices
Reading list	0.585	1.335	2.036	2.860	2.769	3.949	/openroadm-devices/openroadm-device

...

CM-handle registration

Release	DateCmHandles	CM-Handles removed	100	500	1000	2000	5000	10000	20000	1,000	2,000
Kohn	Oct 2022	Time taken	8 sec	16 sec	19 sec	33 sec	1 min	3 min	ERROR	ERROR	Error due to DB running out shared memory
London	May 2023	Time taken	6 sec	7 sec	12 sec	22 sec	1 min	2 3 min	10 min	32 min
current	Aug 2023	Time taken	6 sec	7 sec	10 sec	16 sec	31 sec	57 sec	71 sec	108 sec

...

CM-handle de-registration

Release	DateCmHandles	CM-Handles removed	100	500	1000	2000	5000	10000	20000	1,000	2,000
Kohn	Oct 2022	Time taken	7 sec	2 min	7 min	25 min	2.5 hour	est: 10 hour	est: 2 days	est: 7 days	Some values estimated due to time constraints
London	May 2023	Time taken	< 1 sec	2 sec	3 sec	5 sec	17 sec	37 sec	85 sec	ERROR	Error due to 32K limit
current	Aug 2023	Time taken	< 1 sec	1 sec	3 sec	4 sec	14 sec	23 sec	65 sec	2 min

Current release has exactly linear performance for CM-handle de-registration (on a single thread):

CM-handles Removed	Time (sec)	CM-handles/sec
500	1.5	327
1000	2.6	377
5000	13.2	377
10000	25.9	385
20000	56.1	356

Commentary

CM-handle de-registration is a synchronous operation. Since the above tests send a single large batch, the operation is performed in a single thread. The linear performance reflects this. If the operation was broken into many smaller requests, the expectation is that it would complete many times faster - but this has not been tested.

Space shortcuts

Page tree

Versions Compared

Old Version 34

New Version 35

Key

Summary

Updating data leaves

Updating data leaves

CM-handle registration

CM-handle de-registration

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 34

New Version 35

Key

Summary

Updating data leaves

Updating data leaves

CM-handle registration

CM-handle de-registration