...
id | parent_id | anchor_id | xpath | xpath_component |
---|---|---|---|---|
1 | NULL | 3 | /bookstore | bookstore |
2 | 1 | 3 | /bookstore/categories[@code='1'] | categories[@code='1'] |
3 | 2 | 3 | /bookstore/categories[@code='1']/books[@title='Matilda'] | books[@title='Matilda'] |
4 | 2 | 3 | /bookstore/categories[@code='1']/books[@title='The Gruffalo'] | books[@title='The Gruffalo'] |
For example, given this CPS Path:
/bookstore/categories[@code="1"]/books[@title="Matilda"]
The new approach will first look for "bookstore", and using that as the parent IDnode, look for ''categories[@code='1']", and using that as parent ID, look for "books" or an xpath component starting with "books[", before finally applying leaf condition checks.
For example, given this Cps Path:
/bookstore/categories[@code="1"]/books[@title="Matilda"]
the following SQL query is generated:
Given the above CPS Path, the following SQL query is generated:
Code Block | ||
---|---|---|
| ||
Code Block | ||
| ||
SELECT * FROM fragment WHERE parent_id IN ( SELECT id FROM fragment WHERE xpath_component = 'categories[@code=''1'']' AND parent_id = ( SELECT id FROM fragment WHERE xpath_component = 'bookstore' AND anchor_id = 3 AND parent_id IS NULL ) ) AND ( xpath_component = 'books' OR xpath_component LIKE 'books[%' ) AND ( attributes @> '{"title":"Matilda"}' ) |
Design decision: should xpath_component contain the leaf key?
Given this Cps Patha data node with this unique path:
/bookstore/categories[@code="1"]/books[@title="Matilda"]
...
- Storing only 'books' will allow for removing the 'LIKE' operator from the SQL query, potentially speeding queries in the general case.
With further development, this could allow for GET operation (getDataNodes) to return whole lists, using an index-only lookup. (Though However, given how fast the proposed query solution is, the existence of the GET operation is questionable.) - Storing "books[@title='Matilda']" will allow for optimization of queries where the leaf-condition references the key leaf (as defined in the Yang model), thus skipping both the LIKE operator and the attribute check, but only in this specific case. See section "Index-only lookup where leaf-condition is in the xpath" of this document.
A note on fetching descendant nodesA note on fetching descendant nodes
The above sections only partly describe the current and proposed solutions. In both cases, a CPS path query is actually sent as two database queries:
- The first SQL query performs the CPS path query.
- The second SQL query fetches the descendantsdescendant nodes, if required.
- Alternatively, if an ancestor CPS path was provided, the second SQL query would fetch the ancestors (and their descendants if required).
The current implementation uses a SQL query using a LIKE operator to fetch descendants, while the proposed solution will use recursive SQL to fetch descendants of fragment returned from the CPS path query:
Code Block | ||
---|---|---|
| ||
WITH RECURSIVE descendant_search AS ( SELECT id, 0 AS depth FROM fragment WHERE id IN (:fragmentIds) UNION SELECT child.id, depth + 1 FROM fragment child INNER JOIN descendant_search ds ON child.parent_id = ds.id WHERE depth <= :maxDepth ) SELECT f.id, anchor_id AS anchorId, xpath, f.parent_id AS parentId, CAST(attributes AS TEXT) AS attributes FROM fragment f INNER JOIN descendant_search ds ON f.id = ds.idds.id |
Note: using the proposed query solution with the existing descendant search results in no performance improvement; conversely, using the above recursive SQL descendant search with the existing regex-based solution results in degraded performance. Only when the two new queries are combined (path-component lookup and recursive SQL descendant search) that optimal performance is reached.
Proof of Concept
A PoC was developed so that performance could be compared against existing Cps Path Query query algorithms.
Test data
These tests were performed using Groovy tests, directly calling the Java CPS-Service API (thus these tests do not include any HTTP overhead when REST APIs are used).
The test data is using the openroadm model, present in the integration-test module of CPS source code repository.
In this case, each 'device' node is comprised of 86 data nodeshas 86 fragments in the database. In these tests, 4 anchors were populated using the same data.
Performance
...
Comparison
Query one device
...
out of many (1 out of N)
In this case, a query that matches a single device node is executed, such as this for descendant CPS path:
//openroadm-device[@device-id="C201-7-1A-14"]
or this for absolute CPS path:
/openroadm-devices/openroadm-device[@device-id="C201-7-1A-19"]
Here is a summary of test results:
Case | Query one out of many using descendant cps path | Query one out of many using absolute cps path |
---|---|---|
Query | //openroadm-device[@device-id="C201-7-1A-14"] | /openroadm-devices/openroadm-device[@device-id="C201-7-1A-19"] |
Comparison graph | ||
Time complexity of | O(N) | O(N) |
Time complexity of proposed solution | O(1) | O(1) |
As seen in the graphs, query performance for current master branch is linear on the size of the database, while the PoC implementation is constant time (independent of DB database size).
(Note, I have not illustrated all different fetchDescendantOptions, as it has only minor impact in this case of fetching 1 device node.)
Query all devices using descendant cps path
...