...
- For a query that will return 1 out of N items, the current time complexity is linear, O(N).
- For a query that will return all N items, the current time complexity is quadratic, O(N2).
Operation | Existing solution | Proposed solution |
---|---|---|
Query 1 out of N | O(N) | O(1) |
Query all out of N | O(N2) | O(N) |
Proposal
The new approach involves adding a column to the Fragment table, storing the last path component (called xpath_component here). The new column is indexed, to allow constant-time lookups.
...
In above cases, it is observed that the existing algorithm grows quadratically - O(n2), while the PoC algorithm grows linearly - O(n).
Possible improvements of proposed solution
Index-only lookup where leaf-condition is in the xpath
Note in the following CPS path query:
/bookstore/categories[@code='1']/books[@title='Matilda']
In this case, the bookstore model specifies that the 'title' leaf is the key for 'books', and thus the xpath is encoded in the database as: /bookstore/categories[@code='1']/books[@title='Matilda']
We can optimize for this case, using an index-only lookup.
Given the following SQL snippet from the SQL generated from the CPS path query:
Code Block | ||
---|---|---|
| ||
AND (
xpath_component = 'books' OR xpath_component LIKE 'books[%'
)
AND (
attributes @> '{"title":"Matilda"}'
) |
it may be replaced with:
Code Block | ||
---|---|---|
| ||
AND (
xpath_component = 'books[@title=''Matilda'']'
OR (
(
xpath_component = 'books' OR xpath_component LIKE 'books[%'
)
AND (
attributes @> '{"title":"Matilda"}'
)
)
) |
This would allow for an index-only lookup of the item, without needing the LIKE comparison or the attribute check.
Work Breakdown
In addition to the changes outlined above, there is additional work remaining for this change to be production-ready.
...