...
The use of regex potentially results in full table scans, severely impacting performance, and causing query time duration to grow linearly with the fragment table size (i.e. queries get slower as the database gets bigger).
A new algorithm for queries is being proposed, avoiding regular expressions, so that query duration is independent of database size.
Path queries using path-component lookup
...
Code Block | ||
---|---|---|
| ||
SELECT * FROM fragment WHERE parent_id IN ( SELECT id FROM fragment WHERE xpath_component = 'categories[@code=''1'']' AND parent_id = ( SELECT id FROM fragment ` WHERE xpath_component = 'bookstore' AND anchor_id = 3 AND parent_id IS NULL ) ) AND ( xpath_component = 'books' OR xpath_component LIKE 'books[%' ) AND ( attributes @> '{"title":"Matilda"}' ) |
Comparison with existing solution
Given the following (incomplete) fragment table:
id | parent_id | anchor_id | xpath |
---|---|---|---|
1 | NULL | 3 | /bookstore |
2 | 1 | 3 | /bookstore/categories[@code='1'] |
3 | 2 | 3 | /bookstore/categories[@code='1']/books[@title='Matilda'] |
4 | 2 | 3 | /bookstore/categories[@code='1']/books[@title='The Gruffalo'] |
5 | 1 | 3 | /bookstore/categories[@code='2'] |
6 | 5 | 3 | /bookstore/categories[@code='2']/books[@title='Good Omens'] |
7 | 5 | 3 | /bookstore/categories[@code='2']/books[@title='The Colour of Magic'] |
8 | NULL | 4 | /dmi-registry |
9 | 8 | 4 | /dmi-registry/cm-handles[@id='d99ef4cc8d744a749490cceefd4c469e'] |
Here is a visualization of both algorithms:
Existing solution | Proposed solution |
---|---|
Because the existing solution uses a regex, all fragments in anchor 3 need to be examined. | Because the proposed solution uses sub-queries to look up each path component, only relevant fragments are examined. |
A note on fetching descendant nodes
...