Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The use of regex potentially results in full table scans, severely impacting performance, and causing query time duration to grow linearly with the fragment table size (i.e. queries get slower as the database gets bigger).

A new algorithm for queries is being proposed, avoiding regular expressions, so that query duration is independent of database size.


Path queries using path-component lookup

...

Code Block
languagesql
SELECT
    * 
FROM
    fragment 
WHERE
    parent_id IN (
        SELECT
            id 
        FROM
            fragment 
        WHERE
            xpath_component = 'categories[@code=''1'']'
            AND parent_id = (
                SELECT
                    id 
                FROM
                    fragment 
`                WHERE
                    xpath_component = 'bookstore'
                    AND anchor_id = 3
                    AND parent_id IS NULL
            )
        ) 
        AND (
            xpath_component = 'books'
            OR xpath_component LIKE 'books[%'
        ) 
        AND (
            attributes @> '{"title":"Matilda"}'
        )

Comparison with existing solution

Given the following (incomplete) fragment table:

idparent_idanchor_idxpath
1NULL3/bookstore
213/bookstore/categories[@code='1']
323/bookstore/categories[@code='1']/books[@title='Matilda']
423/bookstore/categories[@code='1']/books[@title='The Gruffalo']
513/bookstore/categories[@code='2']
653/bookstore/categories[@code='2']/books[@title='Good Omens']
753/bookstore/categories[@code='2']/books[@title='The Colour of Magic']
8NULL4/dmi-registry
984/dmi-registry/cm-handles[@id='d99ef4cc8d744a749490cceefd4c469e']

Here is a visualization of both algorithms:

Existing solutionProposed solution

Image Added

Image Added

Because the existing solution uses a regex,
all fragments in anchor 3 need to be examined.
Because the proposed solution uses sub-queries to look up
each path component, only relevant fragments are examined.


A note on fetching descendant nodes

...