Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Presently, only case 1 will return data (because 'code' is the list key for 'categories'), and cases 2 and 3 return nothing. Given the proposed solution uses an SQL sub-query for each path-component, the query generator can be adapted to support cases 2 and 3, with no little expected performance impact. Note this feature would require storing the container/list name as an indexed field, as described in the previous section.

Other operations can be accelerated

...

Beside the work done in the PoC implementation, there is additional work for this change to be production-ready.  The main algorithm is mostly complete in the PoC (all integration tests are passing for the PoC). The existing PoC code can thus be refactored to make it production ready.

Fetch descendants

The proposed solution can broadly be broken into two major changes:

  1. Cps Path query using path-component look-up
  2. Using recursive SQL to fetch descendants of nodes returned from 1.

I proposed that the recursive SQL to fetch descendants be implemented first as an independent change, as it move worst-case complexity from O(N2) to O(N). By contrast, the query using path-component look-up will improve best-case performance from O(N) to O(1).

Cps Path Parser changes

The PoC uses String.split for parsing path components, which means paths containing '/' in the leaf condition are not parsed correctly, such as //books[@title='GNU/Linux']. CpsPathBuilder and CpsPathQuery classes from cps-path-parser module will need to be updated to provide the individual path components (in normalized form).

...