Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Summary

A new Cps Path query algorithm is being proposed. The computational complexity was measured for existing and proposed solutions, and is found to be:

OperationExisting solutionProposed solution
Query 1 out of NO(N)O(1)
Query N out of NO(N2)

O(N)

Based on the data obtained from testing both solutions, performance can be predicted for larger datasets. Given a larger database, say 100 times larger than that used in the study, comprising approximately 25 million datanodes, the following performance is to be expected:

OperationExpected time for existing solutionExpected time for proposed solution
Query 1 out of N10 minutes10 seconds
Query N out of N40 days2 minutes

Background

The current implementation of Cps Path queries relies on regular expressions in the generated SQL queries.

...

Note: using the proposed query solution with without the existing recursive descendant search results in no performance improvement; conversely, using the above recursive SQL descendant search with the existing regex-based solution results in degraded performancethe converse is also true. Only when the two new queries are combined that optimal performance is reached (the two queries being path-component lookup and recursive SQL descendant search) that optimal performance is reached.

Proof of Concept

A PoC was developed so that performance could be compared against existing Cps Path query algorithms.

...