Table of Contents |
---|
Summary
A new Cps Path query algorithm is being proposed. The computational complexity was measured for existing and proposed solutions, and is found to be:
Operation | Existing solution | Proposed solution |
---|---|---|
Query 1 out of N | O(N) | O(1) |
Query N out of N | O(N2) | O(N) |
Based on the data obtained from testing both solutions, performance can be predicted for larger datasets. Given a larger database, say 100 times larger than that used in the study, comprising approximately 25 million datanodes, the following performance is to be expected:
Operation | Expected time for existing solution | Expected time for proposed solution |
---|---|---|
Query 1 out of N | 10 minutes | 10 seconds |
Query N out of N | 40 days | 2 minutes |
Background
The current implementation of Cps Path queries relies on regular expressions in the generated SQL queries.
...
Note: using the proposed query solution with without the existing recursive descendant search results in no performance improvement; conversely, using the above recursive SQL descendant search with the existing regex-based solution results in degraded performancethe converse is also true. Only when the two new queries are combined that optimal performance is reached (the two queries being path-component lookup and recursive SQL descendant search) that optimal performance is reached.
Proof of Concept
A PoC was developed so that performance could be compared against existing Cps Path query algorithms.
...