CPS-251 [WIP] Define REST Query interface for Temporal service

CPS-251 - Getting issue details... STATUS

Overview

CPS temporal data query will allow the user to fetch data based on multiple filtering criteria. This data can be used to create graphs and help the analytical system to take decisions.

Filtering based on data

As it is time-series data, it can be filtered based on three main criteria:

Datetime
- Data in last X hours
- Data after a particular DateTime
- Data before a particular DateTime
- Last X network data
DataTypes in CPS Core
- Dataspace & schema-set
- Dataspace & anchor
- Dataspace & multiple anchors - To improve performance, if there is a need to fetch data for multiple anchors.
- Dataspace
Payload
- Based on a subsection or field in the payload. These criteria do need schema-set to be fixed to search in the same set of anchors. The payload will be passed in JSON format as it is more flexible and the user will not expect to implement functionality similar to cps-path query.

We will implement only the basic APIs first and will add complex ones when required.

Other Parameters

There are other parameters, listed below, that can have an impact on the response.

No

Name

Type

Default

Purpose

Example

1

index

int

0

The query output can have many rows so it is important to limit the fetched data. To limit the number of records and to provide pagination, these two parameters can be used.

"index" parameter represents the page number, starts with 0, and maxSize is the page size.

2

maxSize

int

1000

3

point-in-time

DateTime

CurrentDateTime

Pagination does not work well if new data gets added which fulfils the search criteria. The user must provide this value to avoid this issue.

Assumptions

Temporal database stores full information in the payload ( CPS-192: Design data store for Temporal Service).
It is possible that data stored for the different timestamps is the same, especially if the payload is used to filter data.
Data in the response body will always contain the entire payload data.

Proposed APIs

No	Purpose
1	Return data entries for an anchor after the specified Datetime, which matches the payload format.
2	Return data entries based on provided schema-set after the specified Datetime, which matches the payload format.

Approach 1 - GET

In the GET APIs, all the filtering parameters are passed as query parameter.

No	API endpoint	Description
1.	GET /dataspaces/{dataspace-name}/anchors/{anchor-name}?after=<epoch-time>&maxSize=1000&?payload={"status" : "UP"}&maxSize=500	Return all the data entries for an anchor after the specified epoch in nanoseconds
2.	GET /dataspaces/{dataspace-name}/schema-sets/{schema-set}?after=<epochtime>&maxSize=500	Return all the data entries based on provided schema-set after the specified epoch in nanoseconds.

Response Body

name	type
nextRecordsLink	string	added only if there are remaining records to be fetched for the query.
previousRecordsLink	string	added only if it is not the first set of records.
records	list	contains one record for each timestamp that meets filtering criteria. It contains header information along with data.

{
  "nextRecordsLink": "cps-temporal/api/v1/dataspaces/{dataspace-name}/anchors/{anchor-name}?after=<epoch-time>&maxSize=1000&before=<epoch-time>&index=2&point-in-time=DATE",
  "previousRecordsLink": "cps-temporal/api/v1/dataspaces/{dataspace-name}/anchors/{anchor-name}?after=<epoch-time>&maxSize=1000&before=<epoch-time>&index=0&point-in-time=DATE",
  "records": [
    {
      "timestamp": "1234567788889",
      "dataspace": "my-dataspace",
      "schemaSet": "my-schema-set",
      "anchor": "my-anchor",
      "data": {
        "status" : "UP"
      }
    }
  ]
}

Pros:

Provides the ability to cache data based on URL. It is not relevant as data can change based on when the API is called.
Can provide links for the next record and previous record link in the response itself.
point-in-time can have the current timestamp as the default value because we can add it in the next record and previous record link as a part of the response.

Cons

The payload is JSON can be a little longer and will not fit within the GET URL length limit.

Approach 2 - POST

As we are query data, POST is not intuitive but the search can be considered as a resource with different filtering criteria. We need only one API as the filtering criteria will be provided in the body, which allows the payload to be big if required.

API URL: cps-temporal/api/v1/filters?maxSize=1000&index=1

Request Body

{
  "dataspaceName": "dataspace-001",
  "anchorName": "anchor-001",
  "schemaSetName": "schemaset-001",
  "after" : "2021-03-21T00:00:00-0:00",
  "pointInTime": "2021-04-21T00:00:00-0:00", // mandatory 
  "payload": {
    "status": "UP"
  }
}

Response Body

[
  {
    "timestamp": "1234567788889",
    "dataspace": "my-dataspace",
    "schemaSet": "my-schema-set",
    "anchor": "my-anchor",
    "data": {
      "status": "UP"
    }
  }
]

Questions

Where to keep pagination related variables in the POST API?
GET Approach - Should we disable caching or ask the user to pass the point-in-time parameter value?

Space shortcuts

Page tree

Overview

Filtering based on data

Other Parameters

Assumptions

Proposed APIs

Approach 1 - GET

Response Body

Pros:

Cons

Approach 2 - POST

Request Body

Response Body

Questions

Open Items