You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

CPS-251 - Getting issue details... STATUS

Overview

CPS temporal data query will allow the user to fetch data based on multiple filtering criteria. This data can be used to create graphs and help the analytical system to take decisions.

Filtering based on data

As it is time-series data, it can be filtered based on three main criteria:

  • Datetime 

    • Data in last X hours 
    • Data after a particular DateTime
    • Data before a particular DateTime 
    • Last X network data
  • DataTypes in CPS Core
    • Dataspace & schema-set

    • Dataspace & anchor
    • Dataspace & multiple anchors - To improve performance, if there is a need to fetch data for multiple anchors.
    • Dataspace
  • Payload
    • Based on a subsection or field in the payload. These criteria do need schema-set to be fixed to search in the same set of anchors. The payload will be passed in JSON format as it is more flexible and the user will not expect to implement functionality similar to cps-path query. 

We will implement only the basic APIs first and will add complex ones when required. 

Other Parameters

There are other parameters, listed below, that can have an impact on the response. 

NoNameTypeDefaultPurposeExample
1indexint0

The query output can have many rows so it is important to limit the fetched data. To limit the number of records and to provide pagination, these two parameters can be used. 

"index" parameter represents the page number, starts with 0, and maxSize is the page size.


2maxSizeint1000
3point-in-timeDateTimeCurrentDateTimePagination does not work well if new data gets added which fulfils the search criteria. The user must provide this value to avoid this issue.

Assumptions

  • Temporal database stores full information in the payload ( CPS-192: Design data store for Temporal Service). 
  • It is possible that data stored for the different timestamps is the same, especially if the payload is used to filter data.
  • Data in the response body will always contain the entire payload data.

Proposed APIs

NoPurpose
1

Return data entries for an anchor after the specified Datetime, which matches the payload format.

2

Return data entries based on provided schema-set after the specified Datetime, which matches the payload format.

Approach 1 - GET

In the GET APIs, all the filtering parameters are passed as query parameter. 

NoAPI endpointDescriptionExample
1.

GET /dataspaces/{dataspace-name}​/anchors/{anchor-name}​?after=<epoch-time>&maxSize=1000&?payload={"status" : "UP"}&maxSize=500

Return all the data entries for an anchor after the specified epoch in nanoseconds
2.​GET /dataspaces/{dataspace-name}/schema-sets/{schema-set}?after=<epochtime>&maxSize=500Return all the data entries based on provided schema-set after the specified epoch in nanoseconds.




Response Body
nametype
nextRecordsLinkstringadded only if there are remaining records to be fetched for the query.
previousRecordsLinkstringadded only if it is not the first set of records.
recordslist

contains one record for each timestamp that meets filtering criteria. It contains header information along with data. 


{
  "nextRecordsLink": "cps-temporal/api/v1/dataspaces/{dataspace-name}/anchors/{anchor-name}?after=<epoch-time>&maxSize=1000&before=<epoch-time>&index=2&point-in-time=DATE",
  "previousRecordsLink": "cps-temporal/api/v1/dataspaces/{dataspace-name}/anchors/{anchor-name}?after=<epoch-time>&maxSize=1000&before=<epoch-time>&index=0&point-in-time=DATE",
  "records": [
    {
      "timestamp": "1234567788889",
      "dataspace": "my-dataspace",
      "schemaSet": "my-schema-set",
      "anchor": "my-anchor",
      "data": {
        "status" : "UP"
      }
    }
  ]
}
Pros:
  • Provides the ability to cache data based on URL. It is not relevant as data can change based on when the API is called.
  • Can provide links for the next record and previous record link in the response itself.
  • point-in-time can have the current timestamp as the default value because we can add it in the next record and previous record link as a part of the response.
Cons
  • The payload is JSON can be a little longer and will not fit within the GET URL length limit.

Approach 2 - POST

As we are query data, POST is not intuitive but the search can be considered as a resource with different filtering criteria. We need only one API as the filtering criteria will be provided in the body, which allows the payload to be big if required.

API URL: cps-temporal/api/v1/filters?maxSize=1000&index=1 

Request Body
{
  "dataspaceName": "dataspace-001",
  "anchorName": "anchor-001",
  "schemaSetName": "schemaset-001",
  "after" : "2021-03-21T00:00:00-0:00",
  "pointInTime": "2021-04-21T00:00:00-0:00", // mandatory 
  "payload": {
    "status": "UP"
  }
}
 Response Body
[
  {
    "timestamp": "1234567788889",
    "dataspace": "my-dataspace",
    "schemaSet": "my-schema-set",
    "anchor": "my-anchor",
    "data": {
      "status": "UP"
    }
  }
]

Questions

  1. Where to keep pagination related variables in the POST API?
  2. GET Approach - Should we disable caching or ask the user to pass the point-in-time parameter value?

Open Items

 




  • No labels