The goal of this document is to investigate about the PostgreSQL capability to handle Json documents.
Introduction
PostgreSQL offers two types for storing JSON data: json
and jsonb. T
he json
data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb
data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb
also supports indexing, which can be a significant advantage.
Because the json
type stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept. (The processing functions consider the last value as the operative one.) By contrast, jsonb
does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.
Note
Using PostgreSQL special type as "json" or "jsonb" rather than "text", all applications will lose any compatibility with H2 and MariaDB. They will be mentioned in this document but not recommended.
Type | DataStore | Validation | Support Query | Index | Preserve Order |
---|---|---|---|---|---|
longtext | MariaDB | SELECT | Yes | No | Yes |
PostgreSQL | No | No | No | Yes | |
json | PostgreSQL | INSERT / UPDATE | Yes | No | Yes |
jsonb | PostgreSQL | INSERT / UPDATE | Yes | Yes | No |
Requirements
- Policy types/Policies/Node Types/Node Templates are first order items
- Data Types have a scope of a first order item, so a data type definition only applies in the scope of a policy type or node type definition
- We should keep our current APIs, all changes should be internal
- We must provide an upgrade path to the new data structure and a rollback to the current structure
ORM Layer using Document Storage
ORM layer using document storage (PostgreSQL or MongoDB) could be organized in two layer:
- Document layer (Domain Model to be converted in Json) - implementation has no dependency from DB
- Persistence layer - (Domain Model depend of the DB used): Entities for PostgreSQL, Documents for MongoDB
An implementation on the Document layer could be found here: https://gerrit.nordix.org/c/onap/policy/models/+/13633
Example
In the example below DocToscaServiceTemplate should be serialized to Json.
In the example below the implementation of JpaToscaServiceTemplate for PostgreSQL/MariaDB (full implementation could be found here: https://gerrit.nordix.org/c/onap/policy/clamp/+/13642)
In the example below the implementation of JpaToscaServiceTemplate for MongoDB (full implementation could be found here: https://gerrit.nordix.org/c/onap/policy/clamp/+/13615)
Converters
Jakarta and Spring do no support json Type, but we can use Converters to convert DocToscaServiceTemplate to a Json String.
Note: Serialization and deserialization in Json is already used in policy-model (Un example could be found here [JpaToscaPolicy.java]). @Converter is just an elegant way to do the same thing.
ToscaServiceTemplate Table
clampacm=# \d ToscaServiceTemplate
Table "public.toscaservicetemplate"
Column | Type | Collation | Nullable | Default
-----------------+------------------------+-----------+----------+---------
name | character varying(120) | | not null |
version | character varying(20) | | not null |
servicetemplate | text | | |
Indexes:
"toscaservicetemplate_pkey" PRIMARY KEY, btree (name, version)
clampacm=# select * from public.ToscaServiceTemplate;
name | version | servicetemplate
--------------------+---------+-----------------
PMSH_Test_Instance | 1.0.0 | 16505
(1 row)
clampacm=# select convert_from(lo_get(servicetemplate::oid), 'UTF8') from toscaservicetemplate;
{"tosca_definitions_version":"tosca_simple_yaml_1_3", ...
Proposals
The new ORM level should be additional and optional to the existence one. My propose options are shown below:
- Save ToscaServiceTemplace as Json String in a single Entity:
- ToscaServiceTemplace saved in a Text field as Json
- It is compatible with H2, MariaDB and PostgreSQL
- It is an additional code for policy-models and a medium impact for all applications that are using it
- That solution is compatible with not Spring Applications
- MongoDB/Cassandra
- Document oriented approach full supported by SpringBoot (not needs Converters)
- Compatible only with MongoDB/Cassandra (MongoDB and Cassandra are not compatible to each other)
- It is an additional code for policy-models and a huge impact for all applications that are using it (all repositories and persistence classes have to change to a Document oriented classes)
- Unit tests need an Embedded Server (example for cassandra: EmbeddedCassandraServerHelper or CassandraContainer)
- Spring Boot has own annotations for Documents. Eventually for application not in Spring Boot, it needs additional dao style implementation
Note
- Using document storage, it involves only the ORM layer, it does not change the functionality of the application
- Business logic could be optimized reducing the number of access to DB: read one time a service template and using it during the elaboration rather then fetch data from data from DB for each single search (load a property, load a list of ToscaNodeTemplate ecc..)
- After migration to document storage, it will possible to adjust flexibility of Tosca Service Template Handling (POLICY-3236); as new feature it will impact the business logic of the application
Benchmark Performance of runtime-acm
In order to generate the benchmark I have used (into a laptop) a Virtual Machine whit the follow configuration:
- 8192 Mb
- 2 CPU
For the tests:
- Jmeter to generate requests (same used by performance tests)
- Prometheus for monitoring
- DMaap simulator
- Participant simulator
- MariaDB/PostgreSQL/MongoDB
The existing system
Hibernate/Mariadb. Tosca Service template is saved as a schema entity relation.
Using Json in MariaDB
Hibernate/Mariadb. Tosca Service Template is saved into a longtext as Json.
Using Json in Postgres
Hibernate/PostgreSQL. Tosca Service Template is saved into a text type as Json.
MongoDB
MongoDB. Tosca Service Template and all other entities (Participants and AutomationComposition) are saved as MongoDB Document.
id cannot have dot '.' in MongoDB : solved with minimal configuration
Discussion
- Each Service Template is stored as a JSON "LOB"
- Each service template has a unique name space
- When a TOSCA entity is referred to by another TOSCA entity, the following rules apply
- The entity is referred to using
- name
- version (optional if there is only one version in the name space)
- namespace (optional)
- The version is optional if the name of the referred entity is unique in the specified name space, if there are more than one entities with a given name in a name space, version MUST be specified
- Namespace lookup is as follows
- If a name space is specified, the Service Template referred to by that namespace is used to look up the TOSCA entity
- If an name space is not specified, then the following precedence is used
- The current service template is checked for the referred TOSCA entity, if it's not found...
- The default service template is checked for the referred TOSCA entity
- The entity is referred to using
- Update and delete of service templates is tricky because we need to make sure that no external references are disrupted.
ServiceTemplate001 tosca_definitions_version: tosca_simple_yaml_1_1_0 node_types: org.onap.nodeTypeA: derived_from: tosca.nodetypes.Root org.onap.nodeTypeB: derived_from: org.onap.nodeTypeA version: 1.2.3 topology_template: node_templates: nodeTemplate01: version: 4.5.6 type: org.onap.nodeTypeB type_version: 1.2.3 ServiceTemplate002 tosca_definitions_version: tosca_simple_yaml_1_1_0 namespace: http://onap.org/service/namespace/ServiceTemplate002 version: 10.0.1 imports: - repository: DefaultServiceTemplate namespace_prefix: defaultST node_types: org.onap.nodeTypeC: derived_from: defaultST:org.onap.nodeTypeA version: 2.3.4 topology_template: node_templates: nodeTemplate02: version: 5.6.7 type: defaultST:org.onap.nodeTypeB type_version: 1.2.3 nodeTemplate03: version: 6.7.8 type: defaultST:org.onap.nodeTypeB:1.2.3 nodeTemplate04: version: 7.8.9 type: org.onap.nodeTypeC type_version: 2.3.4 nodeTemplate05: version: 8.9.10 type: org.onap.nodeTypeC:2.3.4 ServiceTemplate003 tosca_definitions_version: tosca_simple_yaml_1_1_0 namespace: http://onap.org/service/namespace/ServiceTemplate003 imports: - repository: DefaultServiceTemplate namespace_prefix: defaultST - repository: ServiceTemplate002:10.0.1 namespace_prefix: st02 node_types: org.onap.nodeTypeZ: derived_from: defaultST:org.onap.nodeTypeA version: 9.3.4 org.onap.nodeTypeY: derived_from: st02:org.onap.nodeTypeA version: 9.3.4 topology_template: node_templates: nodeTemplate10: version: 9.6.7 type: defaultST:org.onap.nodeTypeB:1.2.3 nodeTemplate11: version: 9.8.9 type: org.onap.nodeTypeZ:2.3.4
Discussion Upgraded
When a TOSCA entity is referred to by another TOSCA entity, the following rules apply
- backward compatibility still valid (type and type_version)
- type_version will be deprecated, type will be used as formatted string {namespace}:{type}:{type_version}
- TOSCA language could be found here: https://docs.oasis-open.org/tosca/TOSCA-Simple-Profile-YAML/v1.3/os/TOSCA-Simple-Profile-YAML-v1.3-os.pdf
- A yaml file contains only one service template
- A service template could import other service templates
- TOSCA Service Templates MUST always have, as the first line of YAML, the keyword “tosca_definitions_version” with an associated TOSCA Namespace Alias value.
- Specifically, a Service Template's namespace declaration's URI would be used to form a unique, fully qualified Type name when combined with the locally defined, unqualified name of any Type in the same Service Template. The resultant, fully qualified Type name would be used by TOSCA Orchestrators, Processors and tooling when that Service Template was imported into another Service Template to avoid Type name collision.
Note
in TOSCA language namespace is supposed to be a URI and namespace_prefix is a simplification because of that.
Name and version of service template are currently not used in yaml files, they are not present in all examples neither Unit Tests. They are used as primary key of the ToscaServiceTemplate table and They are used in REST endpoints as id of a resource.
- namespace as URI could be an issue if used as id of a resource
- name could be extracted from the prefix of the namespace, and version will be deprecated or maybe extracted from the namespace as well.
- name and version will be not present into the yaml file (service template), but they could be used as id of a resource because extracted from the namespace
Validation
Validation in current ORM layer
"type" and "type_version" | Referenced to |
---|---|
ToscaProperty | ToscaDataType |
ToscaPolicy | ToscaPolicyType |
"type_version" is optional and "0.0.0" is the default value. The "Key" is used for the validation to find if the ToscaEntity exists.
Examples:
Example | Key |
---|---|
type: onap.datatypes.ToscaConceptIdentifier | onap.datatypes.ToscaConceptIdentifier:0.0.0 |
type: org.onap.policy.clamp.acm.PolicyAutomationCompositionElement | org.onap.policy.clamp.acm.PolicyAutomationCompositionElement:1.0.0 |
"derivedFrom" |
---|
ToscaCapabilityAssignment |
ToscaCapabilityType |
ToscaDataType |
ToscaNodeTemplate |
ToscaNodeType |
ToscaPolicy |
ToscaPolicyType |
ToscaRelationshipType |
ToscaRequirement |
"derivedFrom" is referenced to a ToscaEntity of the same type and placed in same collection.
Example | Key |
---|---|
derivedFrom: onap.datatypes.ToscaConceptIdentifier | onap.datatypes.ToscaConceptIdentifier (any version) |
derivedFrom: org.onap.policy.clamp.acm.PolicyAutomationCompositionElement:1.0.0 | org.onap.policy.clamp.acm.PolicyAutomationCompositionElement:1.0.0 |
Validation in new ORM layer
"type", "type_version" and "namespace" | Referenced to |
---|---|
ToscaProperty | ToscaDataType |
ToscaPolicy | ToscaPolicyType |
ToscaNodeTemplate | ToscaNodeType |
"type_version" is optional and "0.0.0" is the default value, "namespace" is optional and "DefaultNameSpace" is the default value, type could be used as formatted string: {namespace}:{name}:{version}. The "Key" is used for the validation to find if the ToscaEntity exists in same ServiceTemplate or in other one.
Examples:
Example | Key (if defined in same service template) | Key (for external service template) |
---|---|---|
type: onap.datatypes.ToscaConceptIdentifier | "onap.datatypes.ToscaConceptIdentifier:0.0.0" | "DefaultNameSpace:onap.datatypes.ToscaConceptIdentifier:0.0.0" |
type: org.onap.policy.clamp.acm.PolicyAutomationCompositionElement | "org.onap.policy.clamp.acm.PolicyAutomationCompositionElement:1.0.0" | "DefaultNameSpace:org.onap.policy.clamp.acm.PolicyAutomationCompositionElement:1.0.0" |
type: onap.datatype.acm.Target:1.2.3 | "onap.datatype.acm.Target:1.2.3" | "DefaultNameSpace:onap.datatype.acm.Target:1.2.3" |
type: CustomNamespace:onap.datatype.acm.Operation:1.0.1 | "onap.datatype.acm.Operation:1.0.1" | "CustomNamespace:onap.datatype.acm.Operation:1.0.1" |
"derivedFrom" should have same logic as before and it could be used as formatted string: {namespace}:{name}:{version}.
Example | Key (if defined in same service template) | Key (for external service template) |
---|---|---|
derivedFrom: onap.datatypes.ToscaConceptIdentifier | "onap.datatypes.ToscaConceptIdentifier:0.0.0" | "DefaultNameSpace:onap.datatypes.ToscaConceptIdentifier:0.0.0" |
derivedFrom: onap.datatype.acm.Target:1.2.3 | "onap.datatype.acm.Target:1.2.3" | "DefaultNameSpace:onap.datatype.acm.Target:1.2.3" |
derivedFrom: CustomNamespace:onap.datatype.acm.Operation:1.0.1 | "onap.datatype.acm.Operation:1.0.1" | "CustomNamespace:onap.datatype.acm.Operation:1.0.1" |
Update
There are two kind of update:
- values (example "description", "metadata", ...)
- data structure (example Create or Delete of a ToscaEntity, or Update a reference as "type", "type_version" or "namespace")
Open questions about participants
- Case scenario: we have two custom service template, a common service template and two automation composition. participant-policy creates and deletes policies and policy types connecting to policy-api.
- Could we have that scenario?
- Should participant-policy receive a full service template (custom and common service template)?
- Should participant policy collect all policies and policy types by GET policy-api, to know if them are already created?
- How participant policy know if can delete a policy and policy type if it could be used to other automation composition?
- Case scenario: we have two custom service template. A participant is defined in only one of them.
- Could we have that scenario?
- Should participant receive only the custom service template related? (A table to save relations between service templates and participants)
Proposals
- We could consider Service templates references to each other as a DAG ( directed acyclic graph), where Service templates are nodes, and a references as edges. Cyclic references are not allowed.
- Should be available a functionality to load a full version of the service template that contains own tosca entities and also all tosca entities referenced from other service templates (this functionality in for validation or business logic purpose and will be not visible to the end user)
- A service template cannot be deleted if there are references to it
- Any Tosca Entity in a common service template cannot be deleted if there are references to that service template
- Any Tosca Entity in a common service template cannot change "namespace", "type" and "type_version" if there are references to that service template
- Any Tosca Entity in a common service template cannot change values if there are automation compositions referenced to that service template (and references to that service template?)
- A Tosca Entity cannot be add in a common service template if there are automation compositions referenced to that service template
- Functionality as create, update and delete of a ToscaEntity in a service template should be validated and restricted to the service template itself. (Example: update property values of a service template should never update property values from other service templates referenced)
- To reduce the complexity, could be useful to save additional information about the service template:
- A full version of the service template that contains own tosca entities and also all tosca entities referenced from other service templates. it will be used in read-only for business logic purpose
- A table to save relations between service templates
- A table to save relations between service templates and automation compositions (with all service templates referenced)
Current ORM layer for ToscaServiceTemplate
ORM layer for ToscaServiceTemplate Proposed
Tosca Service Template Handling
https://gerrit.nordix.org/c/onap/policy/clamp/+/13755
Conclusion
- Work in progress