You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 38 Next »

The goal of this document is to investigate about the PostgreSQL capability to handle Json documents.

Introduction

PostgreSQL offers two types for storing JSON data: json and jsonb. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.

Because the json type stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept. (The processing functions consider the last value as the operative one.) By contrast, jsonb does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.

In general, most applications should prefer to store JSON data as jsonb, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys. (https://www.postgresql.org/docs/current/datatype-json.html)

Demo Application

To better understand these functionality, it has been created a Demo Application. That it is available here (https://gerrit.nordix.org/c/local/onap/clamp-pilot/+/11966).

 Data Transfer Object

The Java code shown below, is the data transfer Object used by Rest Api Controller. (Validations and Swagger annotations have been removed for simplicity).

Model
public class ElementDto {

    private String name;

    private List<String> tags;

..........
}

public class DocumentDto {

    private String name;

    private Map<String, ElementDto> elements;

..........
}

public class ItemDto {
    private String key;

    private String text;

    private Map<String, String> properties;

    private DocumentDto document;
    
..........
}


The Json shown below is an example of the model shown before.

Json Document
{
  "key": "1",
  "text": "This is an example",
  "properties": {
    "additionalProp1": "Val 10",
    "additionalProp2": "11",
    "additionalProp3": "12"
  },
  "document": {
    "name": "DocumentExample",
    "elements": {
      "additionalProp1": {
        "name": "ElementExample 1",
        "tags": [
          "tag1"
        ]
      },
      "additionalProp2": {
        "name": "ElementExample 2",
        "tags": [
          "tag2"
        ]
      },
      "additionalProp3": {
        "name": "ElementExample 3",
        "tags": [
          "tag3"
        ]
      }
    }
  }
}

Domain model

To store data on PostgreSQL we still have to use Entities, but part of Dto classes could be stored as jsonb type or at least we can create new domain classes that are equal to those DTO classes.

In this example we can use just one class as persistence model as shown below.

Persistence Model
@Entity
@Data
public class Item implements Serializable {

    private static final long serialVersionUID = 1L;

    @Id
    private String key;

    @Column(name = "text", nullable = false)
    private String text;

    @Convert(converter = StringToMapConverter.class)
    @Column(columnDefinition = "jsonb", nullable = false)
    private Map<String, String> properties;

    @Convert(converter = StringToDocumentConverter.class)
    @Column(columnDefinition = "jsonb", nullable = false)
    @NotNull
    private DocumentDto document;
}

Converters

Jakarta and Spring do no support json Type, but we can use Converters to convert an Object to a String. The more generic is a converter, more we can reuse it.

Converters
@Slf4j
@Converter(autoApply = true)
public final class StringToMapConverter implements AttributeConverter<Map<String, ? extends Object>, String> {

    private static final ObjectMapper MAPPER = new ObjectMapper();

    @Override
    public String convertToDatabaseColumn(Map<String, ? extends Object> map) {
        try {
            return map == null ? null : MAPPER.writeValueAsString(map);
        } catch (JsonProcessingException ex) {
            throw new IllegalStateException(ex);
        }
    }

    @Override
    public Map<String, ? extends Object> convertToEntityAttribute(String dbData) {
        if (dbData == null) {
            return Map.of();
        }
        try {
            return MAPPER.readValue(dbData, new TypeReference<Map<String, Object>>() {});
        } catch (IOException ex) {
            log.error("Cannot convert String to Map.");
            return Map.of();
        }
    }
}

@Slf4j
@Converter(autoApply = true)
public final class StringToDocumentConverter implements AttributeConverter<DocumentDto, String> {

    private static final ObjectMapper MAPPER = new ObjectMapper();

    @Override
    public String convertToDatabaseColumn(DocumentDto document) {
        try {
            return document == null ? null : MAPPER.writeValueAsString(document);
        } catch (JsonProcessingException ex) {
            throw new IllegalStateException(ex);
        }
    }

    @Override
    public DocumentDto convertToEntityAttribute(String dbData) {
        if (dbData == null) {
            return new DocumentDto();
        }
        try {
            return MAPPER.readValue(dbData, DocumentDto.class);
        } catch (IOException ex) {
            log.error("Cannot convert String to Map.");
            return new DocumentDto();
        }
    }
}

Repository

The java code below shows some examples how to use the functionalities offered by PosgreSLQ about Json Type. We can apply where clauses, extract and manipulate data into the Json data. In order to use these functionalities, all queries have to be native (nativeQuery = true).

Repository
@Repository
public interface ItemRepository extends JpaRepository<Item, String> {

    @Query(value = " SELECT * FROM item where document['name'] @> :value ", nativeQuery = true)
    Stream<Item> findByName(@Param("value") String value);

    @Modifying
    @Query(value = " UPDATE item SET document['name']=:value where key=:key ", nativeQuery = true)
    void updateNameByKey(@Param("key") String key, @Param("value") String value);

    @Query(
            value = " SELECT Cast(jsonb_path_query_array(document['elements'], '$[*].keyvalue().value.tags') as varchar) FROM item ",
            nativeQuery = true)
    Stream<String> findTags();
}

Unit Tests

In order to make unit tests, we can not use H2 due of the incompatibility. Spring supports Testcontainers where we can configure a PostgreSQL Container that will up and running during tests.

Unit Test
@Testcontainers
@ExtendWith(SpringExtension.class)
@SpringBootTest(webEnvironment = WebEnvironment.RANDOM_PORT)
@ContextConfiguration(
        initializers = DemoApplicationTests.Initializer.class,
        classes = DemoApplication.class)
@TestPropertySource(locations = {"classpath:application_test.properties"})
class DemoApplicationTests {

    @Test
    void contextLoads() {
    }

    @Container
    public static PostgreSQLContainer postgreSQLContainer = new PostgreSQLContainer("postgres:latest")
            .withDatabaseName("db_demo").withUsername("sa").withPassword("sa");

    @Configuration
    static class Initializer implements ApplicationContextInitializer<ConfigurableApplicationContext> {
        @Override
        public void initialize(ConfigurableApplicationContext configurableApplicationContext) {
            TestPropertyValues
                    .of("spring.datasource.url=" + postgreSQLContainer.getJdbcUrl(),
                            "spring.datasource.username=" + postgreSQLContainer.getUsername(),
                            "spring.datasource.password=" + postgreSQLContainer.getPassword())
                    .applyTo(configurableApplicationContext.getEnvironment());
        }
    }

}

Item Table

db_demo=# \d item
                         Table "public.item"
   Column   |          Type          | Collation | Nullable | Default
------------+------------------------+-----------+----------+---------
 key        | character varying(255) |           | not null |
 document   | jsonb                  |           | not null |
 properties | jsonb                  |           | not null |
 text       | character varying(255) |           | not null |
Indexes:
    "item_pkey" PRIMARY KEY, btree (key)


db_demo=# select * from public.item;
 key |                                                                                                                      document                                                                                              |                                   properties                    |        text
-----+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+--------------------
 1   | {"name": "DocumentExample", "elements": {"additionalProp1": {"name": "ElementExample 1", "tags": ["tag1"]}, "additionalProp2": {"name": "ElementExample 2", "tags": ["tag2"]}, "additionalProp3": {"name": "ElementExample 3", "tags": ["tag3"]}}} | {"additionalProp1": "Val 10", "additionalProp2": "11", "additionalProp3": "12"} | This is an example
(1 row)

db_demo=# SELECT Cast(jsonb_path_query_array(document['elements'], '$[*].keyvalue().value.tags') as varchar) FROM item;
     jsonb_path_query_array
--------------------------------
 [["tag1"], ["tag2"], ["tag3"]]
(1 row)

Example in Policy Admin models

JpaToscaWithTypeAndStringProperties class

JpaToscaWithTypeAndStringProperties
@MappedSuperclass
@Data
@EqualsAndHashCode(callSuper = true)
public abstract class JpaToscaWithTypeAndStringProperties<T extends ToscaWithTypeAndObjectProperties>
        extends JpaToscaEntityType<T> implements PfAuthorative<T> {

    .............................

     @Convert(converter = StringToMapConverter.class)
     @Column(columnDefinition = "json")
     private Map<@NotNull String, @NotNull String> properties;

     .............................
 }

Converters

StringToMapConverter
@Converter(autoApply = true)
public final class StringToMapConverter implements AttributeConverter<Map<String, ? extends Object>, String> {

    private static final Coder coder = new StandardCoder();

    @Override
    public String convertToDatabaseColumn(Map<String, ? extends Object> map) {
        try {
            return map == null ? null : coder.encode(map);
        } catch (CoderException e) {
            throw new PfModelRuntimeException(Response.Status.BAD_REQUEST, e.getMessage(), e);
        }
    }

    @Override
    public Map<String, ? extends Object> convertToEntityAttribute(String dbData) {
        if (dbData == null) {
            return Map.of();
        }
        try {
            return coder.decode(dbData, Map.class);
        } catch (CoderException e) {
            throw new PfModelRuntimeException(Response.Status.BAD_REQUEST, e.getMessage(), e);
        }
    }
}

Table of comparison

TypeDataStoreValidationSupport QueryIndexPreserve Order
longtextMariaDBSELECTYesNoYes
longtextPostgreSQLNoNoNoYes
jsonPostgreSQLINSERT / UPDATEYesNoYes
jsonbPostgreSQLINSERT / UPDATEYesYesNo

Proposals

My propose options are shown below:

  1. replace all ElementCollections with Converters:
    • Small Json instead of Java binary code
    • compatible with H2, MariaDB and PostreSql
    • it impacts only onap/policy/models, persistence classes used into ElementCollections have to change to a Document oriented classes
    • 50% of tables (with postfix as _ATTRIBUTES, _PROPERTIES, _METADATA, _OCCURRENCES, _CONSTRAINTS, _TARGETS, _TRIGGERS and _INPUTS) will be removed
    • Unit tests only onap/policy/models
    • onap/policy/models still compatible with not Spring Application
  2. Unique Json String:
    • One big Json for each model (e.g. saving the whole ToscaServiceTemplace as Json)
    • compatible only with PostreSql
    • it impacts onap/policy/models and all applications related (runtime-acm, api and pap), persistence classes have to change to a Document oriented classes
    • 90% of tables will be removed
    • Unit tests need PostgreSQLContainer (is not compatible with H2)
    • onap/policy/models still compatible with not Spring Application
  3. MongoDB/Cassandra
    • Document oriented approach full supported by SpringBoot (not needs Converters)
    • it impacts onap/policy/models and all applications related (runtime-acm, api and pap), and all repositories and persistence classes have to change to a Document oriented classes
    • compatible only with MongoDB/Cassandra (MongoDB and Cassandra are not compatible to each other)
    • Unit tests need an Embedded Server (example for cassandra: EmbeddedCassandraServerHelper or CassandraContainer)

Benchmark Performance of Controlloop Runtime

In order to generate the benchmark I have used:

  • Jmeter to generate requests (same used by performance tests): Requests/sec: 287
  • Prometheus for monitoring
  • DMaap simulator
  • Participant simulator
  • MariaDB/PostgreSQL/MongoDB

The existing system

Eclipse-Link/Mariadb. Tosca Service template is saved as a schema entity relation.

Postgres with the JSON store

Hibernate/PostgreSQL. Tosca Service Template is saved into a jsonb type.

The equivalent in MariaDB

Hibernate/Mariadb. Tosca Service Template is saved into a longtext as Json.

MongoDB

MongoDB. Tosca Service Template is saved as MongoDB Document.

  • id cannot have dot '.' in MongoDB : solved with minimal configuration

Benchmark Performance of Policy Pap

In order to generate the benchmark I have used:

  • Jmeter to generate requests (same used by performance tests)
  • Prometheus for monitoring
  • DMaap simulator
  • MariaDB

The existing system

Eclipse-Link/Mariadb with dao pattern and atomic transactions. At that time the migration to Spring repositories are not finished yet, so the benchmark is calculate by endpoints performance ('server_requests' metric).


Conclusion

  • Moving to Spring Data JPA there are already improvements
  • NoSql does not have standard, so PostgreSQL jsonb, MongoDB and Cassandra are not compatible to each other
  • There is much more support for MongoDB/Cassandra than PostgreSQL jsonb
  • In MongoDB/Cassandra JSON queries are much more intuitive than PostgreSQL jsonb
  • PostgreSql seems to be the fastest


  • No labels