Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Example heap dump from OOME during CM-handle search

Summary of Test

In a test deployment using one instance of NCMP (limited to 2 CPUs and 1GB memory), 20000 CM-handles were registered with some public properties (some handles using different properties). Then five CM-handle searches were run in parallel using curl. An OOME was observed within 3 seconds.

Analysis

Analyzing the heap dump produced from the crash shows that each of five searches consumed substantial memory. (Hazelcast was also observed to be a lesser but still significant memory consumer.)

From this graph, we see each search returning ~10K handles consumed around 25 MB each.

Image Added

Looking more closely at each thread executing the queries, it is show that there were many ArrayLists in memory, two of which are very large.

Image Added

Looking more closely at the ArrayLists, we see one contains many thousands of Postgres Tuples, while the other contains CPS FragmentEntities:

Image AddedImage Added

This illustrates the core problem that large collections are stored in memory, and the full collections cannot be garbage collected until the collection is fully processed/transformed.

Details of Test Setup

In a test deployment using a single instance of NCMP run using docker (with resources limited to 2 CPUs and 1GB memory), 20000 CM-handles were registered with some public properties (10K using different properties):

...

Code Block
languagebash
curl --location 'http://localhost:8883/ncmp/v1/ch/searches' \
--header 'Content-Type: application/json' \
--data '{
    "cmHandleQueryParameters": [
        {
            "conditionName": "hasAllProperties",
            "conditionParameters": [ {"Color": "yellow"}, {"Size": "small"} ]
        }
    ]
}'

An OOME was observed within 3 seconds.

Analyzing the heap dump produced from the crash shows that each of five searches consumed substantial memory. (Hazelcast was also observed to be a lesser but still significant memory consumer.)

From this graph, we see each search returning 10K handles consumed around 25 MB each.

Image Removed

Looking more closely at each thread executing the queries, it is show that there were many ArrayLists in memory, two of which are very large.

Image Removed

Looking more closely at the ArrayLists, we see one contains many thousands of Postgres Tuples, while the other contains CPS FragmentEntities:

Image RemovedImage Removed

...

Proposed Solution

It is proposed to create an end-to-end streaming solution, from Persistence layer to Controller. A Proof of Concept will be constructed to document challenges and investigate performance characteristics.

...