Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

draw.io Diagram
bordertrue
diagramNameCPS NCMP dataflow
simpleViewerfalse
linksauto
tbstyletop
lboxtrue
diagramWidth278
revision1
238
revision2

Example heap dump from OOME during CM-handle search

In a test deployment using a single instance of NCMP run using docker (with resources limited to 2 CPUs and 1GB memory), 20000 CM-handles were registered with some public properties (10K using different properties):

Code Block
languagejs
{
    "dmiPlugin": "http://ncmp-dmi-plugin-demo-and-csit-stub:8092",
    "createdCmHandles": [
        {
            "cmHandle": "ch-1",
            "cmHandleProperties": { "neType": "RadioNode" },
            "publicCmHandleProperties": {
                "Color": "yellow",
                "Size": "small",
                "Shape": "cube"
            }
        }
    ]
}

Then five CM-handle searches were run in parallel using curl (each search using two condition parameters):

Code Block
languagebash
curl --location 'http://localhost:8883/ncmp/v1/ch/searches' \
--header 'Content-Type: application/json' \
--data '{
    "cmHandleQueryParameters": [
        {
            "conditionName": "hasAllProperties",
            "conditionParameters": [ {"Color": "yellow"}, {"Size": "small"} ]
        }
    ]
}'

An OOME was observed within 3 seconds.

Analyzing the heap dump produced from the crash shows that each of five searches consumed substantial memory. (Hazelcast was also observed to be a lesser but still significant memory consumer.)

From this graph, we see each search returning 10K handles consumed around 25 MB each.

Image Added

Looking more closely at each thread executing the queries, it is show that there were many ArrayLists in memory, two of which are very large.

Image Added

Looking more closely at the ArrayLists, we see one contains many thousands of Postgres Tuples, while the other contains CPS FragmentEntities:

Image AddedImage Added

This illustrates the core problem that large collections are stored in memory, and the full collections cannot be garbage collected until the collection is fully processed/transformed.

Proposed Solution

It is proposed to create an end-to-end streaming solution, from Persistence layer to Controller. A Proof of Concept will be constructed to document challenges and investigate performance characteristics.

...

The use of pagination in the FragmentEntity Stream could be later made to self-opmimize using adaptive paging. The use of Java Streams could allow for faster processing using parallel streams.

Additional details of current memory consumption - data conversions

The read APIs in CPS Core (cps-service and cps-ri) return Collection<DataNode>:

Code Block
languagejava
Collection<DataNode> queryDataNodes(String dataspaceName, String anchorName, String cpsPath, FetchDescendantsOption fetchDescendantsOption);
Collection<DataNode> getDataNodes(String dataspaceName, String anchorName, String xpath, FetchDescendantsOption fetchDescendantsOption);
Collection<DataNode> getDataNodesForMultipleXpaths(String dataspaceName, String anchorName, Collection<String> xpaths, FetchDescendantsOption fetchDescendantsOption);

Additionally, internal APIs in CPS Reference Implementation (cps-ri) use List<FragmentEntity>, e.g.

Code Block
languagejava
List<FragmentEntity> findByAnchorAndCpsPath(AnchorEntity anchorEntity, CpsPathQuery cpsPathQuery);

When a CPS path query is run, this will result in a List<FragmentEntity> which needs to be converted to a Collection<DataNode>. Thus, the Fragment Entities cannot be garbage collected until the list is converted to Data Nodes. This doubles the memory usage.

Additionally, NCMP uses CPS path queries, e.g. to find CM handles in a given state. NCMP will then convert Collection<DataNode> to Collection<YangModelCmHandle>. Again, the Collection<DataNode> cannot be garbage collected until fully converted to YangModelCmHandles. This again results in doubling of memory usage.

Similar applies when converting to NcmpServiceCmHandle.

NCMP also contains many queries where only partial results are needed, making a Streams API ideal.

Additionally, all Rest APIs returning query results return Lists. Spring framework allows returning Streams, eliminating memory overhead.