Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

NCMP CM Handle Queries are directly implicated in CPS-2146, as the Out Of Memory errors occurs during NCMP Search and ID Search functions.

Use of Postgres Arrays in Repository methods

Use of Postgres arrays in JpaRepository methods may be using too much memory.

For example, see this partial stack trace:

No Format
2024-02-28T05:18:25.049Z@eric-oss-ncmp-04@ncmp@Connection leak detection triggered for org.postgresql.jdbc.PgConnection@b358fc9 on thread qtp1699794502-7604, stack trace follows, logger: com.zaxxer.hikari.pool.ProxyLeakTask, thread_name: CpsDatabasePool housekeeper, stack_trace: java.lang.Exception: Apparent connection leak detected
 org.onap.cps.spi.repository.YangResourceRepository.findAllModuleReferencesByDataspaceAndModuleNames(YangResourceRepository.java:111)
 org.onap.cps.spi.impl.CpsAdminPersistenceServiceImpl.validateDataspaceAndModuleNames(CpsAdminPersistenceServiceImpl.java:206)
 org.onap.cps.spi.impl.CpsAdminPersistenceServiceImpl.queryAnchors(CpsAdminPersistenceServiceImpl.java:143)
 org.onap.cps.api.impl.CpsAnchorServiceImpl.queryAnchorNames(CpsAnchorServiceImpl.java:90)
 org.onap.cps.ncmp.api.impl.inventory.InventoryPersistenceImpl.getCmHandleIdsWithGivenModules(InventoryPersistenceImpl.java:174)
 org.onap.cps.ncmp.api.impl.NetworkCmProxyCmHandleQueryServiceImpl.executeModuleNameQuery(NetworkCmProxyCmHandleQueryServiceImpl.java:167)
 org.onap.cps.ncmp.api.impl.NetworkCmProxyCmHandleQueryServiceImpl.executeQueries(NetworkCmProxyCmHandleQueryServiceImpl.java:256)
 org.onap.cps.ncmp.api.impl.NetworkCmProxyCmHandleQueryServiceImpl.queryCmHandleIds(NetworkCmProxyCmHandleQueryServiceImpl.java:71)
 org.onap.cps.ncmp.api.impl.NetworkCmProxyCmHandleQueryServiceImpl.queryCmHandles(NetworkCmProxyCmHandleQueryServiceImpl.java:95)
 org.onap.cps.ncmp.api.impl.NetworkCmProxyDataServiceImpl.executeCmHandleSearch(NetworkCmProxyDataServiceImpl.java:215)
 org.onap.cps.ncmp.rest.controller.NetworkCmProxyController.searchCmHandles(NetworkCmProxyController.java:253)

The code causing the exception in YangResourceRepository is:

Code Block
languagejava
    default Set<YangResourceModuleReference> findAllModuleReferencesByDataspaceAndModuleNames(
        final String dataspaceName, final Collection<String> moduleNames) {
        return findAllModuleReferencesByDataspaceAndModuleNames(dataspaceName, moduleNames.toArray(new String[0]));
    }

Hazelcast

The use of Hazelcast (an In-Memory Data Grid) has been identified as a particular source of high memory usage. Some points of interest:

  • In NCMP, Hazelcast is not used as a cache, so idle eviction is not used, and the structures are configured to have 3 backups. It follows that scaling up the deployment (e.g. Kubernetes auto-scaling) would not help in a low-memory situation, as the new instances would have also be storing the whole structure.
  • Given Hazelcast is configured for synchronous operation, it is likely to have worse performance than a database solution.
  • There are additional reasons to avoid Hazelcast, since as a distributed asynchronous system, it cannot give strong consistency guarantees like an ACID database - it is prone to split horizon brain among other issues.
  • I strongly advise against the use of Hazelcast for future development.

Side note - this was seen in logs of CPS-2146:

2024-02-28T05:23:53.961Z@eric-oss-ncmp-04@ncmp@[192.168.89.193]:5701 ["cps-and-ncmp-common-cache-cluster"] [5.2.4] A split-brain merge validation request was received, but the current member is not a master. The master address will be sent to the request source ([192.168.124.37]:5705), logger: com.hazelcast.internal.cluster.impl.operations.SplitBrainMergeValidationOp, thread_name: hz.hazelCastInstanceCpsCore.priority-generic-operation.thread-0

The following is an overview of Hazelcast structures in CPS and NCMP, along with recommendations.

...