Guangrong Fu mentioned AAI in Baseline Measurements based on Testing Results:

  1. Cache the AAI data and refresh them periodically so that Holmes won't have to make an HTTP call to AAI every time it tries to correlate one alarm to another.

The problem for caching is how to know when to update the cached data. Even though the access time may be fast for Holmes, the risk is using out-of-date data, so the correlations will be wrong anyway. Also, duplicating the AAI data outside of AAI is probably a bad architectural decision. Making AAI faster for these use cases would be better.

Has there been a performance analysis of where the time is spent? Could it help to use ElasticSearch (e.g. as in sparky)? Should Holmes have a batch interface to get more AAI data in fewer calls? Or a better correlation API that results in fewer calls?

31st Oct:

1st Nov:

  • Guangrong Fu will try custom queries for queries that took to long to return
  • The hardware (mainly storage) influences the query speed - need to find out what hardware was the speed test conducted on (Guangrong Fu will provide HW specs)
  • HOLMES-186 - Getting issue details... STATUS

Would the AAI Cacher AAI-1337 - Getting issue details... STATUS help to improve performance?

5th Mar: Guangrong Fu


Sorry for my late response. It took me a long time to set up AAI in my own env. For Item 10, here's some information:

Main APIs invoked in Holmes for different use cases:


  • Getting the VM query URL via: /search/nodes-query?search-node-type=vserver&filter=vserver-name:EQUALS: - once
  • Getting VM info via: the URL returned by the query above - once
  • Getting the VNF data via: network/generic-vnfs/generic-vnf - once


  • Updating terminal point via: /network/pnfs/pnf/{pnfName}/p-interfaces/p-interface/nodeId-{pnfName}-ltpId-{ifName} - once
  • Getting logical links via: /network/pnfs/pnf/{pnfName}/p-interfaces/p-interface/nodeId-{pnfName}-ltpId-{ifName} - 3 times
  • Getting VPN bingding info via: /network/pnfs/pnf/{pnfName}/p-interfaces/p-interface/nodeId-{pnfName}-ltpId-{ifName} - once
  • Getting connectivity info via: /network/vpn-bindings/vpn-binding/{vpnId} - once
  • Getting service instance info via: /network/connectivities/connectivity/{connectivityId} - once


We set up an AAI env on a VM (8 cores, 16GB memory, 160GB storage) following the guidance and tried to run a VNF query using "/aai/v11/cloud-infrastructure/cloud-regions/cloud-region/example-cloud-owner-val-45051/example-cloud-region-id-val-56689/tenants/tenant/example-tenant-id-val-51834/vservers/vserver/example-vserver-id-val-51834" (which is returned by "/search/nodes-query?search-node-type=vserver&filter=vserver-name:EQUALS:") for 1000 times. It took ~95ms per query. Also, we tried to query a VNF for 1000 times via "/aai/v11/network/generic-vnfs/generic-vnf/example-vnf-id-val-92494" and the average time is ~86ms.

From the result, we know that even for a single request, the time cost reaches around 100ms. Let alone there will be several requests sent to AAI when an alarm is processed by Holmes. Taking CCVPN for example, for each alarm, there are up to 7 requests made. That means it'll take around 600-700 ms for Holmes to interact with AAI. In case of alarm storms, it is hard for AAI to support such intensive queries.

6th March: Guangrong Fu

In my opinion, the performance of AAI queries is not only impacted by the computation inside AAI, but also impacted by the HTTP request itself.

I've done another test. I tried to send requests to the health check API (which does nothing but return immediately after it receives a request ) of Holmes. The average time cost is also ~ 70ms. So it seems to be a problem with the time cost caused by setting up and releasing HTTP connections.

6th March: Keong

Regarding these queries:

  • Getting logical links via: /network/pnfs/pnf/{pnfName}/p-interfaces/p-interface/nodeId-{pnfName}-ltpId-{ifName} - 3 times
  • Getting VPN bingding info via: /network/pnfs/pnf/{pnfName}/p-interfaces/p-interface/nodeId-{pnfName}-ltpId-{ifName} - once

What depth is used on these GET calls? If the defaulting to depth=0, then perhaps some improvement can be made by using "depth=1" or "depth=2"? Fewer calls returning more data could improve overall performance.

Same could be achieved by changing to Nodes query, e.g.

GET /aai/v14/nodes/p-interfaces?interface-name=nodeId-{pnfName}-ltpId-{ifName}

Question1: Can the Bulk API be used with GET calls? Documentation does not show any examples of GET actions.

Question2: Would it help to have the Holmes pod co-located with the AAI haproxy and AAI resources pods? Reduced network latency could improve overall performance.

Guangrong: Holmes is acutally deployed by DCAE. I'm not sure whether your proposal is feasible. What's more, the performance data I got was based on the fact that Holmes and AAI were deployed on the same VM, sharing the same docker env.

  • No labels