Project Name:

Project description:


  • The business scope of Holmes is different from that of Policy

Both Holmes and Policy adopt Drools as the rules engine. The main difference between these two projects is that Holmes is mainly targeted at correlation analysis between different alarms while Policy is aimed to implement control loops by triggering a series of actions. Briefly speaking, Holmes is targeted at root cause analysis but policy is aimed for auto-healing/auto-scaling.

  • Holmes is necessary for reducing the pressure caused by the large alarm quantity for Policy

Policy does not need to face the original alarms directly with the help of Holmes. The root cause is picked out from all the original alarms by Holmes and then, the most suitable policy ID is selected and published accordingly. In this way, Policy is liberated from triggering similar or duplicated actions which are caused by the alarms with internal relations.

For example, if there are 3 events A, B and C which could lead to a power down fault, and B and C are caused by A. Without Holmes, all of these 3 events will be sent to Policy and 3 corresponding actions are going to be triggered. After we add Holmes to the close loop controller and make it the upstream system of Policy, only Event A will be sent to Policy and thus only one action will be triggered, which makes the close loop control more precise and efficient.

Scope:

Architecture Alignment:

        

Resources:


RoleNameGerrit IDCompany
Email
TimeZone
Primary ContactGuangrong Fu
ZTEBeijing, China. UTC +8
CommitersGuangrong Fu




Peng Tang
ZTEtang.peng5@zte.com.cnBeijing, China. UTC +8
ContributorsJiaqiang Du
ZTEdu.jiaqiang@zte.com.cnBeijing, China. UTC +8

Yi Li
ZTEli.yi101@zte.com.cnBeijing, China. UTC +8

Youbo Wu
ZTE

wu.youbo@zte.com.cn

Beijing, China. UTC +8

Liang Feng
ZTEfeng.liang1@zte.com.cnBeijing, China. UTC +8

Yuan Liu
China Mobileliuyuanyjy@chinamobile.comBeijing, China. UTC +8

Chengli Wang

China Moblile

wangchengli@chinamobile.comBeijing, China. UTC +8

Xin(Saw) Jin
Huaweisaw.jin@huawei.comBeijing, China. UTC +8



Other Information:

TSC Comment Clarification

(Roberto Kung)

Holmes should be looked with Clamp or/and Policy, mainly policy (with introduction of engines and so on). May be a split is needed (analytics – alarm aggregation, filtering, correlation in DCAE analytics microservices / policy design RCA in policy). May not be high priority for R1 (not needed for our use cases). But it is useful to show intents for following releases

(Lingli Deng)

Just to clarify, cross-layer fault correlation is in scope for VoLTE usecase for auto-healing.

(Mazin Gilbert)

This project should be split and combine with DCAE (for the correlation engine), Policy engine (for Drools), and CLAMP (for designing the open loop).

(Lingli Deng)

What about the portal demonstrating the alarms gathered, and correleation made? Would DCAE be providing a portal for that?

(Unknown)

What’s the relationship between CLAMP and Holmes?

(Guangrong Fu)

Holmes is essential for control loops so it should be somewhat provisioned by CLAMP. For instance, if possible, rules of Holmes can be deployed/un-deployed via CLAMP. But how to implement this is still a mystery because so far we haven't got any seed code or API docs about CLAMP, which prevents us from further analysis. 

Key Project Facts

Project Name:

Repo name: holmes
Lifecycle State:
Primary Contact: Guangrong Fu (fu.guangrong@zte.com.cn)
Project Lead: Guangrong Fu (fu.guangrong@zte.com.cn)
mailing list tag [Should match Jira Project Prefix] 
Committers:
Please refer to the table above.

*Link to TSC approval: 
Link to approval of additional submitters: