In the current implementation, ACM supports multi-participant with same supported element Type but different participantId, so they need different properties file.
In order to support replica, it needs to support multi-participant using same properties file.
Note:
In a scenario of high number of compositions, if participant is restarting it will be slow-down the restarting action: AC-runtime will send a message for each composition primed and instance deployed to the participant. To avoid the restarting action, participant needs a database support;
In a scenario where a participant is stuck in deploying, the instance will be in TIMEOUT and the user can take action like deploy again or undeploy. In that scenario the intermediary-participant has to receive the next message, kill the thread that is stuck in deploying and create a new thread.
Solution
Add dynamic participantId support and add database support,
Can a participant share the database with other replicas? Yes, if participants do not work with same composition/instance at same time.
Shared database with same supported element Type: a composition will be connected to a specific participantId, so only one participant will do actions with this composition and his instances. But it can fetch all compositions that are sharing. In a restarting scenario the participant will change the participantId, and it can still fetch compositions and instances. ACM-runtime decides who has to work with, and Participant will do actions based on the message.
Changes in Participant:
UUID participantId will be generated in memory instead to fetch it in properties file.
cosumerGroup will be empty (kafka configuration): any intermediary-participant will have unique Kafka queue, so they will receive same message that will be filtered by participantId.
Add client support for no-sql database.
Add no-sql database or mock for Unit Tests.
Refactor CacheProvider to support insert/update, intermediary-participant can still use the cache in memory.
Any new/change composition and instance will be saved in database.
Refactor Participants that are using own cache in memory (Policy Participant saves policy and policy type in memory)
Changes in ACM-runtime:
When participant go OFF_LINE:
if there are compositions connected to that participant, ACM-runtime will find other ON_LINE participant with same supported element type;
if other ON_LINE participant is present it will change the connection with all compositions and instance;
after that, it will execute restart for all compositions and instances to the ON_LINE participant.
When receive a participant REGISTER:
it will check if there are compositions connected to a OFF_LINE participant with same supported element type;
if there are, it will change the connection with all compositions and instances to that new registered participant;
after that it will execute restart for all compositions and instances changed.
Refactor restarting scenario to apply the restarting only for compositions and instances in transition
Changes in docker/Kubernetes environment
Refactor CSIT to support no-sql database
Refactor performance and stability test to support no-sql database
Refactor OOM to support no-sql database
Database in Ericsson ADP marketplace
The no-sql database could be one that is already in ADP marketplace,
Distributed Coordinator ED (Etcd): Distributed systems use etcd as a consistent key-value store for configuration management, service discovery, and coordinating distributed work. Many organizations use etcd to implement production systems such as container schedulers, service discovery services, and distributed data storage.