Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In the current implementation, ACM supports multi-participant with same supported element Type but different participantId, so they need different properties file.

In order to support replicareplication, it needs we need to support multi-participant using same properties filereplicas of participants.

Note: 

  • In a scenario of high number of compositions, if participant is restarting it will be slow-down the restarting action: AC-runtime will send a message for each composition primed and instance deployed to the participant.
    To avoid the restarting action, participant needs a database we need proper participant replication support;
  • In a scenario where a participant is stuck in deploying, the instance will be in TIMEOUT and the user can take action like deploy again or undeploy. In that scenario the intermediary-participant has to receive the next message, kill the thread that is stuck in deploying and create a new thread.
  • In a scenario where we are increasing the number or participants, could be useful to have different topic name for source and sink. This solution will eliminate the number of useless messages in kafka.
    Example: 
    • for ACM-runtime:
      • sink: POLICY-ACM-PARTICIPANT
      • source: POLICY-ACM-RUNTIME
    • for participant: 
      • sink: POLICY-ACM-RUNTIME
      • source: POLICY-ACM-PARTICIPANT

...

  • Multiple participant replicas possible - it can deal with messages across many participants
  • All participants should have same group-id in kafka
  • All should have the same participant-id.

Solution 4: Distributed Cache

Issues:

  • Not persistent - if the application that handles cache server restarts - data is lost.
  • Approval issues - with Redis, Etcd, Search Engine.

Solution 5: True Participant Replicas

The requirements are:

Disadvantages of DB use

  1. The Participant as a concept envisages an extremely lightweight mechanism, which can be deployed as a very lightweight plugin into the runtime entity implementing a participant
  2. Introducing a database into the participant means that participant becomes a heavyweight plugin with cross participant and participant/acm-r configuration required
  3. There are multiple sources of truth for participant and element data, the ACM-R DB and the this new participant DB have duplicated data
  4. Participant configuration is no longer transparent to the participant implementing component, having a DB for each participant forces orthogonal configuration across participant implementing components
  5. Configuration of components implementing participants becomes much more complex, the components must manage the participant replication
  6. The replication mechanism enforces the use of an additional 3pp, the database

Solution 4: Distributed Cache

Issues:

  • Not persistent - if the application that handles cache server restarts - data is lost.
  • Approval issues - with Redis, Etcd, Search Engine.

Solution 5: True Participant Replicas

The requirements are:

  1. Participants can be replicated, each participant can have an arbitrary number of replicas
  2. Composition definitions, instances, element instances and all their data including properties is identical in all participant replicas
  3. When anything is changed in one replica, the change is propagated to all the replicas of a participant
  4. An operation
  5. Participants can be replicated, each participant can have an arbitrary number of replicas
  6. Composition definitions, instances, element instances and all their data including properties is identical in all participant replicas
  7. When anything is changed in one replica, the change is propagated to all the replicas of a participant
  8. An operation on a composition element can be sent to any replica of  a participant, which means that for a given element, the deploy could be on replica 1, the update could be on replica 2 and the delete could be on replica 3, as one would expect in any HA solution
  9. A single REST operation called on ACM-R will select a participant replica (probably using round robin initially but we could support other algorithms in the future), and use that replica for that operation.
  10. The ACM runtime will be made HA (more than one replica of ACM-R will be supported), it will run on a HA postgres.
  11. The implementation of change propagation replication mechanism used between ACM-R and participants is transparent to participant API users
  12. Replicas are "eventually consistent", with consistency typically occurring in 100s of milliseconds

...

If ACM-R is informed by a replica that an Implementing Component changed composition element properties, Participant Synchronization synchronizes these changes to all other Participant Intermediary replicas.

In this solution:

  1. We will preserve Participant Design backward compatibility is preserved, there is no change to the participant intermediary interface for participant implementationsParticipant Configuration Backward compatibility is preserved, apart from a new "replicas" parameter (optional, default is 1)compatibility, there is no change to the participant intermediary interface for participant configurationimplementations
  2. Participant version backward compatibility will not be preserved because we need to pass replica information in the registration and  operational messages, all participants will have to be upgraded to the new version.
  3. The REST API that returns participant information will be updated to include replica informationACM-R introduces a new REST API for replica management
  4. ACM-R is made HA so that it itself can scale
  5. We can use Kafka load balancing on the participants and get the load balancing functionality for nothing

Optimal Solution:

After analysis, it is clear that the best solution to use is number 3.

  • Arbitrary number of participants possible
  • DB migrator upgrades older versions
  • Restart scenario not applicable anymore. Could be removed.
  • Approval not an issue - postgres already used by acm.
  • DB will be created automatically - as are required tables.

Older participant versions support (Regression)

...

  1. A new Kafka topic is used for synchronization

Optimal Solution:

After analysis, it is clear that the best solution to use is number 5.