Workflow Transition Performance

Table of Contents

Efficiency Gains (since release train 2.0.1)
Concurrency

Performing large-scale imports, as well as promoting/deploying a large number of records, is a fairly intensive operation for the system involving numerous database and message broker interactions. Tuning this flow for maximum throughput is an important performance consideration for your system.

Efficiency Gains (since release train 2.0.1)

A performance refactor of a number of components was performed as part of this release. Processes involved in the admin import and workflow transitions flows have been streamlined across the board. When combined with concurrency configuration below, upgrading to this release (or beyond) will help realize optimum throughput.

Concurrency

Establishing a configuration that increases parallel processing of messages is important to maximizing throughput. By default, the Spring Cloud Stream configuration will process on a single thread. This leaves a lot of performance potential on the table. Additional configuration is required to cause the system to pull messages from more than a single thread.

Important

The reference implementation is configured for Kafka interaction by default. There are behavior specifics with Kafka to be aware of that affect how you configure for concurrency.

Kafka Concurrency Details

Kafka achieves concurrent consumption via partitions
Each instance of consumer is bound to an individual partition
If more consumers are available than partitions, then some consumers will remain idle
If more partitions are available than consumers, then some/all consumers will be bound to multiple partitions
Concurrency can be achieved at multiple levels
- Within a single application instance, the concurrency value in the Spring Cloud Stream config is used to realize multiple consuming threads for a topic (1 thread per available partition)
- Multiple replicas of the application will also bind consumers to available partitions. Adding replicas is usually a path to linear scale.
Generally, you will want to set the partitionCount value for your producer to (concurrency value) * (number of replicas). If your replica count is flexible, it is a good idea to set the partition count higher to account for your max replica count.

Example 1 : One FlexPackage - application.yml (Kafka)

...
broadleaf:
  composite:
    datasource:
      hikari:
        maximumPoolSize: 30 (1)
spring:
  cloud:
    stream:
      kafka:
        binder:
          auto-add-partitions: true (2)
      bindings:
        changeSummaryInput: (3)
          consumer:
            concurrency: 20 (4)
            partitioned: true (5)
        changeSummaryOutput: (6)
          producer:
            partitionCount: 20 (7)
            partitionKeyExpression: headers.id (8)
        workflowRequestCompletionInput: (9)
          consumer:
            concurrency: 20
            partitioned: true
        workflowRequestCompletionOutput: (10)
          producer:
            partitionCount: 20
            partitionKeyExpression: headers.id
        deploymentInput: (11)
          consumer:
            concurrency: 20
            partitioned: true
        deploymentOutput: (12)
          producer:
            partitionCount: 20
            partitionKeyExpression: headers.id
        promotionInput: (13)
          consumer:
            concurrency: 20
            partitioned: true
        promotionOutput: (14)
          producer:
            partitionCount: 20
            partitionKeyExpression: headers.id
        rejectionInput: (15)
          consumer:
            concurrency: 5
            partitioned: true
        rejectionOutput: (16)
          producer:
            partitionCount: 5
            partitionKeyExpression: headers.id
        reversionInput: (17)
          consumer:
            concurrency: 5
            partitioned: true
        reversionOutput: (18)
          producer:
            partitionCount: 5
            partitionKeyExpression: headers.id
...

maximumPoolSize - This is the database connection pool size. This will generally need to be increased to handle the additional concurrency.
auto-add-partitions - This is unique to the kafka binder. Kafka handles concurrency by assigning additional partitions. Depending on how you have kafka configured, the system can dynamically create these partitions for you based on this config, or you may have to manually configure Kafka if you have it more locked down. See https://spring.io/blog/2021/02/03/demystifying-spring-cloud-stream-producers-with-apache-kafka-partitions for more details.
changeSummaryInput - This is the name of the listening channel in SandboxServices. A message is received here any time a change is made to a sandboxable entity in the admin (or imported into a sandbox).
concurrency - Establish the number of listening threads.
partitioned - Set to true to denote the consumption is across several partitions.
changeSummaryOutput - This is the name of the producing channel in any microservice managing sandboxable entities. A message is sent from here to SandboxServices to notify it of the change for overall transition management.
partitionCount - The number of partitions over which the producer will spread messages.
partitionKeyExpression - Expression denoting the header value on the message that will be used in the algorithm for deciding on which partition to send the message.
workflowRequestCompletionInput - This is the name of the listening channel in SandboxServices. A message is received here any time a microservice completes a request to transition an entity (e.g. promote, deploy).
workflowRequestCompletionOutput - This is the name of the producing channel in any microservice. A message is sent from here when the microservice completes a request to transition an entity.
deploymentInput - This is the name of the listening channel in any microservice. A message is received here any time SandboxServices initiates a deploy request.
deploymentOutput - This is the name of the producing channel in SandboxServices. A message is sent from here when a deployment request is initiated.
promotionInput - This is the name of the listening channel in any microservice. A message is received here any time SandboxServices initiates a promote request.
promotionOutput - This is the name of the producing channel in SandboxServices. A message is sent from here when a promotion request is initiated.
rejectionInput - This is the name of the listening channel in any microservice. A message is received here any time SandboxServices initiates a reject request.
rejectionOutput - This is the name of the producing channel in SandboxServices. A message is sent from here when a rejection request is initiated.
reversionInput - This is the name of the listening channel in any microservice. A message is received here any time SandboxServices initiates a revert request.
reversionOutput - This is the name of the producing channel in SandboxServices. A message is sent from here when a reversion request is initiated.

As illustrated above, there are a lot of async lifecycle steps involved between SandboxServices and other microservices required to complete a transition flow. In the case of Kafka, by increasing the number of partitions involved, we can greatly increase the concurrency, and more thoroughly utilize available resource for maximum throughput. This example illustrates the One FlexPackage where all the microservices are co-located into a single runtime. Furthermore, if multiple application replicas are involved, it will be interesting to decrease the concurrency value, or increase the partition count to spread the work more uniformly over the available instances. See the Kafka Concurrency Details section above for more information about this.

Expectations

In the lab, using a configuration similar to above, we saw a 50x - 60x boost in performance over the previous version codebase with default configuration. Your mileage may vary depending on the number of CPUs you have assigned to a given replica performing relevant work in k8s, as well your database sizing and ability to assign a larger connection pool per replica. You should perform some experimentation with workload to determine what infrastructure outlay and associated concurrency configuration makes the most sense for you.

Other FlexPackage Combinations

The One FlexPackage is interesting as a simple vehicle to illustrate the configuration and interaction. However, this FlexPackage is not intended for production usage. Most likely, either the Balanced, Granular, or custom configurations are used. This forces splitting up the configuration above across multiple application deployment units.

Balanced

Processing - The application.yml here should contain all general microservice transition configuration (not SandboxServices). This includes maximumPoolSize, auto-add-partitions, workflowRequestCompletionOutput, changeSummaryOutput, deploymentInput, promotionInput, rejectionInput, reversionInput.
Cart - N/A
Browse - N/A
Supporting - SandboxServices related. This includes maximumPoolSize, auto-add-partitions, changeSummaryInput, workflowRequestCompletionInput, deploymentOutput, promotionOutput, rejectionOutput, reversionOutput.

Granular

There is no specific "Processing" FlexPackage here for performing transition work. Rather, every microservice is separate and not co-located in a runtime with other microservices. As a result, every microservice capable of managing sandboxable entities is a candidate for the (Balanced - Processing) items in the section above. SandboxServices is a special case and would match config with the (Balanced - Supporting) items in the section above.

Tip	If you have lesser used bounded contexts in a granular configuration, you can leave the microservices managing those contexts at their default config to save on overall sizing increases.

Other Brokers

This document focuses on the concurrency configuration related to Kafka. Other brokers don’t have a "partition" concept, for example, and handle concurrency differently. In that case, settings related to concurrency and partitions may no longer be relevant. Refer to the Spring Cloud Stream binder documentation for your broker to determine the correct properties to set to achieve similar concurrency.

https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/