Least Requests Load Balancing

The least-requests load balancing strategy monitors the number of inflight calls and selects the less-used instance.

This strategy keeps track of the inflight calls made by the application and picks the service instance with the smallest number of inflight requests:

when the selection happens, the service instance with the smallest number of inflight requests is selected, and this number is incremented
when the operation completes, successfully or not, the number of inflight requests is decremented

Dependency

First, you need to add the least-requests load-balancer to your project:

<dependency>
    <groupId>io.smallrye.stork</groupId>
    <artifactId>stork-load-balancer-least-requests</artifactId>
    <version>1.1.1</version>
</dependency>

Configuration

For each service expected to use a least-response-time selection, configure the load-balancer to be least-requests:

stork standalonestork in quarkus

stork.my-service.service-discovery.type=...
stork.my-service.service-discovery...=...
stork.my-service.load-balancer.type=least-requests

quarkus.stork.my-service.service-discovery.type=...
quarkus.stork.my-service.service-discovery...=...
quarkus.stork.my-service.load-balancer.type=least-requests