Skip to content

Stork Observability API#

Stork proposes an observability API that automatically observes some parameters to show how the Stork service discovery and selection are behaving.

For any observation to happen, you need to provide your own implementation of an ObservationCollector. By default, Stork provides a no-op implementation.

The ObservationCollector is responsible for instantiating the StorkObservation.

The StorkObservation reacts to Stork events thanks to a StorkEventHandler.

You can extend the metrics collection by extending the StorkEventHandler interface.

The following sequence diagram shows how the observability is initialized :

observability initialization observability initialization

The StorkObservation registers times, number of discovered instances, the selected instance and failures by reacting to the lifecycle of a Stork event such as:

  • start : Observation has been started. The beginning time is registered. It happens when the ObservationCollector#create() method gets called.
  • service discovery success: a collection of instances has been successfully discovered for a service. The end discovery time and number of instances are recorded. It happens when the StorkObservation#onServiceDiscoverySuccess gets called.
  • service discovery error: an error occurs when discovering a service. The end discovery time and failure cause are captured. It happens when the StorkObservation#onServiceDiscoveryFailure gets called.
  • service selection success: an instance has been successfully selected from the collection. The end selection time and selected instance ID are registered. It happens when the StorkObservation#onServiceSelectionSuccess gets called.
  • service selection error: an error occurred during selecting the instance. End selection time and failure cause are registered. It happens when the StorkObservation#onServiceSelectionFailure gets called.
  • end: Observation has finished. Overall duration is registered. It happens when the StorkObservation#onServiceSelectionSuccess gets called.

The following sequence diagram represents the described observation process above:

observation_process observation_process

Implementing an observation collector#

An ObservationCollector implementation must override the create method to provide an instance of StorkObservation. In addition, the user can access and enrich the observation data through the StorkEventHandler.

A custom observation collector class should look as follows:

package examples;

import io.smallrye.stork.Stork;
import io.smallrye.stork.api.observability.ObservationCollector;
import io.smallrye.stork.api.observability.StorkEventHandler;
import io.smallrye.stork.api.observability.StorkObservation;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class AcmeObservationCollector implements ObservationCollector {

    private static final Logger LOGGER = LoggerFactory.getLogger(AcmeObservationCollector.class);

    private static final StorkEventHandler ACME_HANDLER = event -> {
        //This is the terminal event. Put here your custom logic to extend the metrics collection.

        //E.g. Expose metrics to Micrometer, additional logs....
        LOGGER.info( "Service discovery took " + event.getServiceDiscoveryDuration() + ".");
        LOGGER.info( event.getDiscoveredInstancesCount() + " have been discovered for " + event.getServiceName() + ".");
        LOGGER.info( "Service selection took " + event.getServiceSelectionDuration() + ".");

        //        ...

    };

    public static StorkObservation ACME_STORK_EVENT;

    @Override
    public StorkObservation create(String serviceName, String serviceDiscoveryType,
                                   String serviceSelectionType) {
        ACME_STORK_EVENT = new StorkObservation(
                serviceName, serviceDiscoveryType, serviceSelectionType,
                ACME_HANDLER);
        return ACME_STORK_EVENT;
    }
}

The next step is to initialize Stork with an ObservableStorkInfrastructure, taking an instance of your ObservationCollector as parameter.

package examples;

import io.smallrye.stork.Stork;
import io.smallrye.stork.integration.ObservableStorkInfrastructure;

public class ObservableInitializationExample {

    public static void main(String[] args) {
        Stork.initialize(new ObservableStorkInfrastructure(new AcmeObservationCollector()));
        Stork stork = Stork.getInstance();
        // ...
    }
}

Then, Stork uses your implementation to register metrics.

Observing service discovery and selection behaviours#

To access metrics registered by StorkObservation, use the following code:

package examples;

import io.smallrye.mutiny.Uni;
import io.smallrye.stork.Stork;
import io.smallrye.stork.api.Service;
import io.smallrye.stork.api.ServiceInstance;
import io.smallrye.stork.api.observability.ObservationCollector;
import io.smallrye.stork.api.observability.StorkObservation;

import java.time.Duration;
import java.util.List;
import java.util.Map;

import static examples.AcmeObservationCollector.*;

public class ObservationExample {

    public static void example(Stork stork) {
        Service service = stork.getService("my-service");

        ObservationCollector observations = service.getObservations();

        // Gets the time spent in service discovery and service selection even if any error happens
        Duration overallDuration = ACME_STORK_EVENT.getOverallDuration();

        // Gets the total number of instances discovered
        int discoveredInstancesCount = ACME_STORK_EVENT.getDiscoveredInstancesCount();

        // Gets the error raised during the process
        Throwable failure = ACME_STORK_EVENT.failure();

        //        ...

    }
}

Stork Observability with Quarkus#

Stork metrics are automatically enabled when using Stork together with the Micrometer extension in a Quarkus application.

Micrometer collects the metrics of the rest and grpc client using Stork, as well as when using the Stork API.

As an example, if you export the metrics to Prometheus, you will get:

# HELP stork_load_balancer_failures_total The number of failures during service selection.
# TYPE stork_load_balancer_failures_total counter
stork_load_balancer_failures_total{service_name="hello-service",} 0.0
# HELP stork_service_selection_duration_seconds The duration of the selection operation 
# TYPE stork_service_selection_duration_seconds summary
stork_service_selection_duration_seconds_count{service_name="hello-service",} 13.0
stork_service_selection_duration_seconds_sum{service_name="hello-service",} 0.001049291
# HELP stork_service_selection_duration_seconds_max The duration of the selection operation 
# TYPE stork_service_selection_duration_seconds_max gauge
stork_service_selection_duration_seconds_max{service_name="hello-service",} 0.0
# HELP stork_overall_duration_seconds_max The total duration of the Stork service discovery and selection operations
# TYPE stork_overall_duration_seconds_max gauge
stork_overall_duration_seconds_max{service_name="hello-service",} 0.0
# HELP stork_overall_duration_seconds The total duration of the Stork service discovery and selection operations
# TYPE stork_overall_duration_seconds summary
stork_overall_duration_seconds_count{service_name="hello-service",} 13.0
stork_overall_duration_seconds_sum{service_name="hello-service",} 0.001049291
# HELP stork_service_discovery_failures_total The number of failures during service discovery
# TYPE stork_service_discovery_failures_total counter
stork_service_discovery_failures_total{service_name="hello-service",} 0.0
# HELP stork_service_discovery_duration_seconds_max The duration of the discovery operation
# TYPE stork_service_discovery_duration_seconds_max gauge
stork_service_discovery_duration_seconds_max{service_name="hello-service",} 0.0
# HELP stork_service_discovery_duration_seconds The duration of the discovery operation
# TYPE stork_service_discovery_duration_seconds summary
stork_service_discovery_duration_seconds_count{service_name="hello-service",} 13.0
stork_service_discovery_duration_seconds_sum{service_name="hello-service",} 6.585046209
# HELP stork_instances_count_total The number of service instances discovered
# TYPE stork_instances_count_total counter
stork_instances_count_total{service_name="hello-service",} 26.0