Observability

Microservices, in addition to great strengths, have a few shortcomings. One of them is the difficulty to understand what is going on in a highly distributed system. To address this issue, microservices get to collect information about their operations in the form of logs, traces, and performance metrics in specialized storages. Then, that information can be used for health monitoring, troubleshooting, security protection, and other important maintenance scenarios.

Correlation (trace) ids

It is not sufficient just to collect logs and traces across multiple microservices. It is critical to connect that information in a logical sequence within the transaction context. To do that Pip.Services employs correlationIds, which are rigorously used as first parameters in all business methods across all microservice components and sent over via all synchronous calls and asynchronous messages. When errors are thrown, logs or traces recorded, correlationIds are always included there.

A correlationId is any value that can uniquely identify business transactions in a system. One way to generate correlationIds is to use natural keys, like “transaction name + timestamp”. Another common way to generate correlationIds is to use string GUIDs. Although, they could be too long and lack meaning, they are unique in the universe and very easy to generate.

Logging

There are myriads of logging libraries in all known programming languages. But Pip.Services includes its own abstractions for logging. Why? There are several reasons for that:

  1. To achieve consistency and symmetry across all languages, as it is the key goal for the toolkit
  2. To use the Pip.Services component model and easily integrate with common patterns like inversion of control or configurations
  3. To enforce the use of correlationIds, as without them the collected information has less value for analysis

The ILogger interface looks pretty standard. It allows logging messages at different levels: FATAL, ERROR, WARNING, INFO, DEBUG and TRACE. The interface is defined in the log package of the components module. In addition to that, Pip.Services includes a variety of loggers:

  • NullLogger: Dummy implementation of a logger with no real effect.
  • ConsoleLogger: Logger that writes log messages to console.
  • FluentdLogger: Logger that dumps execution logs to a Fluentd service.
  • ElasticSearchLogger: Logger that dumps execution logs to ElasticSearch service.
  • CloudWatchLogger: Logger that writes log messages to AWS CloudWatch Log.
  • AppInsightsLogger

It is common in microservices to log simultaneously into several destinations - to console for debugging as well into a distributed logging system. To support this scenario the toolkit includes the CompositeLogger, which is used by components to collect messages and then distribute them across all loggers included in the microservice.

# Console logger
descriptor: "pip-services:logger:console:default:1.0"
level: info

# Elastic search logger
descriptor: "pip-services:logger:elasticsearch:default:1.0"
connection:
  host: {{ELASTICSEARCH_SERVICE_HOST}}
  port: {{ELASTICSEARCH_SERVICE_PORT}}
import { IReferenceable, IReferences } from "pip-services3-commons-nodex";
import { CompositeLogger } from "pip-services3-components-nodex";


class MyComponent implements IReferenceable {
  private _logger: CompositeLogger = new CompositeLogger();

  public setReferences(refs: IReferences) {
    this._logger.setReferences(refs);
  }

  public doSomething(correlationId: string) {
    this._logger.debug(correlationId, "Did something...");
    ...
  }
}

using PipServices3.Commons.Refer;
using PipServices3.Components.Log;


public class MyComponent : IReferenceable
{
    private CompositeLogger _logger = new CompositeLogger();

    public void SetReferences(IReferences references)
    {
        this._logger.SetReferences(references);
    }

    public void DoSomething(string correlationId)
    {
        this._logger.Debug(correlationId, "Did something…");
        ...
    }
}

import (
	"context"

	cref "github.com/pip-services3-gox/pip-services3-commons-gox/refer"
	clog "github.com/pip-services3-gox/pip-services3-components-gox/log"
)

type MyComponent struct {
	// ...
	logger *clog.CompositeLogger
}

func (c *MyComponent) SetReferences(ctx context.Context, references cref.IReferences) {
	c.logger.SetReferences(ctx, references)
}

func (c *MyComponent) DoSomething(ctx context.Context, correlationId string) {
	c.logger.Debug(ctx, correlationId, "Did something...")
	// ...
}

import 'package:pip_services3_commons/pip_services3_commons.dart';
import 'package:pip_services3_components/pip_services3_components.dart';

class MyComponent implements IReferenceable {
   CompositeLogger _logger = CompositeLogger();

  @override
  void setReferences(IReferences refs) {
    _logger.setReferences(refs);
  }

  void doSomething(String? correlationId) {
    _logger.debug(correlationId, 'Did somethin...');
    ...
  }
}

from pip_services3_commons.refer import IReferenceable, IReferences
from pip_services3_components.log import CompositeLogger


class MyComponent(IReferenceable):
    _logger: CompositeLogger = CompositeLogger()

    def set_references(self, refs: IReferences):
        self._logger.set_references(refs)

    def do_something(self, correlation_id: str):
        self._logger.debug(correlation_id, "Did something...")
        ...


Not available

Performance monitoring

While logs tell “what” the system is doing, performance metrics (or counters) tell “how” it’s being done: how much, how fast, how reliable, and so on. They represent the state of the system from a non-functional perspective and are critical to measure and analyze performance, scalability, and reliability characteristics.

The ICounters from the count package in the components module is the standard interface for all performance monitoring components. It allows to increment counters, record values of self-calculated metrics, record timestamps, calculate min/average/max statistics and measure time intervals. In addition, there are several implementations available out-of-the-box:

  • NullCounters: Dummy implementation of performance counters.
  • LogCounters: Performance counters that periodically dump counters' measurements to a logger.
  • PrometheusCounters: Performance counters that send their metrics to Prometheus service.
  • CloudWatchCounters: Performance counters that periodically dump counters to AWS Cloud Watch Metrics.
  • AppInsightsCounters

Similar to logs, microservices can send counters to several destinations: periodically dump them into console logs, and send them to a centralized monitoring system. CompositeCounters components can be used in microservice components to collect metrics and distribute them to all Counters included in the microservice.

The example below shows how to collect several performance metrics from processing incoming messages:

# Log counters
descriptor: "pip-services:counters:log:default:1.0"

# Prometheus counters
descriptor: "pip-services:counters:prometheus:default:1.0"

# Metrics service used by prometheus to collect metrics
descriptor: "pip-services:metrics-service:prometheus:default:1.0"
import { IReferenceable, IReferences } from "pip-services3-commons-nodex";
import { CompositeCounters, CompositeLogger } from "pip-services3-components-nodex";
import { MessageEnvelope } from "pip-services3-messaging-nodex";


class MyComponent implements IReferenceable {
  private _counters: CompositeCounters = new CompositeCounters();

  public setReferences(refs: IReferences) {
    this._counters.setReferences(refs);
  }

  public onMessage(message: MessageEnvelope) {
    let timing = this._counters.beginTiming("mycomponent:msg_time");
    try {
      this._counters.increment("mycomponent:msg_count", 1);
      ...
    } catch (ex) {
      this._counters.increment("mycomponent:msg_errors", 1);
    } finally {
      timing.endTiming();
    }
  }
}
using PipServices3.Commons.Refer;
using PipServices3.Components.Count;
using PipServices3.Messaging.Queues;


public class MyComponent : IReferenceable
{
    private CompositeCounters _counters = new CompositeCounters();

    public void SetReferences(IReferences references)
    {
        this._counters.SetReferences(references);
    }

    public void OnMessage(MessageEnvelope message)
    {
        var timing = this._counters.BeginTiming("mycomponent:msg_time");
        try
        {
            this._counters.Increment("mycomponent:msg_count", 1);
            ...
        }
        catch (Exception ex)
        {
            this._counters.Increment("mycomponent:msg_errors", 1);
        }
        finally
        {
            timing.EndTiming();
        }
    }
}

import (
	"context"

	cref "github.com/pip-services3-gox/pip-services3-commons-gox/refer"
	ccount "github.com/pip-services3-gox/pip-services3-components-gox/count"
	cmsg "github.com/pip-services3-gox/pip-services3-messaging-gox/queues"
)

type MyComponent struct {
	// ...
	counters *ccount.CompositeCounters
}

func (c *MyComponent) SetReferences(ctx context.Context, references cref.IReferences) {
	c.counters.SetReferences(ctx, references)
}

func (c *MyComponent) OnMessage(ctx context.Context, message *cmsg.MessageEnvelope) {
	timing := c.counters.BeginTiming(ctx, "mycomponent:msg_time")
	defer timing.EndTiming(ctx)

	c.counters.Increment(ctx, "mycomponent:msg_count", 1)
	// ...

	if err != nil {
		c.counters.Increment(ctx, "mycomponent:msg_errors", 1)
	}
}

import 'package:pip_services3_commons/pip_services3_commons.dart';
import 'package:pip_services3_components/pip_services3_components.dart';
import 'package:pip_services3_messaging/pip_services3_messaging.dart';

class MyComponent implements IReferenceable {
  CompositeCounters _counters = CompositeCounters();

  @override
  void setReferences(IReferences refs) {
    _counters.setReferences(refs);
  }

  void onMessage(MessageEnvelope message) {
    var timing = _counters.beginTiming('mycomponent:msg_time');
    try {
      _counters.increment('mycomponent:msg_count', 1);
      ...
    } catch (ex) {
      _counters.increment('mycomponent:msg_errors', 1);
    } finally {
      timing.endTiming();
    }
  }
}

from pip_services3_commons.refer import IReferenceable, IReferences
from pip_services3_components.count import CompositeCounters
from pip_services3_messaging.queues import MessageEnvelope


class MyComponent(IReferenceable):
    _counters: CompositeCounters = CompositeCounters()

    def set_references(self, refs: IReferences):
        self._counters.set_references(refs)

    def on_message(self, message: MessageEnvelope):
        timing = self._counters.begin_timing("mycomponent:msg_time")
        try:
            self._counters.increment("mycomponent:msg_count", 1)
            ...
        except Exception as ex:
            self._counters.increment("mycomponent:msg_errors", 1)
        finally:
            timing.end_timing()

Not available

Traces

The most common observability scenario consists of monitoring the invocation of microservice operations. This can be done via a combination of logs and counters: logs will record what operations are executed and errors when they happen, and counters will calculate how often operations were called and what’s their execution time and error rate. However, some monitoring systems like Datadog or Splank have started offering APIs to collect traces separately, enabling rich visualizations and analytics around them.

The Pip.Services toolkit includes the ITraces interface for tracing components defined in the trace package in the components module. There are a few ready-to-use tracing components available in the toolkit:

  • NullTracer: Dummy implementation of tracer that doesn’t do anything.
  • LogTracer: Tracer that dumps recorded traces to a logger.
  • DataDogTracer

Just like logs and metrics, a tracer can be sent to multiple destinations using the CompositeTracer component.

# Log tracer
descriptor: "pip-services:tracer:log:default:1.0"

# DataDog traces
descriptor: "pip-services:tracer:datadog:default:1.0"
connection:
  api_key: {{DATADOG_API_KEY}}

import { IReferenceable, IReferences } from "pip-services3-commons-nodex";
import { CompositeTracer } from "pip-services3-components-nodex";


class MyComponent implements IReferenceable {
  private _tracer: CompositeTracer = new CompositeTracer();

  public setReferences(refs: IReferences) {
    this._tracer.setReferences(refs);
  }

  public doSomething(correlationId: string) {
    let timing = this._tracer.beginTrace(correlationId, "mycomponent", "do_something");
    try {
      ...
      timing.endTrace();
    } catch (ex) {
      timing.endFailure(ex);
    }
  }
}

using PipServices3.Commons.Refer;
using PipServices3.Components.Trace;

public class MyComponent : IReferenceable
{
    private CompositeTracer _tracer = new CompositeTracer();

    public void SetReferences(IReferences references)
    {
        this._tracer.SetReferences(references);
    }

    public void DoSomething(string correlationId)
    {
        var timing = this._tracer.BeginTrace(correlationId, "mycomponent", "do_something");
        try
        {
            // ...
            timing.EndTrace();
        }
        catch (Exception ex)
        {
            timing.EndFailure(ex);
        }
    }
}


import (
	"context"

	cref "github.com/pip-services3-gox/pip-services3-commons-gox/refer"
	ctrace "github.com/pip-services3-gox/pip-services3-components-gox/trace"
)

type MyComponent struct {
	// ...
	tracer *ctrace.CompositeTracer
}

func (c *MyComponent) SetReferences(ctx context.Context, references cref.IReferences) {
	c.tracer.SetReferences(ctx, references)
}

func (c *MyComponent) DoSomething(ctx context.Context, correlationId string) {
	timing := c.tracer.BeginTrace(ctx, correlationId, "mycomponent", "do_something")

	// ...

	if err != nil {
		timing.EndFailure(ctx, err)
	} else {
		timing.EndTrace(ctx)
	}
}


import 'package:pip_services3_commons/pip_services3_commons.dart';
import 'package:pip_services3_components/src/trace/CompositeTracer.dart';

class MyComponent implements IReferenceable {
  CompositeTracer _tracer = new CompositeTracer();

  @override
  void setReferences(IReferences refs) {
    _tracer.setReferences(refs);
  }

  void doSomething(String? correlationId) {
    var timing = _tracer.beginTrace(correlationId, 'mycomponent', 'do_something');
    try {
      ...
      timing.endTrace();
    } catch (ex) {
      timing.endFailure(ex as Exception);
    }
  }
}

from pip_services3_commons.refer import IReferenceable, IReferences
from pip_services3_components.trace import CompositeTracer


class MyComponent(IReferenceable):
    _tracer: CompositeTracer = CompositeTracer()

    def set_references(self, refs: IReferences):
        self._tracer.set_references(refs)

    def do_something(self, correlation_id: str):
        timing = self._tracer.begin_trace(correlation_id, "mycomponent", "do_something")
        try:
            ...
            timing.end_trace()
        except Exception as ex:
            timing.end_failure(ex)


Not available

References

For more information about observability see: