Microservices Monitoring¶

Overview¶

This guide covers monitoring strategies, tools, and implementation approaches for microservices architecture, including metrics collection, logging, tracing, and alerting.

Prerequisites¶

Basic understanding of microservices architecture
Knowledge of Spring Boot Actuator
Familiarity with monitoring tools
Understanding of distributed systems

Learning Objectives¶

Understand microservices monitoring patterns
Learn metrics collection and visualization
Master distributed tracing
Implement centralized logging
Set up effective alerting

Metrics Collection¶

Spring Boot Actuator Configuration¶

management:
  endpoints:
    web:
      exposure:
        include: health,metrics,prometheus
  endpoint:
    health:
      show-details: always
  metrics:
    tags:
      application: ${spring.application.name}

Prometheus Configuration¶

scrape_configs:
  - job_name: 'spring-actuator'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:8080']

Custom Metrics¶

@Component
public class CustomMetricsService {
    private final MeterRegistry registry;

    public CustomMetricsService(MeterRegistry registry) {
        this.registry = registry;
    }

    public void recordOrderProcessingTime(long timeInMs) {
        registry.timer("order.processing.time")
            .record(timeInMs, TimeUnit.MILLISECONDS);
    }

    public void incrementOrderCounter() {
        registry.counter("order.processed").increment();
    }
}

Distributed Tracing¶

Sleuth Configuration¶

spring:
  sleuth:
    sampler:
      probability: 1.0
  zipkin:
    base-url: http://localhost:9411

Trace Implementation¶

@Service
public class OrderService {
    private static final Logger log = LoggerFactory.getLogger(OrderService.class);

    @Autowired
    private Tracer tracer;

    public Order processOrder(OrderRequest request) {
        Span span = tracer.currentSpan();
        span.tag("orderId", request.getOrderId());

        log.info("Processing order: {}", request.getOrderId());
        // Process order
        return order;
    }
}

Centralized Logging¶

Logback Configuration¶

<configuration>
    <appender name="ELK" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
        <destination>localhost:5000</destination>
        <encoder class="net.logstash.logback.encoder.LogstashEncoder">
            <customFields>{"app":"${springApplicationName}"}</customFields>
        </encoder>
    </appender>

    <root level="INFO">
        <appender-ref ref="ELK" />
    </root>
</configuration>

Structured Logging¶

@Slf4j
@Service
public class PaymentService {
    public void processPayment(Payment payment) {
        log.info("Processing payment: {}", payment.getId(),
            kv("paymentId", payment.getId()),
            kv("amount", payment.getAmount()),
            kv("status", payment.getStatus()));
    }
}

Health Checks¶

Custom Health Indicator¶

@Component
public class DatabaseHealthIndicator implements HealthIndicator {
    private final DataSource dataSource;

    @Override
    public Health health() {
        try (Connection conn = dataSource.getConnection()) {
            PreparedStatement ps = conn.prepareStatement("SELECT 1");
            ps.executeQuery();
            return Health.up()
                .withDetail("database", "PostgreSQL")
                .withDetail("status", "Connected")
                .build();
        } catch (SQLException ex) {
            return Health.down()
                .withDetail("error", ex.getMessage())
                .build();
        }
    }
}

Composite Health Check¶

@Configuration
public class HealthCheckConfig {
    @Bean
    public CompositeHealthContributor healthContributor(
            DatabaseHealthIndicator dbHealth,
            CacheHealthIndicator cacheHealth) {
        Map<String, HealthIndicator> indicators = new HashMap<>();
        indicators.put("database", dbHealth);
        indicators.put("cache", cacheHealth);
        return CompositeHealthContributor.fromMap(indicators);
    }
}

Alerting¶

Alert Configuration¶

alerting:
  rules:
    - alert: HighErrorRate
      expr: rate(http_server_requests_seconds_count{status="5xx"}[5m]) > 0.1
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: High error rate detected
        description: "Service {{ $labels.service }} has high error rate"

Alert Manager Configuration¶

route:
  group_by: ['alertname', 'service']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'team-emails'

receivers:
  - name: 'team-emails'
    email_configs:
      - to: 'team@example.com'

Best Practices¶

Implement comprehensive metrics collection
Use distributed tracing for request flows
Centralize logs with proper context
Implement meaningful health checks
Set up proper alerting thresholds
Monitor system resources
Implement proper dashboard visualization

Common Pitfalls¶

Insufficient monitoring coverage
Poor log aggregation
Missing important metrics
Inadequate alerting
Resource-heavy monitoring
Poor visualization

Implementation Examples¶

Complete Monitoring Setup¶

@Configuration
public class MonitoringConfig {
    @Bean
    public MeterRegistry meterRegistry() {
        CompositeMeterRegistry registry = new CompositeMeterRegistry();
        registry.config()
            .commonTags("application", "${spring.application.name}");
        return registry;
    }

    @Bean
    public TimedAspect timedAspect(MeterRegistry registry) {
        return new TimedAspect(registry);
    }
}

Metrics Aspect¶

@Aspect
@Component
public class MetricsAspect {
    private final MeterRegistry registry;

    @Around("@annotation(Timed)")
    public Object timeMethod(ProceedingJoinPoint joinPoint) throws Throwable {
        Timer.Sample sample = Timer.start(registry);
        try {
            return joinPoint.proceed();
        } finally {
            sample.stop(Timer.builder("method.execution.time")
                .tag("class", joinPoint.getSignature().getDeclaringTypeName())
                .tag("method", joinPoint.getSignature().getName())
                .register(registry));
        }
    }
}

Resources for Further Learning¶

Practice Exercises¶

Set up Spring Boot Actuator with custom metrics
Implement distributed tracing with Sleuth and Zipkin
Configure centralized logging with ELK stack
Create custom health indicators
Set up Prometheus and Grafana dashboards