Recent Posts
Archives

Posts Tagged ‘DuncanDeVore’

PostHeaderIcon [ScalaDaysNewYork2016] Monitoring Reactive Applications: New Approaches for a New Paradigm

Reactive applications, built on event-driven and asynchronous foundations, require innovative monitoring strategies. At Scala Days New York 2016, Duncan DeVore and Henrik Engström, both from Lightbend, explored the challenges and solutions for monitoring such systems. They discussed how traditional monitoring falls short for reactive architectures and introduced Lightbend’s approach to addressing these challenges, emphasizing adaptability and precision in observing distributed systems.

The Shift from Traditional Monitoring

Duncan and Henrik began by outlining the limitations of traditional monitoring, which relies on stack traces in synchronous systems to diagnose issues. In reactive applications, built with frameworks like Akka and Play, the asynchronous, message-driven nature disrupts this model. Stack traces lose relevance, as actors communicate without a direct call stack. The speakers categorized monitoring into business process, functional, and technical types, highlighting the need to track metrics like actor counts, message flows, and system performance in distributed environments.

The Impact of Distributed Systems

The rise of the internet and cloud computing has transformed system design, as Duncan explained. Distributed computing, pioneered by initiatives like ARPANET, and the economic advantages of cloud platforms have enabled businesses to scale rapidly. However, this shift introduces complexities, such as network partitions and variable workloads, necessitating new monitoring approaches. Henrik noted that reactive systems, designed for scalability and resilience, require tools that can handle dynamic data flows and provide insights into system behavior without relying on traditional metrics.

Challenges in Monitoring Reactive Systems

Henrik detailed the difficulties of monitoring asynchronous systems, where data flows through push or pull models. In push-based systems, monitoring tools must handle high data volumes, risking overload, while pull-based systems allow selective querying for efficiency. The speakers emphasized anomaly detection over static thresholds, as thresholds are hard to calibrate and may miss nuanced issues. Anomaly detection, exemplified by tools like Prometheus, identifies unusual patterns by correlating metrics, reducing false alerts and enhancing system understanding.

Lightbend’s Monitoring Solution

Duncan and Henrik introduced Lightbend Monitoring, a subscription-based tool tailored for reactive applications. It integrates with Akka actors and Lagom circuit breakers, generating metrics and traces for backends like StatsD and Telegraf. The solution supports pull-based monitoring, allowing selective data collection to manage high data volumes. Future enhancements include support for distributed tracing, Prometheus integration, and improved Lagom compatibility, aiming to provide a comprehensive view of system health and performance.

Links: