Observability

Written by

Wilco team

•

January 10, 2025

Understanding Observability

Observability, in the context of systems engineering, is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. It's all about making systems transparent and understandable.

Why Observability Matters?

With the increasing complexity of systems, understanding system behavior and troubleshooting issues have become more challenging than ever. Observability provides the tools and practices necessary to gain insights into system performance, identify problems, and debug efficiently.

Implementing Observability

Metrics

Metrics are the numbers that tell the story of a system. They provide quantitative data about the system's operation. You can monitor things like response time, error rates, or resource usage.


# A basic example of collecting metrics using Python's 'psutil' library
import psutil

# Get system CPU usage
cpu_usage = psutil.cpu_percent()

# Get system memory usage
memory_usage = psutil.virtual_memory().percent

print('CPU Usage:', cpu_usage)
print('Memory Usage:', memory_usage)

Traces

Traces provide insights into the journey of a request as it flows through your system. It helps in understanding the system's behavior on a granular level.


// A basic example of tracing in Go using 'opentracing' library
import (
    "github.com/opentracing/opentracing-go"
)

func main() {
    // Start a new span
    span := opentracing.StartSpan("my_span")
    defer span.Finish()

    // Do some work
    // ...
}

Logs

Logs are the diary of your system. They record the events and transactions that occur in your system. When something goes wrong, logs are usually the first place you look.


// A basic example of logging in Java using 'log4j' library
import org.apache.log4j.Logger;

public class LogExample{
   /* Get actual class name to be printed on */
   static Logger log = Logger.getLogger(LogExample.class.getName());
   
   public static void main(String[] args){
      log.info("This is an info message.");
      log.error("This is an error message.");
   }
}

Top 10 Key Takeaways

Observability is a measure of how well internal states of a system can be inferred from its outputs.
Observability is crucial in understanding system behavior and troubleshooting issues.
Metrics, traces, and logs are the three pillars of observability.
Metrics provide quantitative data about the system's operation.
Traces provide insights into the journey of a request as it flows through your system.
Logs record the events and transactions that occur in your system.
Use appropriate libraries and tools to implement observability in your system.
Always handle errors and exceptions while implementing observability.
Proper observability practices can help in efficient debugging and optimization.
Observability is not just a tool but a culture that promotes transparency and understanding of systems.

Ready to start learning? Start the quest now