Skip to content

02

πŸš€ Welcome to Lab : Centralized Logging with Grafana Loki

πŸ“š What You'll Learn

In this hands-on lab, you'll deploy a complete centralized logging infrastructure using Grafana Loki and Grafana Alloy. You'll learn how Loki's innovative label-based indexing makes it 10x more cost-effective than traditional logging systems like Elasticsearch.

By the end of this lab, you'll have logs flowing from Docker containers into Loki, ready to be queried through Grafana's powerful interface.

🎯 Lab Structure

  • Task 1: Understanding Log-Based Observability (15 min)
  • Task 2: Deploy Loki Stack with Docker Compose (20 min)
  • Task 3: Configure Alloy for Log Collection (25 min)
  • Task 4: Deploy Sample Application (15 min)

Q: What makes Loki more cost-effective than traditional logging systems like Elasticsearch? A: It indexes only labels, not full log content.

Loki's key innovation is that it indexes only metadata (labels), not the full text of log lines. This approach significantly reduces storage and compute costs while maintaining the ability to perform fast queries. In contrast, traditional systems like Elasticsearch index every word, resulting in costs that are ten times higher at scale.

Q: When would you use logs instead of metrics? A: To find the exact error message for a failed request.

Logs provide context and detail for individual events. They are ideal for debugging specific failures, viewing stack traces, and understanding the exact sequence of events. In contrast, metrics are more suitable for trends and aggregations such as percentiles, rates, and resource usage over time.


Real-World Context

Why This Matters

In production environments, you need more than metrics to troubleshoot issues effectively:

  • Metrics show you that an API's error rate increased from 1% to 15%
  • Logs tell you the exact error message: "Database connection pool exhausted after 30 seconds"

This combination enables rapid root cause analysis and significantly reduces Mean Time to Diagnosis (MTTD).

Who Uses Loki?

Grafana Loki is used by:

  • Cloud-native startups building on Kubernetes
  • Fortune 500 companies transitioning from expensive proprietary solutions
  • DevOps teams managing multi-cloud environments
  • SRE teams responsible for high-scale production systems

Loki is a Cloud Native Computing Foundation (CNCF) project with massive industry adoption, making it a critical skill for modern operations roles.

Hands-On Activities

You'll perform these practical tasks:

  1. Modify docker-compose.yml to add Loki and Alloy services
  2. Create Loki configuration specifying storage and retention policies
  3. Write Alloy config defining log collection rules and label extraction
  4. Deploy a Python Flask app that generates structured JSON logs
  5. Query logs in Grafana using the Explore interface
  6. Validate log ingestion by checking Loki's API endpoints

Prerequisites

To succeed in this lab, you should have:

  • Basic Docker knowledge: Understanding of containers and Docker Compose
  • Command-line proficiency: Comfortable with Linux/Unix terminal commands
  • YAML familiarity: Ability to read and edit YAML configuration files
  • Prometheus/Grafana basics (Recommended): Having completed "AIOps Foundations: Intelligent Monitoring with Prometheus & Grafana" is helpful but not required.

Lab Environment

Pre-Configured Components

Your lab environment comes pre-configured with:

  • Alpine Linux with Docker and Docker Compose installed
  • Pre-pulled Docker images for Loki, Alloy, Grafana, and supporting services
  • Network configuration allowing services to communicate
  • Template files ready for you to customize

What You'll Build

By the end of this lab, you'll have:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚             Grafana (Port 3000)                 β”‚
β”‚        (Visualization & Querying)               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β”‚ Query logs via API
                β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           Grafana Loki (Port 3100)              β”‚
β”‚        (Log Aggregation & Storage)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β”‚ Send logs
                β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Grafana Alloy (Port 12345)              β”‚
β”‚          (Log Collection Agent)                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β”‚ Scrape container logs
                β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Sample Flask App      β”‚   Your Other Apps     β”‚
β”‚   (Generates Logs)      β”‚   (Future Services)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Q: What is the main advantage of structured logging (JSON format) over plain text logs? Correct Answer: It's easier to parse and query programmatically

Structured logging (JSON) provides a consistent, machine-readable format that makes it trivial to extract specific fields, filter on values, and aggregate data. With plain text logs, you'd need complex regex patterns to extract information. JSON logs enable powerful queries like: {container="app"} | json | response_time > 1000


|= - Contains text != - Does not contain |~ - Regex match !~ - Regex not match