# Module 09: Synthetic Advanced Features (Labs 19-23) Structured Logging, Custom Metrics, Extensions, Alerting, and SLOs
Navigate: [All Slides](../index.html) | [Prev: Browser Testing](../08_Browser_Testing/index.html) | [Next: Observability Integration](../10_Observability_Integration/index.html)
## Module Overview Advanced k6 and Grafana SM features: - **Lab 19:** Structured logging (JSON format, log levels, VU context) - **Lab 20:** Custom metrics (Counter, Gauge, Rate, Trend) - **Lab 21:** k6 extensions and the xk6 ecosystem - **Lab 22:** Alerting on synthetic monitoring results - **Lab 23:** SLOs and error budgets
## Lab 19: Structured Logging k6 supports four console log levels: | Method | Level | When to use | |--------|-------|-------------| | `console.log` | info | General trace messages | | `console.info` | info | Informational milestones | | `console.warn` | warn | Unexpected-but-not-failing conditions | | `console.error` | error | Failures, assertion errors | All output goes to **stderr** by default.
## JSON Log Format Enable structured logging: ```bash k6 run --log-format json script.js ``` Output: ```json { "level":"info", "msg":"product count: 5", "source":"console", "time":"2024-01-15T10:23:45.123Z" } ``` Compatible with Grafana Loki without transformation.
## Writing Logs to File Separate log output from progress display: ```bash mkdir -p logs k6 run --log-output file=logs/test.log \ --log-format json script.js ``` Filter with jq: ```bash cat logs/test.log | jq 'select(.level == "error")' ```
## Embed VU and Iteration Context Use global variables in log messages: ```javascript const productsRes = http.get('http://localhost:3000/api/products'); console.log(`VU ${__VU} iter ${__ITER}: /api/products → ${productsRes.status}`); ``` Output: ``` INFO[0001] VU 1 iter 0: /api/products → 200 INFO[0001] VU 2 iter 0: /api/products → 200 INFO[0003] VU 1 iter 1: /api/products → 200 ```
## Conditional Logging Pattern Log only on failure to reduce volume: ```javascript const loginOk = check(loginRes, { 'login status 200': (r) => r.status === 200, }); if (!loginOk) { console.error( `VU ${__VU} iter ${__ITER}: login FAILED — ` + `status=${loginRes.status} body=${loginRes.body.substring(0, 200)}` ); } ```
## Lab 20: Custom Metrics Four metric types: | Type | Tracks | Value Behavior | Example | |------|--------|----------------|---------| | Counter | Cumulative sum | Always increases | Total orders placed | | Gauge | Point-in-time value | Last sample wins | Active sessions | | Rate | Ratio of truthy adds | Value 0-1 | Login success rate | | Trend | Statistical distribution | Stores all samples | Cart total, page size |
## Defining Custom Metrics ```javascript import { Counter, Gauge, Rate, Trend } from 'k6/metrics'; const ordersPlaced = new Counter('orders_placed'); const activeUsers = new Gauge('active_users'); const loginSuccessRate = new Rate('login_success_rate'); const cartValueTrend = new Trend('cart_value_usd', false); ``` `false` in Trend constructor means "not a time duration" (it's a dollar value).
## Recording Metric Values ```javascript // Rate: login success loginSuccessRate.add(loginRes.status === 200); // Counter: orders placed if (checkoutRes.status === 201) { ordersPlaced.add(1); const body = JSON.parse(checkoutRes.body); if (body.total) { cartValueTrend.add(parseFloat(body.total)); } } // Gauge: active VU snapshot activeUsers.add(__VU); ```
## Thresholds on Custom Metrics ```javascript export const options = { thresholds: { 'login_success_rate': ['rate>0.95'], 'orders_placed': ['count>10'], 'cart_value_usd': ['p(95)<200'], }, }; ``` If any threshold fails, k6 exits with non-zero status — CI pipeline fails the build.
## Tagging Metric Samples Add tags for dashboard filtering: ```javascript cartValueTrend.add(parseFloat(body.total), { product_category: 'electronics' }); loginSuccessRate.add(loginRes.status === 200, { user_tier: 'premium' }); ``` Tags appear as labels in Prometheus, fields in InfluxDB — filter in Grafana panels.
## Lab 21: k6 Extensions k6 is extensible via **xk6** (extend k6) build system. **Experimental built-ins** (no custom build needed): - `k6/experimental/browser` — Chromium automation - `k6/experimental/tracing` — distributed trace context - `k6/experimental/redis` — Redis client - `k6/experimental/websockets` — WebSocket client
## WebSocket Test Example ```javascript import { WebSocket } from 'k6/experimental/websockets'; export default function () { const ws = new WebSocket('ws://localhost:8765'); ws.onopen = () => { ws.send('hello from k6'); }; ws.onmessage = (event) => { console.log('received:', event.data); ws.close(); }; sleep(2); } ```
## The xk6 Build System For extensions not shipped with the binary: ```bash # 1. Install xk6 (requires Go 1.21+) go install go.k6.io/xk6/cmd/xk6@latest # 2. Build custom k6 binary with extensions xk6 build --with github.com/grafana/xk6-sql@latest # 3. Run tests with custom binary ./k6 run my-sql-test.js ```
## Finding Extensions Extension catalog: **https://k6.io/docs/extensions/explore/** Categories: - **Data formats:** xk6-faker, xk6-csv - **Messaging:** xk6-kafka, xk6-amqp - **Databases:** xk6-sql, xk6-redis - **Protocols:** xk6-grpc-web, xk6-stomp, xk6-ssh - **Utilities:** xk6-dashboard, xk6-timers Before using: check last commit date, tests, and whether Grafana maintains it.
## Lab 22: Alerting on SM Results Grafana Synthetic Monitoring integrates with Grafana Alerting. **Auto-generated rules** per check: - Uptime alert (drops below threshold) - Response time alert (p95 exceeds threshold) **Custom rules** on SM metrics: ```promql avg_over_time(probe_success{job="Workshop Demo"}[5m]) ```
## Customizing Alert Rules Edit auto-generated uptime alert: 1. Change threshold from `< 0.75` to `< 0.99` (99% uptime) 2. Set pending period to `5m` (avoid false positives) 3. Add labels: - `severity: warning` - `team: platform` Labels drive notification routing.
## Contact Points and Notification Policies **Contact Point** — where notifications go (email, Slack, PagerDuty) **Notification Policy** — routing by labels ```yaml Matcher: severity = critical Contact point: PagerDuty Repeat interval: 1h ``` Separate policies for warning (Slack) and critical (PagerDuty).
## Mute Timings Suppress notifications during planned maintenance: ```yaml Name: Weekly maintenance Days: Sunday Time range: 02:00–04:00 ``` Apply to notification policy. Alerts still fire and appear in Grafana; notifications are simply not sent.
## Lab 23: SLOs and Error Budgets Key concepts: **SLI** — Service Level Indicator (measurement): `successful_checks / total_checks` **SLO** — Service Level Objective (target): "SLI >= 99.5% over last 30 days" **Error Budget** — allowed headroom for failures: 99.5% target = 0.5% failure budget
## Burn Rate How fast you consume error budget relative to baseline: - **1.0x** — on track to use exactly 100% by window end - **14x** — failures 14× faster than baseline (budget exhausted in ~2 days) Burn rate is a **leading indicator** — tells you when to act before budget runs out.
## Creating an SLO in SM ```yaml Name: Workshop Demo SLO Description: Uptime SLO for Workshop Demo HTTP check SLI type: Success rate Check: Workshop Demo Target: 99.5% Rolling window: 30 days ``` Grafana auto-generates error budget dashboard and calculations.
## Burn Rate Alerts Two horizons: **Fast burn (page-worthy):** - Burn rate: 14× - Window: 5 minutes - Severity: critical - At 14×, 30-day budget exhausted in ~2 days **Slow burn (ticket-worthy):** - Burn rate: 3× - Window: 60 minutes - Severity: warning - At 3×, budget exhausted in ~10 days
## Error Budget Policies Team agreement for decision-making: | Budget Remaining | Action | |------------------|--------| | > 50% | Ship freely; reliability position strong | | 20-50% | Exercise caution; every deploy carries risk | | < 20% | Stop features; prioritize reliability | | 0% | Feature freeze; mandatory reliability sprint | Post this policy in team Slack/wiki.
## Error Budget Math Example Setup: - SLO: 99.5% over 30 days - Check frequency: 1 minute - Total runs: 43,200 Allowed failures: 216 (0.5% of 43,200) **Scenario:** 10-minute outage (10 consecutive failures) - Budget consumed: 10/216 = 4.6% in 10 minutes - Burn rate: 200×
## Key Takeaways - Structured logging with JSON format integrates with Grafana Loki - Custom metrics (Counter, Gauge, Rate, Trend) track business outcomes - k6 extensions add capabilities beyond HTTP (WebSockets, gRPC, SQL, Kafka) - SM auto-generates alert rules; customize thresholds and add custom rules - Notification Policies route by labels — decouple routing from rules - SLOs define reliability targets; error budgets are finite allowances - Burn rate alerts are leading indicators — act before budget exhausted
# Lab Complete! Ready to integrate observability tools
Navigate: [All Slides](../index.html) | [Prev: Browser Testing](../08_Browser_Testing/index.html) | [Next: Observability Integration](../10_Observability_Integration/index.html)