---
name: kelam-viz
description: 'Visualize Kelam voice-agent call data — transcripts, call volume, durations, talk ratio, tool usage, status breakdowns. Use when the user asks to "visualize calls", "build a call dashboard", "chart/plot call metrics or transcripts", "analyze call data", or wants any report/graphic from `kelam export` / `kelam stats` output.'
metadata:
  author: kelam
  version: "0.1.0"
---

# Kelam call-data visualization

Turn Kelam call logs into beautiful, self-contained visualizations. The pipeline is
always: **export the data with the CLI → inspect what you actually got → build one
self-contained HTML file → open it (or hand it to the user)**.

## 1. Get the data

The `kelam` CLI reads call logs from DynamoDB through the control plane
(`KELAM_API_URL`, `KELAM_WORKSPACE` env — defaults work on a configured machine):

```bash
kelam stats                                   # quick aggregate JSON — look here FIRST
kelam stats --agent agt_abc123 --since 7d     # scoped

kelam export -o calls.jsonl                                  # full logs, newest first
kelam export --agent agt_abc123 --since 24h --status completed -o calls.jsonl
kelam export --format csv -o calls.csv        # flat metric rows, no transcript text
kelam export --format json                    # array on stdout (small datasets)
```

`--since` takes `30m` / `24h` / `7d` / `2w` or ISO-8601. Always run `kelam stats`
first: it tells you how many calls exist, the date range, and the status mix — which
determines what is worth charting. Agent ids come from `kelam list`.

## 2. Data contract (one object per call)

```jsonc
{
  "call_id": "kelam-call-1a2b3c4d",
  "agent_id": "agt_abc123", "workspace": "default",
  "direction": "outbound",                  // or "inbound"
  "from_number": "+14155550100", "to_number": "+12065550123",
  "started_at": "2026-06-11T17:03:00+00:00",   // UTC ISO-8601
  "ended_at": "2026-06-11T17:05:30+00:00",     // may be absent (in_progress)
  "status": "completed",  // completed | failed | busy | no_answer | in_progress
  "transcript": [          // ordered turns; empty for failed/no_answer dials
    {"role": "user", "text": "..."},
    {"role": "assistant", "text": "..."},
    {"role": "tool", "text": "→ called book_slot({...})"},   // tool invocation
    {"role": "tool", "text": "← book_slot returned: ..."}    // tool result
  ],
  "recording_url": "s3://kelam-recordings-.../recordings/....ogg",
  "turn_metrics": [        // per user turn, in order; ABSENT on calls recorded before
    {                      // latency capture shipped — treat missing as "not measured"
      "speech_id": "speech_1a2b",
      "eou_delay": 0.45,   // end of speech -> turn considered complete (seconds)
      "llm_ttft": 0.62,    // LLM time to first token
      "tts_ttfb": 0.31     // TTS time to first audio byte
    }                      // any part may be null (e.g. interrupted before TTS)
  ],
  "metrics": {             // derived server-side (kelam/metrics.py) — same on every export
    "duration_seconds": 150.0,        // null when the call never connected
    "user_turns": 9, "agent_turns": 10, "tool_calls": 2,
    "user_words": 210, "agent_words": 480,
    "agent_talk_ratio": 0.696,        // agent_words / spoken words; null if silent
    "words_per_minute": 276.0,        // null without a duration
    "response_latency_p50": 1.38,     // per-turn eou_delay + llm_ttft + tts_ttfb (seconds);
    "response_latency_p95": 2.04      // null when the call has no measured turns
  }
}
```

Gotchas: `duration_seconds`, `agent_talk_ratio`, `words_per_minute` are null for
calls that never connected — filter, never coerce to 0. Timestamps are UTC; bucket in
UTC and label the timezone. Latency (`turn_metrics`, `response_latency_p50/p95`) only
exists on calls recorded after the worker started capturing it — older calls have no
`turn_metrics` and null percentiles; filter them out of latency charts and say how many
calls were measured. A latency sample only counts turns where the caller heard a
response (llm_ttft and tts_ttfb both present); eou_delay may be null under stt-based
turn detection. `recording_url` is `s3://`, not embeddable — presign with
`aws s3 presign` only if the user asks for audio playback.

## 3. Build the visualization

Produce **one self-contained HTML file** (e.g. `call-dashboard.html`):

- Embed the exported data inline: `<script>const CALLS = [...];</script>`. No fetch,
  no server — the file must work from `open call-dashboard.html`, emailed, anywhere.
- Charts via a CDN library (Chart.js is the default; D3 when you need custom forms
  like timelines or swimlanes). Everything else inline CSS/JS, no build step.
- Compute aggregates in JS from `CALLS` (or precompute in Python and embed both), so
  the file doubles as the data's home.

### Design bar

This is a product surface, not a debug page — make it genuinely beautiful:

- One typeface (system stack is fine), a deliberate palette (one hue per status:
  green=completed, red=failed, amber=busy/no_answer, gray=in_progress — keep it
  consistent across every chart), generous whitespace, a header with the workspace /
  agent / date range / export time.
- Lead with 4-6 stat cards (total calls, answer rate, total talk time, median
  duration, tool calls), then charts, then the transcript explorer.
- Label axes and units, format durations as m:ss, phone numbers verbatim, round
  ratios to 2 dp. Tooltips on everything. Handle the empty state ("no calls match")
  instead of rendering blank canvases.
- Respect `prefers-color-scheme` if it is cheap; otherwise pick one polished theme.

### Chart recipes — pick by question, do not dump every chart

| Question | Chart |
|---|---|
| When are calls happening? | volume over time (bar, bucketed hour/day to fit range) |
| Are calls connecting? | status breakdown (stacked bar over time, or donut) |
| How long are calls? | duration histogram + p50/p95 markers (from completed only) |
| Who talks more? | agent_talk_ratio distribution, or scatter vs duration |
| Are tools being used? | tool_calls per call; parse tool names from "→ called X(" for a frequency bar |
| How do agents compare? | grouped bars per agent_id: count, answer rate, median duration |
| Is the agent responsive? | response_latency_p50/p95 per call over time, or per-turn stacked bars (eou_delay / llm_ttft / tts_ttfb from turn_metrics) for one call |
| What was said? | transcript explorer: call list (status dot, time, duration) → chat-bubble view, user left / agent right, tool turns styled as monospace events; client-side text search across transcripts |

For "a dashboard", combine stat cards + volume + status + duration + the transcript
explorer. For a specific question, build only the charts that answer it.

## 4. Conventions

- Code comments: real descriptions, no emojis (repo rule).
- Compare across agents only when the export actually spans agents.
- Quote findings from the data in your summary (e.g. "answer rate 82%, median call
  2:10") — the chart illustrates the numbers, the prose states them.
- After writing the file, open it or send it to the user; do not just mention the path.