Skip to content

Recorder agent

Recorder agent

The recorder agent is the Linux node process that turns local audio hardware into managed Rakkr recordings. It is a Rust crate at crates/recorder-agent (binary rakkr-recorder-agent), entrypoint src/main.rs, configuration in src/config.rs.

This page explains how it works. For every flag and environment variable, see the Recorder agent CLI reference.

Responsibilities

AreaWhat it does
IdentityReports a stable node ID, alias, site, room, tags, and runtime details (arch, kernel, OS, uptime, IPs, audio backends).
InventoryDiscovers ALSA capture devices (arecord -l, falling back to /proc/asound/pcm), and detects PipeWire/JACK availability by probing the PATH. Refreshed every heartbeat.
MeteringSamples PCM levels per channel and derives quality fields (RMS/peak dBFS, clipping, speech, channel correlation, …).
Capture / jobsClaims recording jobs, runs bounded-concurrent capture processes, monitors output growth, renders channel maps, re-encodes, and uploads.
Health logWrites lifecycle-managed local evidence (JSONL or SQLite) and syncs events to the controller.
System healthTracks disk pressure, CPU/load pressure, and audio-backend availability transitions.
Cache retentionTracks rendered/raw cache in a manifest and sweeps per controller-supplied policies.
RecoveryReconciles in-flight jobs on startup, detects controller clock skew, and recovers from runtime device loss and disk shortfall with segment stitching.

Run modes

Without a mode flag the agent runs as a long-lived daemon (the heartbeat loop). Several one-shot modes exist for diagnostics and scripting:

ModePurpose
--print-inventoryPrint node inventory JSON and exit.
--print-meter-frameCapture/generate one meter frame and exit.
--print-channel-map-assignmentsFetch and print this node’s channel-map assignments.
--run-next-jobClaim and run exactly one queued job, then exit.
--capture-recording-idOne-shot capture → render → upload for a recording ID.
--attach-cache-fileUpload an existing local file as a recording’s cache.

The daemon loop

Every heartbeat tick (default 5s), when a controller token is configured, the agent:

  1. HeartbeatsPOST /nodes/{id}/heartbeat with the full inventory snapshot; the response Date header is used to compute controller clock skew.
  2. Posts a meter framePOST /nodes/{id}/meter-frame.
  3. Posts a monitor chunk (if enabled) — POST /nodes/{id}/listen/chunk with recent WAV audio for live listen-in.
  4. Pulls node configGET /nodes/{id}/config returns audio defaults, recorder-cache policies, and recording capacity, all applied live (no restart needed to change capture defaults or concurrency).
  5. Syncs health eventsPOST /nodes/{id}/health-events for anything logged locally.
  6. Sweeps cache — when idle and policies exist, runs retention cleanup.

Recording-job workers run alongside the loop, up to the (controller-overridable) concurrency limit. Each worker claims a job, runs capture, heartbeats the job, watches for controller-driven stop/cancel, then renders and uploads. On startup the agent reconciles any job left in-flight from its persisted state file.

Capture and backends

The default capture path is ALSA via arecord. PipeWire (pw-record) and JACK (jack_capture) are first-class presets; a synthetic meter backend keeps development hosts working without real hardware. If the capture command is left at the arecord default, the agent auto-selects the right command for the chosen backend.

Operators can fully override the argument list with command templates (--capture-args-template, --meter-args-template) using placeholders like {device}, {format}, {sample_rate}, {channels}, {seconds}, {output} — so non-arecord tools or site-specific flags can be plugged in without code changes.

Capture is guarded: a minimum output size rejects empty files, and a growth-stall detector (grace period + stall timeout) fails captures whose output stops growing, with structured evidence.

Health evidence

The agent writes a local health log — rotating JSONL by default, or a SQLite store — then (with a token) syncs each event to the controller. Local logging never blocks on sync. Event families include:

  • Meter capture: capture-failed, device-unavailable, xrun, recovered.
  • Meter quality: clipping, flatline, low-signal, channel-correlation (+ each recovery).
  • Sync health: heartbeat / meter-frame / monitor-chunk / node-config sync failures and recoveries.
  • Recording job: capture start/stall/render/upload failures, channel-map applied/lookup-failed, control-plane/status-poll failures, segment stitching, disk-space recovery.
  • System health: disk pressure, CPU pressure, audio-backend unavailable/recovered.
  • Cache retention: sweep completed, delete failures, tracking sync/failure.

Quality thresholds (clip/flatline/low-signal dBFS), log rotation, disk/CPU thresholds, and inventory probe paths are all configurable — see the CLI reference.

Transport

The agent talks to <controller_url>/api/v1. It refuses non-loopback http:// controllers unless explicitly allowed for development, and can trust an internal controller CA bundle for TLS. All calls send the node bearer token; node-scoped calls also send an x-rakkr-agent-id header. See Transport security.