hermes/docs/HUD-PROTOCOL.md
Nico ccee249618 v0.6.42: Hermes chat UI — Vue3/TS/Vite, audio STT/TTS, sidebar rail, MCP event loop
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 19:35:10 +02:00

9.9 KiB
Raw Blame History

HUD Protocol — Structured Activity Feed

Status: Design — not yet implemented
Author: Titan
Created: 2026-03-15


Overview

The HUD (Heads-Up Display) is a real-time activity feed in the webchat UI showing what the agent is doing — tool calls, reasoning, session events. It replaces the previous flat string[] log with a structured, hierarchical, machine-readable event stream.


Goals

  • Structured — args and results are objects, not truncated strings
  • Hierarchical — tools nest inside turns; thinking nests inside turns
  • Incremental — events stream as they happen (_start / _end pairs)
  • Machine-readable — Titan can inspect full tool output from HUD panel
  • Replay-aware — history replay emits same events, flagged replay: true
  • Extensible — new event types add without breaking existing consumers

Wire Format

All HUD events are sent as WebSocket messages with type: "hud".

Base shape

interface HudEvent {
  type: 'hud'
  event: HudEventKind
  id: string              // always crypto.randomUUID() — globally unique
  correlationId?: string  // provider call_* id (tools) or turnId (turns) — used for _start/_end pairing
  parentId?: string       // correlationId of containing turn
  ts: number              // Unix ms timestamp
  replay?: boolean        // true when emitted from history replay
}

Event kinds

tool_start   tool_end
think_start  think_end
turn_start   turn_end
received                  // instantaneous — no _end counterpart

Event Shapes

turn_start / turn_end

{ "type": "hud", "event": "turn_start", "id": "<uuid>", "correlationId": "<turnId>", "ts": 1741995000000 }
{ "type": "hud", "event": "turn_end",   "id": "<uuid>", "correlationId": "<turnId>", "ts": 1741995004500, "durationMs": 4500 }

correlationId = session.turnId. Frontend pairs turn_startturn_end by matching correlationId.


think_start / think_end

{ "type": "hud", "event": "think_start", "id": "<uuid>", "correlationId": "<uuid-think>", "parentId": "<turnId>", "ts": 1741995000050 }
{ "type": "hud", "event": "think_end",   "id": "<uuid>", "correlationId": "<uuid-think>", "parentId": "<turnId>", "ts": 1741995000820, "durationMs": 770 }

correlationId = crypto.randomUUID() generated at think_start, held in session state until think_end. parentId = containing turn's correlationId.


tool_start / tool_end

{ "type": "hud", "event": "tool_start",
  "id": "<uuid>",
  "correlationId": "call_123f898fc88346afaec098e0",
  "parentId": "<turnId>",
  "tool": "read",
  "args": { "path": "workspace-titan/SOUL.md" },
  "ts": 1741995001000 }

{ "type": "hud", "event": "tool_end",
  "id": "<uuid>",
  "correlationId": "call_123f898fc88346afaec098e0",
  "parentId": "<turnId>",
  "tool": "read",
  "result": { "ok": true, "text": "# SOUL\n…", "bytes": 2048 },
  "ts": 1741995001210, "durationMs": 210 }

id = always crypto.randomUUID(). correlationId = provider tool call id (call_*) — frontend pairs tool_starttool_end by matching correlationId.

Result shapes by tool:

// Shared — viewer-navigable file reference
interface FileArea {
  startLine: number    // 1-indexed, in resulting file
  endLine: number
}

interface FileMeta {
  path: string         // raw as passed to tool
  viewerPath: string   // normalized: /home/openclaw/.openclaw/ stripped
  area?: FileArea
}

// read
args:   { path: string, offset?: number, limit?: number }
result: { ok: boolean, file: FileMeta, area: FileArea, text: string, bytes: number, truncated: boolean }
// area inferred from offset/limit → { startLine: offset, endLine: offset+limit }

// write
args:   { path: string, operation: 'write' }
result: { ok: boolean, file: FileMeta, area: FileArea, bytes: number }
// area: { startLine: 1, endLine: lineCount(written content) }

// edit
args:   { path: string, operation: 'edit' }
result: { ok: boolean, file: FileMeta, area: FileArea }
// area: line range of replaced block in resulting file

// append
args:   { path: string, operation: 'append' }
result: { ok: boolean, file: FileMeta, area: FileArea, bytes: number }
// area: { startLine: prevLineCount+1, endLine: newLineCount }

// exec
args:   { command: string }
result: { ok: boolean, exitCode: number, stdout: string, truncated: boolean, mentionedPaths?: FileMeta[], error?: string }
// mentionedPaths: file paths parsed from command string via regex

// web_search / web_fetch
args:   { query?: string, url?: string }
result: { ok: boolean, text: string, url?: string, truncated: boolean }

// browser / canvas / nodes / message / sessions_*
args:   { action: string, [key: string]: any }
result: { ok: boolean, summary: string, raw?: any }

// fallback (unknown tool)
result: { ok: boolean, raw: any }

UI labels derived from operation:

Tool + operation Label
read 👁 path:L10L50
write ✏️ path (overwrite)
edit ✏️ path:L22L28
append ✏️ path:L180L195
exec ⚡ command
web_fetch 🌐 url
web_search 🔍 query

received (instantaneous)

No _end counterpart. Emitted when the backend acknowledges a control action.

interface ReceivedEvent extends HudEvent {
  event: 'received'
  subtype: ReceivedSubtype
  label: string               // human readable description
  payload?: Record<string, any>
}

type ReceivedSubtype =
  | 'new_session'
  | 'agent_switch'
  | 'stop'
  | 'kill'
  | 'handover'
  | 'reconnect'
  | 'message'

Examples:

{ "type": "hud", "event": "received", "id": "<uuid>", "subtype": "new_session",
  "label": "/new received — resetting session",
  "payload": { "previousAgent": "tester" }, "ts": 1741995010000 }

{ "type": "hud", "event": "received", "id": "<uuid>", "subtype": "agent_switch",
  "label": "switch → titan",
  "payload": { "from": "tester", "to": "titan" }, "ts": 1741995020000 }

{ "type": "hud", "event": "received", "id": "<uuid>", "subtype": "stop",
  "label": "stop received — aborting turn",
  "payload": { "state": "AGENT_RUNNING" }, "ts": 1741995030000 }

{ "type": "hud", "event": "received", "id": "<uuid>", "subtype": "reconnect",
  "label": "reconnected — replaying history",
  "payload": { "sessionKey": "agent:titan:web:nico" }, "ts": 1741995040000 }

{ "type": "hud", "event": "received", "id": "<uuid>", "subtype": "message",
  "label": "message received",
  "payload": { "preview": "hello world" }, "ts": 1741995050000 }

ID Policy

All id fields are always crypto.randomUUID() — globally unique, no exceptions.

correlationId carries the external or domain identifier used for _start/_end pairing:

Event correlationId source
tool_start / tool_end Provider tool call id (call_*)
turn_start / turn_end session.turnId
think_start / think_end crypto.randomUUID() generated at think_start, reused at think_end
received — (no pairing needed)

Frontend pairing logic:

  • *_start → create node, index by correlationId (or id if no correlationId)
  • *_end → look up by correlationId → merge result, set state: 'done', set durationMs
  • FIFO fallback if correlationId is missing or unmatched — match oldest running node of same tool/type

Frontend Data Model

interface HudNode {
  id: string
  type: 'turn' | 'tool' | 'think' | 'received'
  subtype?: string
  state: 'running' | 'done' | 'error'
  label: string                  // human readable
  tool?: string
  args?: Record<string, any>    // full, structured
  result?: Record<string, any>  // full, structured
  startedAt: number
  endedAt?: number
  durationMs?: number
  children: HudNode[]           // tools/thinks nest inside turns
  replay: boolean
}

Pairing logic:

  • Maintain Map<id, HudNode> (pending nodes)
  • *_start → create node with state: 'running', insert into map + tree
  • *_end → look up by id, merge result, set state: 'done', set durationMs, remove from map
  • received → create complete node immediately (state: 'done')
  • If *_end arrives with unknown id → FIFO fallback (match oldest running node of same tool)

Emission Points

Source Events emitted
gateway.ts chat.tool_call tool_start
gateway.ts chat.tool_result tool_end
gateway.ts chat.thinking think_start (on first chunk)
gateway.ts chat.done / chat.final think_end (if thinking was open), turn_end
gateway.ts chat.delta / turn_start turn_start (on first delta)
server.ts handleNew received subtype=new_session
server.ts handleSwitchAgent received subtype=agent_switch
server.ts handleStopKill received subtype=stop or kill
server.ts handleHandoverRequest received subtype=handover
server.ts reconnect path received subtype=reconnect
session-watcher.ts history all of the above with replay: true

Rendering (HudActions.vue)

  • Tree view: turns at root, tools/thinks as children
  • Each row: [state-dot] [icon] [label] [duration-badge]
  • Expandable: click row → show args/result as formatted JSON
  • replay: true nodes rendered at reduced opacity
  • Running nodes animate (pulse dot)
  • Max visible: last 50 nodes (configurable)
  • History replay nodes collapsible as a group

Migration

Replaces:

  • hudActionsLog: ref<string[]> in sessionHistory.ts
  • String-building in handleSessionEntry / handleSessionHistory
  • Raw string push in useAgentSocket.ts lines 111114

Preserved:

  • chatStore.pushSystem() — chat bubble system messages (errors, stop confirm) — different concern
  • lastSystemMsgRef — status text in HudRow

Open Questions

  • Should received.message events be emitted for every user message? (could be noisy)
  • Should thinking content be stored in the node (for expand) or discarded?
  • Cap on result.text size stored in node? (full fidelity vs memory)