- testcases/*.md: declarative test definitions (send, expect_response,
expect_state, expect_actions, action)
- runtime_test.py: standalone runner + pytest integration via conftest.py
- /tests route: web UI showing last run results from results.json
- /api/tests: serves results JSON
- Two initial testcases: counter_state (UI actions) and pub_conversation
(multi-turn, language switch, tool use, memorizer state)
- pub_conversation: 19/20 passed on first run
- Fix nm-text vertical overflow in node metrics bar
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6-node pipeline: Input -> Thinker -> Output (voice) + UI (screen) in parallel
- Output: text only (markdown, emoji). Never emits HTML or controls.
- UI: dedicated node for labels, buttons, tables. Tracks workspace state.
Replaces entire workspace on each update. Runs parallel with Output.
- Input: strict one-sentence perception. No more hallucinating responses.
- Thinker: controls removed from prompt, focuses on reasoning + tools.
- Frontend: markdown rendered in chat (bold, italic, code blocks, lists).
Label control type added. UI node meter in top bar.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Thinker tool results stream directly to user, skipping Output node (halves latency)
- ProcessManager process_start/process_done events render as live cards in chat
- UI controls sent before response text, not after
- Button clicks route to handle_action(), skip Input, go straight to Thinker
- Fix Thinker model: gemini-2.5-flash-preview -> gemini-2.5-flash (old ID expired)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ThinkerNode: reasons about perception, decides tool use vs direct answer
- Python tool: subprocess execution with 10s timeout
- Auto-detects python code blocks in LLM output and executes them
- Tool call/result visible in trace + HUD
- Thinker meter in frontend (token budget: 4K)
- Flow: Input (perceive) -> Thinker (reason + tools) -> Output (speak)
- Tested: math (42*137=5754), SQLite (create+query), time, greetings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Per-node context fill bars (input/output/memorizer/sensor)
- Color-coded: green <50%, amber 50-80%, red >80%
- Sensor meter shows tick count + latest deltas
- Token info in trace context events
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>