- Harness reports to /api/test/status with suite_start/step_result/suite_end
- Frontend shows x/44 progress, per-test duration, total elapsed time
- Auto-discovers test count from test modules (no hardcoded number)
- run_all.py --report URL pushes live results to browser
- Fix: suite_start with count only resets on first call
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- OutputSink: collects output, optionally streams to attached WS
- Runtime no longer requires WebSocket — works headless for MCP
- WS connects/disconnects via attach_ws()/detach_ws(), runtime persists
- /api/send/check + /api/send (async) + /api/result (poll with progress)
- Graph switch destroys old runtime, next request creates new one
- Director v2 model: claude-opus-4 (was claude-sonnet-4, reserved)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RED->GREEN->REFACTOR cycle:
- UI node has state store (key-value), action bindings (op/var), and
local action handlers (inc/dec/set/toggle — no LLM round-trip)
- Thinker self-model: knows its environment, that ACTIONS create real
buttons, that UI handles state locally. Emits var/op payload for
stateful actions.
- Thinker's context includes UI state so it can report current values
- /api/clear resets UI state, bindings, and controls
- Test runner: action_match for fuzzy action names, persistent actions
across steps, _stream_text restored
- Counter test: 16/16 passed (create, read, inc, inc, dec, verify)
- Pub test: 20/20 passed (conversation, language switch, tool use, mood)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- testcases/*.md: declarative test definitions (send, expect_response,
expect_state, expect_actions, action)
- runtime_test.py: standalone runner + pytest integration via conftest.py
- /tests route: web UI showing last run results from results.json
- /api/tests: serves results JSON
- Two initial testcases: counter_state (UI actions) and pub_conversation
(multi-turn, language switch, tool use, memorizer state)
- pub_conversation: 19/20 passed on first run
- Fix nm-text vertical overflow in node metrics bar
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Thinker tool results stream directly to user, skipping Output node (halves latency)
- ProcessManager process_start/process_done events render as live cards in chat
- UI controls sent before response text, not after
- Button clicks route to handle_action(), skip Input, go straight to Thinker
- Fix Thinker model: gemini-2.5-flash-preview -> gemini-2.5-flash (old ID expired)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>