Nico a2bc6347fc v0.13.0: Graph engine, versioned nodes, S3* audit, DB tools, Cytoscape

Architecture:
- Graph engine (engine.py) loads graph definitions, instantiates nodes
- Versioned nodes: input_v1, thinker_v1, output_v1, memorizer_v1, director_v1
- NODE_REGISTRY for dynamic node lookup by name
- Graph API: /api/graph/active, /api/graph/list, /api/graph/switch
- Graph definition: graphs/v1_current.py (7 nodes, 13 edges, 3 edge types)

S3* Audit system:
- Workspace mismatch detection (server vs browser controls)
- Code-without-tools retry (Thinker wrote code but no tool calls)
- Intent-without-action retry (request intent but Thinker only produced text)
- Dashboard feedback: browser sends workspace state on every message
- Sensor continuous comparison on 5s tick

State machines:
- create_machine / add_state / reset_machine / destroy_machine via function calling
- Local transitions (go:) resolve without LLM round-trip
- Button persistence across turns

Database tools:
- query_db tool via pymysql to MariaDB K3s pod (eras2_production)
- Table rendering in workspace (tab-separated parsing)
- Director pre-planning with Opus for complex data requests
- Error retry with corrected SQL

Frontend:
- Cytoscape.js pipeline graph with real-time node animations
- Overlay scrollbars (CSS-only, no reflow)
- Tool call/result trace events
- S3* audit events in trace

Testing:
- 167 integration tests (11 test suites)
- 22 node-level unit tests (test_nodes/)
- Three test levels: node unit, graph integration, scenario

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-29 00:18:45 +01:00

969 B

Raw Blame History

DB Exploration

Tests that the agent queries the database, renders results as tables in the workspace (not as text in chat), and creates interactive exploration UI.

Setup

clear history

Steps

1. Query renders table in workspace

send: show me 5 customers from the database
expect_trace: has tool_call
expect_actions: has table
expect_response: not contains "---|" or "| ID"

2. Chat summarizes, does not dump data

expect_response: contains "customer" or "Kunde" or "5" or "table"
expect_response: length > 10

3. Thinker builds exploration UI (not describes it)

send: select customer 2 Kathrin Jager, add buttons to explore her objects and devices
expect_actions: length >= 1
expect_response: not contains "UI team" or "will add" or "will create"

4. Error recovery on bad query

send: SELECT * FROM nichtexistiert LIMIT 5
expect_trace: has tool_call
expect_response: not contains "1146"
expect_response: length > 10

969 B Raw Blame History

DB Exploration

Setup

Steps

1. Query renders table in workspace

2. Chat summarizes, does not dump data

3. Thinker builds exploration UI (not describes it)

4. Error recovery on bad query

969 B

Raw Blame History