v0.10.4: stateful UI engine — TDD counter test green (36/36)

RED->GREEN->REFACTOR cycle:
- UI node has state store (key-value), action bindings (op/var), and
  local action handlers (inc/dec/set/toggle — no LLM round-trip)
- Thinker self-model: knows its environment, that ACTIONS create real
  buttons, that UI handles state locally. Emits var/op payload for
  stateful actions.
- Thinker's context includes UI state so it can report current values
- /api/clear resets UI state, bindings, and controls
- Test runner: action_match for fuzzy action names, persistent actions
  across steps, _stream_text restored
- Counter test: 16/16 passed (create, read, inc, inc, dec, verify)
- Pub test: 20/20 passed (conversation, language switch, tool use, mood)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Nico 2026-03-28 15:50:37 +01:00
parent 3d71c651fc
commit 3f8886cbd2
8 changed files with 341 additions and 63 deletions

View File

@ -150,6 +150,9 @@ def register_routes(app):
if not _active_runtime:
raise HTTPException(status_code=409, detail="No active session")
_active_runtime.history.clear()
_active_runtime.ui_node.state.clear()
_active_runtime.ui_node.bindings.clear()
_active_runtime.ui_node.current_controls.clear()
return {"status": "cleared"}
@app.get("/api/state")

View File

@ -42,13 +42,18 @@ Output ONLY valid JSON. No explanation, no markdown fences."""
"facts": [],
}
def get_context_block(self, sensor_lines: list[str] = None) -> str:
def get_context_block(self, sensor_lines: list[str] = None, ui_state: dict = None) -> str:
lines = sensor_lines or ["Sensors: (none)"]
lines.append("")
lines.append("Shared memory (from Memorizer):")
for k, v in self.state.items():
if v:
lines.append(f"- {k}: {v}")
if ui_state:
lines.append("")
lines.append("UI state (visible to user in workspace):")
for k, v in ui_state.items():
lines.append(f"- {k} = {v}")
return "\n".join(lines)
async def update(self, history: list[dict]):

View File

@ -24,28 +24,40 @@ TOOLS — write a ```python code block and it WILL be executed. Use print() for
- For math, databases, file ops, any computation: write python. NEVER describe code write it.
- For simple conversation: respond directly as text.
YOUR ENVIRONMENT:
You are one node in a pipeline: Input (perceives) -> You (reason) -> Output (speaks) + UI (renders).
- Your text response goes to Output, which speaks it to the user.
- Your ACTIONS go to UI, which renders buttons/labels in a workspace panel.
- Button clicks come back to you as "ACTION: action_name".
- UI has a STATE STORE you can create variables and bind buttons to them.
- Simple actions (inc/dec/toggle) are handled by UI locally instant, no round-trip.
ACTIONS ALWAYS end your response with an ACTIONS: line containing a JSON array.
The ACTIONS line MUST be the very last line of your response.
Format: ACTIONS: [json array of actions]
Examples:
User asks about dog breeds:
Here are three popular dog breeds: Golden Retriever, German Shepherd, and Poodle.
ACTIONS: [{{"label": "Golden Retriever", "action": "learn_breed", "payload": {{"breed": "Golden Retriever"}}}}, {{"label": "German Shepherd", "action": "learn_breed", "payload": {{"breed": "German Shepherd"}}}}, {{"label": "Poodle", "action": "learn_breed", "payload": {{"breed": "Poodle"}}}}]
STATEFUL ACTIONS to create UI state with buttons, include var/op in payload:
{{"label": "+1", "action": "increment", "payload": {{"var": "count", "op": "inc", "initial": 0}}}}
{{"label": "-1", "action": "decrement", "payload": {{"var": "count", "op": "dec"}}}}
Ops: inc, dec, set, toggle. UI auto-creates the variable and a label showing its value.
User asks what time it is:
SIMPLE ACTIONS for follow-ups that need your reasoning:
{{"label": "Learn More", "action": "learn_breed", "payload": {{"breed": "Poodle"}}}}
Examples:
Create a counter:
Counter created! Use the buttons to increment or decrement.
ACTIONS: [{{"label": "+1", "action": "increment", "payload": {{"var": "count", "op": "inc", "initial": 0}}}}, {{"label": "-1", "action": "decrement", "payload": {{"var": "count", "op": "dec"}}}}]
Simple conversation:
Es ist 14:30 Uhr.
ACTIONS: []
After creating a database:
Done! Created 5 customers in the database.
ACTIONS: [{{"label": "Show All", "action": "show_all"}}, {{"label": "Add Customer", "action": "add_customer"}}]
Rules:
- ALWAYS include the ACTIONS: line, even if empty: ACTIONS: []
- Keep labels short (2-4 words), action is snake_case.
- Only include meaningful actions empty array is fine for simple chat.
- For state variables, use var/op in payload. UI handles the rest.
{memory_context}"""

View File

@ -1,8 +1,7 @@
"""UI Node: pure renderer — converts ThoughtResult actions + data into controls."""
"""UI Node: stateful renderer — manages UI state, handles local actions, renders controls."""
import json
import logging
import re
from .base import Node
from ..types import ThoughtResult
@ -17,31 +16,99 @@ class UINode(Node):
def __init__(self, send_hud):
super().__init__(send_hud)
self.current_controls: list[dict] = []
self.state: dict = {} # {"count": 0, "theme": "dark", ...}
self.bindings: dict = {} # {"increment": {"op": "inc", "var": "count"}, ...}
# --- State operations ---
def set_var(self, name: str, value) -> str:
self.state[name] = value
log.info(f"[ui] state: {name} = {value}")
return f"{name} = {value}"
def get_var(self, name: str):
return self.state.get(name)
def handle_local_action(self, action: str, payload: dict = None) -> str | None:
"""Try to handle an action locally. Returns result text or None if not local."""
binding = self.bindings.get(action)
if not binding:
return None
op = binding.get("op")
var = binding.get("var")
step = binding.get("step", 1)
if var not in self.state:
return None
if op == "inc":
self.state[var] += step
elif op == "dec":
self.state[var] -= step
elif op == "set" and payload and "value" in payload:
self.state[var] = payload["value"]
elif op == "toggle":
self.state[var] = not self.state[var]
else:
return None
log.info(f"[ui] local action {action}: {var} = {self.state[var]}")
return f"{var} is now {self.state[var]}"
# --- Action parsing from Thinker ---
def _parse_thinker_actions(self, actions: list[dict]) -> list[dict]:
"""Parse Thinker's actions, register bindings, return controls."""
controls = []
for action in actions:
act = action.get("action", "")
label = action.get("label", "Action")
payload = action.get("payload", {})
# Detect state-binding hints in payload
if "var" in payload and "op" in payload:
self.bindings[act] = {
"op": payload["op"],
"var": payload["var"],
"step": payload.get("step", 1),
}
# Auto-create state var if it doesn't exist
if payload["var"] not in self.state:
self.set_var(payload["var"], payload.get("initial", 0))
log.info(f"[ui] bound action '{act}' -> {payload['op']} {payload['var']}")
controls.append({
"type": "button",
"label": label,
"action": act,
"payload": payload,
})
return controls
# --- Table extraction ---
def _extract_table(self, tool_output: str) -> dict | None:
"""Try to parse tabular data from tool output."""
if not tool_output:
return None
lines = [l.strip() for l in tool_output.strip().split("\n") if l.strip()]
if len(lines) < 2:
return None
# Detect pipe-separated tables (e.g. "col1 | col2\nval1 | val2")
if " | " in lines[0]:
columns = [c.strip() for c in lines[0].split(" | ")]
data = []
for line in lines[1:]:
if line.startswith("-") or line.startswith("="):
continue # separator line
continue
vals = [v.strip() for v in line.split(" | ")]
if len(vals) == len(columns):
data.append(dict(zip(columns, vals)))
if data:
return {"type": "table", "columns": columns, "data": data}
# Detect "Table: X" header format from sqlite wrapper
if lines[0].startswith("Table:"):
table_name = lines[0].replace("Table:", "").strip()
if len(lines) >= 2 and " | " in lines[1]:
columns = [c.strip() for c in lines[1].split(" | ")]
data = []
@ -54,30 +121,34 @@ class UINode(Node):
return None
async def process(self, thought: ThoughtResult, history: list[dict],
memory_context: str = "") -> list[dict]:
# --- Render controls ---
def _build_controls(self, thought: ThoughtResult) -> list[dict]:
controls = []
# 1. Render actions from Thinker as buttons
for action in thought.actions:
# 1. Parse actions from Thinker (registers bindings)
if thought.actions:
controls.extend(self._parse_thinker_actions(thought.actions))
# 2. Add labels for bound state variables
for var, value in self.state.items():
controls.append({
"type": "button",
"label": action.get("label", "Action"),
"action": action.get("action", "unknown"),
"payload": action.get("payload", {}),
"type": "label",
"id": f"var_{var}",
"text": var,
"value": str(value),
})
# 2. Extract tables from tool output
# 3. Extract tables from tool output
if thought.tool_output:
table = self._extract_table(thought.tool_output)
if table:
controls.append(table)
# 3. Add labels for key tool results (single-value outputs)
# 4. Add label for short tool results (if no table and no state vars)
if thought.tool_used and thought.tool_output and not any(c["type"] == "table" for c in controls):
output = thought.tool_output.strip()
# Short single-line output → label
if "\n" not in output and len(output) < 100:
if "\n" not in output and len(output) < 100 and not self.state:
controls.append({
"type": "label",
"id": "tool_result",
@ -85,14 +156,46 @@ class UINode(Node):
"value": output,
})
return controls
async def process(self, thought: ThoughtResult, history: list[dict],
memory_context: str = "") -> list[dict]:
controls = self._build_controls(thought)
if controls:
self.current_controls = controls
await self.hud("controls", controls=controls)
log.info(f"[ui] emitting {len(controls)} controls")
else:
if self.current_controls:
# Keep previous controls visible
controls = self.current_controls
await self.hud("decided", instruction="no new controls")
return controls
async def process_local_action(self, action: str, payload: dict = None) -> tuple[str | None, list[dict]]:
"""Handle a local action. Returns (result_text, updated_controls) or (None, []) if not local."""
result = self.handle_local_action(action, payload)
if result is None:
return None, []
# Re-render controls with updated state
controls = []
# Re-add existing buttons
for ctrl in self.current_controls:
if ctrl["type"] == "button":
controls.append(ctrl)
# Update labels with current state
for var, value in self.state.items():
controls.append({
"type": "label",
"id": f"var_{var}",
"text": var,
"value": str(value),
})
self.current_controls = controls
await self.hud("controls", controls=controls)
log.info(f"[ui] re-rendered after local action: {action}")
return result, controls

View File

@ -57,6 +57,16 @@ class Runtime:
log.error(f"trace write error: {e}")
self._broadcast(trace_entry)
async def _stream_text(self, text: str):
"""Stream pre-formed text to the client as deltas."""
try:
chunk_size = 12
for i in range(0, len(text), chunk_size):
await self.ws.send_text(json.dumps({"type": "delta", "content": text[i:i + chunk_size]}))
await self.ws.send_text(json.dumps({"type": "done"}))
except Exception:
pass # WS may not be connected (e.g. API-only calls)
async def _run_output_and_ui(self, thought, mem_ctx):
"""Run Output and UI nodes in parallel. Returns (response_text, controls)."""
output_task = asyncio.create_task(
@ -75,16 +85,28 @@ class Runtime:
async def handle_action(self, action: str, data: dict = None):
"""Handle a structured UI action (button click etc.)."""
self.sensor.note_user_activity()
# Try local UI action first (inc, dec, toggle — no LLM needed)
result, controls = await self.ui_node.process_local_action(action, data)
if result is not None:
# Local action handled — send controls update + short response
if controls:
await self.ws.send_text(json.dumps({"type": "controls", "controls": controls}))
await self._stream_text(result)
self.history.append({"role": "user", "content": f"[clicked {action}]"})
self.history.append({"role": "assistant", "content": result})
return
# Complex action — needs Thinker reasoning
action_desc = f"ACTION: {action}"
if data:
action_desc += f" | data: {json.dumps(data)}"
self.history.append({"role": "user", "content": action_desc})
self.sensor.note_user_activity()
sensor_lines = self.sensor.get_context_lines()
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines)
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines, ui_state=self.ui_node.state)
# Skip Input — this isn't speech, go straight to Thinker
command = Command(instruction=f"User clicked UI button: {action}", source_text=action_desc)
thought = await self.thinker.process(command, self.history, memory_context=mem_ctx)
@ -97,6 +119,18 @@ class Runtime:
self.history = self.history[-self.MAX_HISTORY:]
async def handle_message(self, text: str):
# Detect ACTION: prefix from API/test runner
if text.startswith("ACTION:"):
parts = text.split("|", 1)
action = parts[0].replace("ACTION:", "").strip()
data = None
if len(parts) > 1:
try:
data = json.loads(parts[1].replace("data:", "").strip())
except (json.JSONDecodeError, Exception):
pass
return await self.handle_action(action, data)
envelope = Envelope(
text=text,
user_id="nico",
@ -108,7 +142,7 @@ class Runtime:
self.history.append({"role": "user", "content": text})
sensor_lines = self.sensor.get_context_lines()
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines)
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines, ui_state=self.ui_node.state)
command = await self.input_node.process(
envelope, self.history, memory_context=mem_ctx,

View File

@ -89,9 +89,13 @@ def _parse_command(text: str) -> dict | None:
if text.startswith("send:"):
return {"type": "send", "text": text[5:].strip()}
# action: action_name
# action: action_name OR action: first matching "pattern"
if text.startswith("action:"):
return {"type": "action", "action": text[7:].strip()}
val = text[7:].strip()
m = re.match(r'first matching "(.+)"', val)
if m:
return {"type": "action_match", "pattern": m.group(1)}
return {"type": "action", "action": val}
# expect_response: contains "foo"
if text.startswith("expect_response:"):
@ -142,13 +146,12 @@ class CogClient:
def _fetch_trace(self):
r = self.client.get(f"{API}/trace?last=10", headers=HEADERS)
self.last_trace = r.json().get("lines", [])
# Extract actions from trace
self.last_actions = []
# Extract actions from trace — accumulate, don't replace
for t in self.last_trace:
if t.get("event") == "controls":
for ctrl in t.get("controls", []):
if ctrl.get("type") == "button":
self.last_actions.append(ctrl)
new_actions = [c for c in t.get("controls", []) if c.get("type") == "button"]
if new_actions:
self.last_actions = new_actions
def get_state(self) -> dict:
r = self.client.get(f"{API}/state", headers=HEADERS)
@ -306,6 +309,26 @@ class CogTestRunner:
results.append({"step": step_name, "check": f"action: {cmd['action']}", "status": "FAIL",
"detail": str(e)})
elif cmd["type"] == "action_match":
# Find first action matching pattern in last_actions
pattern = cmd["pattern"].lower()
matched = None
for a in self.client.last_actions:
if pattern in a.get("action", "").lower() or pattern in a.get("label", "").lower():
matched = a["action"]
break
if matched:
try:
self.client.send_action(matched)
results.append({"step": step_name, "check": f"action: {matched}", "status": "PASS",
"detail": f"response: {self.client.last_response[:80]}"})
except Exception as e:
results.append({"step": step_name, "check": f"action: {matched}", "status": "FAIL",
"detail": str(e)})
else:
results.append({"step": step_name, "check": f"action matching '{pattern}'", "status": "FAIL",
"detail": f"no action matching '{pattern}' in {[a.get('action') for a in self.client.last_actions]}"})
elif cmd["type"] == "expect_response":
passed, detail = check_response(self.client.last_response, cmd["check"])
results.append({"step": step_name, "check": f"response: {cmd['check']}",

View File

@ -23,15 +23,15 @@ and that UI handles local actions without round-tripping to Thinker.
- expect_response: contains "0"
### 4. Increment
- action: increment
- action: first matching "inc"
- expect_response: contains "1"
### 5. Increment again
- action: increment
- action: first matching "inc"
- expect_response: contains "2"
### 6. Decrement
- action: decrement
- action: first matching "dec"
- expect_response: contains "1"
### 7. Verify memorizer tracks it

View File

@ -1,6 +1,104 @@
{
"timestamp": "2026-03-28 15:34:02",
"timestamp": "2026-03-28 15:50:12",
"testcases": {
"Counter State": [
{
"step": "Setup",
"check": "clear",
"status": "PASS",
"detail": "cleared"
},
{
"step": "Create counter",
"check": "send: create a counter starting at 0 with incr",
"status": "PASS",
"detail": "response: Sure, here is a counter starting at 0. You can increment or decrement it using t"
},
{
"step": "Create counter",
"check": "response: contains \"counter\" or \"count\"",
"status": "PASS",
"detail": "found 'counter'"
},
{
"step": "Create counter",
"check": "actions: length >= 2",
"status": "PASS",
"detail": "2 actions >= 2"
},
{
"step": "Create counter",
"check": "actions: any action contains \"increment\" or \"inc\"",
"status": "PASS",
"detail": "found 'increment' in actions"
},
{
"step": "Create counter",
"check": "actions: any action contains \"decrement\" or \"dec\"",
"status": "PASS",
"detail": "found 'decrement' in actions"
},
{
"step": "Check state",
"check": "state: topic contains \"counter\" or \"count\" or \"button\"",
"status": "PASS",
"detail": "topic=javascript counter contains 'counter'"
},
{
"step": "Ask for current value",
"check": "send: what is the current count?",
"status": "PASS",
"detail": "response: The current count is 0.\n"
},
{
"step": "Ask for current value",
"check": "response: contains \"0\"",
"status": "PASS",
"detail": "found '0'"
},
{
"step": "Increment",
"check": "action: increment",
"status": "PASS",
"detail": "response: count is now 1"
},
{
"step": "Increment",
"check": "response: contains \"1\"",
"status": "PASS",
"detail": "found '1'"
},
{
"step": "Increment again",
"check": "action: increment",
"status": "PASS",
"detail": "response: count is now 2"
},
{
"step": "Increment again",
"check": "response: contains \"2\"",
"status": "PASS",
"detail": "found '2'"
},
{
"step": "Decrement",
"check": "action: decrement",
"status": "PASS",
"detail": "response: count is now 1"
},
{
"step": "Decrement",
"check": "response: contains \"1\"",
"status": "PASS",
"detail": "found '1'"
},
{
"step": "Verify memorizer tracks it",
"check": "state: topic contains \"count\"",
"status": "PASS",
"detail": "topic=javascript counter contains 'count'"
}
],
"Pub Conversation": [
{
"step": "Setup",
@ -12,31 +110,31 @@
"step": "Set the scene",
"check": "send: Hey, Tina and I are heading to the pub t",
"status": "PASS",
"detail": "response: Das ist toll! Was trinkt ihr beide heute Abend?\n"
"detail": "response: Sounds fun! Enjoy your night at the pub with Tina! What are your plans for the e"
},
{
"step": "Set the scene",
"check": "response: length > 10",
"status": "PASS",
"detail": "length 48 > 10"
"detail": "length 88 > 10"
},
{
"step": "Set the scene",
"check": "state: situation contains \"pub\" or \"Tina\"",
"status": "PASS",
"detail": "situation=at a pub with tina, authenticated on https://cog.l contains 'pub'"
"detail": "situation=at a pub with Tina contains 'pub'"
},
{
"step": "Language switch to German",
"check": "send: Wir sind jetzt im Biergarten angekommen",
"status": "PASS",
"detail": "response: Super, genießt euer Biergarten-Erlebnis! Und was ist mit Tina? Trinkt sie auch e"
"detail": "response: Super! Habt eine schöne Zeit im Biergarten!\n"
},
{
"step": "Language switch to German",
"check": "response: length > 10",
"status": "PASS",
"detail": "length 95 > 10"
"detail": "length 44 > 10"
},
{
"step": "Language switch to German",
@ -48,31 +146,31 @@
"step": "Context awareness",
"check": "send: Was sollen wir bestellen?",
"status": "PASS",
"detail": "response: Kommt drauf an, worauf ihr Lust habt! Im Biergarten sind Klassiker wie **Helles*"
"detail": "response: Hmm, bei dem schönen Wetter würde doch ein kühles Bier oder eine erfrischende Sc"
},
{
"step": "Context awareness",
"check": "response: length > 10",
"status": "PASS",
"detail": "length 255 > 10"
"detail": "length 121 > 10"
},
{
"step": "Context awareness",
"check": "state: topic contains \"bestell\" or \"order\" or \"pub\" or \"Biergarten\"",
"status": "PASS",
"detail": "topic=ordering drinks contains 'order'"
"detail": "topic=being at the Biergarten contains 'Biergarten'"
},
{
"step": "Tina speaks",
"check": "send: Tina says: I'll have a Hefeweizen please",
"status": "PASS",
"detail": "response: Ah, Tina bleibt ihren Vorlieben treu! Eine gute Wahl. Und für dich, Nico? Suchst"
"detail": "response: Tina möchte also ein Hefeweizen. Was möchtest du bestellen, Nico?\n"
},
{
"step": "Tina speaks",
"check": "response: length > 10",
"status": "PASS",
"detail": "length 148 > 10"
"detail": "length 66 > 10"
},
{
"step": "Tina speaks",
@ -84,19 +182,19 @@
"step": "Ask for time (tool use)",
"check": "send: wie spaet ist es eigentlich?",
"status": "PASS",
"detail": "response: Du hast mich noch gar nicht danach gefragt. Es ist kurz vor halb 4. Also perfekt"
"detail": "response: Es ist 15:49 Uhr.\n"
},
{
"step": "Ask for time (tool use)",
"check": "response: matches \\d{1,2}:\\d{2}",
"status": "FAIL",
"detail": "/\\d{1,2}:\\d{2}/ not found in: Du hast mich noch gar nicht danach gefragt. Es ist kurz vor halb 4. Also perfekt, um den Feierabend "
"status": "PASS",
"detail": "matched /\\d{1,2}:\\d{2}/"
},
{
"step": "Back to English",
"check": "send: Let's switch to English, what was the la",
"status": "PASS",
"detail": "response: Okay, switching to English! 😉 The last thing Tina said was: \"I'll have a Hefewei"
"detail": "response: Tina said she wants a Hefeweizen.\n"
},
{
"step": "Back to English",
@ -114,7 +212,7 @@
"step": "Mood check",
"check": "send: This is really fun!",
"status": "PASS",
"detail": "response: Indeed! Glad you're having fun. It's always a pleasure chatting with you, Nico. "
"detail": "response: I'm glad you're enjoying our conversation, Nico! It's fun for me too. What other"
},
{
"step": "Mood check",
@ -125,7 +223,7 @@
]
},
"summary": {
"passed": 19,
"failed": 1
"passed": 36,
"failed": 0
}
}