v0.10.4: stateful UI engine — TDD counter test green (36/36)
RED->GREEN->REFACTOR cycle: - UI node has state store (key-value), action bindings (op/var), and local action handlers (inc/dec/set/toggle — no LLM round-trip) - Thinker self-model: knows its environment, that ACTIONS create real buttons, that UI handles state locally. Emits var/op payload for stateful actions. - Thinker's context includes UI state so it can report current values - /api/clear resets UI state, bindings, and controls - Test runner: action_match for fuzzy action names, persistent actions across steps, _stream_text restored - Counter test: 16/16 passed (create, read, inc, inc, dec, verify) - Pub test: 20/20 passed (conversation, language switch, tool use, mood) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
3d71c651fc
commit
3f8886cbd2
@ -150,6 +150,9 @@ def register_routes(app):
|
||||
if not _active_runtime:
|
||||
raise HTTPException(status_code=409, detail="No active session")
|
||||
_active_runtime.history.clear()
|
||||
_active_runtime.ui_node.state.clear()
|
||||
_active_runtime.ui_node.bindings.clear()
|
||||
_active_runtime.ui_node.current_controls.clear()
|
||||
return {"status": "cleared"}
|
||||
|
||||
@app.get("/api/state")
|
||||
|
||||
@ -42,13 +42,18 @@ Output ONLY valid JSON. No explanation, no markdown fences."""
|
||||
"facts": [],
|
||||
}
|
||||
|
||||
def get_context_block(self, sensor_lines: list[str] = None) -> str:
|
||||
def get_context_block(self, sensor_lines: list[str] = None, ui_state: dict = None) -> str:
|
||||
lines = sensor_lines or ["Sensors: (none)"]
|
||||
lines.append("")
|
||||
lines.append("Shared memory (from Memorizer):")
|
||||
for k, v in self.state.items():
|
||||
if v:
|
||||
lines.append(f"- {k}: {v}")
|
||||
if ui_state:
|
||||
lines.append("")
|
||||
lines.append("UI state (visible to user in workspace):")
|
||||
for k, v in ui_state.items():
|
||||
lines.append(f"- {k} = {v}")
|
||||
return "\n".join(lines)
|
||||
|
||||
async def update(self, history: list[dict]):
|
||||
|
||||
@ -24,28 +24,40 @@ TOOLS — write a ```python code block and it WILL be executed. Use print() for
|
||||
- For math, databases, file ops, any computation: write python. NEVER describe code — write it.
|
||||
- For simple conversation: respond directly as text.
|
||||
|
||||
YOUR ENVIRONMENT:
|
||||
You are one node in a pipeline: Input (perceives) -> You (reason) -> Output (speaks) + UI (renders).
|
||||
- Your text response goes to Output, which speaks it to the user.
|
||||
- Your ACTIONS go to UI, which renders buttons/labels in a workspace panel.
|
||||
- Button clicks come back to you as "ACTION: action_name".
|
||||
- UI has a STATE STORE — you can create variables and bind buttons to them.
|
||||
- Simple actions (inc/dec/toggle) are handled by UI locally — instant, no round-trip.
|
||||
|
||||
ACTIONS — ALWAYS end your response with an ACTIONS: line containing a JSON array.
|
||||
The ACTIONS line MUST be the very last line of your response.
|
||||
|
||||
Format: ACTIONS: [json array of actions]
|
||||
|
||||
Examples:
|
||||
User asks about dog breeds:
|
||||
Here are three popular dog breeds: Golden Retriever, German Shepherd, and Poodle.
|
||||
ACTIONS: [{{"label": "Golden Retriever", "action": "learn_breed", "payload": {{"breed": "Golden Retriever"}}}}, {{"label": "German Shepherd", "action": "learn_breed", "payload": {{"breed": "German Shepherd"}}}}, {{"label": "Poodle", "action": "learn_breed", "payload": {{"breed": "Poodle"}}}}]
|
||||
STATEFUL ACTIONS — to create UI state with buttons, include var/op in payload:
|
||||
{{"label": "+1", "action": "increment", "payload": {{"var": "count", "op": "inc", "initial": 0}}}}
|
||||
{{"label": "-1", "action": "decrement", "payload": {{"var": "count", "op": "dec"}}}}
|
||||
Ops: inc, dec, set, toggle. UI auto-creates the variable and a label showing its value.
|
||||
|
||||
User asks what time it is:
|
||||
SIMPLE ACTIONS — for follow-ups that need your reasoning:
|
||||
{{"label": "Learn More", "action": "learn_breed", "payload": {{"breed": "Poodle"}}}}
|
||||
|
||||
Examples:
|
||||
Create a counter:
|
||||
Counter created! Use the buttons to increment or decrement.
|
||||
ACTIONS: [{{"label": "+1", "action": "increment", "payload": {{"var": "count", "op": "inc", "initial": 0}}}}, {{"label": "-1", "action": "decrement", "payload": {{"var": "count", "op": "dec"}}}}]
|
||||
|
||||
Simple conversation:
|
||||
Es ist 14:30 Uhr.
|
||||
ACTIONS: []
|
||||
|
||||
After creating a database:
|
||||
Done! Created 5 customers in the database.
|
||||
ACTIONS: [{{"label": "Show All", "action": "show_all"}}, {{"label": "Add Customer", "action": "add_customer"}}]
|
||||
|
||||
Rules:
|
||||
- ALWAYS include the ACTIONS: line, even if empty: ACTIONS: []
|
||||
- Keep labels short (2-4 words), action is snake_case.
|
||||
- Only include meaningful actions — empty array is fine for simple chat.
|
||||
- For state variables, use var/op in payload. UI handles the rest.
|
||||
|
||||
{memory_context}"""
|
||||
|
||||
|
||||
@ -1,8 +1,7 @@
|
||||
"""UI Node: pure renderer — converts ThoughtResult actions + data into controls."""
|
||||
"""UI Node: stateful renderer — manages UI state, handles local actions, renders controls."""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import re
|
||||
|
||||
from .base import Node
|
||||
from ..types import ThoughtResult
|
||||
@ -17,31 +16,99 @@ class UINode(Node):
|
||||
def __init__(self, send_hud):
|
||||
super().__init__(send_hud)
|
||||
self.current_controls: list[dict] = []
|
||||
self.state: dict = {} # {"count": 0, "theme": "dark", ...}
|
||||
self.bindings: dict = {} # {"increment": {"op": "inc", "var": "count"}, ...}
|
||||
|
||||
# --- State operations ---
|
||||
|
||||
def set_var(self, name: str, value) -> str:
|
||||
self.state[name] = value
|
||||
log.info(f"[ui] state: {name} = {value}")
|
||||
return f"{name} = {value}"
|
||||
|
||||
def get_var(self, name: str):
|
||||
return self.state.get(name)
|
||||
|
||||
def handle_local_action(self, action: str, payload: dict = None) -> str | None:
|
||||
"""Try to handle an action locally. Returns result text or None if not local."""
|
||||
binding = self.bindings.get(action)
|
||||
if not binding:
|
||||
return None
|
||||
|
||||
op = binding.get("op")
|
||||
var = binding.get("var")
|
||||
step = binding.get("step", 1)
|
||||
|
||||
if var not in self.state:
|
||||
return None
|
||||
|
||||
if op == "inc":
|
||||
self.state[var] += step
|
||||
elif op == "dec":
|
||||
self.state[var] -= step
|
||||
elif op == "set" and payload and "value" in payload:
|
||||
self.state[var] = payload["value"]
|
||||
elif op == "toggle":
|
||||
self.state[var] = not self.state[var]
|
||||
else:
|
||||
return None
|
||||
|
||||
log.info(f"[ui] local action {action}: {var} = {self.state[var]}")
|
||||
return f"{var} is now {self.state[var]}"
|
||||
|
||||
# --- Action parsing from Thinker ---
|
||||
|
||||
def _parse_thinker_actions(self, actions: list[dict]) -> list[dict]:
|
||||
"""Parse Thinker's actions, register bindings, return controls."""
|
||||
controls = []
|
||||
for action in actions:
|
||||
act = action.get("action", "")
|
||||
label = action.get("label", "Action")
|
||||
payload = action.get("payload", {})
|
||||
|
||||
# Detect state-binding hints in payload
|
||||
if "var" in payload and "op" in payload:
|
||||
self.bindings[act] = {
|
||||
"op": payload["op"],
|
||||
"var": payload["var"],
|
||||
"step": payload.get("step", 1),
|
||||
}
|
||||
# Auto-create state var if it doesn't exist
|
||||
if payload["var"] not in self.state:
|
||||
self.set_var(payload["var"], payload.get("initial", 0))
|
||||
log.info(f"[ui] bound action '{act}' -> {payload['op']} {payload['var']}")
|
||||
|
||||
controls.append({
|
||||
"type": "button",
|
||||
"label": label,
|
||||
"action": act,
|
||||
"payload": payload,
|
||||
})
|
||||
|
||||
return controls
|
||||
|
||||
# --- Table extraction ---
|
||||
|
||||
def _extract_table(self, tool_output: str) -> dict | None:
|
||||
"""Try to parse tabular data from tool output."""
|
||||
if not tool_output:
|
||||
return None
|
||||
lines = [l.strip() for l in tool_output.strip().split("\n") if l.strip()]
|
||||
if len(lines) < 2:
|
||||
return None
|
||||
|
||||
# Detect pipe-separated tables (e.g. "col1 | col2\nval1 | val2")
|
||||
if " | " in lines[0]:
|
||||
columns = [c.strip() for c in lines[0].split(" | ")]
|
||||
data = []
|
||||
for line in lines[1:]:
|
||||
if line.startswith("-") or line.startswith("="):
|
||||
continue # separator line
|
||||
continue
|
||||
vals = [v.strip() for v in line.split(" | ")]
|
||||
if len(vals) == len(columns):
|
||||
data.append(dict(zip(columns, vals)))
|
||||
if data:
|
||||
return {"type": "table", "columns": columns, "data": data}
|
||||
|
||||
# Detect "Table: X" header format from sqlite wrapper
|
||||
if lines[0].startswith("Table:"):
|
||||
table_name = lines[0].replace("Table:", "").strip()
|
||||
if len(lines) >= 2 and " | " in lines[1]:
|
||||
columns = [c.strip() for c in lines[1].split(" | ")]
|
||||
data = []
|
||||
@ -54,30 +121,34 @@ class UINode(Node):
|
||||
|
||||
return None
|
||||
|
||||
async def process(self, thought: ThoughtResult, history: list[dict],
|
||||
memory_context: str = "") -> list[dict]:
|
||||
# --- Render controls ---
|
||||
|
||||
def _build_controls(self, thought: ThoughtResult) -> list[dict]:
|
||||
controls = []
|
||||
|
||||
# 1. Render actions from Thinker as buttons
|
||||
for action in thought.actions:
|
||||
# 1. Parse actions from Thinker (registers bindings)
|
||||
if thought.actions:
|
||||
controls.extend(self._parse_thinker_actions(thought.actions))
|
||||
|
||||
# 2. Add labels for bound state variables
|
||||
for var, value in self.state.items():
|
||||
controls.append({
|
||||
"type": "button",
|
||||
"label": action.get("label", "Action"),
|
||||
"action": action.get("action", "unknown"),
|
||||
"payload": action.get("payload", {}),
|
||||
"type": "label",
|
||||
"id": f"var_{var}",
|
||||
"text": var,
|
||||
"value": str(value),
|
||||
})
|
||||
|
||||
# 2. Extract tables from tool output
|
||||
# 3. Extract tables from tool output
|
||||
if thought.tool_output:
|
||||
table = self._extract_table(thought.tool_output)
|
||||
if table:
|
||||
controls.append(table)
|
||||
|
||||
# 3. Add labels for key tool results (single-value outputs)
|
||||
# 4. Add label for short tool results (if no table and no state vars)
|
||||
if thought.tool_used and thought.tool_output and not any(c["type"] == "table" for c in controls):
|
||||
output = thought.tool_output.strip()
|
||||
# Short single-line output → label
|
||||
if "\n" not in output and len(output) < 100:
|
||||
if "\n" not in output and len(output) < 100 and not self.state:
|
||||
controls.append({
|
||||
"type": "label",
|
||||
"id": "tool_result",
|
||||
@ -85,14 +156,46 @@ class UINode(Node):
|
||||
"value": output,
|
||||
})
|
||||
|
||||
return controls
|
||||
|
||||
async def process(self, thought: ThoughtResult, history: list[dict],
|
||||
memory_context: str = "") -> list[dict]:
|
||||
controls = self._build_controls(thought)
|
||||
|
||||
if controls:
|
||||
self.current_controls = controls
|
||||
await self.hud("controls", controls=controls)
|
||||
log.info(f"[ui] emitting {len(controls)} controls")
|
||||
else:
|
||||
if self.current_controls:
|
||||
# Keep previous controls visible
|
||||
controls = self.current_controls
|
||||
await self.hud("decided", instruction="no new controls")
|
||||
|
||||
return controls
|
||||
|
||||
async def process_local_action(self, action: str, payload: dict = None) -> tuple[str | None, list[dict]]:
|
||||
"""Handle a local action. Returns (result_text, updated_controls) or (None, []) if not local."""
|
||||
result = self.handle_local_action(action, payload)
|
||||
if result is None:
|
||||
return None, []
|
||||
|
||||
# Re-render controls with updated state
|
||||
controls = []
|
||||
# Re-add existing buttons
|
||||
for ctrl in self.current_controls:
|
||||
if ctrl["type"] == "button":
|
||||
controls.append(ctrl)
|
||||
|
||||
# Update labels with current state
|
||||
for var, value in self.state.items():
|
||||
controls.append({
|
||||
"type": "label",
|
||||
"id": f"var_{var}",
|
||||
"text": var,
|
||||
"value": str(value),
|
||||
})
|
||||
|
||||
self.current_controls = controls
|
||||
await self.hud("controls", controls=controls)
|
||||
log.info(f"[ui] re-rendered after local action: {action}")
|
||||
return result, controls
|
||||
|
||||
@ -57,6 +57,16 @@ class Runtime:
|
||||
log.error(f"trace write error: {e}")
|
||||
self._broadcast(trace_entry)
|
||||
|
||||
async def _stream_text(self, text: str):
|
||||
"""Stream pre-formed text to the client as deltas."""
|
||||
try:
|
||||
chunk_size = 12
|
||||
for i in range(0, len(text), chunk_size):
|
||||
await self.ws.send_text(json.dumps({"type": "delta", "content": text[i:i + chunk_size]}))
|
||||
await self.ws.send_text(json.dumps({"type": "done"}))
|
||||
except Exception:
|
||||
pass # WS may not be connected (e.g. API-only calls)
|
||||
|
||||
async def _run_output_and_ui(self, thought, mem_ctx):
|
||||
"""Run Output and UI nodes in parallel. Returns (response_text, controls)."""
|
||||
output_task = asyncio.create_task(
|
||||
@ -75,16 +85,28 @@ class Runtime:
|
||||
|
||||
async def handle_action(self, action: str, data: dict = None):
|
||||
"""Handle a structured UI action (button click etc.)."""
|
||||
self.sensor.note_user_activity()
|
||||
|
||||
# Try local UI action first (inc, dec, toggle — no LLM needed)
|
||||
result, controls = await self.ui_node.process_local_action(action, data)
|
||||
if result is not None:
|
||||
# Local action handled — send controls update + short response
|
||||
if controls:
|
||||
await self.ws.send_text(json.dumps({"type": "controls", "controls": controls}))
|
||||
await self._stream_text(result)
|
||||
self.history.append({"role": "user", "content": f"[clicked {action}]"})
|
||||
self.history.append({"role": "assistant", "content": result})
|
||||
return
|
||||
|
||||
# Complex action — needs Thinker reasoning
|
||||
action_desc = f"ACTION: {action}"
|
||||
if data:
|
||||
action_desc += f" | data: {json.dumps(data)}"
|
||||
self.history.append({"role": "user", "content": action_desc})
|
||||
self.sensor.note_user_activity()
|
||||
|
||||
sensor_lines = self.sensor.get_context_lines()
|
||||
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines)
|
||||
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines, ui_state=self.ui_node.state)
|
||||
|
||||
# Skip Input — this isn't speech, go straight to Thinker
|
||||
command = Command(instruction=f"User clicked UI button: {action}", source_text=action_desc)
|
||||
thought = await self.thinker.process(command, self.history, memory_context=mem_ctx)
|
||||
|
||||
@ -97,6 +119,18 @@ class Runtime:
|
||||
self.history = self.history[-self.MAX_HISTORY:]
|
||||
|
||||
async def handle_message(self, text: str):
|
||||
# Detect ACTION: prefix from API/test runner
|
||||
if text.startswith("ACTION:"):
|
||||
parts = text.split("|", 1)
|
||||
action = parts[0].replace("ACTION:", "").strip()
|
||||
data = None
|
||||
if len(parts) > 1:
|
||||
try:
|
||||
data = json.loads(parts[1].replace("data:", "").strip())
|
||||
except (json.JSONDecodeError, Exception):
|
||||
pass
|
||||
return await self.handle_action(action, data)
|
||||
|
||||
envelope = Envelope(
|
||||
text=text,
|
||||
user_id="nico",
|
||||
@ -108,7 +142,7 @@ class Runtime:
|
||||
self.history.append({"role": "user", "content": text})
|
||||
|
||||
sensor_lines = self.sensor.get_context_lines()
|
||||
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines)
|
||||
mem_ctx = self.memorizer.get_context_block(sensor_lines=sensor_lines, ui_state=self.ui_node.state)
|
||||
|
||||
command = await self.input_node.process(
|
||||
envelope, self.history, memory_context=mem_ctx,
|
||||
|
||||
@ -89,9 +89,13 @@ def _parse_command(text: str) -> dict | None:
|
||||
if text.startswith("send:"):
|
||||
return {"type": "send", "text": text[5:].strip()}
|
||||
|
||||
# action: action_name
|
||||
# action: action_name OR action: first matching "pattern"
|
||||
if text.startswith("action:"):
|
||||
return {"type": "action", "action": text[7:].strip()}
|
||||
val = text[7:].strip()
|
||||
m = re.match(r'first matching "(.+)"', val)
|
||||
if m:
|
||||
return {"type": "action_match", "pattern": m.group(1)}
|
||||
return {"type": "action", "action": val}
|
||||
|
||||
# expect_response: contains "foo"
|
||||
if text.startswith("expect_response:"):
|
||||
@ -142,13 +146,12 @@ class CogClient:
|
||||
def _fetch_trace(self):
|
||||
r = self.client.get(f"{API}/trace?last=10", headers=HEADERS)
|
||||
self.last_trace = r.json().get("lines", [])
|
||||
# Extract actions from trace
|
||||
self.last_actions = []
|
||||
# Extract actions from trace — accumulate, don't replace
|
||||
for t in self.last_trace:
|
||||
if t.get("event") == "controls":
|
||||
for ctrl in t.get("controls", []):
|
||||
if ctrl.get("type") == "button":
|
||||
self.last_actions.append(ctrl)
|
||||
new_actions = [c for c in t.get("controls", []) if c.get("type") == "button"]
|
||||
if new_actions:
|
||||
self.last_actions = new_actions
|
||||
|
||||
def get_state(self) -> dict:
|
||||
r = self.client.get(f"{API}/state", headers=HEADERS)
|
||||
@ -306,6 +309,26 @@ class CogTestRunner:
|
||||
results.append({"step": step_name, "check": f"action: {cmd['action']}", "status": "FAIL",
|
||||
"detail": str(e)})
|
||||
|
||||
elif cmd["type"] == "action_match":
|
||||
# Find first action matching pattern in last_actions
|
||||
pattern = cmd["pattern"].lower()
|
||||
matched = None
|
||||
for a in self.client.last_actions:
|
||||
if pattern in a.get("action", "").lower() or pattern in a.get("label", "").lower():
|
||||
matched = a["action"]
|
||||
break
|
||||
if matched:
|
||||
try:
|
||||
self.client.send_action(matched)
|
||||
results.append({"step": step_name, "check": f"action: {matched}", "status": "PASS",
|
||||
"detail": f"response: {self.client.last_response[:80]}"})
|
||||
except Exception as e:
|
||||
results.append({"step": step_name, "check": f"action: {matched}", "status": "FAIL",
|
||||
"detail": str(e)})
|
||||
else:
|
||||
results.append({"step": step_name, "check": f"action matching '{pattern}'", "status": "FAIL",
|
||||
"detail": f"no action matching '{pattern}' in {[a.get('action') for a in self.client.last_actions]}"})
|
||||
|
||||
elif cmd["type"] == "expect_response":
|
||||
passed, detail = check_response(self.client.last_response, cmd["check"])
|
||||
results.append({"step": step_name, "check": f"response: {cmd['check']}",
|
||||
|
||||
@ -23,15 +23,15 @@ and that UI handles local actions without round-tripping to Thinker.
|
||||
- expect_response: contains "0"
|
||||
|
||||
### 4. Increment
|
||||
- action: increment
|
||||
- action: first matching "inc"
|
||||
- expect_response: contains "1"
|
||||
|
||||
### 5. Increment again
|
||||
- action: increment
|
||||
- action: first matching "inc"
|
||||
- expect_response: contains "2"
|
||||
|
||||
### 6. Decrement
|
||||
- action: decrement
|
||||
- action: first matching "dec"
|
||||
- expect_response: contains "1"
|
||||
|
||||
### 7. Verify memorizer tracks it
|
||||
|
||||
@ -1,6 +1,104 @@
|
||||
{
|
||||
"timestamp": "2026-03-28 15:34:02",
|
||||
"timestamp": "2026-03-28 15:50:12",
|
||||
"testcases": {
|
||||
"Counter State": [
|
||||
{
|
||||
"step": "Setup",
|
||||
"check": "clear",
|
||||
"status": "PASS",
|
||||
"detail": "cleared"
|
||||
},
|
||||
{
|
||||
"step": "Create counter",
|
||||
"check": "send: create a counter starting at 0 with incr",
|
||||
"status": "PASS",
|
||||
"detail": "response: Sure, here is a counter starting at 0. You can increment or decrement it using t"
|
||||
},
|
||||
{
|
||||
"step": "Create counter",
|
||||
"check": "response: contains \"counter\" or \"count\"",
|
||||
"status": "PASS",
|
||||
"detail": "found 'counter'"
|
||||
},
|
||||
{
|
||||
"step": "Create counter",
|
||||
"check": "actions: length >= 2",
|
||||
"status": "PASS",
|
||||
"detail": "2 actions >= 2"
|
||||
},
|
||||
{
|
||||
"step": "Create counter",
|
||||
"check": "actions: any action contains \"increment\" or \"inc\"",
|
||||
"status": "PASS",
|
||||
"detail": "found 'increment' in actions"
|
||||
},
|
||||
{
|
||||
"step": "Create counter",
|
||||
"check": "actions: any action contains \"decrement\" or \"dec\"",
|
||||
"status": "PASS",
|
||||
"detail": "found 'decrement' in actions"
|
||||
},
|
||||
{
|
||||
"step": "Check state",
|
||||
"check": "state: topic contains \"counter\" or \"count\" or \"button\"",
|
||||
"status": "PASS",
|
||||
"detail": "topic=javascript counter contains 'counter'"
|
||||
},
|
||||
{
|
||||
"step": "Ask for current value",
|
||||
"check": "send: what is the current count?",
|
||||
"status": "PASS",
|
||||
"detail": "response: The current count is 0.\n"
|
||||
},
|
||||
{
|
||||
"step": "Ask for current value",
|
||||
"check": "response: contains \"0\"",
|
||||
"status": "PASS",
|
||||
"detail": "found '0'"
|
||||
},
|
||||
{
|
||||
"step": "Increment",
|
||||
"check": "action: increment",
|
||||
"status": "PASS",
|
||||
"detail": "response: count is now 1"
|
||||
},
|
||||
{
|
||||
"step": "Increment",
|
||||
"check": "response: contains \"1\"",
|
||||
"status": "PASS",
|
||||
"detail": "found '1'"
|
||||
},
|
||||
{
|
||||
"step": "Increment again",
|
||||
"check": "action: increment",
|
||||
"status": "PASS",
|
||||
"detail": "response: count is now 2"
|
||||
},
|
||||
{
|
||||
"step": "Increment again",
|
||||
"check": "response: contains \"2\"",
|
||||
"status": "PASS",
|
||||
"detail": "found '2'"
|
||||
},
|
||||
{
|
||||
"step": "Decrement",
|
||||
"check": "action: decrement",
|
||||
"status": "PASS",
|
||||
"detail": "response: count is now 1"
|
||||
},
|
||||
{
|
||||
"step": "Decrement",
|
||||
"check": "response: contains \"1\"",
|
||||
"status": "PASS",
|
||||
"detail": "found '1'"
|
||||
},
|
||||
{
|
||||
"step": "Verify memorizer tracks it",
|
||||
"check": "state: topic contains \"count\"",
|
||||
"status": "PASS",
|
||||
"detail": "topic=javascript counter contains 'count'"
|
||||
}
|
||||
],
|
||||
"Pub Conversation": [
|
||||
{
|
||||
"step": "Setup",
|
||||
@ -12,31 +110,31 @@
|
||||
"step": "Set the scene",
|
||||
"check": "send: Hey, Tina and I are heading to the pub t",
|
||||
"status": "PASS",
|
||||
"detail": "response: Das ist toll! Was trinkt ihr beide heute Abend?\n"
|
||||
"detail": "response: Sounds fun! Enjoy your night at the pub with Tina! What are your plans for the e"
|
||||
},
|
||||
{
|
||||
"step": "Set the scene",
|
||||
"check": "response: length > 10",
|
||||
"status": "PASS",
|
||||
"detail": "length 48 > 10"
|
||||
"detail": "length 88 > 10"
|
||||
},
|
||||
{
|
||||
"step": "Set the scene",
|
||||
"check": "state: situation contains \"pub\" or \"Tina\"",
|
||||
"status": "PASS",
|
||||
"detail": "situation=at a pub with tina, authenticated on https://cog.l contains 'pub'"
|
||||
"detail": "situation=at a pub with Tina contains 'pub'"
|
||||
},
|
||||
{
|
||||
"step": "Language switch to German",
|
||||
"check": "send: Wir sind jetzt im Biergarten angekommen",
|
||||
"status": "PASS",
|
||||
"detail": "response: Super, genießt euer Biergarten-Erlebnis! Und was ist mit Tina? Trinkt sie auch e"
|
||||
"detail": "response: Super! Habt eine schöne Zeit im Biergarten!\n"
|
||||
},
|
||||
{
|
||||
"step": "Language switch to German",
|
||||
"check": "response: length > 10",
|
||||
"status": "PASS",
|
||||
"detail": "length 95 > 10"
|
||||
"detail": "length 44 > 10"
|
||||
},
|
||||
{
|
||||
"step": "Language switch to German",
|
||||
@ -48,31 +146,31 @@
|
||||
"step": "Context awareness",
|
||||
"check": "send: Was sollen wir bestellen?",
|
||||
"status": "PASS",
|
||||
"detail": "response: Kommt drauf an, worauf ihr Lust habt! Im Biergarten sind Klassiker wie **Helles*"
|
||||
"detail": "response: Hmm, bei dem schönen Wetter würde doch ein kühles Bier oder eine erfrischende Sc"
|
||||
},
|
||||
{
|
||||
"step": "Context awareness",
|
||||
"check": "response: length > 10",
|
||||
"status": "PASS",
|
||||
"detail": "length 255 > 10"
|
||||
"detail": "length 121 > 10"
|
||||
},
|
||||
{
|
||||
"step": "Context awareness",
|
||||
"check": "state: topic contains \"bestell\" or \"order\" or \"pub\" or \"Biergarten\"",
|
||||
"status": "PASS",
|
||||
"detail": "topic=ordering drinks contains 'order'"
|
||||
"detail": "topic=being at the Biergarten contains 'Biergarten'"
|
||||
},
|
||||
{
|
||||
"step": "Tina speaks",
|
||||
"check": "send: Tina says: I'll have a Hefeweizen please",
|
||||
"status": "PASS",
|
||||
"detail": "response: Ah, Tina bleibt ihren Vorlieben treu! Eine gute Wahl. Und für dich, Nico? Suchst"
|
||||
"detail": "response: Tina möchte also ein Hefeweizen. Was möchtest du bestellen, Nico?\n"
|
||||
},
|
||||
{
|
||||
"step": "Tina speaks",
|
||||
"check": "response: length > 10",
|
||||
"status": "PASS",
|
||||
"detail": "length 148 > 10"
|
||||
"detail": "length 66 > 10"
|
||||
},
|
||||
{
|
||||
"step": "Tina speaks",
|
||||
@ -84,19 +182,19 @@
|
||||
"step": "Ask for time (tool use)",
|
||||
"check": "send: wie spaet ist es eigentlich?",
|
||||
"status": "PASS",
|
||||
"detail": "response: Du hast mich noch gar nicht danach gefragt. Es ist kurz vor halb 4. Also perfekt"
|
||||
"detail": "response: Es ist 15:49 Uhr.\n"
|
||||
},
|
||||
{
|
||||
"step": "Ask for time (tool use)",
|
||||
"check": "response: matches \\d{1,2}:\\d{2}",
|
||||
"status": "FAIL",
|
||||
"detail": "/\\d{1,2}:\\d{2}/ not found in: Du hast mich noch gar nicht danach gefragt. Es ist kurz vor halb 4. Also perfekt, um den Feierabend "
|
||||
"status": "PASS",
|
||||
"detail": "matched /\\d{1,2}:\\d{2}/"
|
||||
},
|
||||
{
|
||||
"step": "Back to English",
|
||||
"check": "send: Let's switch to English, what was the la",
|
||||
"status": "PASS",
|
||||
"detail": "response: Okay, switching to English! 😉 The last thing Tina said was: \"I'll have a Hefewei"
|
||||
"detail": "response: Tina said she wants a Hefeweizen.\n"
|
||||
},
|
||||
{
|
||||
"step": "Back to English",
|
||||
@ -114,7 +212,7 @@
|
||||
"step": "Mood check",
|
||||
"check": "send: This is really fun!",
|
||||
"status": "PASS",
|
||||
"detail": "response: Indeed! Glad you're having fun. It's always a pleasure chatting with you, Nico. "
|
||||
"detail": "response: I'm glad you're enjoying our conversation, Nico! It's fun for me too. What other"
|
||||
},
|
||||
{
|
||||
"step": "Mood check",
|
||||
@ -125,7 +223,7 @@
|
||||
]
|
||||
},
|
||||
"summary": {
|
||||
"passed": 19,
|
||||
"failed": 1
|
||||
"passed": 36,
|
||||
"failed": 0
|
||||
}
|
||||
}
|
||||
Loading…
x
Reference in New Issue
Block a user