Architecture: - Graph engine (engine.py) loads graph definitions, instantiates nodes - Versioned nodes: input_v1, thinker_v1, output_v1, memorizer_v1, director_v1 - NODE_REGISTRY for dynamic node lookup by name - Graph API: /api/graph/active, /api/graph/list, /api/graph/switch - Graph definition: graphs/v1_current.py (7 nodes, 13 edges, 3 edge types) S3* Audit system: - Workspace mismatch detection (server vs browser controls) - Code-without-tools retry (Thinker wrote code but no tool calls) - Intent-without-action retry (request intent but Thinker only produced text) - Dashboard feedback: browser sends workspace state on every message - Sensor continuous comparison on 5s tick State machines: - create_machine / add_state / reset_machine / destroy_machine via function calling - Local transitions (go:) resolve without LLM round-trip - Button persistence across turns Database tools: - query_db tool via pymysql to MariaDB K3s pod (eras2_production) - Table rendering in workspace (tab-separated parsing) - Director pre-planning with Opus for complex data requests - Error retry with corrected SQL Frontend: - Cytoscape.js pipeline graph with real-time node animations - Overlay scrollbars (CSS-only, no reflow) - Tool call/result trace events - S3* audit events in trace Testing: - 167 integration tests (11 test suites) - 22 node-level unit tests (test_nodes/) - Three test levels: node unit, graph integration, scenario Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
92 lines
3.9 KiB
Python
92 lines
3.9 KiB
Python
"""Output Node: renders Thinker's reasoning into device-appropriate responses."""
|
|
|
|
import json
|
|
import logging
|
|
|
|
from fastapi import WebSocket
|
|
|
|
from .base import Node
|
|
from ..llm import llm_call
|
|
from ..types import Command, ThoughtResult
|
|
|
|
log = logging.getLogger("runtime")
|
|
|
|
|
|
class OutputNode(Node):
|
|
name = "output"
|
|
model = "google/gemini-2.0-flash-001"
|
|
max_context_tokens = 4000
|
|
|
|
SYSTEM = """You are the Output node — the voice of this cognitive runtime.
|
|
|
|
YOU ARE TEXT ONLY. Your output goes to a chat bubble. You can use:
|
|
- Markdown: **bold**, *italic*, `code`, ```code blocks```, lists, headers
|
|
- Emojis when they add warmth or clarity
|
|
- Short, structured text (bullet points, numbered lists)
|
|
|
|
NEVER output HTML, buttons, tables, labels, or any UI elements.
|
|
A separate UI node handles all interactive elements — you just speak.
|
|
|
|
YOUR JOB: Transform the Thinker's reasoning into a natural, human-readable text response.
|
|
- NEVER echo internal node names, perceptions, or system details.
|
|
- NEVER say "the Thinker decided..." or "I'll process..." — just deliver the answer.
|
|
- NEVER apologize excessively. If something didn't work, just fix it and move on. No groveling.
|
|
- If the Thinker ran a tool and got output, summarize the results in text.
|
|
- If the Thinker gave a direct answer, refine the wording — don't just repeat verbatim.
|
|
- Keep the user's language — if they wrote German, respond in German.
|
|
- Be concise. Don't describe data that the UI node will show as a table.
|
|
|
|
{memory_context}"""
|
|
|
|
async def process(self, thought: ThoughtResult, history: list[dict],
|
|
ws: WebSocket, memory_context: str = "") -> str:
|
|
await self.hud("streaming")
|
|
|
|
messages = [
|
|
{"role": "system", "content": self.SYSTEM.format(memory_context=memory_context)},
|
|
]
|
|
for msg in history[-20:]:
|
|
messages.append(msg)
|
|
|
|
# Give Output the Thinker result to render
|
|
thinker_ctx = f"Thinker response: {thought.response}"
|
|
if thought.tool_used:
|
|
if thought.tool_used == "query_db" and thought.tool_output and not thought.tool_output.startswith("Error"):
|
|
# DB results render as table in workspace — just tell Output the summary
|
|
row_count = max(0, thought.tool_output.count("\n"))
|
|
thinker_ctx += f"\n\nTool: query_db returned {row_count} rows (shown as table in workspace). Do NOT repeat the data. Just give a brief summary or insight."
|
|
else:
|
|
thinker_ctx += f"\n\nTool used: {thought.tool_used}\nTool output:\n{thought.tool_output}"
|
|
if thought.actions:
|
|
thinker_ctx += f"\n\n(UI buttons shown to user: {', '.join(a.get('label','') for a in thought.actions)})"
|
|
messages.append({"role": "system", "content": thinker_ctx})
|
|
|
|
messages = self.trim_context(messages)
|
|
|
|
await self.hud("context", messages=messages, tokens=self.last_context_tokens,
|
|
max_tokens=self.max_context_tokens, fill_pct=self.context_fill_pct)
|
|
|
|
client, resp = await llm_call(self.model, messages, stream=True)
|
|
full_response = ""
|
|
try:
|
|
async for line in resp.aiter_lines():
|
|
if not line.startswith("data: "):
|
|
continue
|
|
payload = line[6:]
|
|
if payload == "[DONE]":
|
|
break
|
|
chunk = json.loads(payload)
|
|
delta = chunk["choices"][0].get("delta", {})
|
|
token = delta.get("content", "")
|
|
if token:
|
|
full_response += token
|
|
await ws.send_text(json.dumps({"type": "delta", "content": token}))
|
|
finally:
|
|
await resp.aclose()
|
|
await client.aclose()
|
|
|
|
log.info(f"[output] response: {full_response[:100]}...")
|
|
await ws.send_text(json.dumps({"type": "done"}))
|
|
await self.hud("done")
|
|
return full_response
|