agent-runtime/agent/nodes/thinker_v2.py
Nico 1000411eb2 v0.15.0: Frame engine (v3), PA + Expert architecture (v4-eras), live test streaming
Frame Engine (v3-framed):
- Tick-based deterministic pipeline: frames advance on completion, not timers
- FrameRecord/FrameTrace dataclasses for structured per-message tracing
- /api/frames endpoint: queryable frame trace history (last 20 messages)
- frame_trace HUD event with full pipeline visibility
- Reflex=2F, Director=4F, Director+Interpreter=5F deterministic frame counts

Expert Architecture (v4-eras):
- PA node (pa_v1): routes to domain experts, holds user context
- ExpertNode base: stateless executor with plan+execute two-LLM-call pattern
- ErasExpertNode: eras2_production DB specialist with DESCRIBE-first discipline
- Schema caching: DESCRIBE results reused across queries within session
- Progress streaming: PA streams thinking message, expert streams per-tool progress
- PARouting type for structured routing decisions

UI Controls Split:
- Separate thinker_controls from machine controls (current_controls is now a property)
- Machine buttons persist across Thinker responses
- Machine state parser handles both dict and list formats from Director
- Normalized button format with go/payload field mapping

WebSocket Architecture:
- /ws/test: dedicated debug socket for test runner progress
- /ws/trace: dedicated debug socket for HUD/frame trace events
- /ws (chat): cleaned up, only deltas/controls/done/cleared
- WS survives graph switch (re-attaches to new runtime)
- Pipeline result reset on clear

Test Infrastructure:
- Live test streaming: on_result callback fires per check during execution
- Frontend polling fallback (500ms) for proxy-buffered WS
- frame_trace-first trace assertion (fixes stale perceived event bug)
- action_match supports "or" patterns and multi-pattern matching
- Trace window increased to 40 events
- Graph-agnostic assertions (has X or Y)

Test Suites:
- smoketest.md: 12 steps covering all categories (~2min)
- fast.md: 10 quick checks (~1min)
- fast_v4.md: 10 v4-eras specific checks
- expert_eras.md: eras domain tests (routing, DB, schema, errors)
- expert_progress.md: progress streaming tests

Other:
- Shared db.py extracted from thinker_v2 (reused by experts)
- InputNode prompt: few-shot examples, history as context summary
- Director prompt: full tool signatures for add_state/reset_machine/destroy_machine
- nginx no-cache headers for static files during development
- Cache-busted static file references

Scores: v3 smoketest 39/40, v4-eras fast 28/28, expert_eras 23/23

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:10:31 +02:00

113 lines
4.1 KiB
Python

"""Thinker Node v2: pure executor — runs tools as directed by Director."""
import asyncio
import json
import logging
from .base import Node
from ..llm import llm_call
from ..db import run_db_query
from ..process import ProcessManager
from ..types import Command, DirectorPlan, ThoughtResult
log = logging.getLogger("runtime")
class ThinkerV2Node(Node):
name = "thinker_v2"
model = "google/gemini-2.0-flash-001" # Fast model — just executes
max_context_tokens = 4000
RESPONSE_SYSTEM = """You are the Thinker — a fast executor in a cognitive runtime.
The Director (a smart model) already decided what to do. You just executed the tools.
Now write a natural response to the user based on the results.
{hint}
Rules:
- Be concise and natural.
- If tool results contain data, summarize it clearly.
- NEVER apologize. NEVER say "I" — you are part of a team.
- Keep it short: 1-3 sentences for simple responses.
- For data: reference the numbers, don't repeat raw output."""
def __init__(self, send_hud, process_manager: ProcessManager = None):
super().__init__(send_hud)
self.pm = process_manager
async def process(self, command: Command, plan: DirectorPlan,
history: list[dict], memory_context: str = "") -> ThoughtResult:
"""Execute Director's plan and produce ThoughtResult."""
await self.hud("thinking", detail=f"executing plan: {plan.goal}")
actions = []
state_updates = {}
display_items = []
machine_ops = []
tool_used = ""
tool_output = ""
# Execute tool_sequence in order
for step in plan.tool_sequence:
tool = step.get("tool", "")
args = step.get("args", {})
await self.hud("tool_call", tool=tool, args=args)
if tool == "emit_actions":
actions.extend(args.get("actions", []))
elif tool == "set_state":
key = args.get("key", "")
if key:
state_updates[key] = args.get("value")
elif tool == "emit_display":
display_items.extend(args.get("items", []))
elif tool == "create_machine":
machine_ops.append({"op": "create", **args})
elif tool == "add_state":
machine_ops.append({"op": "add_state", **args})
elif tool == "reset_machine":
machine_ops.append({"op": "reset", **args})
elif tool == "destroy_machine":
machine_ops.append({"op": "destroy", **args})
elif tool == "query_db":
query = args.get("query", "")
database = args.get("database", "eras2_production")
try:
result = await asyncio.to_thread(run_db_query, query, database)
tool_used = "query_db"
tool_output = result
await self.hud("tool_result", tool="query_db", output=result[:200])
except Exception as e:
tool_used = "query_db"
tool_output = f"Error: {e}"
await self.hud("tool_result", tool="query_db", output=str(e)[:200])
# Generate text response
hint = plan.response_hint or f"Goal: {plan.goal}"
if tool_output:
hint += f"\nTool result:\n{tool_output[:500]}"
messages = [
{"role": "system", "content": self.RESPONSE_SYSTEM.format(hint=hint)},
]
for msg in history[-8:]:
messages.append(msg)
messages.append({"role": "user", "content": command.source_text})
messages = self.trim_context(messages)
response = await llm_call(self.model, messages)
if not response:
response = "[no response]"
await self.hud("decided", instruction=response[:200])
return ThoughtResult(
response=response,
tool_used=tool_used,
tool_output=tool_output,
actions=actions,
state_updates=state_updates,
display_items=display_items,
machine_ops=machine_ops,
)