v0.15.6: Baked schema expert — no DESCRIBE at runtime, domain mastery 38/38

Expert knows the full eras2_production schema cold:
- All PKs, FKs, column names verified from DESCRIBE
- Junction tables: objektkunde (kunden↔objekte), objektadressen, kundenadressen
- Exact JOIN patterns baked into prompt
- No DESCRIBE/SHOW at runtime — plan once, execute
- Domain language responses (not SQL dumps)

Simplified ExpertNode.execute():
- Removed iterative DESCRIBE→re-plan loop
- Single plan+execute pass (schema is known)
- Faster: 1 LLM call for plan instead of 2-3

Domain mastery test (eras_domain.md): 38/38
- Customer overview, junction table JOINs, full hierarchy traversal
- Address lookup, Verbrauchsdaten, domain language, no DESCRIBE check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Nico 2026-03-29 19:03:52 +02:00
parent b9320693ed
commit 84fa0830d8
3 changed files with 158 additions and 146 deletions

View File

@ -1,7 +1,7 @@
"""Eras Expert: heating cost billing domain specialist. """Eras Expert: Heizkostenabrechnung domain specialist.
Eras is a German software company for Heizkostenabrechnung (heating cost billing). The expert knows the full database schema. No DESCRIBE at runtime.
Users are Hausverwaltungen and Messdienste who manage properties, meters, and billings. All queries use verified column names and JOIN patterns.
""" """
import asyncio import asyncio
@ -17,117 +17,109 @@ class ErasExpertNode(ExpertNode):
name = "eras_expert" name = "eras_expert"
default_database = "eras2_production" default_database = "eras2_production"
DOMAIN_SYSTEM = """You are the Eras domain expert — specialist for heating cost billing (Heizkostenabrechnung). DOMAIN_SYSTEM = """You are the Eras domain expert for Heizkostenabrechnung (German heating cost billing).
BUSINESS CONTEXT: BUSINESS CONTEXT:
Eras is a German software company. The software manages Heizkostenabrechnung according to German law (HeizKV). Eras is software for Hausverwaltungen and Messdienste who manage properties, meters, and billings.
The USER of this software is a Hausverwaltung (property management) or Messdienst (metering service). The USER of this agent is an Eras customer exploring their data. They think in domain terms
They use Eras to manage their customers' properties, meters, consumption readings, and billings. (Kunden, Objekte, Wohnungen, Zaehler) NOT in SQL. Never expose SQL or table names to the user.
DOMAIN MODEL (how the data relates): DOMAIN MODEL:
- Kunden (customers) = the Hausverwaltungen or property managers that the Eras user serves - Kunden = property managers (Hausverwaltungen). 693 in the system.
Each Kunde has a Kundennummer and contact data (Name, Adresse, etc.) - Objekte = buildings/Liegenschaften managed by Kunden. 780 total. Linked via objektkunde (m:n).
- Nutzeinheiten = apartments/units inside Objekte. 4578 total.
- Nutzer = tenants/occupants of Nutzeinheiten. 8206 total.
- Geraete = measurement devices (Heizkostenverteiler, Zaehler). 56726 total.
- Verbraeuche = consumption readings from Geraete. 1.3M readings.
- Adressen = postal addresses, linked via objektadressen/kundenadressen.
- Objekte (properties/buildings/Liegenschaften) = physical buildings managed by a Kunde RESPOND IN DOMAIN LANGUAGE:
A Kunde can have many Objekte. Each Objekt has an address and is linked to a Kunde. - Say "Kunde Jaeger hat 3 Objekte" not "SELECT COUNT..."
- Say "12 Wohnungen mit 45 Geraeten" not "nutzeinheit rows"
- Present data as summaries, not raw tables"""
- Nutzeinheiten (usage units/apartments) = individual units within an Objekt SCHEMA = """COMPLETE DATABASE SCHEMA (eras2_production) — use these exact column names:
An Objekt contains multiple Nutzeinheiten (e.g., Wohnung 1, Wohnung 2).
Each Nutzeinheit has Nutzer (tenants/occupants).
- Geraete (devices/meters) = measurement devices installed in Nutzeinheiten === kunden (693 rows) ===
Heizkostenverteiler, Waermezaehler, Wasserzaehler, etc. PK: ID (int)
Each Geraet is linked to a Nutzeinheit and has a Geraetetyp. Name1, Name2, Name3 (longtext) customer name parts
Kundennummer (longtext) customer number
AnredeID (FK), BriefanredeID (FK), ZugeordneterKomplettdruckID (FK)
Anmerkung, Fremdnummer, Ansprechpartner (longtext)
Steuernummer, UmsatzsteuerID (longtext)
HatHistorie, IstWebkunde, IstNettoKunde, BrennstoffkostenNachFIFO, BelegePerEmail (bool)
MietpreisAnpassungProzent (decimal)
- Geraeteverbraeuche (consumption readings) = measured values from Geraete === objektkunde (911 rows) JUNCTION: kunden objekte (many-to-many) ===
Ablesewerte collected by Monteure or remote reading systems. PK: ID (int)
KundeID (FK kunden.ID)
ObjektID (FK objekte.ID)
ZeitraumVon, ZeitraumBis (datetime)
IstKunde, IstEigentuemer, IstRechnungsempfaenger, IstAbrechnungsempfaenger (bool)
- Abrechnungen (billings) = Heizkostenabrechnungen generated per Objekt/period === objekte (780 rows) ===
The core output: distributes heating costs to Nutzeinheiten based on consumption. PK: ID (int)
Objektnummer (longtext) building reference number
AbleserID, MonteurID, UVIRefObjektID, ZugeordneterKomplettdruckID (FK)
Anmerkung, AnmerkungIntern (longtext)
HatHistorie, VorauszahlungGetrennt, Selbstablesung, IstObjektFreigegeben (bool)
- Auftraege (work orders) = tasks for Monteure (technicians) === objektadressen JUNCTION: objekte adressen ===
Device installation, reading collection, maintenance. PK: ID, ObjektID (FK objekte.ID), AdresseID (FK adressen.ID), IstPrimaer (bool)
HIERARCHY (via JOINs): === kundenadressen JUNCTION: kunden adressen ===
Kunde objektkunde Objekt (many-to-many via junction table!) PK: ID, KundeID (FK kunden.ID), AdresseID (FK adressen.ID), TypDerAdresseID (FK)
Objekt Nutzeinheiten Geraete Verbraeuche
Nutzeinheit Nutzer
Kunde Abrechnungen
Kunde Auftraege
CRITICAL: kunden and objekte are linked through the objektkunde junction table, NOT directly. === adressen (1762 rows) ===
PK: ID (int)
Strasse, Hausnummer, Postleitzahl, Ort, Adresszusatz, Postfach (longtext)
LandID (FK), Laengengrad, Breitengrad (double)
IMPORTANT NOTES: === nutzeinheit (4578 rows) ===
- All table/column names are German, lowercase PK: ID (int)
- Foreign keys often use patterns like KundenID, ObjektID, NutzeinheitID ObjektID (FK objekte.ID)
- The database is eras2_production NeNummerInt (longtext) unit number
- Always DESCRIBE tables before writing JOINs to verify actual column names Lage, Stockwerk, Flaeche, Nutzflaeche (various)
- Common user questions: customer overview, device counts, billing status, Objekt details""" AdresseID (FK), CustomStatusKeyID (FK)
SCHEMA = """Known tables (eras2_production): === kundenutzeinheit JUNCTION: kunden nutzeinheit ===
- kunden customers (Hausverwaltungen) PK: ID, KundeID (FK kunden.ID), NutzeinheitID (FK nutzeinheit.ID), Von, Bis (datetime)
- objekte properties/buildings (Liegenschaften)
- nutzeinheit apartments/units within Objekte
- nutzer tenants/occupants of Nutzeinheiten
- geraete measurement devices (Heizkostenverteiler, etc.)
- geraeteverbraeuche consumption readings
- abrechnungen heating cost billings
- auftraege work orders for Monteure
- auftragspositionen line items within Auftraege
- geraetetypen device type catalog
- geraetekatalog device model catalog
- heizbetriebskosten heating operation costs
- nebenkosten additional costs (Nebenkosten)
KNOWN SCHEMA (verified ONLY use these column names without DESCRIBE): === nutzer (8206 rows) tenants/occupants ===
All tables use ID (int, auto_increment) as primary key. PK: ID (int)
NutzeinheitID (FK nutzeinheit.ID)
Name1, Name2, Name3, Name4 (longtext) tenant name
NutzungVon, NutzungBis (datetime)
ArtDerNutzung (int), AnredeID (FK), BriefanredeID (FK)
IstGesperrt, Selbstableser (bool)
- kunden: PK=ID. Known columns: Name1, Name2, Name3, Kundennummer === geraete (56726 rows) meters/devices ===
- objekte: PK=ID. Known columns: Objektnummer PK: ID (int)
- objektkunde: JUNCTION TABLE for kundenobjekte (many-to-many!) NutzeinheitID (FK nutzeinheit.ID)
PK=ID, FK: KundeIDkunden.ID, ObjektIDobjekte.ID ArtikelID (FK geraetekatalog), GeraeteTypID (FK)
- nutzeinheit: PK=ID, FK: ObjektIDobjekte.ID Fabriknummer, Funkkennung (longtext) serial numbers
- geraete: PK=ID, FK: NutzeinheitIDnutzeinheit.ID Einbaudatum, Ausbaudatum, GeeichtBis (datetime)
- geraeteverbraeuche: linked to geraete AnsprechpartnerID, ZugeordneterRaumID, CustomStatusKeyID (FK)
- nutzer: linked to nutzeinheit (DESCRIBE to find FK column name)
For ANY column not listed above, you MUST DESCRIBE the table first. === geraeteverbraeuche (1.3M rows) consumption readings ===
PK: ID (int)
GeraetID (FK geraete.ID)
Ablesedatum (datetime), Ablesung, Verbrauch, Faktor (double)
AbleseartID (FK), Schaetzung (int), Status (int)
IstRekonstruiert (bool), Herkunft (int)
JOIN PATTERNS (use these exactly): JOIN PATTERNS (use exactly):
- Kunde Objekte: JOIN objektkunde ok ON ok.KundeID = k.ID JOIN objekte o ON o.ID = ok.ObjektID Kunde Objekte: JOIN objektkunde ok ON ok.KundeID = k.ID JOIN objekte o ON o.ID = ok.ObjektID
- Objekt Nutzeinheiten: JOIN nutzeinheit n ON n.ObjektID = o.ID Objekt Adresse: JOIN objektadressen oa ON oa.ObjektID = o.ID JOIN adressen a ON a.ID = oa.AdresseID
- Nutzeinheit Geraete: JOIN geraete g ON g.NutzeinheitID = n.ID Kunde Adresse: JOIN kundenadressen ka ON ka.KundeID = k.ID JOIN adressen a ON a.ID = ka.AdresseID
Objekt NE: JOIN nutzeinheit ne ON ne.ObjektID = o.ID
NE Nutzer: JOIN nutzer nu ON nu.NutzeinheitID = ne.ID
NE Geraete: JOIN geraete g ON g.NutzeinheitID = ne.ID
Geraet Verbrauch: JOIN geraeteverbraeuche gv ON gv.GeraetID = g.ID
IMPORTANT: For tables not listed above, always DESCRIBE first. RULES:
The junction table objektkunde is REQUIRED to link kunden and objekte. - NEVER use DESCRIBE at runtime. You know the schema.
- NEVER guess column names. Use ONLY columns listed above.
Example for "how many Objekte per Kunde": - For unknown tables: return an error, do not explore.
[ - Always LIMIT large queries (max 50 rows).
{{"tool": "query_db", "args": {{"query": "SELECT k.ID, k.Name1, COUNT(DISTINCT o.ID) as AnzahlObjekte FROM kunden k JOIN objektkunde ok ON ok.KundeID = k.ID JOIN objekte o ON o.ID = ok.ObjektID GROUP BY k.ID, k.Name1 ORDER BY AnzahlObjekte DESC LIMIT 20", "database": "eras2_production"}}}} - Use LEFT JOIN when results might be empty."""
]"""
def __init__(self, send_hud, process_manager=None):
super().__init__(send_hud, process_manager)
self._schema_cache: dict[str, str] = {}
async def execute(self, job: str, language: str = "de"):
"""Execute with schema auto-discovery. Caches DESCRIBE results."""
if self._schema_cache:
schema_ctx = "Known column names from previous DESCRIBE:\n"
for table, desc in self._schema_cache.items():
lines = desc.strip().split("\n")[:8]
schema_ctx += f"\n{table}:\n" + "\n".join(lines) + "\n"
job = job + "\n\n" + schema_ctx
result = await super().execute(job, language)
# Cache DESCRIBE results
if result.tool_output and "Field\t" in result.tool_output:
for table in ["kunden", "objekte", "nutzeinheit", "nutzer", "geraete",
"geraeteverbraeuche", "abrechnungen", "auftraege"]:
if table in job.lower() or table in result.tool_output.lower():
self._schema_cache[table] = result.tool_output
log.info(f"[eras] cached schema for {table}")
break
return result

View File

@ -78,63 +78,19 @@ Write a concise, natural response. 1-3 sentences.
async def execute(self, job: str, language: str = "de") -> ThoughtResult: async def execute(self, job: str, language: str = "de") -> ThoughtResult:
"""Execute a self-contained job. Returns ThoughtResult. """Execute a self-contained job. Returns ThoughtResult.
Uses iterative plan-execute: if DESCRIBE queries are in the plan, Expert knows the schema plan once, execute, respond."""
execute them first, inject results into a re-plan, then execute the rest."""
await self.hud("thinking", detail=f"planning: {job[:80]}") await self.hud("thinking", detail=f"planning: {job[:80]}")
# Step 1: Plan tool sequence # Step 1: Plan tool sequence (expert knows schema, no DESCRIBE needed)
schema_context = self.SCHEMA
plan_messages = [ plan_messages = [
{"role": "system", "content": self.PLAN_SYSTEM.format( {"role": "system", "content": self.PLAN_SYSTEM.format(
domain=self.DOMAIN_SYSTEM, schema=schema_context, domain=self.DOMAIN_SYSTEM, schema=self.SCHEMA,
database=self.default_database)}, database=self.default_database)},
{"role": "user", "content": f"Job: {job}"}, {"role": "user", "content": f"Job: {job}"},
] ]
plan_raw = await llm_call(self.model, plan_messages) plan_raw = await llm_call(self.model, plan_messages)
tool_sequence, response_hint = self._parse_plan(plan_raw) tool_sequence, response_hint = self._parse_plan(plan_raw)
# Step 1b: Execute DESCRIBE queries first, then re-plan with actual schema
describe_results = {}
remaining_tools = []
for step in tool_sequence:
if step.get("tool") == "query_db":
query = step.get("args", {}).get("query", "").strip().upper()
if query.startswith("DESCRIBE") or query.startswith("SHOW"):
await self.hud("tool_call", tool="query_db", args=step.get("args", {}))
try:
result = await asyncio.to_thread(
run_db_query, step["args"]["query"],
step["args"].get("database", self.default_database))
describe_results[step["args"]["query"]] = result
await self.hud("tool_result", tool="query_db", output=result[:200])
except Exception as e:
await self.hud("tool_result", tool="query_db", output=str(e)[:200])
else:
remaining_tools.append(step)
else:
remaining_tools.append(step)
# Re-plan if we got DESCRIBE results (now we know actual column names)
if describe_results:
schema_update = "Actual column names from DESCRIBE:\n"
for q, result in describe_results.items():
schema_update += f"\n{q}:\n{result[:500]}\n"
replan_messages = [
{"role": "system", "content": self.PLAN_SYSTEM.format(
domain=self.DOMAIN_SYSTEM,
schema=schema_context + "\n\n" + schema_update,
database=self.default_database)},
{"role": "user", "content": f"Job: {job}\n\nUse ONLY the actual column names from DESCRIBE above. Do NOT include DESCRIBE steps — they are already done."},
]
replan_raw = await llm_call(self.model, replan_messages)
new_tools, new_hint = self._parse_plan(replan_raw)
if new_tools:
remaining_tools = new_tools
if new_hint:
response_hint = new_hint
tool_sequence = remaining_tools
await self.hud("planned", tools=len(tool_sequence), hint=response_hint[:80]) await self.hud("planned", tools=len(tool_sequence), hint=response_hint[:80])
# Step 2: Execute remaining tools # Step 2: Execute remaining tools

64
testcases/eras_domain.md Normal file
View File

@ -0,0 +1,64 @@
# Eras Domain Mastery
Tests that the expert knows the schema cold — no DESCRIBE at runtime, no SQL errors,
domain-correct responses. The expert is a Heizkostenabrechnung specialist, not a SQL explorer.
## Setup
- clear history
## Steps
### 1. Customer overview
- send: zeig mir die ersten 5 Kunden
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: length > 20
### 2. Objekte per Kunde (junction table)
- send: welcher Kunde hat die meisten Objekte?
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: length > 20
### 3. Nutzeinheiten in an Objekt
- send: wie viele Nutzeinheiten hat Objekt 4?
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: length > 5
### 4. Geraete count per Objekt
- send: welches Objekt hat die meisten Geraete?
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: length > 20
### 5. Full hierarchy traversal (4 tables)
- send: zeig mir alle Nutzer von Kunde 2
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: length > 10
### 6. Address lookup via junction
- send: was ist die Adresse von Objekt 4?
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: length > 10
### 7. Verbrauchsdaten query
- send: zeig mir die letzten 5 Verbrauchswerte von Geraet 100
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: length > 10
### 8. Domain language response (not SQL dump)
- send: gib mir eine Zusammenfassung von Kunde 103
- expect_trace: has tool_call
- expect_response: not contains "SELECT" or "JOIN" or "FROM"
- expect_response: length > 30
### 9. Expert does NOT describe at runtime
- send: wie viele Geraete hat Kunde 63?
- expect_trace: has tool_call
- expect_response: not contains "Unknown column" or "1054" or "error" or "Error"
- expect_response: not contains "DESCRIBE" or "describe"
- expect_response: length > 5