- PA prompt updated: routes ANY task needing tools (DB, UI, buttons, machines) to expert. Only social chat stays with PA. - Expert descriptions include UI capabilities (buttons, machines, tables) - Dashboard integration test: expert creates/replaces buttons, machines, tables — all persist correctly across queries - v4-eras scores: fast 27/28, expert 23/23, dashboard 15/15, progress 11/11 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
34 lines
1.0 KiB
Markdown
34 lines
1.0 KiB
Markdown
# Dashboard Integration
|
|
|
|
Tests that experts can build UI on the shared dashboard:
|
|
buttons, machines, tables, state — all through the PA→Expert pipeline.
|
|
|
|
## Setup
|
|
- clear history
|
|
|
|
## Steps
|
|
|
|
### 1. Expert creates buttons
|
|
- send: create two buttons on my dashboard: Report and Export
|
|
- expect_actions: length >= 2
|
|
- expect_actions: any action contains "report" or "Report"
|
|
|
|
### 2. Buttons survive a query
|
|
- send: how many customers are there?
|
|
- expect_response: length > 5
|
|
- expect_actions: any action contains "report" or "Report"
|
|
|
|
### 3. Expert creates a machine
|
|
- send: create a navigation machine called "workflow" with initial state "start" showing a Next button that goes to "step2"
|
|
- expect_trace: has tool_call create_machine
|
|
|
|
### 4. Expert shows data table
|
|
- send: show me 5 customers in a table
|
|
- expect_trace: has tool_call
|
|
- expect_response: length > 10
|
|
|
|
### 5. Expert replaces buttons
|
|
- send: remove all buttons and create one button called Reset
|
|
- expect_actions: length >= 1
|
|
- expect_actions: any action contains "reset" or "Reset"
|