Part 3 of 6
The Worker Engine
The Loop That Runs Everything
The engine is a 6-phase state machine shared by every Employee. It handles state initialization, pre-enrichment, the main recursive loop, model escalation, and synthesis fallback. No Employee has its own loop code — they all run through this.
The 6-Phase State Machine
From the moment a user message arrives to the moment a response is emitted, the engine progresses through six phases. The main loop (Phase 2) is the recursive core — it runs until an exit condition fires or the AI decides it's ready to respond.
State Initialization
Create the WorkerState object — the engine's memory for this entire session
Pre-Enrichment
Run context-loading tools before the loop starts (parallel, with timeout)
Main Loop
while(true) — build prompt → AI decision → update document → execute tools → repeat
Model Escalation
Automatic switch to a stronger model when the AI is stuck
Response Synthesis
Fallback: use all gathered data to produce a final answer if loop exits early
Stream & Return
Emit the final response event and close the generator
Phase 0: State Initialization
createWorkerState()
Build the engine's memory object
Before anything runs, the engine creates a WorkerState object that holds all mutable state for this session. This object is passed through every phase and updated in place.
WorkerState shape
interface WorkerState {
workerId: string;
goal: string; // The user's current message
chatHistory: ChatMessage[]; // All previous turns in this conversation
organizationId: string; // From the session — never from request body
userId: string; // From the session
// Living document — the AI's working memory
document: {
sections: Record<string, string>; // Named sections, accumulated across passes
};
// Tool execution history
toolResults: ToolResultRecord[];
executedToolSignatures: Set<string>; // Deduplication set
// Loop control
passCount: number;
totalCost: number;
tokensUsed: number;
config: LoopConfig; // Merged global + worker overrides
// Approval queue
pendingApprovals: PendingApproval[];
// Escalation tracking
confidenceHistory: string[];
activeModel: string; // Starts as thinkModel, may escalate
// Pre-enrichment results (injected into Pass 1 only)
preEnrichedContext?: string;
}Phase 1: Pre-Enrichment
runWorkerPreEnrichment()
Load context before the loop starts
A lightweight AI call (with an 8-second timeout) selects which tools to run before the main loop. All selected tools run in parallel. Results are stored in state.preEnrichedContext and injected into Pass 1 only.
Pre-enrichment flow
async function runWorkerPreEnrichment(
worker: WorkerDefinition,
goal: string,
organizationId: string,
userId: string
): Promise<PreEnrichmentResult> {
// Step 1: Ask a small AI model which tools to run
const selectionPrompt = worker.preEnrichment.buildPrompt(goal);
const selection = await chatCompletion(
[{ role: 'user', content: selectionPrompt }],
'google/gemini-3-flash-preview',
{ maxTokens: 512, timeout: 8000 } // Hard 8-second timeout
);
// Step 2: Parse the AI's tool selection
let toolsToRun = parseJsonWithRecovery<ToolSelection[]>(selection.content);
// Step 3: Fall back to defaults if AI selection fails or times out
if (!toolsToRun || toolsToRun.length === 0) {
toolsToRun = worker.preEnrichment.defaultTools.map(t => ({
tool: t.tool,
params: t.params,
reason: 'default fallback'
}));
}
// Step 4: Run all selected tools in PARALLEL (12-second timeout each)
const results = await Promise.allSettled(
toolsToRun.slice(0, worker.preEnrichment.maxTools).map(async (t) => {
const result = await executeToolByName(t.tool, t.params, { organizationId, userId });
return { tool: t.tool, reason: t.reason, result };
})
);
// Step 5: Assemble results into a markdown context block
const contextMarkdown = assemblePreEnrichmentContext(results);
return { contextMarkdown, toolsRun: toolsToRun };
}Phase 2: The Main Loop
while(true)
The recursive core — runs until exit conditions fire
Step A: Check Exit Conditions
At the start of every pass, the engine checks whether it should stop. Exit conditions are checked in priority order:
| Condition | Reason Code | What Happens |
|---|---|---|
| passCount >= maxPasses | max_passes | Loop exits → synthesis fallback |
| tokensUsed >= tokenBudget | token_budget | Loop exits → synthesis fallback |
| totalCost >= costBudget | budget_exceeded | Loop exits → synthesis fallback |
| pendingApprovals.length > 0 | approval_needed | Loop pauses → approval card shown |
| stale confidence (2x same non-high) | stale_confidence | Loop exits → synthesis fallback |
| all tool requests are duplicates | all_tools_duplicate | Loop exits → synthesis fallback |
Step B: Build the Decision Prompt
The prompt is assembled as a message array with four layers. The last message is always a system message ending with "Return JSON." — this triggers the AI's structured decision.
Decision prompt layers
function buildDecisionPrompt(worker: WorkerDefinition, state: WorkerState) {
const messages = [];
// Layer 1: System prompt (expertise, tools, rules, JSON contract)
messages.push({
role: 'system',
content: worker.buildSystemPrompt({ organizationId: state.organizationId, ... })
});
// Layer 2: Pre-enriched context (Pass 1 ONLY)
if (state.preEnrichedContext && state.passCount === 1) {
messages.push({ role: 'system', content: state.preEnrichedContext });
}
// Layer 3: Chat history (all previous turns)
for (const msg of state.chatHistory) {
messages.push({ role: msg.role, content: msg.content });
}
// Layer 4: Current state (pass count, budget, tool history, living doc)
const passesLeft = state.config.maxPasses - state.passCount;
const budgetPct = Math.round((state.totalCost / state.config.costBudget) * 100);
messages.push({
role: 'system',
content: `## CURRENT STATE (Pass ${state.passCount + 1}/${state.config.maxPasses}
· ${passesLeft} passes remaining · $${state.totalCost.toFixed(4)} budget · ${budgetPct}% used)
### User Goal
${state.goal}
### Tool Results So Far
${formatToolHistory(state) || 'None yet.'}
### Living Document
${formatDocument(state) || 'Empty — no data gathered yet.'}
Decide your next action. Call multiple tools in parallel when possible.
Return JSON.`
});
return messages;
}Step C: Get the AI Decision
The AI is called with temperature: 0.2 (low randomness — consistent, structured decisions) and must return JSON in the exact shape defined in the system prompt's JSON contract.
AI decision JSON contract
// When the AI needs to gather more data:
{
"thinking": "I need to search for records in Region A. I also want stats.",
"tool_calls": [
{ "tool": "data.searchRecords", "params": { "region": "A", "category": "enterprise" } },
{ "tool": "data.getStats", "params": { "group_by": "category" } }
],
"should_respond": false,
"confidence": "low",
"document_updates": {
"criteria": "Region A, enterprise category",
"plan": "Search by region+category, then cross-reference, then score"
}
}
// When the AI is ready to respond:
{
"thinking": "I have a complete scored list of 50 records to present",
"tool_calls": [],
"should_respond": true,
"response": "## Summary\n- Searched: 847 records\n...",
"confidence": "high"
}Step D: Update the Living Document
The document_updates field in the AI decision is applied to the named sections. Each update is appended with a pass label — the AI can see its own history across passes.
Living document update logic
function updateDocument(state: WorkerState, updates: Record<string, string>): void {
for (const [key, value] of Object.entries(updates)) {
if (key === 'confidence') {
// Confidence is replaced, not appended
state.document.sections[key] = value;
} else {
// All other sections are APPENDED with a pass label
const existing = state.document.sections[key] || '';
state.document.sections[key] = existing
? `${existing}\n[Pass ${state.passCount}] ${value}`
: `[Pass ${state.passCount}] ${value}`;
}
}
}
// After 3 passes, work_queue might look like:
// [Pass 2] rec-001, rec-002, rec-003 (from Region A search)
// [Pass 3] rec-004, rec-005 (additional from broader search)Phase 3: Model Escalation
maybeEscalateModel()
Automatic switch to a stronger model when stuck
Two automatic escalation paths exist. Both switch from the cheap think model to the escalation model (e.g., Claude Sonnet instead of Gemini Flash).
Escalation path 1: Stale confidence
function maybeEscalateModel(state: WorkerState): void {
// Don't escalate if already on the escalation model
if (state.activeModel === state.config.escalationModel) return;
const history = state.confidenceHistory;
if (history.length >= 2) {
// Two consecutive "low" confidence passes → escalate
const recentLow = history.slice(-2).every((c) => c === 'low');
if (recentLow) {
state.activeModel = state.config.escalationModel;
// The engine emits a status event: "Escalating to a stronger model..."
}
}
}Escalation path 2: No tools requested
let consecutiveNoToolPasses = 0;
// In the main loop, after getting the AI decision:
if (decision.tool_calls.length === 0 && !decision.should_respond) {
consecutiveNoToolPasses++;
// First no-tool pass: escalate model
if (state.activeModel !== state.config.escalationModel) {
state.activeModel = state.config.escalationModel;
yield { type: 'status', data: { message: 'Escalating to a stronger model...' } };
}
// Three consecutive no-tool passes even after escalation: force synthesis
if (consecutiveNoToolPasses >= 3) {
return await synthesizeResponse(worker, state);
}
continue; // Try again with the stronger model
}Phase 4: Response Synthesis
synthesizeResponse()
Fallback: always produce a response, even if the loop exits early
When the loop exits for any reason other than should_respond: true, the engine calls the synthesis step. This uses the synthesizeModel (cheaper and faster) with all gathered data and produces a final answer. The user always gets a response — even if the loop hit its budget.
Synthesis fallback
async function synthesizeResponse(
worker: WorkerDefinition,
state: WorkerState
): Promise<LoopResult> {
if (state.toolResults.length === 0) {
// No data gathered at all — return a graceful error
return {
type: 'response',
message: "I wasn't able to gather enough data to complete this task. Please try again."
};
}
// Use the worker's custom synthesis prompt if defined
const synthesisSystemPrompt = worker.buildSynthesisPrompt
? worker.buildSynthesisPrompt({ organizationId: state.organizationId, ... })
: `You are ${worker.name}. Synthesize a final response using the gathered data below.`;
const messages = [
{ role: 'system', content: synthesisSystemPrompt },
...state.chatHistory,
{
role: 'system',
content: `## GATHERED DATA\n\n${formatDocument(state)}
## TOOL RESULTS\n\n${formatToolHistory(state)}
Respond to the user using the actual data above.`
},
];
// Use synthesizeModel (cheaper than thinkModel, faster than escalationModel)
const response = await chatCompletion(
messages,
state.config.synthesizeModel,
{ temperature: 0.4, maxTokens: 65536 }
);
return { type: 'response', message: response.content };
}