Part 3

Part 3 of 6

The Worker Engine

The Loop That Runs Everything

The engine is a 6-phase state machine shared by every Employee. It handles state initialization, pre-enrichment, the main recursive loop, model escalation, and synthesis fallback. No Employee has its own loop code — they all run through this.

The 6-Phase State Machine

From the moment a user message arrives to the moment a response is emitted, the engine progresses through six phases. The main loop (Phase 2) is the recursive core — it runs until an exit condition fires or the AI decides it's ready to respond.

State Initialization

Create the WorkerState object — the engine's memory for this entire session

Pre-Enrichment

Run context-loading tools before the loop starts (parallel, with timeout)

Main Loop

while(true) — build prompt → AI decision → update document → execute tools → repeat

Model Escalation

Automatic switch to a stronger model when the AI is stuck

Response Synthesis

Fallback: use all gathered data to produce a final answer if loop exits early

Stream & Return

Emit the final response event and close the generator

Phase 0: State Initialization

createWorkerState()

Build the engine's memory object

Before anything runs, the engine creates a WorkerState object that holds all mutable state for this session. This object is passed through every phase and updated in place.

WorkerState shape

TypeScript

interface WorkerState {
  workerId: string;
  goal: string;                    // The user's current message
  chatHistory: ChatMessage[];      // All previous turns in this conversation
  organizationId: string;          // From the session — never from request body
  userId: string;                  // From the session

  // Living document — the AI's working memory
  document: {
    sections: Record<string, string>;  // Named sections, accumulated across passes
  };

  // Tool execution history
  toolResults: ToolResultRecord[];
  executedToolSignatures: Set<string>; // Deduplication set

  // Loop control
  passCount: number;
  totalCost: number;
  tokensUsed: number;
  config: LoopConfig;              // Merged global + worker overrides

  // Approval queue
  pendingApprovals: PendingApproval[];

  // Escalation tracking
  confidenceHistory: string[];
  activeModel: string;             // Starts as thinkModel, may escalate

  // Pre-enrichment results (injected into Pass 1 only)
  preEnrichedContext?: string;
}

AI THINKS

The WorkerState is my entire world for this session. Every pass I run adds to toolResults, updates document.sections, and increments passCount. The engine reads this state to build my prompt for each pass — I see my own history through the formatted tool history and living document injected into every decision prompt.

Phase 1: Pre-Enrichment

runWorkerPreEnrichment()

Load context before the loop starts

A lightweight AI call (with an 8-second timeout) selects which tools to run before the main loop. All selected tools run in parallel. Results are stored in state.preEnrichedContext and injected into Pass 1 only.

Pre-enrichment flow

TypeScript

async function runWorkerPreEnrichment(
  worker: WorkerDefinition,
  goal: string,
  organizationId: string,
  userId: string
): Promise<PreEnrichmentResult> {
  // Step 1: Ask a small AI model which tools to run
  const selectionPrompt = worker.preEnrichment.buildPrompt(goal);
  const selection = await chatCompletion(
    [{ role: 'user', content: selectionPrompt }],
    'google/gemini-3-flash-preview',
    { maxTokens: 512, timeout: 8000 }  // Hard 8-second timeout
  );

  // Step 2: Parse the AI's tool selection
  let toolsToRun = parseJsonWithRecovery<ToolSelection[]>(selection.content);

  // Step 3: Fall back to defaults if AI selection fails or times out
  if (!toolsToRun || toolsToRun.length === 0) {
    toolsToRun = worker.preEnrichment.defaultTools.map(t => ({
      tool: t.tool,
      params: t.params,
      reason: 'default fallback'
    }));
  }

  // Step 4: Run all selected tools in PARALLEL (12-second timeout each)
  const results = await Promise.allSettled(
    toolsToRun.slice(0, worker.preEnrichment.maxTools).map(async (t) => {
      const result = await executeToolByName(t.tool, t.params, { organizationId, userId });
      return { tool: t.tool, reason: t.reason, result };
    })
  );

  // Step 5: Assemble results into a markdown context block
  const contextMarkdown = assemblePreEnrichmentContext(results);
  return { contextMarkdown, toolsRun: toolsToRun };
}

AI THINKS

By the time I see the user's message in Pass 1, I already have a data overview injected into my context as a ## PRE-ENRICHED CONTEXT block. For Coulee Tech's List Builder, this meant I started Pass 1 knowing the total record count, regional distribution, and category breakdown — without spending a pass to ask for it.

Phase 2: The Main Loop

while(true)

The recursive core — runs until exit conditions fire

Step A: Check Exit Conditions

At the start of every pass, the engine checks whether it should stop. Exit conditions are checked in priority order:

Condition	Reason Code	What Happens
passCount >= maxPasses	max_passes	Loop exits → synthesis fallback
tokensUsed >= tokenBudget	token_budget	Loop exits → synthesis fallback
totalCost >= costBudget	budget_exceeded	Loop exits → synthesis fallback
pendingApprovals.length > 0	approval_needed	Loop pauses → approval card shown
stale confidence (2x same non-high)	stale_confidence	Loop exits → synthesis fallback
all tool requests are duplicates	all_tools_duplicate	Loop exits → synthesis fallback

Step B: Build the Decision Prompt

The prompt is assembled as a message array with four layers. The last message is always a system message ending with "Return JSON." — this triggers the AI's structured decision.

Decision prompt layers

TypeScript

function buildDecisionPrompt(worker: WorkerDefinition, state: WorkerState) {
  const messages = [];

  // Layer 1: System prompt (expertise, tools, rules, JSON contract)
  messages.push({
    role: 'system',
    content: worker.buildSystemPrompt({ organizationId: state.organizationId, ... })
  });

  // Layer 2: Pre-enriched context (Pass 1 ONLY)
  if (state.preEnrichedContext && state.passCount === 1) {
    messages.push({ role: 'system', content: state.preEnrichedContext });
  }

  // Layer 3: Chat history (all previous turns)
  for (const msg of state.chatHistory) {
    messages.push({ role: msg.role, content: msg.content });
  }

  // Layer 4: Current state (pass count, budget, tool history, living doc)
  const passesLeft = state.config.maxPasses - state.passCount;
  const budgetPct = Math.round((state.totalCost / state.config.costBudget) * 100);
  messages.push({
    role: 'system',
    content: `## CURRENT STATE (Pass ${state.passCount + 1}/${state.config.maxPasses}
· ${passesLeft} passes remaining · $${state.totalCost.toFixed(4)} budget · ${budgetPct}% used)

### User Goal
${state.goal}

### Tool Results So Far
${formatToolHistory(state) || 'None yet.'}

### Living Document
${formatDocument(state) || 'Empty — no data gathered yet.'}

Decide your next action. Call multiple tools in parallel when possible.
Return JSON.`
  });

  return messages;
}

Step C: Get the AI Decision

The AI is called with temperature: 0.2 (low randomness — consistent, structured decisions) and must return JSON in the exact shape defined in the system prompt's JSON contract.

AI decision JSON contract

TypeScript

// When the AI needs to gather more data:
{
  "thinking": "I need to search for records in Region A. I also want stats.",
  "tool_calls": [
    { "tool": "data.searchRecords", "params": { "region": "A", "category": "enterprise" } },
    { "tool": "data.getStats", "params": { "group_by": "category" } }
  ],
  "should_respond": false,
  "confidence": "low",
  "document_updates": {
    "criteria": "Region A, enterprise category",
    "plan": "Search by region+category, then cross-reference, then score"
  }
}

// When the AI is ready to respond:
{
  "thinking": "I have a complete scored list of 50 records to present",
  "tool_calls": [],
  "should_respond": true,
  "response": "## Summary\n- Searched: 847 records\n...",
  "confidence": "high"
}

Step D: Update the Living Document

The document_updates field in the AI decision is applied to the named sections. Each update is appended with a pass label — the AI can see its own history across passes.

Living document update logic

TypeScript

function updateDocument(state: WorkerState, updates: Record<string, string>): void {
  for (const [key, value] of Object.entries(updates)) {
    if (key === 'confidence') {
      // Confidence is replaced, not appended
      state.document.sections[key] = value;
    } else {
      // All other sections are APPENDED with a pass label
      const existing = state.document.sections[key] || '';
      state.document.sections[key] = existing
        ? `${existing}\n[Pass ${state.passCount}] ${value}`
        : `[Pass ${state.passCount}] ${value}`;
    }
  }
}

// After 3 passes, work_queue might look like:
// [Pass 2] rec-001, rec-002, rec-003 (from Region A search)
// [Pass 3] rec-004, rec-005 (additional from broader search)

AI THINKS

My living document is my external memory. I can't hold everything in my context window across passes — but I don't need to. I write key findings to named sections and the engine injects them back into every pass. By Pass 5, my resultssection contains the scored table I've been building. I don't repeat myself; I accumulate.

Phase 3: Model Escalation

maybeEscalateModel()

Automatic switch to a stronger model when stuck

Two automatic escalation paths exist. Both switch from the cheap think model to the escalation model (e.g., Claude Sonnet instead of Gemini Flash).

Why escalation matters

The cheap think model (Gemini Flash) is ~10x cheaper than Claude Sonnet. Using it for every pass keeps costs low. But when the AI is stuck — reporting the same low confidence or failing to request new tools — the escalation model takes over. After the loop completes, the cheap model is restored for the next session.

Escalation path 1: Stale confidence

TypeScript

function maybeEscalateModel(state: WorkerState): void {
  // Don't escalate if already on the escalation model
  if (state.activeModel === state.config.escalationModel) return;

  const history = state.confidenceHistory;
  if (history.length >= 2) {
    // Two consecutive "low" confidence passes → escalate
    const recentLow = history.slice(-2).every((c) => c === 'low');
    if (recentLow) {
      state.activeModel = state.config.escalationModel;
      // The engine emits a status event: "Escalating to a stronger model..."
    }
  }
}

Escalation path 2: No tools requested

TypeScript

let consecutiveNoToolPasses = 0;

// In the main loop, after getting the AI decision:
if (decision.tool_calls.length === 0 && !decision.should_respond) {
  consecutiveNoToolPasses++;

  // First no-tool pass: escalate model
  if (state.activeModel !== state.config.escalationModel) {
    state.activeModel = state.config.escalationModel;
    yield { type: 'status', data: { message: 'Escalating to a stronger model...' } };
  }

  // Three consecutive no-tool passes even after escalation: force synthesis
  if (consecutiveNoToolPasses >= 3) {
    return await synthesizeResponse(worker, state);
  }
  continue; // Try again with the stronger model
}

Phase 4: Response Synthesis

synthesizeResponse()

Fallback: always produce a response, even if the loop exits early

When the loop exits for any reason other than should_respond: true, the engine calls the synthesis step. This uses the synthesizeModel (cheaper and faster) with all gathered data and produces a final answer. The user always gets a response — even if the loop hit its budget.

Synthesis fallback

TypeScript

async function synthesizeResponse(
  worker: WorkerDefinition,
  state: WorkerState
): Promise<LoopResult> {
  if (state.toolResults.length === 0) {
    // No data gathered at all — return a graceful error
    return {
      type: 'response',
      message: "I wasn't able to gather enough data to complete this task. Please try again."
    };
  }

  // Use the worker's custom synthesis prompt if defined
  const synthesisSystemPrompt = worker.buildSynthesisPrompt
    ? worker.buildSynthesisPrompt({ organizationId: state.organizationId, ... })
    : `You are ${worker.name}. Synthesize a final response using the gathered data below.`;

  const messages = [
    { role: 'system', content: synthesisSystemPrompt },
    ...state.chatHistory,
    {
      role: 'system',
      content: `## GATHERED DATA\n\n${formatDocument(state)}
## TOOL RESULTS\n\n${formatToolHistory(state)}
Respond to the user using the actual data above.`
    },
  ];

  // Use synthesizeModel (cheaper than thinkModel, faster than escalationModel)
  const response = await chatCompletion(
    messages,
    state.config.synthesizeModel,
    { temperature: 0.4, maxTokens: 65536 }
  );

  return { type: 'response', message: response.content };
}

AI THINKS

Synthesis is my safety net. If I hit the pass limit, cost budget, or get stuck, the engine doesn't just return an error — it calls a separate AI step with everything I've gathered and asks it to produce the best possible answer from that data. The synthesis model doesn't loop; it just formats what's there. The user always gets something useful.

Part 2: Tool System Part 4: Streaming Layer