---
title: "Bringing the Agent Loop to the Web"
description: "Demystify AI agents by exploring the case for moving the loop to the browser. Learn how client-side orchestration enables tight UI integration and data control. "
slug: "bringing-the-agent-loop-to-the-web"
created: 2026-04-23T15:07:00Z
updated: 2026-05-06T16:19:30.731Z
tags:
  - "ai"
  - "agents"
  - "browser"
  - "web"
  - "development"
  - "typescript"
  - "architecture"
  - "llm"
  - "javascript"
hero_image: "/images/demystifying-ai-agents-the-loop-in-the-hero-1.jpg"
ai_assisted: true
---

![](/images/demystifying-ai-agents-the-loop-in-the-hero-1.jpg)


In [Demystifying AI Agents: Learning the Mechanics with Rust](https://bandarra.me/posts/demystifying-ai-agents), we saw that an AI agent is just a `while` loop wrapping a stateless LLM. It asks the model to act, runs a tool, updates history, and repeats. No magic, just plumbing.

## Introduction

My PM colleague mentioned that most of his day-to-day work had moved from the browser into an IDE, with AI supporting a lot of it. I noticed the same thing when writing for this blog. I'd built a custom admin interface for a better writing experience, but once I started using AI agents to help with posts, I found myself moving back to an IDE.

That got me thinking. IDEs have deep integration with the device. They can read files, run shell commands, and know what you're looking at. Web agents don't have any of that today. But for something like writing a blog post, I don't see a fundamental reason why the web can't. The gap isn't capability; it's where the agent loop runs.

Most agent frameworks assume agents belong on the server, treating the browser as a "dumb terminal" for sending prompts and displaying text. The result is often a chat panel bolted onto a product, not built into it. It can answer questions, but it can't interact with the application or react to what's on screen.

If the agent's loop runs on a remote server, it lacks awareness of the user's browser environment. Reading the text the user selected, checking local storage, or reading UI state requires cumbersome piping through websockets or polling, fighting the environment rather than using what's already there.

What if the browser *is* the agent? Consider a [browser-based text editor agent](https://bandarra.me/apps/agent-text-editor/). It reads highlighted text, renders surgical edits as diffs, and pauses for user approval.

![](/images/agent-text-editor.png)

That kind of integration is far more natural when the loop runs directly in the client. The tool executes in the browser and the loop pauses until the user responds.

<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 750 430" width="100%" height="430">
  <defs>
    <filter id="shadow-seq" x="-5%" y="-5%" width="110%" height="110%">
      <feDropShadow dx="0" dy="2" stdDeviation="3" flood-opacity="0.1" />
    </filter>
    <linearGradient id="bg-seq" x1="0" y1="0" x2="0" y2="1">
      <stop offset="0%" stop-color="#f8fafc" />
      <stop offset="100%" stop-color="#f1f5f9" />
    </linearGradient>
    <marker id="arrow-seq" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
      <path d="M 0 0 L 10 5 L 0 10 z" fill="#94a3b8" />
    </marker>
    <marker id="arrow-seq-green" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
      <path d="M 0 0 L 10 5 L 0 10 z" fill="#10b981" />
    </marker>
  </defs>

  <!-- Background -->
  <rect x="0" y="0" width="750" height="430" rx="12" fill="url(#bg-seq)" stroke="#cbd5e1" stroke-width="2" />

  <!-- Column headers -->
  <rect x="20" y="15" width="160" height="58" rx="8" fill="#ffffff" stroke="#94a3b8" stroke-width="2" filter="url(#shadow-seq)" />
  <text x="100" y="42" font-family="system-ui, -apple-system, sans-serif" font-size="13" font-weight="bold" fill="#475569" text-anchor="middle">UI / Application</text>
  <text x="100" y="60" font-family="system-ui, -apple-system, sans-serif" font-size="11" fill="#94a3b8" text-anchor="middle">in the browser</text>

  <rect x="285" y="15" width="180" height="58" rx="8" fill="#ffffff" stroke="#1a73e8" stroke-width="2" filter="url(#shadow-seq)" />
  <text x="375" y="42" font-family="system-ui, -apple-system, sans-serif" font-size="13" font-weight="bold" fill="#1e293b" text-anchor="middle">Agent Loop</text>
  <text x="375" y="60" font-family="system-ui, -apple-system, sans-serif" font-size="11" fill="#94a3b8" text-anchor="middle">in the browser</text>

  <rect x="555" y="15" width="160" height="58" rx="8" fill="#ffffff" stroke="#8b5cf6" stroke-width="2" filter="url(#shadow-seq)" />
  <text x="635" y="42" font-family="system-ui, -apple-system, sans-serif" font-size="13" font-weight="bold" fill="#1e293b" text-anchor="middle">Cloud LLM</text>
  <text x="635" y="60" font-family="system-ui, -apple-system, sans-serif" font-size="11" fill="#94a3b8" text-anchor="middle">remote</text>

  <!-- Lifelines -->
  <line x1="100" y1="73" x2="100" y2="415" stroke="#cbd5e1" stroke-width="1.5" stroke-dasharray="4,4" />
  <line x1="375" y1="73" x2="375" y2="415" stroke="#cbd5e1" stroke-width="1.5" stroke-dasharray="4,4" />
  <line x1="635" y1="73" x2="635" y2="415" stroke="#cbd5e1" stroke-width="1.5" stroke-dasharray="4,4" />

  <!-- Step 1: UI → Agent (reads selection, direct) -->
  <path d="M 163 110 L 278 110" stroke="#10b981" stroke-width="2" fill="none" marker-end="url(#arrow-seq-green)" />
  <text x="220" y="102" font-family="system-ui, -apple-system, sans-serif" font-size="11" font-weight="bold" fill="#059669" text-anchor="middle">1. reads selection</text>
  <text x="220" y="124" font-family="system-ui, -apple-system, sans-serif" font-size="10" fill="#059669" text-anchor="middle">direct — no network</text>

  <!-- Step 2: Agent → LLM (history) -->
  <path d="M 465 155 L 548 155" stroke="#94a3b8" stroke-width="2" stroke-dasharray="5,4" fill="none" marker-end="url(#arrow-seq)" />
  <text x="506" y="147" font-family="system-ui, -apple-system, sans-serif" font-size="11" fill="#64748b" text-anchor="middle">2. send history</text>

  <!-- Step 3: LLM → Agent (tool call instruction) -->
  <path d="M 548 188 L 465 188" stroke="#94a3b8" stroke-width="2" stroke-dasharray="5,4" fill="none" marker-end="url(#arrow-seq)" />
  <text x="506" y="180" font-family="system-ui, -apple-system, sans-serif" font-size="11" fill="#64748b" text-anchor="middle">3. call render_diff tool</text>

  <!-- Step 4: Agent executes tool → UI (direct) -->
  <path d="M 278 230 L 163 230" stroke="#10b981" stroke-width="2" fill="none" marker-end="url(#arrow-seq-green)" />
  <text x="220" y="222" font-family="system-ui, -apple-system, sans-serif" font-size="11" font-weight="bold" fill="#059669" text-anchor="middle">4. execute tool → renders diff</text>
  <text x="220" y="244" font-family="system-ui, -apple-system, sans-serif" font-size="10" fill="#059669" text-anchor="middle">direct — no network</text>

  <!-- Pause box on Agent lifeline -->
  <rect x="330" y="258" width="90" height="32" rx="6" fill="#ffffff" stroke="#f97316" stroke-width="2" filter="url(#shadow-seq)" />
  <text x="375" y="272" font-family="system-ui, -apple-system, sans-serif" font-size="11" font-weight="bold" fill="#ea580c" text-anchor="middle">5. paused</text>
  <text x="375" y="285" font-family="system-ui, -apple-system, sans-serif" font-size="10" fill="#ea580c" text-anchor="middle">awaiting input</text>

  <!-- Step 6: UI → Agent (user accepts/rejects) -->
  <path d="M 163 325 L 322 325" stroke="#94a3b8" stroke-width="2" fill="none" marker-end="url(#arrow-seq)" />
  <text x="220" y="317" font-family="system-ui, -apple-system, sans-serif" font-size="11" fill="#64748b" text-anchor="middle">6. accept / reject</text>
</svg>

## The case for client-side agents

Server-bound agents have one core limitation: they can't see the client-side state and synchronizing changes to the UI from tool calls is cumbersome.

A server-side agent has to wait for the client to send it whatever it needs from the page. To interact with the app, it has to predict an action, send it to the client, and wait for a callback to run the JavaScript.

A client-side agent lives inside the application. It reads the client-side state directly and can check the value of a React state hook, inspect local storage, update the interface or prompt the user without network overhead. 

> [!NOTE]
> CLI agent tools like [Claude Code](https://claude.ai/code) and [Gemini CLI](https://geminicli.com/) already use this pattern. The loop runs in a local process, tools touch the file system and shell, and the LLM is still a stateless remote endpoint. The browser is the same idea in a different runtime, with different local resources: the DOM, browser storage, and the user's active sessions.

## The browser as orchestrator

The architecture is simpler than it sounds. Move the loop to the browser, and the browser becomes the orchestrator.

### Hybrid orchestration map

<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 450" width="100%" height="450">
  <defs>
    <filter id="shadow-hybrid" x="-5%" y="-5%" width="110%" height="110%">
      <feDropShadow dx="0" dy="2" stdDeviation="3" flood-opacity="0.1" />
    </filter>
    <linearGradient id="bg-hybrid" x1="0" y1="0" x2="0" y2="1">
      <stop offset="0%" stop-color="#f8fafc" />
      <stop offset="100%" stop-color="#f1f5f9" />
    </linearGradient>
    <marker id="arrow-hybrid" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
      <path d="M 0 0 L 10 5 L 0 10 z" fill="#94a3b8" />
    </marker>
  </defs>

  <!-- Background -->
  <rect x="0" y="0" width="800" height="450" rx="12" fill="url(#bg-hybrid)" stroke="#cbd5e1" stroke-width="2" />

  <!-- Browser Environment Box -->
  <rect x="20" y="40" width="360" height="385" rx="8" fill="none" stroke="#cbd5e1" stroke-width="1.5" stroke-dasharray="6,4" />
  <text x="200" y="65" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#94a3b8" text-anchor="middle">Browser Environment</text>

  <!-- Server Environment Box -->
  <rect x="420" y="40" width="360" height="385" rx="8" fill="none" stroke="#cbd5e1" stroke-width="1.5" stroke-dasharray="6,4" />
  <text x="600" y="65" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#94a3b8" text-anchor="middle">Server Environment</text>

  <!-- Hub: Browser Agent Loop -->
  <rect x="100" y="130" width="200" height="80" rx="8" fill="#ffffff" stroke="#1a73e8" stroke-width="2" filter="url(#shadow-hybrid)" />
  <text x="200" y="165" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">Agent Loop</text>
  <text x="200" y="185" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">(Orchestrator &amp; State)</text>

  <!-- Local Tools -->
  <rect x="100" y="330" width="200" height="60" rx="8" fill="#ffffff" stroke="#10b981" stroke-width="2" filter="url(#shadow-hybrid)" />
  <text x="200" y="360" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">Local Tools</text>
  <text x="200" y="378" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">(DOM, Local Storage)</text>

  <!-- Remote Brain -->
  <rect x="500" y="100" width="200" height="60" rx="8" fill="#ffffff" stroke="#8b5cf6" stroke-width="2" filter="url(#shadow-hybrid)" />
  <text x="600" y="130" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">Remote Brain</text>
  <text x="600" y="148" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">(Cloud LLM + Prompt)</text>

  <!-- Server Tools -->
  <rect x="500" y="330" width="200" height="60" rx="8" fill="#ffffff" stroke="#ef4444" stroke-width="2" filter="url(#shadow-hybrid)" />
  <text x="600" y="360" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">Server Tools</text>
  <text x="600" y="378" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">(API, DB, Compute)</text>

  <!-- Connections -->
  <g stroke="#94a3b8" stroke-width="2" fill="none" marker-end="url(#arrow-hybrid)">
    <!-- Hub to Remote -->
    <path d="M 300 155 L 490 125" />
    <!-- Remote to Hub -->
    <path d="M 500 145 L 310 175" />
    <!-- Hub to Local -->
    <path d="M 200 210 L 200 320" />
    <!-- Hub to Server Tools -->
    <path d="M 285 210 L 500 345" />
  </g>

  <!-- Labels -->
  <g font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">
    <text x="390" y="128">1. History</text>
    <text x="415" y="178">2. Decision</text>
    <text x="158" y="272">3a. Execute</text>
    <text x="385" y="295">3b. Delegate</text>
  </g>
</svg>

### Shifting the source of truth

In a traditional server-centric agent, the backend runs everything. It holds the conversation history, calls the LLM in the loop, and executes the tool calls. The frontend is just a display layer, and deferring tool calls or sub-agents to the client-side is architecturally complex.

When running the loop on the client-side, web application owns the conversation state and invokes the LLM, which can live in the Cloud, in each loop. The tool calls can be handled on the client-side or, when required, can be easily deferred to the server-side via calls to REST APIs. Similarly, sub-agents can live on the client-side or on the server side. 

### Protecting your system prompts

A concern I hear oftenn from developers is how to protect their prompts on the client-side. Because a client-side agent loop can use a Cloud LLM the "secret sauce", the system prompts for the application can be stored and injected into the prompt on the server.

## Choosing your architecture

To recap, the fundamental difference is where the agent loop—the orchestrator—lives. It doesn't mean that **all** agents need to run on the client-side. If an agent primary interacts with backend systems and requires no integration to the user-interface other than displaying the results, a server-side loop might be a great choice, as it also enables the same agent to run across other surfaces.

But if you want your agent to have a tight intergration with the user interface, pulling client side data, showing confirmation dialogs for tool calls, reading and updating UI state, it's likely a client-side agent will give you more flexibility.

## Building the loop in TypeScript

The architecture sounds sophisticated. The code is not.

The loop has four steps:

<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 700 340" width="100%" height="340">
  <defs>
    <filter id="shadow-loop" x="-5%" y="-5%" width="110%" height="110%">
      <feDropShadow dx="0" dy="2" stdDeviation="3" flood-opacity="0.1" />
    </filter>
    <linearGradient id="bg-loop" x1="0" y1="0" x2="0" y2="1">
      <stop offset="0%" stop-color="#f8fafc" />
      <stop offset="100%" stop-color="#f1f5f9" />
    </linearGradient>
    <marker id="arrow-loop" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
      <path d="M 0 0 L 10 5 L 0 10 z" fill="#94a3b8" />
    </marker>
  </defs>

  <!-- Background -->
  <rect x="0" y="0" width="700" height="340" rx="12" fill="url(#bg-loop)" stroke="#cbd5e1" stroke-width="2" />

  <!-- 1. History -->
  <rect x="260" y="20" width="180" height="55" rx="8" fill="#ffffff" stroke="#94a3b8" stroke-width="2" filter="url(#shadow-loop)" />
  <text x="350" y="48" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#475569" text-anchor="middle">1. History</text>
  <text x="350" y="66" font-family="monospace" font-size="12" fill="#64748b" text-anchor="middle">Message[]</text>

  <!-- 2. Generate -->
  <rect x="260" y="115" width="180" height="55" rx="8" fill="#ffffff" stroke="#8b5cf6" stroke-width="2" filter="url(#shadow-loop)" />
  <text x="350" y="143" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">2. Generate</text>
  <text x="350" y="161" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">"What's next?"</text>

  <!-- 3. Decision Diamond -->
  <path d="M 350 200 L 400 232 L 350 264 L 300 232 Z" fill="#ffffff" stroke="#3b82f6" stroke-width="2" filter="url(#shadow-loop)" />
  <text x="350" y="237" font-family="system-ui, -apple-system, sans-serif" font-size="12" font-weight="bold" fill="#1e293b" text-anchor="middle">3. Decision</text>

  <!-- 4a. Final Text -->
  <rect x="80" y="205" width="140" height="55" rx="28" fill="#f1f5f9" stroke="#cbd5e1" stroke-width="2" filter="url(#shadow-loop)" />
  <text x="150" y="238" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#475569" text-anchor="middle">Final Text</text>

  <!-- 4b. Execute Tools -->
  <rect x="480" y="205" width="160" height="55" rx="8" fill="#ffffff" stroke="#10b981" stroke-width="2" filter="url(#shadow-loop)" />
  <text x="560" y="233" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">4. Execute Tools</text>
  <text x="560" y="251" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">append results</text>

  <!-- Connections -->
  <g stroke="#94a3b8" stroke-width="2" fill="none" marker-end="url(#arrow-loop)">
    <!-- History to Generate -->
    <path d="M 350 75 L 350 107" />
    <!-- Generate to Decision -->
    <path d="M 350 170 L 350 195" />
    <!-- Decision to Final Text -->
    <path d="M 300 232 L 228 232" />
    <!-- Decision to Execute Tools -->
    <path d="M 400 232 L 472 232" />
    <!-- Execute Tools back to History (the loop) -->
    <path d="M 560 205 L 560 47 L 448 47" />
  </g>

</svg>

Here's that loop in TypeScript, stripped to its essentials:

```javascript
async function runAgent(prompt: string) {
  // The browser owns the conversation state
  const history = [{ role: 'user', content: prompt }];
  
  while (true) {
    // 1. Ask the model what to do next
    const response = await model.generate(history, tools);
    
    // 2a. The model gives a final answer
    if (response.text) {
      history.push({ role: 'assistant', content: response.text });
      return response.text;
    }
    
    // 2b. The model wants to call tools
    if (response.toolCalls) {
      history.push({ role: 'assistant', toolCalls: response.toolCalls });
      
      for (const call of response.toolCalls) {
        // 3. The browser executes the tool locally
        const result = await executeTool(call.name, call.args); // error handling omitted for clarity
        
        // 4. Record what happened
        history.push({ 
          role: 'tool', 
          toolCallId: call.id, 
          content: result 
        });
      }
      // 5. The loop repeats, sending the updated history back to the model
    }
  }
}
```

### Handling the cycle

The loop handles the classic agentic cycle:
1. Send the full history to the model.
2. If the model returns tool calls, execute them.
3. Append the tool calls and their results to the history.

This is what most frameworks hide behind layers of abstraction. Once you understand this loop, you can build your own agent framework in a few hundred lines of code. If you want to see a concrete implementation of this loop, check out [`AgentRunner`](https://github.com/andreban/mast-ai/blob/main/packages/core/src/runner.ts) in the mast-ai repository.

> [!NOTE]
> The history array grows with every turn. For long-running agents, you will eventually hit the model's context window limit. Plan for this early: common strategies include summarising older turns into a single message, or dropping tool results once their content has been acknowledged by the model.

## Delegating to specialized agents

One more thing worth knowing: agents can delegate tasks to other, specialized agents.

### Agent delegation tree

<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 400" width="100%" height="400">
  <defs>
    <filter id="shadow-tree" x="-5%" y="-5%" width="110%" height="110%">
      <feDropShadow dx="0" dy="2" stdDeviation="3" flood-opacity="0.1" />
    </filter>
    <linearGradient id="bg-tree" x1="0" y1="0" x2="0" y2="1">
      <stop offset="0%" stop-color="#f8fafc" />
      <stop offset="100%" stop-color="#f1f5f9" />
    </linearGradient>
    <marker id="arrowhead-4" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto">
      <path d="M 0 0 L 10 5 L 0 10 z" fill="#94a3b8" />
    </marker>
  </defs>

  <!-- Background -->
  <rect x="0" y="0" width="800" height="400" rx="12" fill="url(#bg-tree)" stroke="#cbd5e1" stroke-width="2" />

  <!-- Manager Agent -->
  <rect x="300" y="50" width="200" height="60" fill="#e8f0fe" stroke="#1a73e8" stroke-width="3" rx="8" filter="url(#shadow-tree)" />
  <text x="400" y="80" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">Manager Agent (Parent)</text>
  <text x="400" y="98" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">Orchestrates user request</text>

  <!-- Researcher Sub-Agent -->
  <rect x="300" y="185" width="200" height="60" fill="#ffffff" stroke="#94a3b8" stroke-width="2" rx="8" filter="url(#shadow-tree)" />
  <text x="400" y="215" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">Researcher Agent (Child)</text>
  <text x="400" y="233" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">Exposed as a tool</text>

  <!-- Tools -->
  <rect x="300" y="315" width="200" height="60" fill="#ffffff" stroke="#94a3b8" stroke-width="2" rx="8" filter="url(#shadow-tree)" />
  <text x="400" y="345" font-family="system-ui, -apple-system, sans-serif" font-size="14" font-weight="bold" fill="#1e293b" text-anchor="middle">Search &amp; Fetch Tools</text>
  <text x="400" y="363" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">Used by researcher</text>

  <!-- Manager to Researcher (Call) -->
  <path d="M 380 110 L 380 175" stroke="#94a3b8" stroke-width="2" fill="none" marker-end="url(#arrowhead-4)" />
  <text x="325" y="148" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">1. Call Tool</text>

  <!-- Researcher to Manager (Return) -->
  <path d="M 420 185 L 420 118" stroke="#94a3b8" stroke-width="2" fill="none" marker-end="url(#arrowhead-4)" />
  <text x="475" y="148" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">4. Return Result</text>

  <!-- Researcher to Tools -->
  <path d="M 380 245 L 380 305" stroke="#94a3b8" stroke-width="2" fill="none" marker-end="url(#arrowhead-4)" />
  <text x="318" y="280" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">2. Use Tools</text>

  <!-- Tools to Researcher (Return) -->
  <path d="M 420 315 L 420 255" stroke="#94a3b8" stroke-width="2" fill="none" marker-end="url(#arrowhead-4)" />
  <text x="482" y="280" font-family="system-ui, -apple-system, sans-serif" font-size="12" fill="#64748b" text-anchor="middle">3. Get Results</text>
</svg>


The agent loop takes conversation history and calls tools. A tool is just a function that returns a string, and that function can be another agent loop.

Imagine a general "Assistant Agent" that handles user requests. If the user asks for a deep research report on a topic, the main agent doesn't need to do the research itself. It can call a specialized "Research Sub-Agent" exposed as a tool.

The main agent pauses its loop, calls the research tool with a query, and the sub-agent starts its own loop to fetch URLs, summarize pages, and synthesize a report. When the sub-agent finishes, it returns the report as a string to the main agent, which resumes its loop.

The sub-agent could be running on the same main thread, in a background Web Worker to keep the UI responsive, or on a remote server entirely. It might use the same LLM or a different model optimized for the task. To the parent, it's just another tool call.

> [!TIP]
> Sub-agents that do heavy work (fetching URLs, scraping pages, running long loops) are natural candidates for [Web Workers](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API). Wrapping a sub-agent in a Worker keeps it off the main thread so the UI stays responsive while it runs.

## Conclusion

The server doesn't go away when you move the loop to the browser. It still runs the LLM, protects your credentials, handles heavy compute. But the browser decides when to call it, what to send, and what to do with the response.

If you're building an AI agent for the web, consider running the loop in the browser. Think about the [text editor from the introduction](https://bandarra.me/apps/agent-text-editor/): an agent that reads your selection, queries your workspace, delegates to a reviewer or writer sub-agent when needed, and renders the result as a diff for your approval. That's not a chatbot bolted onto a sidebar. That's a first-class feature, and it only works naturally when the loop lives where the UI lives.

If you want to try the text editor yourself, head over to [bandarra.me/apps/agent-text-editor](https://bandarra.me/apps/agent-text-editor/) (you'll need a Gemini API key, which you can get free at [Google AI Studio](https://aistudio.google.com/)). The source is at [github.com/andreban/agent-text-editor](https://github.com/andreban/agent-text-editor).

If you want to see how the agent loop is implemented, or want a foundation to build your own browser agents, check out [mast-ai](https://github.com/andreban/mast-ai) on GitHub.
