How It Works: JSON Streaming De-mystified

The Concept: A2UI is not a static file download. It is a live broadcast. Just like subtitles appear instantly while a person is speaking, A2UI components appear instantly while the AI is “thinking”.

The “Puppeteer” Architecture

Imagine your AI Agent is a Puppeteer, and your User Interface is the Marionette.

The Puppeteer pulls strings (sends JSON events).
The Marionette moves (UI updates).
The key: The Puppeteer doesn’t need to rebuild the doll every time. It just sends small commands like “Raise Left Hand” (Update Data).

graph LR
    A[Agent (Puppeteer)] -->|1. Stream JSON| B[A2UI Runtime]
    B -->|2. Parse & Diff| C[React/Vue Component]
    C -->|3. Update DOM| D[User Screen]

The Protocol: Streaming JSONL

A2UI messages are sent as a stream of JSON Lines. This means each line is a complete command. This is crucial for performance over slow networks.

1. The Setup (The Blueprint)

First, the Agent describes what to build.

{"surfaceUpdate":{"surfaceId":"main","components":[{"type":"flight-card"}]}}

Effect: The client sees a Skeleton Loader immediately. The “shape” is there, but the data is empty.

2. The Data (The Content)

As the LLM generates tokens (“United… Airlines…”), A2UI streams them to the component.

{"dataModelUpdate":{"surfaceId":"main","contents":[{"key":"airline","value":"United"}]}}

Effect: The Skeleton flashes usage and the text “United” appears.

3. The Interactive Mode (The Handover)

Once the stream finishes, the UI becomes fully interactive.

{"beginRendering":{"surfaceId":"main"}}

Effect: Buttons become clickable. Forms become editable.

Lifecycle Example: Booking a Table

Let’s trace a single request from start to finish.

User: “Book a table for 2 at 7pm.”

Agent (Think): “I need to show a booking form.”
Agent (Stream): Sends surfaceUpdate with a ReservationForm component.
Client (Render): User sees the form appear instantly.
User (Action): Changes “Guests” from 2 to 3.
- Note: This basic interaction happens LOCALLY. Zero latency.
User (Action): Clicks “Confirm”.

Client (Send): sending the final state back to the Agent.

{"userAction": {"name": "confirm", "data": {"guests": 3}}}

Agent (Close): “Great, booked!” -> Sends deleteSurface to remove the form.

Why “Streaming” Matters?

Perceived Performance: The user sees the UI before the AI finishes thinking.
Error Resilience: If the network cuts out, the user still sees the partial UI (e.g., the airline name) instead of a broken generic error.
Native Feel: Because the host handles the rendering, the UI runs at 60fps, regardless of how slow the LLM is.

Error Handling

Malformed messages: Skip and continue, or request correction.
Network interruptions: Client handles reconnection state.

Performance

Batching: Buffer updates (e.g., 16ms) to batch render.
Diffing: Compare old/new components to minimize updates.
Granular updates: Update /user/name not the entire model.