The Concept: A2UI is not a static file download. It is a live broadcast. Just like subtitles appear instantly while a person is speaking, A2UI components appear instantly while the AI is “thinking”.
The “Puppeteer” Architecture
Imagine your AI Agent is a Puppeteer, and your User Interface is the Marionette.
- The Puppeteer pulls strings (sends JSON events).
- The Marionette moves (UI updates).
- The key: The Puppeteer doesn’t need to rebuild the doll every time. It just sends small commands like “Raise Left Hand” (Update Data).
graph LR
A[Agent (Puppeteer)] -->|1. Stream JSON| B[A2UI Runtime]
B -->|2. Parse & Diff| C[React/Vue Component]
C -->|3. Update DOM| D[User Screen]
The Protocol: Streaming JSONL
A2UI messages are sent as a stream of JSON Lines. This means each line is a complete command. This is crucial for performance over slow networks.
1. The Setup (The Blueprint)
First, the Agent describes what to build.
{"surfaceUpdate":{"surfaceId":"main","components":[{"type":"flight-card"}]}}
- Effect: The client sees a Skeleton Loader immediately. The “shape” is there, but the data is empty.
2. The Data (The Content)
As the LLM generates tokens (“United… Airlines…”), A2UI streams them to the component.
{"dataModelUpdate":{"surfaceId":"main","contents":[{"key":"airline","value":"United"}]}}
- Effect: The Skeleton flashes usage and the text “United” appears.
3. The Interactive Mode (The Handover)
Once the stream finishes, the UI becomes fully interactive.
{"beginRendering":{"surfaceId":"main"}}
- Effect: Buttons become clickable. Forms become editable.
Lifecycle Example: Booking a Table
Let’s trace a single request from start to finish.
User: “Book a table for 2 at 7pm.”
- Agent (Think): “I need to show a booking form.”
- Agent (Stream): Sends
surfaceUpdatewith aReservationFormcomponent. - Client (Render): User sees the form appear instantly.
- User (Action): Changes “Guests” from 2 to 3.
- Note: This basic interaction happens LOCALLY. Zero latency.
- User (Action): Clicks “Confirm”.
- Client (Send): sending the final state back to the Agent.
{"userAction": {"name": "confirm", "data": {"guests": 3}}} - Agent (Close): “Great, booked!” -> Sends
deleteSurfaceto remove the form.
Why “Streaming” Matters?
- Perceived Performance: The user sees the UI before the AI finishes thinking.
- Error Resilience: If the network cuts out, the user still sees the partial UI (e.g., the airline name) instead of a broken generic error.
- Native Feel: Because the host handles the rendering, the UI runs at 60fps, regardless of how slow the LLM is.
Error Handling
- Malformed messages: Skip and continue, or request correction.
- Network interruptions: Client handles reconnection state.
Performance
- Batching: Buffer updates (e.g., 16ms) to batch render.
- Diffing: Compare old/new components to minimize updates.
- Granular updates: Update
/user/namenot the entire model.