Vercel AI SDK Patterns: Streaming, Tool Use & Typed Output
Practical Vercel AI SDK patterns for production: streaming responses, tool calling, and structured output that actually parses. From real shipped code.

Vercel's AI SDK has become the backbone of how I build AI features across all of my products. After shipping AI-powered features in Aviation Infinity, ClickAi, blog management tools, and various internal utilities, I've converged on a set of patterns that work reliably in production.
These are not theoretical best practices. These are patterns extracted from real code serving real users, refined through months of iteration and debugging.
Why AI SDK
Before diving into patterns, the quick case for AI SDK: it provides a unified interface across model providers, handles streaming elegantly, supports tool use natively, and integrates naturally with Next.js App Router. Since my entire stack is Next.js and TypeScript, AI SDK fits like a glove.
The alternative (building custom streaming, tool use, and output parsing on top of raw API calls) is doable but tedious. AI SDK saves weeks of boilerplate and edge-case handling.
Pattern 1: Streaming with Progressive UI
The most basic but most impactful pattern is streaming AI responses with a progressive UI that shows content as it arrives.
// app/api/chat/route.ts
import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
export async function POST(req: Request) {
const { messages } = await req.json()
const result = streamText({
model: anthropic('claude-3-5-sonnet-20241022'),
system: systemPrompt,
messages,
})
return result.toDataStreamResponse()
}
On the client side, the useChat hook handles the streaming state:
const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat()
The key insight is that streaming isn't just a performance optimization. It fundamentally changes the user experience. When an AI feature is analyzing a complex situation, users see the reasoning unfold in real time. This builds trust in a way that a loading spinner followed by a wall of text never can.
Production refinement: I always add a minimum display delay between streamed chunks on the client side. Without it, fast responses feel jarring because text appears almost instantly and users don't have time to read. A subtle delay of 10-20ms between rendered chunks makes the experience feel natural.
Pattern 2: Tool Use with Confirmation
Tool use is where AI stops being a text generator and starts being an agent. AI SDK makes tool definition and execution straightforward:
const result = streamText({
model: anthropic('claude-3-5-sonnet-20241022'),
messages,
tools: {
searchAviationData: {
description: 'Search aviation databases for relevant information',
parameters: z.object({
query: z.string().describe('Aviation search query'),
category: z.string().optional().describe('Aircraft type, airport, or regulation category'),
dateRange: z.object({
from: z.string().optional(),
to: z.string().optional(),
}).optional(),
}),
execute: async ({ query, category, dateRange }) => {
// Actual search implementation
return await searchAviationDatabase(query, category, dateRange)
},
},
},
})
Production refinement: For tools that modify data (as opposed to reading it), I always add a confirmation step. The AI proposes the tool call, the UI shows the user what is about to happen, and the tool only executes after explicit confirmation. This pattern is essential for products where an incorrect action could have real consequences.
The confirmation flow uses AI SDK's tool call streaming events. When the model decides to call a tool, I intercept the tool call before execution, render a confirmation UI, and only proceed when the user approves.
Pattern 3: Structured Output with Zod Schemas
When the AI needs to produce structured data, not just text, AI SDK's generateObject with Zod schemas is invaluable:
import { generateObject } from 'ai'
import { z } from 'zod'
const analysisSchema = z.object({
situation: z.string().describe('Plain language summary of the situation'),
relevantData: z.array(z.object({
name: z.string(),
source: z.string(),
relevance: z.string(),
})),
recommendedActions: z.array(z.object({
action: z.string(),
priority: z.enum(['immediate', 'soon', 'when-ready']),
requiresExpert: z.boolean(),
})),
confidenceLevel: z.enum(['high', 'medium', 'low']),
caveats: z.array(z.string()),
})
const { object } = await generateObject({
model: anthropic('claude-3-5-sonnet-20241022'),
schema: analysisSchema,
prompt: `Analyze the following situation: ${userSituation}`,
})
The Zod schema does double duty: it tells the model exactly what structure to produce, and it validates the output at runtime. If the model produces something that doesn't match the schema, you get a typed error instead of a silent failure.
Production refinement: I always include .describe() annotations on schema fields. These descriptions are sent to the model and significantly improve output quality. A field named priority without a description might get inconsistent values. A field described as "Priority level: immediate for time-sensitive legal deadlines, soon for actions needed within 2 weeks, when-ready for non-urgent steps" gets consistently appropriate values.
Pattern 4: Error Boundaries and Fallbacks
AI is inherently non-deterministic. Sometimes the model produces unexpected output, the API times out, or the tool execution fails. Production AI features need solid error handling.
My standard error boundary pattern wraps every AI interaction:
try {
const result = await streamText({ ... })
return result.toDataStreamResponse()
} catch (error) {
if (error instanceof APICallError) {
// Rate limit, timeout, or API error
return Response.json(
{ error: 'The AI service is temporarily unavailable. Please try again.' },
{ status: 503 }
)
}
// Unexpected error
console.error('AI generation error:', error)
return Response.json(
{ error: 'Something went wrong. Your message was not lost. Please try again.' },
{ status: 500 }
)
}
The key is that error messages should be user-friendly and reassuring. "Your message was not lost" matters to a user in the middle of a complex interaction.
Production refinement: I implement automatic retries with exponential backoff for transient errors (rate limits, timeouts), but never for content errors (malformed output, schema validation failures). Retrying a content error with the same prompt usually produces the same error.
Pattern 5: Context Window Management
When conversations get long, you hit context window limits. My pattern is a sliding window with summarization:
For conversations that approach the context limit, I use a separate AI call to summarize the earlier portion of the conversation, then replace the old messages with the summary. This preserves the essential context while freeing up space for new interaction.
The summarization prompt is specific: "Summarize the conversation so far, preserving all factual details the user has provided, all recommendations given, and all decisions made. Do not lose any specific names, dates, or numbers."
The Meta-Pattern
The overarching pattern across all of these is: start simple, add complexity only where production use demands it. AI SDK makes the simple case easy (stream a response, display it) and the complex case possible (tools, structured output, error handling, context management).
Every pattern I described started as the simplest possible implementation and evolved through real usage. That is the best way to build with AI: ship the simple version, watch how users interact with it, and add the refinements that actual usage reveals you need.
Enjoyed this article?
I write about building products, AI, aviation, and the journey of entrepreneurship. Follow along for more.
Keep reading

Getting Started with Allem SDK: React Hooks for AI, Forms & Auth
Allem SDK is a collection of React hooks for AI chat, form validation, authentication, analytics, and utilities. Here is how to install and use it.

Getting Started with Allem UI: React & React Native Components
Allem UI is an accessible component library for React and React Native with 44+ components, dark mode, and Tailwind CSS v4. Here is how to install and use it.

The Agento Suite: Building 6 AI Products in Parallel
In 2026, I launched six AI products across legal tech, travel, healthcare, and developer tools. Here is the architecture and playbook for building in parallel.