Server Setup
Configure your backend API for the Copilot SDK
Your server receives messages from the Copilot SDK frontend, calls the LLM, and returns the response.
Quick Start
1. Install Dependencies
npm install @yourgpt/llm-sdk openaiSee OpenAI Provider for all available models.
npm install @yourgpt/llm-sdk @anthropic-ai/sdkAnthropic requires its own SDK (@anthropic-ai/sdk).
See Anthropic Provider for models and extended thinking.
npm install @yourgpt/llm-sdk openaiGoogle uses OpenAI-compatible API. See Google Provider for Gemini models.
npm install @yourgpt/llm-sdk openaixAI uses OpenAI-compatible API. See xAI Provider for Grok models.
2. Set Environment Variables
OPENAI_API_KEY=sk-...
# or ANTHROPIC_API_KEY=sk-ant-...
# or GOOGLE_API_KEY=...
# or XAI_API_KEY=...3. Create Runtime
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';
const runtime = createRuntime({
provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o',
systemPrompt: 'You are a helpful assistant.',
});import { createRuntime } from '@yourgpt/llm-sdk';
import { createAnthropic } from '@yourgpt/llm-sdk/anthropic';
const runtime = createRuntime({
provider: createAnthropic({ apiKey: process.env.ANTHROPIC_API_KEY }),
model: 'claude-sonnet-4-20250514',
systemPrompt: 'You are a helpful assistant.',
});import { createRuntime } from '@yourgpt/llm-sdk';
import { createGoogle } from '@yourgpt/llm-sdk/google';
const runtime = createRuntime({
provider: createGoogle({ apiKey: process.env.GOOGLE_API_KEY }),
model: 'gemini-2.0-flash',
systemPrompt: 'You are a helpful assistant.',
});import { createRuntime } from '@yourgpt/llm-sdk';
import { createXAI } from '@yourgpt/llm-sdk/xai';
const runtime = createRuntime({
provider: createXAI({ apiKey: process.env.XAI_API_KEY }),
model: 'grok-3-fast-beta',
systemPrompt: 'You are a helpful assistant.',
});Usage
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';
const runtime = createRuntime({
provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o',
systemPrompt: 'You are a helpful assistant.',
});
export async function POST(req: Request) {
const body = await req.json();
// Streaming - real-time response
return runtime.stream(body).toResponse();
// Non-streaming - wait for complete response
// const result = await runtime.chat(body);
// return Response.json(result);
}import express from 'express';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';
const app = express();
app.use(express.json());
const runtime = createRuntime({
provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o',
systemPrompt: 'You are a helpful assistant.',
});
// Copilot SDK - Streaming (SSE)
app.post('/api/copilot/stream', async (req, res) => {
await runtime.stream(req.body).pipeToResponse(res);
});
// Copilot SDK - Non-streaming (JSON)
app.post('/api/copilot/chat', async (req, res) => {
const result = await runtime.chat(req.body);
res.json(result);
});
app.listen(3001);See the full Express Demo for all endpoint variations including raw streaming and text-only responses.
import { createServer } from 'http';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';
const runtime = createRuntime({
provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o',
systemPrompt: 'You are a helpful assistant.',
});
createServer(async (req, res) => {
if (req.method !== 'POST') return;
const body = JSON.parse(await getBody(req));
if (req.url === '/api/copilot/stream') {
// Copilot SDK - Streaming (SSE)
await runtime.stream(body).pipeToResponse(res);
} else if (req.url === '/api/copilot/chat') {
// Copilot SDK - Non-streaming (JSON)
const result = await runtime.chat(body);
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify(result));
}
}).listen(3001);
function getBody(req: any): Promise<string> {
return new Promise((resolve) => {
let data = '';
req.on('data', (chunk: any) => data += chunk);
req.on('end', () => resolve(data));
});
}| Method | Use Case | Returns |
|---|---|---|
runtime.stream(body) | Real-time chat, interactive UI | StreamResult with .toResponse(), .pipeToResponse() |
runtime.chat(body) | Background tasks, batch processing | { text, messages, toolCalls } |
All Response Methods
The runtime.stream() method returns a StreamResult with multiple ways to consume the response:
For Copilot SDK (SSE format)
| Method | Returns | Framework |
|---|---|---|
toResponse() | Web Response (SSE) | Next.js, Hono, Deno |
pipeToResponse(res) | Pipes SSE stream | Express, Node.js |
// Next.js / Hono
return runtime.stream(body).toResponse();
// Express
await runtime.stream(body).pipeToResponse(res);For Non-Streaming
| Method | Returns | Description |
|---|---|---|
collect() | CollectedResult | Wait for full response |
text() | string | Just get the final text |
// Get full result
const { text, messages, toolCalls } = await runtime.stream(body).collect();
// Or just the text
const text = await runtime.stream(body).text();
// Or use the convenience method
const result = await runtime.chat(body); // Same as stream().collect()Using Runtime generate() (CopilotChat Compatible)
The generate() method returns a GenerateResult with both raw access and a toResponse() method for CopilotChat-compatible format:
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';
const runtime = createRuntime({
provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o',
systemPrompt: 'You are a helpful assistant.',
});
export async function POST(req: Request) {
const body = await req.json();
const result = await runtime.generate(body);
// CopilotChat-compatible format
return Response.json(result.toResponse());
// Or access raw properties
// return Response.json({ text: result.text, toolCalls: result.toolCalls });
}import express from 'express';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';
const app = express();
app.use(express.json());
const runtime = createRuntime({
provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o',
systemPrompt: 'You are a helpful assistant.',
});
app.post('/api/chat', async (req, res) => {
const result = await runtime.generate(req.body);
res.json(result.toResponse());
});
app.listen(3001);import { Hono } from 'hono';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';
const app = new Hono();
const runtime = createRuntime({
provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o',
systemPrompt: 'You are a helpful assistant.',
});
app.post('/api/chat', async (c) => {
const result = await runtime.generate(await c.req.json());
return c.json(result.toResponse());
});
export default app;GenerateResult properties:
result.text- Generated text contentresult.messages- All conversation messagesresult.toolCalls- Tool calls made during generationresult.toolResults- Results from tool executionsresult.success- Whether generation was successfulresult.toResponse()- CopilotChat-compatible JSON format
When to use non-streaming:
- Background processing or batch operations
- When you need the full response before taking action
- Simpler integration without SSE handling
- Logging or analytics that need complete responses
For Direct Text Streaming (Not Copilot SDK)
These methods return plain text (text/plain) which the Copilot SDK cannot parse. Only use for direct streaming to non-Copilot clients.
| Method | Returns | Framework |
|---|---|---|
toTextResponse() | Web Response (text/plain) | Next.js, Hono, Deno |
pipeTextToResponse(res) | Pipes text stream | Express, Node.js |
For Custom Handling
| Method | Returns | Description |
|---|---|---|
toReadableStream() | ReadableStream<Uint8Array> | Raw stream for custom processing |
Event Handlers
Process events as they stream (similar to Anthropic SDK):
const result = runtime.stream(body)
.on('text', (text) => console.log('Text:', text))
.on('toolCall', (call) => console.log('Tool:', call.name))
.on('done', (result) => console.log('Done:', result.text))
.on('error', (err) => console.error('Error:', err));
await result.pipeToResponse(res);Connect Frontend
Point your Copilot SDK frontend to your API endpoint:
'use client';
import { CopilotProvider } from '@yourgpt/copilot-sdk/react';
export function Providers({ children }: { children: React.ReactNode }) {
return (
<CopilotProvider runtimeUrl="/api/copilot/stream">
{children}
</CopilotProvider>
);
}For a separate backend server:
// Streaming (default)
<CopilotProvider runtimeUrl="http://localhost:3001/api/copilot/stream">
// Non-streaming
<CopilotProvider
runtimeUrl="http://localhost:3001/api/copilot/chat"
streaming={false}
>| Mode | Endpoint | CopilotProvider |
|---|---|---|
| Streaming (SSE) | /api/copilot/stream | streaming={true} (default) |
| Non-streaming (JSON) | /api/copilot/chat | streaming={false} |
Advanced
Direct AI Functions
For more control or standalone usage without the Runtime, you can use the AI functions directly:
| Function | Description | Link |
|---|---|---|
streamText() | Stream text in real-time | Documentation |
generateText() | Generate complete text (non-streaming) | Documentation |
import { streamText, generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';
// Streaming
const stream = await streamText({
model: openai('gpt-4o'),
messages,
});
return stream.toDataStreamResponse();
// Non-streaming
const result = await generateText({
model: openai('gpt-4o'),
messages,
});
return Response.json({ text: result.text });Note: When using streamText() with Copilot SDK, use toDataStreamResponse() (not toTextStreamResponse()). See the streamText documentation for details.
Next Steps
- Tools - Learn more about frontend and backend tools
- Providers - Provider-specific configuration
- Chat History - Persist conversations across sessions
- LLM SDK - Low-level AI functions