Server Setup

Your server receives messages from the Copilot SDK frontend, calls the LLM, and returns the response.

Quick Start

1. Install Dependencies

npm install @yourgpt/llm-sdk openai

See OpenAI Provider for all available models.

npm install @yourgpt/llm-sdk @anthropic-ai/sdk

Anthropic requires its own SDK (@anthropic-ai/sdk). See Anthropic Provider for models and extended thinking.

npm install @yourgpt/llm-sdk openai

Google uses OpenAI-compatible API. See Google Provider for Gemini models.

npm install @yourgpt/llm-sdk openai

xAI uses OpenAI-compatible API. See xAI Provider for Grok models.

2. Set Environment Variables

.env.local

OPENAI_API_KEY=sk-...
# or ANTHROPIC_API_KEY=sk-ant-...
# or GOOGLE_API_KEY=...
# or XAI_API_KEY=...

3. Create Runtime

import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

import { createRuntime } from '@yourgpt/llm-sdk';
import { createAnthropic } from '@yourgpt/llm-sdk/anthropic';

const runtime = createRuntime({
  provider: createAnthropic({ apiKey: process.env.ANTHROPIC_API_KEY }),
  model: 'claude-sonnet-4-20250514',
  systemPrompt: 'You are a helpful assistant.',
});

import { createRuntime } from '@yourgpt/llm-sdk';
import { createGoogle } from '@yourgpt/llm-sdk/google';

const runtime = createRuntime({
  provider: createGoogle({ apiKey: process.env.GOOGLE_API_KEY }),
  model: 'gemini-2.0-flash',
  systemPrompt: 'You are a helpful assistant.',
});

import { createRuntime } from '@yourgpt/llm-sdk';
import { createXAI } from '@yourgpt/llm-sdk/xai';

const runtime = createRuntime({
  provider: createXAI({ apiKey: process.env.XAI_API_KEY }),
  model: 'grok-3-fast-beta',
  systemPrompt: 'You are a helpful assistant.',
});

Usage

app/api/chat/route.ts

import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

export async function POST(req: Request) {
  const body = await req.json();

  // Streaming - real-time response
  return runtime.stream(body).toResponse();

  // Non-streaming - wait for complete response
  // const result = await runtime.chat(body);
  // return Response.json(result);
}

server.ts

import express from 'express';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(express.json());

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

// Copilot SDK - Streaming (SSE)
app.post('/api/copilot/stream', async (req, res) => {
  await runtime.stream(req.body).pipeToResponse(res);
});

// Copilot SDK - Non-streaming (JSON)
app.post('/api/copilot/chat', async (req, res) => {
  const result = await runtime.chat(req.body);
  res.json(result);
});

app.listen(3001);

See the full Express Demo for all endpoint variations including raw streaming and text-only responses.

server.ts

import { createServer } from 'http';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

createServer(async (req, res) => {
  if (req.method !== 'POST') return;
  const body = JSON.parse(await getBody(req));

  if (req.url === '/api/copilot/stream') {
    // Copilot SDK - Streaming (SSE)
    await runtime.stream(body).pipeToResponse(res);
  } else if (req.url === '/api/copilot/chat') {
    // Copilot SDK - Non-streaming (JSON)
    const result = await runtime.chat(body);
    res.writeHead(200, { 'Content-Type': 'application/json' });
    res.end(JSON.stringify(result));
  }
}).listen(3001);

function getBody(req: any): Promise<string> {
  return new Promise((resolve) => {
    let data = '';
    req.on('data', (chunk: any) => data += chunk);
    req.on('end', () => resolve(data));
  });
}

Method	Use Case	Returns
`runtime.stream(body)`	Real-time chat, interactive UI	`StreamResult` with `.toResponse()`, `.pipeToResponse()`
`runtime.chat(body)`	Background tasks, batch processing	`{ text, messages, toolCalls }`

All Response Methods

The runtime.stream() method returns a StreamResult with multiple ways to consume the response:

For Copilot SDK (SSE format)

Method	Returns	Framework
`toResponse()`	Web `Response` (SSE)	Next.js, Hono, Deno
`pipeToResponse(res)`	Pipes SSE stream	Express, Node.js

// Next.js / Hono
return runtime.stream(body).toResponse();

// Express
await runtime.stream(body).pipeToResponse(res);

For Non-Streaming

Method	Returns	Description
`collect()`	`CollectedResult`	Wait for full response
`text()`	`string`	Just get the final text

// Get full result
const { text, messages, toolCalls } = await runtime.stream(body).collect();

// Or just the text
const text = await runtime.stream(body).text();

// Or use the convenience method
const result = await runtime.chat(body); // Same as stream().collect()

Using Runtime generate() (CopilotChat Compatible)

The generate() method returns a GenerateResult with both raw access and a toResponse() method for CopilotChat-compatible format:

app/api/chat/route.ts

import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

export async function POST(req: Request) {
  const body = await req.json();
  const result = await runtime.generate(body);

  // CopilotChat-compatible format
  return Response.json(result.toResponse());

  // Or access raw properties
  // return Response.json({ text: result.text, toolCalls: result.toolCalls });
}

server.ts

import express from 'express';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(express.json());

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

app.post('/api/chat', async (req, res) => {
  const result = await runtime.generate(req.body);
  res.json(result.toResponse());
});

app.listen(3001);

server.ts

import { Hono } from 'hono';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = new Hono();

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

app.post('/api/chat', async (c) => {
  const result = await runtime.generate(await c.req.json());
  return c.json(result.toResponse());
});

export default app;

GenerateResult properties:

result.text - Generated text content
result.messages - All conversation messages
result.toolCalls - Tool calls made during generation
result.toolResults - Results from tool executions
result.success - Whether generation was successful
result.toResponse() - CopilotChat-compatible JSON format

When to use non-streaming:

Background processing or batch operations
When you need the full response before taking action
Simpler integration without SSE handling
Logging or analytics that need complete responses

For Direct Text Streaming (Not Copilot SDK)

These methods return plain text (text/plain) which the Copilot SDK cannot parse. Only use for direct streaming to non-Copilot clients.

Method	Returns	Framework
`toTextResponse()`	Web `Response` (text/plain)	Next.js, Hono, Deno
`pipeTextToResponse(res)`	Pipes text stream	Express, Node.js

For Custom Handling

Method	Returns	Description
`toReadableStream()`	`ReadableStream<Uint8Array>`	Raw stream for custom processing

Event Handlers

Process events as they stream (similar to Anthropic SDK):

const result = runtime.stream(body)
  .on('text', (text) => console.log('Text:', text))
  .on('toolCall', (call) => console.log('Tool:', call.name))
  .on('done', (result) => console.log('Done:', result.text))
  .on('error', (err) => console.error('Error:', err));

await result.pipeToResponse(res);

Connect Frontend

Point your Copilot SDK frontend to your API endpoint:

app/providers.tsx

'use client';

import { CopilotProvider } from '@yourgpt/copilot-sdk/react';

export function Providers({ children }: { children: React.ReactNode }) {
  return (
    <CopilotProvider runtimeUrl="/api/copilot/stream">
      {children}
    </CopilotProvider>
  );
}

For a separate backend server:

// Streaming (default)
<CopilotProvider runtimeUrl="http://localhost:3001/api/copilot/stream">

// Non-streaming
<CopilotProvider
  runtimeUrl="http://localhost:3001/api/copilot/chat"
  streaming={false}
>

Mode	Endpoint	CopilotProvider
Streaming (SSE)	`/api/copilot/stream`	`streaming={true}` (default)
Non-streaming (JSON)	`/api/copilot/chat`	`streaming={false}`

Advanced

Direct AI Functions

For more control or standalone usage without the Runtime, you can use the AI functions directly:

Function	Description	Link
`streamText()`	Stream text in real-time	Documentation
`generateText()`	Generate complete text (non-streaming)	Documentation

import { streamText, generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

// Streaming
const stream = await streamText({
  model: openai('gpt-4o'),
  messages,
});
return stream.toDataStreamResponse();

// Non-streaming
const result = await generateText({
  model: openai('gpt-4o'),
  messages,
});
return Response.json({ text: result.text });

Note: When using streamText() with Copilot SDK, use toDataStreamResponse() (not toTextStreamResponse()). See the streamText documentation for details.

Next Steps

Tools - Learn more about frontend and backend tools
Providers - Provider-specific configuration
Chat History - Persist conversations across sessions
LLM SDK - Low-level AI functions

Server Setup

Add Tools - Let the AI call functions on your server

Add Persistence - Save conversations to your database

Track Token Usage - For billing and consumption limits

Runtime Configuration - All options

CORS Configuration - For cross-origin requests

Error Handling

On this page