Streaming

Some Chat models provide a streaming response. This means that instead of waiting for the entire response to be returned, you can start processing it as soon as it's available. This is useful if you want to display the response to the user as it's being generated, or if you want to process the response as it's being generated.

Using `.stream()`

The easiest way to stream is to use the .stream() method. This returns an readable stream that you can also iterate over:

tip

See this section for general instructions on installing integration packages.

npm
Yarn
pnpm

npm install @langchain/openai

yarn add @langchain/openai

pnpm add @langchain/openai

tip

We're unifying model params across all packages. We now suggest using model instead of modelName, and apiKey for API keys.

import { ChatOpenAI } from "@langchain/openai";

const chat = new ChatOpenAI({
  maxTokens: 25,
});

// Pass in a human message. Also accepts a raw string, which is automatically
// inferred to be a human message.
const stream = await chat.stream([["human", "Tell me a joke about bears."]]);

for await (const chunk of stream) {
  console.log(chunk);
}
/*
AIMessageChunk {
  content: '',
  additional_kwargs: {}
}
AIMessageChunk {
  content: 'Why',
  additional_kwargs: {}
}
AIMessageChunk {
  content: ' did',
  additional_kwargs: {}
}
AIMessageChunk {
  content: ' the',
  additional_kwargs: {}
}
AIMessageChunk {
  content: ' bear',
  additional_kwargs: {}
}
AIMessageChunk {
  content: ' bring',
  additional_kwargs: {}
}
AIMessageChunk {
  content: ' a',
  additional_kwargs: {}
}
...
*/

API Reference:

ChatOpenAI from @langchain/openai

For models that do not support streaming, the entire response will be returned as a single chunk.

For convenience, you can also pipe a chat model into a StringOutputParser to extract just the raw string values from each chunk:

import { ChatOpenAI } from "@langchain/openai";
import { StringOutputParser } from "@langchain/core/output_parsers";

const parser = new StringOutputParser();

const model = new ChatOpenAI({ temperature: 0 });

const stream = await model.pipe(parser).stream("Hello there!");

for await (const chunk of stream) {
  console.log(chunk);
}

/*
  Hello
  !
  How
  can
  I
  assist
  you
  today
  ?
*/

API Reference:

ChatOpenAI from @langchain/openai
StringOutputParser from @langchain/core/output_parsers

You can also do something similar to stream bytes directly (e.g. for returning a stream in an HTTP response) using the HttpResponseOutputParser:

import { ChatOpenAI } from "@langchain/openai";
import { HttpResponseOutputParser } from "langchain/output_parsers";

const handler = async () => {
  const parser = new HttpResponseOutputParser();

  const model = new ChatOpenAI({ temperature: 0 });

  const stream = await model.pipe(parser).stream("Hello there!");

  const httpResponse = new Response(stream, {
    headers: {
      "Content-Type": "text/plain; charset=utf-8",
    },
  });

  return httpResponse;
};

await handler();

API Reference:

ChatOpenAI from @langchain/openai
HttpResponseOutputParser from langchain/output_parsers

Using a callback handler

You can also use a CallbackHandler like so:

import { ChatOpenAI } from "@langchain/openai";
import { HumanMessage } from "@langchain/core/messages";

const chat = new ChatOpenAI({
  maxTokens: 25,
  streaming: true,
});

const response = await chat.invoke([new HumanMessage("Tell me a joke.")], {
  callbacks: [
    {
      handleLLMNewToken(token: string) {
        console.log({ token });
      },
    },
  ],
});

console.log(response);
// { token: '' }
// { token: '\n\n' }
// { token: 'Why' }
// { token: ' don' }
// { token: "'t" }
// { token: ' scientists' }
// { token: ' trust' }
// { token: ' atoms' }
// { token: '?\n\n' }
// { token: 'Because' }
// { token: ' they' }
// { token: ' make' }
// { token: ' up' }
// { token: ' everything' }
// { token: '.' }
// { token: '' }
// AIMessage {
//   text: "\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything."
// }

API Reference:

ChatOpenAI from @langchain/openai
HumanMessage from @langchain/core/messages

Streaming

Using .stream()​

API Reference:

API Reference:

API Reference:

Using a callback handler​

API Reference:

Help us out by providing feedback on this documentation page:

Using `.stream()`

Using a callback handler