ChatOpenAI
OpenAI is an artificial intelligence (AI) research laboratory.
This guide will help you getting started with ChatOpenAI chat models. For detailed documentation of all ChatOpenAI features and configurations head to the API reference.
Overviewβ
Integration detailsβ
Class | Package | Local | Serializable | PY support | Package downloads | Package latest |
---|---|---|---|---|---|---|
ChatOpenAI | @langchain/openai | β | β | β |
Model featuresβ
See the links in the table headers below for guides on how to use specific features.
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|
β | β | β | β | β | β | β | β | β |
Setupβ
To access OpenAI chat models youβll need to create an OpenAI account,
get an API key, and install the @langchain/openai
integration package.
Credentialsβ
Head to OpenAIβs website to sign up for
OpenAI and generate an API key. Once youβve done this set the
OPENAI_API_KEY
environment variable:
export OPENAI_API_KEY="your-api-key"
If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:
# export LANGCHAIN_TRACING_V2="true"
# export LANGCHAIN_API_KEY="your-api-key"
Installationβ
The LangChain ChatOpenAI
integration lives in the @langchain/openai
package:
- npm
- yarn
- pnpm
npm i @langchain/openai @langchain/core
yarn add @langchain/openai @langchain/core
pnpm add @langchain/openai @langchain/core
Instantiationβ
Now we can instantiate our model object and generate chat completions:
import { ChatOpenAI } from "@langchain/openai";
const llm = new ChatOpenAI({
model: "gpt-4o",
temperature: 0,
// other params...
});
Invocationβ
const aiMsg = await llm.invoke([
{
role: "system",
content:
"You are a helpful assistant that translates English to French. Translate the user sentence.",
},
{
role: "user",
content: "I love programming.",
},
]);
aiMsg;
AIMessage {
"id": "chatcmpl-ADItECqSPuuEuBHHPjeCkh9wIO1H5",
"content": "J'adore la programmation.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 31,
"totalTokens": 36
},
"finish_reason": "stop",
"system_fingerprint": "fp_5796ac6771"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 31,
"output_tokens": 5,
"total_tokens": 36
}
}
console.log(aiMsg.content);
J'adore la programmation.
Chainingβ
We can chain our model with a prompt template like so:
import { ChatPromptTemplate } from "@langchain/core/prompts";
const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
],
["human", "{input}"],
]);
const chain = prompt.pipe(llm);
await chain.invoke({
input_language: "English",
output_language: "German",
input: "I love programming.",
});
AIMessage {
"id": "chatcmpl-ADItFaWFNqkSjSmlxeGk6HxcBHzVN",
"content": "Ich liebe Programmieren.",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 5,
"promptTokens": 26,
"totalTokens": 31
},
"finish_reason": "stop",
"system_fingerprint": "fp_5796ac6771"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 26,
"output_tokens": 5,
"total_tokens": 31
}
}
Custom URLsβ
You can customize the base URL the SDK sends requests to by passing a
configuration
parameter like this:
import { ChatOpenAI } from "@langchain/openai";
const llmWithCustomURL = new ChatOpenAI({
temperature: 0.9,
configuration: {
baseURL: "https://your_custom_url.com",
},
});
await llmWithCustomURL.invoke("Hi there!");
The configuration
field also accepts other ClientOptions
parameters
accepted by the official SDK.
If you are hosting on Azure OpenAI, see the dedicated page instead.
Custom headersβ
You can specify custom headers in the same configuration
field:
import { ChatOpenAI } from "@langchain/openai";
const llmWithCustomHeaders = new ChatOpenAI({
temperature: 0.9,
configuration: {
defaultHeaders: {
Authorization: `Bearer SOME_CUSTOM_VALUE`,
},
},
});
await llmWithCustomHeaders.invoke("Hi there!");
Disabling streaming usage metadataβ
Some proxies or third-party providers present largely the same API
interface as OpenAI, but donβt support the more recently added
stream_options
parameter to return streaming usage. You can use
ChatOpenAI
to access these providers by disabling streaming usage like
this:
import { ChatOpenAI } from "@langchain/openai";
const llmWithoutStreamUsage = new ChatOpenAI({
temperature: 0.9,
streamUsage: false,
configuration: {
baseURL: "https://proxy.com",
},
});
await llmWithoutStreamUsage.invoke("Hi there!");
Calling fine-tuned modelsβ
You can call fine-tuned OpenAI models by passing in your corresponding
modelName
parameter.
This generally takes the form of
ft:{OPENAI_MODEL_NAME}:{ORG_NAME}::{MODEL_ID}
. For example:
import { ChatOpenAI } from "@langchain/openai";
const fineTunedLlm = new ChatOpenAI({
temperature: 0.9,
model: "ft:gpt-3.5-turbo-0613:{ORG_NAME}::{MODEL_ID}",
});
await fineTunedLlm.invoke("Hi there!");
Generation metadataβ
If you need additional information like logprobs or token usage, these
will be returned directly in the .invoke
response within the
response_metadata
field on the message.
Requires @langchain/core
version >=0.1.48.
import { ChatOpenAI } from "@langchain/openai";
// See https://cookbook.openai.com/examples/using_logprobs for details
const llmWithLogprobs = new ChatOpenAI({
logprobs: true,
// topLogprobs: 5,
});
const responseMessageWithLogprobs = await llmWithLogprobs.invoke("Hi there!");
console.dir(responseMessageWithLogprobs.response_metadata.logprobs, {
depth: null,
});
{
content: [
{
token: 'Hello',
logprob: -0.0004740447,
bytes: [ 72, 101, 108, 108, 111 ],
top_logprobs: []
},
{
token: '!',
logprob: -0.00004334534,
bytes: [ 33 ],
top_logprobs: []
},
{
token: ' How',
logprob: -0.000030113732,
bytes: [ 32, 72, 111, 119 ],
top_logprobs: []
},
{
token: ' can',
logprob: -0.0004797665,
bytes: [ 32, 99, 97, 110 ],
top_logprobs: []
},
{
token: ' I',
logprob: -7.89631e-7,
bytes: [ 32, 73 ],
top_logprobs: []
},
{
token: ' assist',
logprob: -0.114006,
bytes: [
32, 97, 115,
115, 105, 115,
116
],
top_logprobs: []
},
{
token: ' you',
logprob: -4.3202e-7,
bytes: [ 32, 121, 111, 117 ],
top_logprobs: []
},
{
token: ' today',
logprob: -0.00004501419,
bytes: [ 32, 116, 111, 100, 97, 121 ],
top_logprobs: []
},
{
token: '?',
logprob: -0.000010206721,
bytes: [ 63 ],
top_logprobs: []
}
],
refusal: null
}
Tool callingβ
Tool calling with OpenAI models works in a similar to other models. Additionally, the following guides have some information especially relevant to OpenAI:
- How to: disable parallel tool calling
- How to: force a tool call
- How to: bind model-specific tool formats to a model.
strict: true
β
As of Aug 6, 2024, OpenAI supports a strict
argument when calling
tools that will enforce that the tool argument schema is respected by
the model. See more here:
https://platform.openai.com/docs/guides/function-calling.
@langchain/openai >= 0.2.6
Note: If strict: true
the tool definition will also be validated, and a subset of JSON schema are accepted. Crucially, schema cannot have optional args (those with default values). Read the full docs on what types of schema are supported here: https://platform.openai.com/docs/guides/structured-outputs/supported-schemas.
Hereβs an example with tool calling. Passing an extra strict: true
argument to .bindTools
will pass the param through to all tool
definitions:
import { ChatOpenAI } from "@langchain/openai";
import { tool } from "@langchain/core/tools";
import { z } from "zod";
const weatherTool = tool((_) => "no-op", {
name: "get_current_weather",
description: "Get the current weather",
schema: z.object({
location: z.string(),
}),
});
const llmWithStrictTrue = new ChatOpenAI({
model: "gpt-4o",
}).bindTools([weatherTool], {
strict: true,
tool_choice: weatherTool.name,
});
// Although the question is not about the weather, it will call the tool with the correct arguments
// because we passed `tool_choice` and `strict: true`.
const strictTrueResult = await llmWithStrictTrue.invoke(
"What is 127862 times 12898 divided by 2?"
);
console.dir(strictTrueResult.tool_calls, { depth: null });
[
{
name: 'get_current_weather',
args: { location: 'current' },
type: 'tool_call',
id: 'call_hVFyYNRwc6CoTgr9AQFQVjm9'
}
]
If you only want to apply this parameter to a select number of tools, you can also pass OpenAI formatted tool schemas directly:
import { zodToJsonSchema } from "zod-to-json-schema";
const toolSchema = {
type: "function",
function: {
name: "get_current_weather",
description: "Get the current weather",
strict: true,
parameters: zodToJsonSchema(
z.object({
location: z.string(),
})
),
},
};
const llmWithStrictTrueTools = new ChatOpenAI({
model: "gpt-4o",
}).bindTools([toolSchema], {
strict: true,
});
const weatherToolResult = await llmWithStrictTrueTools.invoke([
{
role: "user",
content: "What is the current weather in London?",
},
]);
weatherToolResult.tool_calls;
[
{
name: 'get_current_weather',
args: { location: 'London' },
type: 'tool_call',
id: 'call_EOSejtax8aYtqpchY8n8O82l'
}
]
Structured outputβ
We can also pass strict: true
to the
.withStructuredOutput()
.
Hereβs an example:
import { ChatOpenAI } from "@langchain/openai";
const traitSchema = z.object({
traits: z
.array(z.string())
.describe("A list of traits contained in the input"),
});
const structuredLlm = new ChatOpenAI({
model: "gpt-4o-mini",
}).withStructuredOutput(traitSchema, {
name: "extract_traits",
strict: true,
});
await structuredLlm.invoke([
{
role: "user",
content: `I am 6'5" tall and love fruit.`,
},
]);
{ traits: [ `6'5" tall`, 'love fruit' ] }
Prompt cachingβ
Newer OpenAI models will automatically cache parts of your prompt if your inputs are above a certain size (1024 tokens at the time of writing) in order to reduce costs for use-cases that require long context.
Note: The number of tokens cached for a given query is not yet
standardized in AIMessage.usage_metadata
, and is instead contained in
the AIMessage.response_metadata
field.
Hereβs an example
import { ChatOpenAI } from "@langchain/openai";
const modelWithCaching = new ChatOpenAI({
model: "gpt-4o-mini-2024-07-18",
});
// CACHED_TEXT is some string longer than 1024 tokens
const LONG_TEXT = `You are a pirate. Always respond in pirate dialect.
Use the following as context when answering questions:
${CACHED_TEXT}`;
const longMessages = [
{
role: "system",
content: LONG_TEXT,
},
{
role: "user",
content: "What types of messages are supported in LangChain?",
},
];
const originalRes = await modelWithCaching.invoke(longMessages);
console.log("USAGE:", originalRes.response_metadata.usage);
USAGE: {
prompt_tokens: 2624,
completion_tokens: 263,
total_tokens: 2887,
prompt_tokens_details: { cached_tokens: 0 },
completion_tokens_details: { reasoning_tokens: 0 }
}
const resWitCaching = await modelWithCaching.invoke(longMessages);
console.log("USAGE:", resWitCaching.response_metadata.usage);
USAGE: {
prompt_tokens: 2624,
completion_tokens: 272,
total_tokens: 2896,
prompt_tokens_details: { cached_tokens: 2432 },
completion_tokens_details: { reasoning_tokens: 0 }
}
Predicted outputβ
Some OpenAI models (such as their gpt-4o
and gpt-4o-mini
series)
support Predicted
Outputs,
which allow you to pass in a known portion of the LLMβs expected output
ahead of time to reduce latency. This is useful for cases such as
editing text or code, where only a small part of the modelβs output will
change.
Hereβs an example:
import { ChatOpenAI } from "@langchain/openai";
const modelWithPredictions = new ChatOpenAI({
model: "gpt-4o-mini",
});
const codeSample = `
/// <summary>
/// Represents a user with a first name, last name, and username.
/// </summary>
public class User
{
/// <summary>
/// Gets or sets the user's first name.
/// </summary>
public string FirstName { get; set; }
/// <summary>
/// Gets or sets the user's last name.
/// </summary>
public string LastName { get; set; }
/// <summary>
/// Gets or sets the user's username.
/// </summary>
public string Username { get; set; }
}
`;
// Can also be attached ahead of time
// using `model.bind({ prediction: {...} })`;
await modelWithPredictions.invoke(
[
{
role: "user",
content:
"Replace the Username property with an Email property. Respond only with code, and with no markdown formatting.",
},
{
role: "user",
content: codeSample,
},
],
{
prediction: {
type: "content",
content: codeSample,
},
}
);
AIMessage {
"id": "chatcmpl-AQLyQKnazr7lEV7ejLTo1UqhzHDBl",
"content": "/// <summary>\n/// Represents a user with a first name, last name, and email.\n/// </summary>\npublic class User\n{\n/// <summary>\n/// Gets or sets the user's first name.\n/// </summary>\npublic string FirstName { get; set; }\n\n/// <summary>\n/// Gets or sets the user's last name.\n/// </summary>\npublic string LastName { get; set; }\n\n/// <summary>\n/// Gets or sets the user's email.\n/// </summary>\npublic string Email { get; set; }\n}",
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"promptTokens": 148,
"completionTokens": 217,
"totalTokens": 365
},
"finish_reason": "stop",
"usage": {
"prompt_tokens": 148,
"completion_tokens": 217,
"total_tokens": 365,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"accepted_prediction_tokens": 36,
"rejected_prediction_tokens": 116
}
},
"system_fingerprint": "fp_0ba0d124f1"
},
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"output_tokens": 217,
"input_tokens": 148,
"total_tokens": 365,
"input_token_details": {
"cache_read": 0
},
"output_token_details": {
"reasoning": 0
}
}
}
Note that currently predictions are billed as additional tokens and will increase your usage and costs in exchange for this reduced latency.
Audio outputβ
Some OpenAI models (such as gpt-4o-audio-preview
) support generating
audio output. This example shows how to use that feature:
import { ChatOpenAI } from "@langchain/openai";
const modelWithAudioOutput = new ChatOpenAI({
model: "gpt-4o-audio-preview",
// You may also pass these fields to `.bind` as a call argument.
modalities: ["text", "audio"], // Specifies that the model should output audio.
audio: {
voice: "alloy",
format: "wav",
},
});
const audioOutputResult = await modelWithAudioOutput.invoke(
"Tell me a joke about cats."
);
const castAudioContent = audioOutputResult.additional_kwargs.audio as Record<
string,
any
>;
console.log({
...castAudioContent,
data: castAudioContent.data.slice(0, 100), // Sliced for brevity
});
{
id: 'audio_67129e9466f48190be70372922464162',
data: 'UklGRgZ4BABXQVZFZm10IBAAAAABAAEAwF0AAIC7AAACABAATElTVBoAAABJTkZPSVNGVA4AAABMYXZmNTguMjkuMTAwAGRhdGHA',
expires_at: 1729277092,
transcript: "Why did the cat sit on the computer's keyboard? Because it wanted to keep an eye on the mouse!"
}
We see that the audio data is returned inside the data
field. We are
also provided an expires_at
date field. This field represents the date
the audio response will no longer be accessible on the server for use in
multi-turn conversations.
Streaming Audio Outputβ
OpenAI also supports streaming audio output. Hereβs an example:
import { AIMessageChunk } from "@langchain/core/messages";
import { concat } from "@langchain/core/utils/stream";
import { ChatOpenAI } from "@langchain/openai";
const modelWithStreamingAudioOutput = new ChatOpenAI({
model: "gpt-4o-audio-preview",
modalities: ["text", "audio"],
audio: {
voice: "alloy",
format: "pcm16", // Format must be `pcm16` for streaming
},
});
const audioOutputStream = await modelWithStreamingAudioOutput.stream(
"Tell me a joke about cats."
);
let finalAudioOutputMsg: AIMessageChunk | undefined;
for await (const chunk of audioOutputStream) {
finalAudioOutputMsg = finalAudioOutputMsg
? concat(finalAudioOutputMsg, chunk)
: chunk;
}
const castStreamedAudioContent = finalAudioOutputMsg?.additional_kwargs
.audio as Record<string, any>;
console.log({
...castStreamedAudioContent,
data: castStreamedAudioContent.data.slice(0, 100), // Sliced for brevity
});
{
id: 'audio_67129e976ce081908103ba4947399a3eaudio_67129e976ce081908103ba4947399a3e',
transcript: 'Why was the cat sitting on the computer? Because it wanted to keep an eye on the mouse!',
index: 0,
data: 'CgAGAAIADAAAAA0AAwAJAAcACQAJAAQABQABAAgABQAPAAAACAADAAUAAwD8/wUA+f8MAPv/CAD7/wUA///8/wUA/f8DAPj/AgD6',
expires_at: 1729277096
}
Audio inputβ
These models also support passing audio as input. For this, you must
specify input_audio
fields as seen below:
import { HumanMessage } from "@langchain/core/messages";
const userInput = new HumanMessage({
content: [
{
type: "input_audio",
input_audio: {
data: castAudioContent.data, // Re-use the base64 data from the first example
format: "wav",
},
},
],
});
// Re-use the same model instance
const userInputAudioRes = await modelWithAudioOutput.invoke([userInput]);
console.log(
(userInputAudioRes.additional_kwargs.audio as Record<string, any>).transcript
);
That's a great joke! It's always fun to imagine why cats do the funny things they do. Keeping an eye on the "mouse" is a creatively punny way to describe it!
API referenceβ
For detailed documentation of all ChatOpenAI features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_openai.ChatOpenAI.html
Relatedβ
- Chat model conceptual guide
- Chat model how-to guides