ChatCerebras

Cerebras is a model provider that serves open source models with an emphasis on speed. The Cerebras CS-3 system, powered by the the Wafer-Scale Engine-3 (WSE-3), represents a new class of AI supercomputer that sets the standard for generative AI training and inference with unparalleled performance and scalability.

With Cerebras as your inference provider, you can:

Achieve unprecedented speed for AI inference workloads
Build commercially with high throughput
Effortlessly scale your AI workloads with our seamless clustering technology

Our CS-3 systems can be quickly and easily clustered to create the largest AI supercomputers in the world, making it simple to place and run the largest models. Leading corporations, research institutions, and governments are already using Cerebras solutions to develop proprietary models and train popular open-source models.

This will help you getting started with ChatCerebras chat models. For detailed documentation of all ChatCerebras features and configurations head to the API reference.

Overview

Integration details

Class	Package	Local	Serializable	PY support	Package downloads	Package latest
ChatCerebras	`@langchain/cerebras`	❌	❌	✅

Model features

See the links in the table headers below for guides on how to use specific features.

Tool calling	Structured output	JSON mode	Image input	Audio input	Video input	Token-level streaming	Token usage	Logprobs
✅	✅	✅	❌	❌	❌	✅	✅	❌

Setup

To access ChatCerebras models you’ll need to create a Cerebras account, get an API key, and install the @langchain/cerebras integration package.

Credentials

Get an API Key from cloud.cerebras.ai and add it to your environment variables:

export CEREBRAS_API_KEY="your-api-key"

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

# export LANGSMITH_TRACING="true"
# export LANGSMITH_API_KEY="your-api-key"

Installation

The LangChain ChatCerebras integration lives in the @langchain/cerebras package:

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/cerebras @langchain/core

yarn add @langchain/cerebras @langchain/core

pnpm add @langchain/cerebras @langchain/core

Instantiation

Now we can instantiate our model object and generate chat completions:

import { ChatCerebras } from "@langchain/cerebras";

const llm = new ChatCerebras({
  model: "llama-3.3-70b",
  temperature: 0,
  maxTokens: undefined,
  maxRetries: 2,
  // other params...
});

Invocation

const aiMsg = await llm.invoke([
  {
    role: "system",
    content:
      "You are a helpful assistant that translates English to French. Translate the user sentence.",
  },
  { role: "user", content: "I love programming." },
]);
aiMsg;

AIMessage {
  "id": "run-17c7d62d-67ac-4677-b33a-18298fc85e35",
  "content": "J'adore la programmation.",
  "additional_kwargs": {},
  "response_metadata": {
    "id": "chatcmpl-2d1e2de5-4239-46fb-af2a-6200d89d7dde",
    "created": 1735785598,
    "model": "llama-3.3-70b",
    "system_fingerprint": "fp_2e2a2a083c",
    "object": "chat.completion",
    "time_info": {
      "queue_time": 0.00009063,
      "prompt_time": 0.002163031,
      "completion_time": 0.012339628,
      "total_time": 0.01640915870666504,
      "created": 1735785598
    }
  },
  "tool_calls": [],
  "invalid_tool_calls": [],
  "usage_metadata": {
    "input_tokens": 55,
    "output_tokens": 9,
    "total_tokens": 64
  }
}

console.log(aiMsg.content);

J'adore la programmation.

Json invocation

const messages = [
  {
    role: "system",
    content:
      "You are a math tutor that handles math exercises and makes output in json in format { result: number }.",
  },
  { role: "user", content: "2 + 2" },
];

const aiInvokeMsg = await llm.invoke(messages, {
  response_format: { type: "json_object" },
});

// if you want not to pass response_format in every invoke, you can bind it to the instance
const llmWithResponseFormat = llm.bind({
  response_format: { type: "json_object" },
});
const aiBindMsg = await llmWithResponseFormat.invoke(messages);

// they are the same
console.log({
  aiInvokeMsgContent: aiInvokeMsg.content,
  aiBindMsg: aiBindMsg.content,
});

{ aiInvokeMsgContent: '{"result":4}', aiBindMsg: '{"result":4}' }

Chaining

We can chain our model with a prompt template like so:

import { ChatPromptTemplate } from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromMessages([
  [
    "system",
    "You are a helpful assistant that translates {input_language} to {output_language}.",
  ],
  ["human", "{input}"],
]);

const chain = prompt.pipe(llm);
await chain.invoke({
  input_language: "English",
  output_language: "German",
  input: "I love programming.",
});

AIMessage {
  "id": "run-5c8a9f25-0f57-499b-9c2b-87bd07135feb",
  "content": "Ich liebe das Programmieren.",
  "additional_kwargs": {},
  "response_metadata": {
    "id": "chatcmpl-abd1e9eb-b873-492e-9e30-0d13dfc3a145",
    "created": 1735785607,
    "model": "llama-3.3-70b",
    "system_fingerprint": "fp_2e2a2a083c",
    "object": "chat.completion",
    "time_info": {
      "queue_time": 0.00009499,
      "prompt_time": 0.002095266,
      "completion_time": 0.008807576,
      "total_time": 0.012718439102172852,
      "created": 1735785607
    }
  },
  "tool_calls": [],
  "invalid_tool_calls": [],
  "usage_metadata": {
    "input_tokens": 50,
    "output_tokens": 7,
    "total_tokens": 57
  }
}

API reference

For detailed documentation of all ChatCerebras features and configurations head to the API reference: https://api.js.langchain.com/classes/\_langchain_cerebras.ChatCerebras.html

Chat model conceptual guide
Chat model how-to guides

ChatCerebras

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Invocation

Json invocation

Chaining

API reference

Was this page helpful?

You can also leave detailed feedback on GitHub.

Overview​

Integration details​

Model features​

Setup​

Credentials​

Installation​

Instantiation​

Invocation​

Json invocation​

Chaining​

API reference​

Related​

Was this page helpful?

You can also leave detailed feedback on GitHub.

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Invocation

Json invocation

Chaining

API reference

Related