Skip to main content

ChatVertexAI

Google Vertex is a service that exposes all foundation models available in Google Cloud, like gemini-1.5-pro, gemini-2.0-flash-exp, etc. It also provides some non-Google models such as Anthropicโ€™s Claude.

This will help you getting started with ChatVertexAI chat models. For detailed documentation of all ChatVertexAI features and configurations head to the API reference.

Overviewโ€‹

Integration detailsโ€‹

ClassPackageLocalSerializablePY supportPackage downloadsPackage latest
ChatVertexAI@langchain/google-vertexaiโŒโœ…โœ…NPM - DownloadsNPM - Version

Model featuresโ€‹

See the links in the table headers below for guides on how to use specific features.

Tool callingStructured outputJSON modeImage inputAudio inputVideo inputToken-level streamingToken usageLogprobs
โœ…โœ…โŒโœ…โœ…โœ…โœ…โœ…โœ…

Note that while logprobs are supported, Gemini has fairly restricted usage of them.

Setupโ€‹

LangChain.js supports two different authentication methods based on whether youโ€™re running in a Node.js environment or a web environment. It also supports the authentication method used by Vertex AI Express Mode using either package.

To access ChatVertexAI models youโ€™ll need to setup Google VertexAI in your Google Cloud Platform (GCP) account, save the credentials file, and install the @langchain/google-vertexai integration package.

Credentialsโ€‹

Head to your GCP account and generate a credentials file. Once youโ€™ve done this set the GOOGLE_APPLICATION_CREDENTIALS environment variable:

export GOOGLE_APPLICATION_CREDENTIALS="path/to/your/credentials.json"

If running in a web environment, you should set the GOOGLE_VERTEX_AI_WEB_CREDENTIALS environment variable as a JSON stringified object, and install the @langchain/google-vertexai-web package:

GOOGLE_VERTEX_AI_WEB_CREDENTIALS={"type":"service_account","project_id":"YOUR_PROJECT-12345",...}

If you are using Vertex AI Express Mode, you can install either the @langchain/google-vertexai or @langchain/google-vertexai-web package. You can then go to the Express Mode API Key page and set your API Key in the GOOGLE_API_KEY environment variable:

export GOOGLE_API_KEY="api_key_value"

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

# export LANGSMITH_TRACING="true"
# export LANGSMITH_API_KEY="your-api-key"

Installationโ€‹

The LangChain ChatVertexAI integration lives in the @langchain/google-vertexai package:

yarn add @langchain/google-vertexai @langchain/core

Or if using in a web environment like a Vercel Edge function:

yarn add @langchain/google-vertexai-web @langchain/core

Instantiationโ€‹

Now we can instantiate our model object and generate chat completions:

import { ChatVertexAI } from "@langchain/google-vertexai";
// Uncomment the following line if you're running in a web environment:
// import { ChatVertexAI } from "@langchain/google-vertexai-web"

const llm = new ChatVertexAI({
model: "gemini-2.0-flash-exp",
temperature: 0,
maxRetries: 2,
// For web, authOptions.credentials
// authOptions: { ... }
// other params...
});

Invocationโ€‹

const aiMsg = await llm.invoke([
[
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
],
["human", "I love programming."],
]);
aiMsg;
AIMessageChunk {
"content": "J'adore programmer. \n",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"tool_call_chunks": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 20,
"output_tokens": 7,
"total_tokens": 27
}
}
console.log(aiMsg.content);
J'adore programmer.

Tool Calling with Google Search Retrievalโ€‹

It is possible to call the model with a Google search tool which you can use to ground content generation with real-world information and reduce hallucinations.

Grounding is currently not supported by gemini-2.0-flash-exp.

You can choose to either ground using Google Search or by using a custom data store. Here are examples of both:

Google Search Retrievalโ€‹

Grounding example that uses Google Search:

import { ChatVertexAI } from "@langchain/google-vertexai";

const searchRetrievalTool = {
googleSearchRetrieval: {
dynamicRetrievalConfig: {
mode: "MODE_DYNAMIC", // Use Dynamic Retrieval
dynamicThreshold: 0.7, // Default for Dynamic Retrieval threshold
},
},
};

const searchRetrievalModel = new ChatVertexAI({
model: "gemini-1.5-pro",
temperature: 0,
maxRetries: 0,
}).bindTools([searchRetrievalTool]);

const searchRetrievalResult = await searchRetrievalModel.invoke(
"Who won the 2024 NBA Finals?"
);

console.log(searchRetrievalResult.content);
The Boston Celtics won the 2024 NBA Finals, defeating the Dallas Mavericks 4-1 in the series to claim their 18th NBA championship. This victory marked their first title since 2008 and established them as the team with the most NBA championships, surpassing the Los Angeles Lakers' 17 titles.

Google Search Retrieval with Data Storeโ€‹

First, set up your data store (this is a schema of an example data store):

IDDateTeam 1ScoreTeam 2
30012023-09-07Argentina1 - 0Ecuador
30022023-09-12Venezuela1 - 0Paraguay
30032023-09-12Chile0 - 0Colombia
30042023-09-12Peru0 - 1Brazil
30052024-10-15Argentina6 - 0Bolivia

Then, use this data store in the example provided below:

(Note that you have to use your own variables for projectId and datastoreId)

import { ChatVertexAI } from "@langchain/google-vertexai";

const projectId = "YOUR_PROJECT_ID";
const datastoreId = "YOUR_DATASTORE_ID";

const searchRetrievalToolWithDataset = {
retrieval: {
vertexAiSearch: {
datastore: `projects/${projectId}/locations/global/collections/default_collection/dataStores/${datastoreId}`,
},
disableAttribution: false,
},
};

const searchRetrievalModelWithDataset = new ChatVertexAI({
model: "gemini-1.5-pro",
temperature: 0,
maxRetries: 0,
}).bindTools([searchRetrievalToolWithDataset]);

const searchRetrievalModelResult = await searchRetrievalModelWithDataset.invoke(
"What is the score of Argentina vs Bolivia football game?"
);

console.log(searchRetrievalModelResult.content);
Argentina won against Bolivia with a score of 6-0 on October 15, 2024.

You should now get results that are grounded in the data from your provided data store.

Context Cachingโ€‹

Vertex AI offers context caching functionality, which helps optimize costs by storing and reusing long blocks of message content across multiple API requests. This is particularly useful when you have lengthy conversation histories or message segments that appear frequently in your interactions.

To use this feature, first create a context cache by following this official guide.

Once youโ€™ve created a cache, you can pass its id in as a runtime param as follows:

import { ChatVertexAI } from "@langchain/google-vertexai";

const modelWithCachedContent = new ChatVertexAI({
model: "gemini-1.5-pro-002",
location: "us-east5",
});

await modelWithCachedContent.invoke("What is in the content?", {
cachedContent:
"projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID",
});

You can also bind this field directly onto the model instance:

const modelWithBoundCachedContent = new ChatVertexAI({
model: "gemini-1.5-pro-002",
location: "us-east5",
}).bind({
cachedContent:
"projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID",
});

Note that not all modelsย currently support context caching.

Chainingโ€‹

We can chain our model with a prompt template like so:

import { ChatPromptTemplate } from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
],
["human", "{input}"],
]);

const chain = prompt.pipe(llm);
await chain.invoke({
input_language: "English",
output_language: "German",
input: "I love programming.",
});
AIMessageChunk {
"content": "Ich liebe das Programmieren. \n",
"additional_kwargs": {},
"response_metadata": {},
"tool_calls": [],
"tool_call_chunks": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 15,
"output_tokens": 9,
"total_tokens": 24
}
}

API referenceโ€‹

For detailed documentation of all ChatVertexAI features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_google_vertexai.ChatVertexAI.html


Was this page helpful?


You can also leave detailed feedback on GitHub.