ChatGoogleGenerativeAI
You can access Google's gemini
and gemini-vision
models, as well as other
generative models in LangChain through ChatGoogleGenerativeAI
class in the
@langchain/google-genai
integration package.
You can also access Google's gemini
family of models via the LangChain VertexAI and VertexAI-web integrations.
Click here to read the docs.
Get an API key here: https://ai.google.dev/tutorials/setup
You'll first need to install the @langchain/google-genai
package:
- npm
- Yarn
- pnpm
npm install @langchain/google-genai
yarn add @langchain/google-genai
pnpm add @langchain/google-genai
Usage
We're unifying model params across all packages. We now suggest using model
instead of modelName
, and apiKey
for API keys.
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HarmBlockThreshold, HarmCategory } from "@google/generative-ai";
/*
* Before running this, you should make sure you have created a
* Google Cloud Project that has `generativelanguage` API enabled.
*
* You will also need to generate an API key and set
* an environment variable GOOGLE_API_KEY
*
*/
// Text
const model = new ChatGoogleGenerativeAI({
model: "gemini-pro",
maxOutputTokens: 2048,
safetySettings: [
{
category: HarmCategory.HARM_CATEGORY_HARASSMENT,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
},
],
});
// Batch and stream are also supported
const res = await model.invoke([
[
"human",
"What would be a good company name for a company that makes colorful socks?",
],
]);
console.log(res);
/*
AIMessage {
content: '1. Rainbow Soles\n' +
'2. Toe-tally Colorful\n' +
'3. Bright Sock Creations\n' +
'4. Hue Knew Socks\n' +
'5. The Happy Sock Factory\n' +
'6. Color Pop Hosiery\n' +
'7. Sock It to Me!\n' +
'8. Mismatched Masterpieces\n' +
'9. Threads of Joy\n' +
'10. Funky Feet Emporium\n' +
'11. Colorful Threads\n' +
'12. Sole Mates\n' +
'13. Colorful Soles\n' +
'14. Sock Appeal\n' +
'15. Happy Feet Unlimited\n' +
'16. The Sock Stop\n' +
'17. The Sock Drawer\n' +
'18. Sole-diers\n' +
'19. Footloose Footwear\n' +
'20. Step into Color',
name: 'model',
additional_kwargs: {}
}
*/
API Reference:
- ChatGoogleGenerativeAI from
@langchain/google-genai
Multimodal support
To provide an image, pass a human message with a content
field set to an array of content objects. Each content object
where each dict contains either an image value (type of image_url) or a text (type of text) value. The value of image_url must be a base64
encoded image (e.g., data:image/png;base64,abcd124):
import fs from "fs";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HumanMessage } from "@langchain/core/messages";
// Multi-modal
const vision = new ChatGoogleGenerativeAI({
model: "gemini-pro-vision",
maxOutputTokens: 2048,
});
const image = fs.readFileSync("./hotdog.jpg").toString("base64");
const input2 = [
new HumanMessage({
content: [
{
type: "text",
text: "Describe the following image.",
},
{
type: "image_url",
image_url: `data:image/png;base64,${image}`,
},
],
}),
];
const res2 = await vision.invoke(input2);
console.log(res2);
/*
AIMessage {
content: ' The image shows a hot dog in a bun. The hot dog is grilled and has a dark brown color. The bun is toasted and has a light brown color. The hot dog is in the center of the bun.',
name: 'model',
additional_kwargs: {}
}
*/
// Multi-modal streaming
const res3 = await vision.stream(input2);
for await (const chunk of res3) {
console.log(chunk);
}
/*
AIMessageChunk {
content: ' The image shows a hot dog in a bun. The hot dog is grilled and has grill marks on it. The bun is toasted and has a light golden',
name: 'model',
additional_kwargs: {}
}
AIMessageChunk {
content: ' brown color. The hot dog is in the center of the bun.',
name: 'model',
additional_kwargs: {}
}
*/
API Reference:
- ChatGoogleGenerativeAI from
@langchain/google-genai
- HumanMessage from
@langchain/core/messages
Gemini Prompting FAQs
As of the time this doc was written (2023/12/12), Gemini has some restrictions on the types and structure of prompts it accepts. Specifically:
- When providing multimodal (image) inputs, you are restricted to at most 1 message of "human" (user) type. You cannot pass multiple messages (though the single human message may have multiple content entries)
- System messages are not natively supported, and will be merged with the first human message if present.
- For regular chat conversations, messages must follow the human/ai/human/ai alternating pattern. You may not provide 2 AI or human messages in sequence.
- Message may be blocked if they violate the safety checks of the LLM. In this case, the model will return an empty response.