ChatGoogleGenerativeAI

You can access Google's gemini and gemini-vision models, as well as other generative models in LangChain through ChatGoogleGenerativeAI class in the @langchain/google-genai integration package.

tip

You can also access Google's gemini family of models via the LangChain VertexAI and VertexAI-web integrations.

Click here to read the docs.

Get an API key here: https://ai.google.dev/tutorials/setup

You'll first need to install the @langchain/google-genai package:

tip

See this section for general instructions on installing integration packages.

npm
Yarn
pnpm

npm install @langchain/google-genai

yarn add @langchain/google-genai

pnpm add @langchain/google-genai

Usage

tip

We're unifying model params across all packages. We now suggest using model instead of modelName, and apiKey for API keys.

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HarmBlockThreshold, HarmCategory } from "@google/generative-ai";

/*
 * Before running this, you should make sure you have created a
 * Google Cloud Project that has `generativelanguage` API enabled.
 *
 * You will also need to generate an API key and set
 * an environment variable GOOGLE_API_KEY
 *
 */

// Text
const model = new ChatGoogleGenerativeAI({
  model: "gemini-pro",
  maxOutputTokens: 2048,
  safetySettings: [
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
  ],
});

// Batch and stream are also supported
const res = await model.invoke([
  [
    "human",
    "What would be a good company name for a company that makes colorful socks?",
  ],
]);

console.log(res);

/*
  AIMessage {
    content: '1. Rainbow Soles\n' +
      '2. Toe-tally Colorful\n' +
      '3. Bright Sock Creations\n' +
      '4. Hue Knew Socks\n' +
      '5. The Happy Sock Factory\n' +
      '6. Color Pop Hosiery\n' +
      '7. Sock It to Me!\n' +
      '8. Mismatched Masterpieces\n' +
      '9. Threads of Joy\n' +
      '10. Funky Feet Emporium\n' +
      '11. Colorful Threads\n' +
      '12. Sole Mates\n' +
      '13. Colorful Soles\n' +
      '14. Sock Appeal\n' +
      '15. Happy Feet Unlimited\n' +
      '16. The Sock Stop\n' +
      '17. The Sock Drawer\n' +
      '18. Sole-diers\n' +
      '19. Footloose Footwear\n' +
      '20. Step into Color',
    name: 'model',
    additional_kwargs: {}
  }
*/

API Reference:

ChatGoogleGenerativeAI from @langchain/google-genai

Multimodal support

To provide an image, pass a human message with a content field set to an array of content objects. Each content object where each dict contains either an image value (type of image_url) or a text (type of text) value. The value of image_url must be a base64 encoded image (e.g., data:image/png;base64,abcd124):

import fs from "fs";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HumanMessage } from "@langchain/core/messages";

// Multi-modal
const vision = new ChatGoogleGenerativeAI({
  model: "gemini-pro-vision",
  maxOutputTokens: 2048,
});
const image = fs.readFileSync("./hotdog.jpg").toString("base64");
const input2 = [
  new HumanMessage({
    content: [
      {
        type: "text",
        text: "Describe the following image.",
      },
      {
        type: "image_url",
        image_url: `data:image/png;base64,${image}`,
      },
    ],
  }),
];

const res2 = await vision.invoke(input2);

console.log(res2);

/*
  AIMessage {
    content: ' The image shows a hot dog in a bun. The hot dog is grilled and has a dark brown color. The bun is toasted and has a light brown color. The hot dog is in the center of the bun.',
    name: 'model',
    additional_kwargs: {}
  }
*/

// Multi-modal streaming
const res3 = await vision.stream(input2);

for await (const chunk of res3) {
  console.log(chunk);
}

/*
  AIMessageChunk {
    content: ' The image shows a hot dog in a bun. The hot dog is grilled and has grill marks on it. The bun is toasted and has a light golden',
    name: 'model',
    additional_kwargs: {}
  }
  AIMessageChunk {
    content: ' brown color. The hot dog is in the center of the bun.',
    name: 'model',
    additional_kwargs: {}
  }
*/

API Reference:

ChatGoogleGenerativeAI from @langchain/google-genai
HumanMessage from @langchain/core/messages

Gemini Prompting FAQs

As of the time this doc was written (2023/12/12), Gemini has some restrictions on the types and structure of prompts it accepts. Specifically:

When providing multimodal (image) inputs, you are restricted to at most 1 message of "human" (user) type. You cannot pass multiple messages (though the single human message may have multiple content entries)
System messages are not natively supported, and will be merged with the first human message if present.
For regular chat conversations, messages must follow the human/ai/human/ai alternating pattern. You may not provide 2 AI or human messages in sequence.
Message may be blocked if they violate the safety checks of the LLM. In this case, the model will return an empty response.

ChatGoogleGenerativeAI

Usage​

API Reference:

Multimodal support​

API Reference:

Gemini Prompting FAQs​

Help us out by providing feedback on this documentation page:

Usage

Multimodal support

Gemini Prompting FAQs