Skip to main content

Integrations: Embeddings

LangChain offers a number of Embeddings implementations that integrate with various model providers. These are:

OpenAIEmbeddings

The OpenAIEmbeddings class uses the OpenAI API to generate embeddings for a given text. By default it strips new line characters from the text, as recommended by OpenAI, but you can disable this by passing stripNewLines: false to the constructor.

import { OpenAIEmbeddings } from "langchain/embeddings/openai";

const embeddings = new OpenAIEmbeddings({
openAIApiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.OPENAI_API_KEY
});

Azure OpenAIEmbeddings

The OpenAIEmbeddings class uses the OpenAI API on Azure to generate embeddings for a given text. By default it strips new line characters from the text, as recommended by OpenAI, but you can disable this by passing stripNewLines: false to the constructor.

import { OpenAIEmbeddings } from "langchain/embeddings/openai";

const embeddings = new OpenAIEmbeddings({
azureOpenAIApiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.AZURE_OPENAI_API_KEY
azureOpenAIApiInstanceName: "YOUR-INSTANCE-NAME", // In Node.js defaults to process.env.AZURE_OPENAI_API_INSTANCE_NAME
azureOpenAIApiDeploymentName: "YOUR-DEPLOYMENT-NAME", // In Node.js defaults to process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME
azureOpenAIApiVersion: "YOUR-API-VERSION", // In Node.js defaults to process.env.AZURE_OPENAI_API_VERSION
});

Google Vertex AI

The GoogleVertexAIEmbeddings class uses Google's Vertex AI PaLM models to generate embeddings for a given text.

The Vertex AI implementation is meant to be used in Node.js and not directly in a browser, since it requires a service account to use.

Before running this code, you should make sure the Vertex AI API is enabled for the relevant project in your Google Cloud dashboard and that you've authenticated to Google Cloud using one of these methods:

  • You are logged into an account (using gcloud auth application-default login) permitted to that project.
  • You are running on a machine using a service account that is permitted to the project.
  • You have downloaded the credentials for a service account that is permitted to the project and set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of this file.
npm install google-auth-library
import { GoogleVertexAIEmbeddings } from "langchain/embeddings/googlevertexai";

export const run = async () => {
const model = new GoogleVertexAIEmbeddings();
const res = await model.embedQuery(
"What would be a good company name for a company that makes colorful socks?"
);
console.log({ res });
};

API Reference:

Note: The default Google Vertex AI embeddings model, textembedding-gecko, has a different number of dimensions than OpenAI's text-embedding-ada-002 model and may not be supported by all vector store providers.

CohereEmbeddings

The CohereEmbeddings class uses the Cohere API to generate embeddings for a given text.

npm install cohere-ai
import { CohereEmbeddings } from "langchain/embeddings/cohere";

const embeddings = new CohereEmbeddings({
apiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.COHERE_API_KEY
});

TensorFlowEmbeddings

This Embeddings integration runs the embeddings entirely in your browser or Node.js environment, using TensorFlow.js. This means that your data isn't sent to any third party, and you don't need to sign up for any API keys. However, it does require more memory and processing power than the other integrations.

npm install @tensorflow/tfjs-core @tensorflow/tfjs-converter @tensorflow-models/universal-sentence-encoder @tensorflow/tfjs-backend-cpu
import "@tensorflow/tfjs-backend-cpu";
import { TensorFlowEmbeddings } from "langchain/embeddings/tensorflow";

const embeddings = new TensorFlowEmbeddings();

This example uses the CPU backend, which works in any JS environment. However, you can use any of the backends supported by TensorFlow.js, including GPU and WebAssembly, which will be a lot faster. For Node.js you can use the @tensorflow/tfjs-node package, and for the browser you can use the @tensorflow/tfjs-backend-webgl package. See the TensorFlow.js documentation for more information.

HuggingFaceInferenceEmbeddings

This Embeddings integration uses the HuggingFace Inference API to generate embeddings for a given text using by default the sentence-transformers/distilbert-base-nli-mean-tokens model. You can pass a different model name to the constructor to use a different model.

npm install @huggingface/inference@1
import { HuggingFaceInferenceEmbeddings } from "langchain/embeddings/hf";

const embeddings = new HuggingFaceInferenceEmbeddings({
apiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.HUGGINGFACEHUB_API_KEY
});