Integrations: Embeddings
LangChain offers a number of Embeddings implementations that integrate with various model providers. These are:
OpenAIEmbeddings
The OpenAIEmbeddings
class uses the OpenAI API to generate embeddings for a given text. By default it strips new line characters from the text, as recommended by OpenAI, but you can disable this by passing stripNewLines: false
to the constructor.
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
const embeddings = new OpenAIEmbeddings({
openAIApiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.OPENAI_API_KEY
});
Azure OpenAIEmbeddings
The OpenAIEmbeddings
class uses the OpenAI API on Azure to generate embeddings for a given text. By default it strips new line characters from the text, as recommended by OpenAI, but you can disable this by passing stripNewLines: false
to the constructor.
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
const embeddings = new OpenAIEmbeddings({
azureOpenAIApiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.AZURE_OPENAI_API_KEY
azureOpenAIApiInstanceName: "YOUR-INSTANCE-NAME", // In Node.js defaults to process.env.AZURE_OPENAI_API_INSTANCE_NAME
azureOpenAIApiDeploymentName: "YOUR-DEPLOYMENT-NAME", // In Node.js defaults to process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME
azureOpenAIApiVersion: "YOUR-API-VERSION", // In Node.js defaults to process.env.AZURE_OPENAI_API_VERSION
});
Google Vertex AI
The GoogleVertexAIEmbeddings
class uses Google's Vertex AI PaLM models
to generate embeddings for a given text.
The Vertex AI implementation is meant to be used in Node.js and not directly in a browser, since it requires a service account to use.
Before running this code, you should make sure the Vertex AI API is enabled for the relevant project in your Google Cloud dashboard and that you've authenticated to Google Cloud using one of these methods:
- You are logged into an account (using
gcloud auth application-default login
) permitted to that project. - You are running on a machine using a service account that is permitted to the project.
- You have downloaded the credentials for a service account that is permitted
to the project and set the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path of this file.
- npm
- Yarn
- pnpm
npm install google-auth-library
yarn add google-auth-library
pnpm add google-auth-library
import { GoogleVertexAIEmbeddings } from "langchain/embeddings/googlevertexai";
export const run = async () => {
const model = new GoogleVertexAIEmbeddings();
const res = await model.embedQuery(
"What would be a good company name for a company that makes colorful socks?"
);
console.log({ res });
};
API Reference:
- GoogleVertexAIEmbeddings from
langchain/embeddings/googlevertexai
Note: The default Google Vertex AI embeddings model, textembedding-gecko
, has a different number of dimensions than OpenAI's text-embedding-ada-002
model
and may not be supported by all vector store providers.
CohereEmbeddings
The CohereEmbeddings
class uses the Cohere API to generate embeddings for a given text.
- npm
- Yarn
- pnpm
npm install cohere-ai
yarn add cohere-ai
pnpm add cohere-ai
import { CohereEmbeddings } from "langchain/embeddings/cohere";
const embeddings = new CohereEmbeddings({
apiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.COHERE_API_KEY
});
TensorFlowEmbeddings
This Embeddings integration runs the embeddings entirely in your browser or Node.js environment, using TensorFlow.js. This means that your data isn't sent to any third party, and you don't need to sign up for any API keys. However, it does require more memory and processing power than the other integrations.
- npm
- Yarn
- pnpm
npm install @tensorflow/tfjs-core @tensorflow/tfjs-converter @tensorflow-models/universal-sentence-encoder @tensorflow/tfjs-backend-cpu
yarn add @tensorflow/tfjs-core @tensorflow/tfjs-converter @tensorflow-models/universal-sentence-encoder @tensorflow/tfjs-backend-cpu
pnpm add @tensorflow/tfjs-core @tensorflow/tfjs-converter @tensorflow-models/universal-sentence-encoder @tensorflow/tfjs-backend-cpu
import "@tensorflow/tfjs-backend-cpu";
import { TensorFlowEmbeddings } from "langchain/embeddings/tensorflow";
const embeddings = new TensorFlowEmbeddings();
This example uses the CPU backend, which works in any JS environment. However, you can use any of the backends supported by TensorFlow.js, including GPU and WebAssembly, which will be a lot faster. For Node.js you can use the @tensorflow/tfjs-node
package, and for the browser you can use the @tensorflow/tfjs-backend-webgl
package. See the TensorFlow.js documentation for more information.
HuggingFaceInferenceEmbeddings
This Embeddings integration uses the HuggingFace Inference API to generate embeddings for a given text using by default the sentence-transformers/distilbert-base-nli-mean-tokens
model. You can pass a different model name to the constructor to use a different model.
- npm
- Yarn
- pnpm
npm install @huggingface/inference@1
yarn add @huggingface/inference@1
pnpm add @huggingface/inference@1
import { HuggingFaceInferenceEmbeddings } from "langchain/embeddings/hf";
const embeddings = new HuggingFaceInferenceEmbeddings({
apiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.HUGGINGFACEHUB_API_KEY
});