Skip to main content

Weaviate

Weaviate is an open source vector database that stores both objects and vectors, allowing for combining vector search with structured filtering. LangChain connects to Weaviate via the weaviate-ts-client package, the official Typescript client for Weaviate.

LangChain inserts vectors directly to Weaviate, and queries Weaviate for the nearest neighbors of a given vector, so that you can use all the LangChain Embeddings integrations with Weaviate.

Setup

Weaviate has their own standalone integration package with LangChain, accessible via @langchain/weaviate on NPM!

npm install @langchain/weaviate @langchain/openai @langchain/community

You'll need to run Weaviate either locally or on a server, see the Weaviate documentation for more information.

Usage, insert documents

/* eslint-disable @typescript-eslint/no-explicit-any */
import weaviate, { ApiKey } from "weaviate-ts-client";
import { WeaviateStore } from "@langchain/weaviate";
import { OpenAIEmbeddings } from "@langchain/openai";

export async function run() {
// Something wrong with the weaviate-ts-client types, so we need to disable
const client = (weaviate as any).client({
scheme: process.env.WEAVIATE_SCHEME || "https",
host: process.env.WEAVIATE_HOST || "localhost",
apiKey: new ApiKey(process.env.WEAVIATE_API_KEY || "default"),
});

// Create a store and fill it with some texts + metadata
await WeaviateStore.fromTexts(
["hello world", "hi there", "how are you", "bye now"],
[{ foo: "bar" }, { foo: "baz" }, { foo: "qux" }, { foo: "bar" }],
new OpenAIEmbeddings(),
{
client,
indexName: "Test",
textKey: "text",
metadataKeys: ["foo"],
}
);
}

API Reference:

Usage, query documents

/* eslint-disable @typescript-eslint/no-explicit-any */
import weaviate, { ApiKey } from "weaviate-ts-client";
import { WeaviateStore } from "@langchain/weaviate";
import { OpenAIEmbeddings } from "@langchain/openai";

export async function run() {
// Something wrong with the weaviate-ts-client types, so we need to disable
const client = (weaviate as any).client({
scheme: process.env.WEAVIATE_SCHEME || "https",
host: process.env.WEAVIATE_HOST || "localhost",
apiKey: new ApiKey(process.env.WEAVIATE_API_KEY || "default"),
});

// Create a store for an existing index
const store = await WeaviateStore.fromExistingIndex(new OpenAIEmbeddings(), {
client,
indexName: "Test",
metadataKeys: ["foo"],
});

// Search the index without any filters
const results = await store.similaritySearch("hello world", 1);
console.log(results);
/*
[ Document { pageContent: 'hello world', metadata: { foo: 'bar' } } ]
*/

// Search the index with a filter, in this case, only return results where
// the "foo" metadata key is equal to "baz", see the Weaviate docs for more
// https://weaviate.io/developers/weaviate/api/graphql/filters
const results2 = await store.similaritySearch("hello world", 1, {
where: {
operator: "Equal",
path: ["foo"],
valueText: "baz",
},
});
console.log(results2);
/*
[ Document { pageContent: 'hi there', metadata: { foo: 'baz' } } ]
*/
}

API Reference:

Usage, maximal marginal relevance

You can use maximal marginal relevance search, which optimizes for similarity to the query AND diversity.

/* eslint-disable @typescript-eslint/no-explicit-any */
import weaviate, { ApiKey } from "weaviate-ts-client";
import { WeaviateStore } from "@langchain/weaviate";
import { OpenAIEmbeddings } from "@langchain/openai";

export async function run() {
// Something wrong with the weaviate-ts-client types, so we need to disable
const client = (weaviate as any).client({
scheme: process.env.WEAVIATE_SCHEME || "https",
host: process.env.WEAVIATE_HOST || "localhost",
apiKey: new ApiKey(process.env.WEAVIATE_API_KEY || "default"),
});

// Create a store for an existing index
const store = await WeaviateStore.fromExistingIndex(new OpenAIEmbeddings(), {
client,
indexName: "Test",
metadataKeys: ["foo"],
});

const resultOne = await store.maxMarginalRelevanceSearch("Hello world", {
k: 1,
});

console.log(resultOne);
}

API Reference:

Usage, delete documents

/* eslint-disable @typescript-eslint/no-explicit-any */
import weaviate, { ApiKey } from "weaviate-ts-client";
import { WeaviateStore } from "@langchain/weaviate";
import { OpenAIEmbeddings } from "@langchain/openai";

export async function run() {
// Something wrong with the weaviate-ts-client types, so we need to disable
const client = (weaviate as any).client({
scheme: process.env.WEAVIATE_SCHEME || "https",
host: process.env.WEAVIATE_HOST || "localhost",
apiKey: new ApiKey(process.env.WEAVIATE_API_KEY || "default"),
});

// Create a store for an existing index
const store = await WeaviateStore.fromExistingIndex(new OpenAIEmbeddings(), {
client,
indexName: "Test",
metadataKeys: ["foo"],
});

const docs = [{ pageContent: "see ya!", metadata: { foo: "bar" } }];

// Also supports an additional {ids: []} parameter for upsertion
const ids = await store.addDocuments(docs);

// Search the index without any filters
const results = await store.similaritySearch("see ya!", 1);
console.log(results);
/*
[ Document { pageContent: 'see ya!', metadata: { foo: 'bar' } } ]
*/

// Delete documents with ids
await store.delete({ ids });

const results2 = await store.similaritySearch("see ya!", 1);
console.log(results2);
/*
[]
*/

const docs2 = [
{ pageContent: "hello world", metadata: { foo: "bar" } },
{ pageContent: "hi there", metadata: { foo: "baz" } },
{ pageContent: "how are you", metadata: { foo: "qux" } },
{ pageContent: "hello world", metadata: { foo: "bar" } },
{ pageContent: "bye now", metadata: { foo: "bar" } },
];

await store.addDocuments(docs2);

const results3 = await store.similaritySearch("hello world", 1);
console.log(results3);
/*
[ Document { pageContent: 'hello world', metadata: { foo: 'bar' } } ]
*/

// delete documents with filter
await store.delete({
filter: {
where: {
operator: "Equal",
path: ["foo"],
valueText: "bar",
},
},
});

const results4 = await store.similaritySearch("hello world", 1, {
where: {
operator: "Equal",
path: ["foo"],
valueText: "bar",
},
});
console.log(results4);
/*
[]
*/
}

API Reference:


Help us out by providing feedback on this documentation page: