Google Cloud SQL for PostgreSQL
Cloud SQL is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability and offers database engines such as PostgreSQL.
This guide provides a quick overview of how to use Cloud SQL for
PostgreSQL to store vector embeddings with the PostgresVectorStore
class.
Overview
Integration details
Class | Package | PY support | Package latest |
---|---|---|---|
PostgresVectorStore | @langchain/google-cloud-sql-pg | ✅ | 0.0.1 |
Before you begin
In order to use this package, you first need to go throught the following steps:
- Select or create a Cloud Platform project.
- Enable billing for your project.
- Enable the Cloud SQL Admin API.
- Setup Authentication.
- Create a CloudSQL instance
- Create a CloudSQL database
- Add a user to the database
Authentication
Authenticate locally to your Google Cloud account using the
gcloud auth login
command.
Set Your Google Cloud Project
Set your Google Cloud project ID to leverage Google Cloud resources locally:
gcloud config set project YOUR-PROJECT-ID
If you don’t know your project ID, try the following: * Run
gcloud config list
. * Run gcloud projects list
. * See the support
page: Locate the project
ID.
Setting up a PostgresVectorStore instance
To use the PostgresVectorStore library, you’ll need to install the
@langchain/google-cloud-sql-pg
package and then follow the steps
bellow.
First, you’ll need to log in to your Google Cloud account and set the following environment variables based on your Google Cloud project; these will be defined based on how you want to configure (fromInstance, fromEngine, fromEngineArgs) your PostgresEngine instance :
PROJECT_ID="your-project-id"
REGION="your-project-region" // example: "us-central1"
INSTANCE_NAME="your-instance"
DB_NAME="your-database-name"
DB_USER="your-database-user"
PASSWORD="your-database-password"
Setting up an instance
To instantiate a PostgresVectorStore, you’ll first need to create a
database connection through the PostgresEngine, then initialize the
vector store table and finally call the .initialize()
method to
instantiate the vector store.
import {
Column,
PostgresEngine,
PostgresEngineArgs,
PostgresVectorStore,
PostgresVectorStoreArgs,
VectorStoreTableArgs,
} from "@langchain/google-cloud-sql-pg";
import { SyntheticEmbeddings } from "@langchain/core/utils/testing"; // This is used as an Embedding service
import * as dotenv from "dotenv";
dotenv.config();
const peArgs: PostgresEngineArgs = {
user: process.env.DB_USER ?? "",
password: process.env.PASSWORD ?? "",
};
// PostgresEngine instantiation
const engine: PostgresEngine = await PostgresEngine.fromInstance(
process.env.PROJECT_ID ?? "",
process.env.REGION ?? "",
process.env.INSTANCE_NAME ?? "",
process.env.DB_NAME ?? "",
peArgs
);
const vectorStoreArgs: VectorStoreTableArgs = {
metadataColumns: [new Column("page", "TEXT"), new Column("source", "TEXT")],
};
// Vector store table initilization
await engine.initVectorstoreTable("my_vector_store_table", 768, vectorStoreArgs);
const embeddingService = new SyntheticEmbeddings({ vectorSize: 768 });
const pvectorArgs: PostgresVectorStoreArgs = {
metadataColumns: ["page", "source"],
};
// PostgresVectorStore instantiation
const vectorStore = await PostgresVectorStore.initialize(
engine,
embeddingService,
"my_vector_store_table",
pvectorArgs
);
Manage Vector Store
Add Documents to vector store
To add Documents to the vector store, you would be able to it by passing or not the ids
import { v4 as uuidv4 } from "uuid";
import type { Document } from "@langchain/core/documents";
const document1: Document = {
pageContent: "The powerhouse of the cell is the mitochondria",
metadata: { page: 0, source: "https://example.com" },
};
const document2: Document = {
pageContent: "Buildings are made out of brick",
metadata: { page: 1, source: "https://example.com" },
};
const document3: Document = {
pageContent: "Mitochondria are made out of lipids",
metadata: { page: 2, source: "https://example.com" },
};
const document4: Document = {
pageContent: "The 2024 Olympics are in Paris",
metadata: { page: 3, source: "https://example.com" },
};
const documents = [document1, document2, document3, document4];
const ids = [uuidv4(), uuidv4(), uuidv4(), uuidv4()];
await vectorStore.addDocuments(documents, { ids: ids });
Delete Documents from vector store
You can delete one or more Documents from the vector store by passing the arrays of ids to be deleted:
// deleting a document
const id1 = ids[0];
await vectorStore.delete({ ids: [id1] });
// deleting more than one document
await vectorStore.delete({ ids: ids });
Search for documents
Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.
Query directly
Performing a simple similarity search can be done as follows:
const filter = `"source" = "https://example.com"`;
const results = await vectorStore.similaritySearch("biology", 2, filter);
for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
If you want to execute a similarity search and receive the corresponding scores you can run:
const filter = `"source" = "https://example.com"`;
const resultsWithScores = await vectorStore.similaritySearchWithScore(
"biology",
2,
filter
);
for (const [doc, score] of resultsWithScores) {
console.log(
`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(doc.metadata)}]`
);
}
Query by using the max marginal relevance search
The Maximal marginal relevance optimizes for similarity to the query and diversity among selected documents.
const options = {
k: 4,
filter: `"source" = 'https://example.com'`,
};
const results = await vectorStoreInstance.maxMarginalRelevanceSearch("biology", options);
for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
Related
- Vector store conceptual guide
- Vector store how-to guides