Skip to main content

Google Cloud SQL for PostgreSQL

Cloud SQL is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability and offers database engines such as PostgreSQL.

This guide provides a quick overview of how to use Cloud SQL for PostgreSQL to store vector embeddings with the PostgresVectorStore class.

Overview

Integration details

ClassPackagePY supportPackage latest
PostgresVectorStore@langchain/google-cloud-sql-pg0.0.1

Before you begin

In order to use this package, you first need to go throught the following steps:

  1. Select or create a Cloud Platform project.
  2. Enable billing for your project.
  3. Enable the Cloud SQL Admin API.
  4. Setup Authentication.
  5. Create a CloudSQL instance
  6. Create a CloudSQL database
  7. Add a user to the database

Authentication

Authenticate locally to your Google Cloud account using the gcloud auth login command.

Set Your Google Cloud Project

Set your Google Cloud project ID to leverage Google Cloud resources locally:

gcloud config set project YOUR-PROJECT-ID

If you don’t know your project ID, try the following: * Run gcloud config list. * Run gcloud projects list. * See the support page: Locate the project ID.

Setting up a PostgresVectorStore instance

To use the PostgresVectorStore library, you’ll need to install the @langchain/google-cloud-sql-pg package and then follow the steps bellow.

First, you’ll need to log in to your Google Cloud account and set the following environment variables based on your Google Cloud project; these will be defined based on how you want to configure (fromInstance, fromEngine, fromEngineArgs) your PostgresEngine instance :

PROJECT_ID="your-project-id"
REGION="your-project-region" // example: "us-central1"
INSTANCE_NAME="your-instance"
DB_NAME="your-database-name"
DB_USER="your-database-user"
PASSWORD="your-database-password"

Setting up an instance

To instantiate a PostgresVectorStore, you’ll first need to create a database connection through the PostgresEngine, then initialize the vector store table and finally call the .initialize() method to instantiate the vector store.

import {
Column,
PostgresEngine,
PostgresEngineArgs,
PostgresVectorStore,
PostgresVectorStoreArgs,
VectorStoreTableArgs,
} from "@langchain/google-cloud-sql-pg";
import { SyntheticEmbeddings } from "@langchain/core/utils/testing"; // This is used as an Embedding service
import * as dotenv from "dotenv";

dotenv.config();

const peArgs: PostgresEngineArgs = {
user: process.env.DB_USER ?? "",
password: process.env.PASSWORD ?? "",
};

// PostgresEngine instantiation
const engine: PostgresEngine = await PostgresEngine.fromInstance(
process.env.PROJECT_ID ?? "",
process.env.REGION ?? "",
process.env.INSTANCE_NAME ?? "",
process.env.DB_NAME ?? "",
peArgs
);

const vectorStoreArgs: VectorStoreTableArgs = {
metadataColumns: [new Column("page", "TEXT"), new Column("source", "TEXT")],
};

// Vector store table initilization
await engine.initVectorstoreTable("my_vector_store_table", 768, vectorStoreArgs);
const embeddingService = new SyntheticEmbeddings({ vectorSize: 768 });

const pvectorArgs: PostgresVectorStoreArgs = {
metadataColumns: ["page", "source"],
};

// PostgresVectorStore instantiation
const vectorStore = await PostgresVectorStore.initialize(
engine,
embeddingService,
"my_vector_store_table",
pvectorArgs
);

Manage Vector Store

Add Documents to vector store

To add Documents to the vector store, you would be able to it by passing or not the ids

import { v4 as uuidv4 } from "uuid";
import type { Document } from "@langchain/core/documents";

const document1: Document = {
pageContent: "The powerhouse of the cell is the mitochondria",
metadata: { page: 0, source: "https://example.com" },
};

const document2: Document = {
pageContent: "Buildings are made out of brick",
metadata: { page: 1, source: "https://example.com" },
};

const document3: Document = {
pageContent: "Mitochondria are made out of lipids",
metadata: { page: 2, source: "https://example.com" },
};

const document4: Document = {
pageContent: "The 2024 Olympics are in Paris",
metadata: { page: 3, source: "https://example.com" },
};

const documents = [document1, document2, document3, document4];

const ids = [uuidv4(), uuidv4(), uuidv4(), uuidv4()];

await vectorStore.addDocuments(documents, { ids: ids });

Delete Documents from vector store

You can delete one or more Documents from the vector store by passing the arrays of ids to be deleted:

// deleting a document
const id1 = ids[0];
await vectorStore.delete({ ids: [id1] });

// deleting more than one document
await vectorStore.delete({ ids: ids });

Search for documents

Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.

Query directly

Performing a simple similarity search can be done as follows:

const filter = `"source" = "https://example.com"`;

const results = await vectorStore.similaritySearch("biology", 2, filter);

for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

If you want to execute a similarity search and receive the corresponding scores you can run:

const filter = `"source" = "https://example.com"`;
const resultsWithScores = await vectorStore.similaritySearchWithScore(
"biology",
2,
filter
);

for (const [doc, score] of resultsWithScores) {
console.log(
`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(doc.metadata)}]`
);
}

The Maximal marginal relevance optimizes for similarity to the query and diversity among selected documents.

const options = {
k: 4,
filter: `"source" = 'https://example.com'`,
};

const results = await vectorStoreInstance.maxMarginalRelevanceSearch("biology", options);

for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

Was this page helpful?


You can also leave detailed feedback on GitHub.