Skip to main content

Google Cloud SQL for PostgreSQL

Cloud SQL is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability and offers database engines such as PostgreSQL.

This guide provides a quick overview of how to use Cloud SQL for PostgreSQL to load Documents with the PostgresLoader class.

Overview

Before you begin

In order to use this package, you first need to go through the following steps:

  1. Select or create a Cloud Platform project.
  2. Enable billing for your project.
  3. Enable the Cloud SQL Admin API.
  4. Setup Authentication.
  5. Create a CloudSQL instance
  6. Create a CloudSQL database
  7. Add a user to the database

Authentication

Authenticate locally to your Google Cloud account using the gcloud auth login command.

Set Your Google Cloud Project

Set your Google Cloud project ID to leverage Google Cloud resources locally:

gcloud config set project YOUR-PROJECT-ID

If you don't know your project ID, try the following:

Setting up a PostgresLoader instance

To use the PostgresLoader class, you'll need to install the @langchain/google-cloud-sql-pg package and then follow the steps bellow.

First, you'll need to log in to your Google Cloud account and set the following environment variables based on your Google Cloud project; these will be defined based on how you want to configure (fromInstance, fromEngine, fromEngineArgs) your PostgresEngine instance:

PROJECT_ID="your-project-id"
REGION="your-project-region" // example: "us-central1"
INSTANCE_NAME="your-instance"
DB_NAME="your-database-name"
DB_USER="your-database-user"
PASSWORD="your-database-password"

Setting up an instance

To instantiate a PostgresLoader, you'll first need to create a database connection through the PostgresEngine.

import {
PostgresLoader,
PostgresEngine,
PostgresEngineArgs,
} from "@langchain/google-cloud-sql-pg";
import * as dotenv from "dotenv";

dotenv.config();

const peArgs: PostgresEngineArgs = {
user: process.env.DB_USER ?? "",
password: process.env.PASSWORD ?? "",
};

// PostgresEngine instantiation
const engine: PostgresEngine = await PostgresEngine.fromInstance(
process.env.PROJECT_ID ?? "",
process.env.REGION ?? "",
process.env.INSTANCE_NAME ?? "",
process.env.DB_NAME ?? "",
peArgs
);

Load Documents using the table_name argument

The loader returns a list of Documents from the table using the first column as page_content and all other columns as metadata. The default table will have the first column as page_content and the second column as metadata (JSON). Each row becomes a document.

const documentLoaderArgs: PostgresLoaderOptions = {
tableName: "test_table_custom",
contentColumns: ["fruit_name", "variety"],
metadataColumns: [
"fruit_id",
"quantity_in_stock",
"price_per_unit",
"organic",
],
format: "text",
};

const documentLoaderInstance = await PostgresLoader.initialize(
PEInstance,
documentLoaderArgs
);

Load Documents using a SQL query

The query parameter allows users to specify a custom SQL query which can include filters to load specific documents from a database.

const documentLoaderArgs: PostgresLoaderOptions = {
query: "SELECT * FROM my_fruit_table",
contentColumns: ["fruit_name", "variety"],
metadataColumns: [
"fruit_id",
"quantity_in_stock",
"price_per_unit",
"organic",
],
format: "text",
};

const documentLoaderInstance = await PostgresLoader.initialize(
PEInstance,
docucumetLoaderArgs
);

Set page content format

The loader returns a list of Documents, with one document per row, with page content in specified string format, i.e. text (space separated concatenation), JSON, YAML, CSV, etc. JSON and YAML formats include headers, while text and CSV do not include field headers.


Was this page helpful?


You can also leave detailed feedback on GitHub.