Skip to main content

How to create a time-weighted retriever

Prerequisites

This guide assumes familiarity with the following concepts:

This guide covers the TimeWeightedVectorStoreRetriever, which uses a combination of semantic similarity and a time decay.

The algorithm for scoring them is:

semantic_similarity + (1.0 - decay_rate) ^ hours_passed

Notably, hours_passed refers to the hours passed since the object in the retriever was last accessed, not since it was created. This means that frequently accessed objects remain "fresh."

let score = (1.0 - this.decayRate) ** hoursPassed + vectorRelevance;

this.decayRate is a configurable decimal number between 0 and 1. A lower number means that documents will be "remembered" for longer, while a higher number strongly weights more recently accessed documents.

Note that setting a decay rate of exactly 0 or 1 makes hoursPassed irrelevant and makes this retriever equivalent to a standard vector lookup.

It is important to note that due to required metadata, all documents must be added to the backing vector store using the addDocuments method on the retriever, not the vector store itself.

npm install @langchain/openai
import { TimeWeightedVectorStoreRetriever } from "langchain/retrievers/time_weighted";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const vectorStore = new MemoryVectorStore(new OpenAIEmbeddings());

const retriever = new TimeWeightedVectorStoreRetriever({
vectorStore,
memoryStream: [],
searchKwargs: 2,
});

const documents = [
"My name is John.",
"My name is Bob.",
"My favourite food is pizza.",
"My favourite food is pasta.",
"My favourite food is sushi.",
].map((pageContent) => ({ pageContent, metadata: {} }));

// All documents must be added using this method on the retriever (not the vector store!)
// so that the correct access history metadata is populated
await retriever.addDocuments(documents);

const results1 = await retriever.invoke("What is my favourite food?");

console.log(results1);

/*
[
Document { pageContent: 'My favourite food is pasta.', metadata: {} }
]
*/

const results2 = await retriever.invoke("What is my favourite food?");

console.log(results2);

/*
[
Document { pageContent: 'My favourite food is pasta.', metadata: {} }
]
*/

API Reference:

Next steps​

You've now learned how to use time as a factor when performing retrieval.

Next, check out the broader tutorial on RAG, or this section to learn how to create your own custom retriever over any data source.


Was this page helpful?


You can also leave detailed feedback on GitHub.