Skip to main content

Zep Cloud Memory

Recall, understand, and extract data from chat histories. Power personalized AI experiences.

Zep is a long-term memory service for AI Assistant apps. With Zep, you can provide AI assistants with the ability to recall past conversations, no matter how distant, while also reducing hallucinations, latency, and cost.

How Zep Cloud works​

Zep persists and recalls chat histories, and automatically generates summaries and other artifacts from these chat histories. It also embeds messages and summaries, enabling you to search Zep for relevant context from past conversations. Zep does all of this asynchronously, ensuring these operations don't impact your user's chat experience. Data is persisted to database, allowing you to scale out when growth demands.

Zep also provides a simple, easy to use abstraction for document vector search called Document Collections. This is designed to complement Zep's core memory features, but is not designed to be a general purpose vector database.

Zep allows you to be more intentional about constructing your prompt:

  • automatically adding a few recent messages, with the number customized for your app;
  • a summary of recent conversations prior to the messages above;
  • and/or contextually relevant summaries or messages surfaced from the entire chat session.
  • and/or relevant Business data from Zep Document Collections.

Zep Cloud offers:

  • Fact Extraction: Automatically build fact tables from conversations, without having to define a data schema upfront.
  • Dialog Classification: Instantly and accurately classify chat dialog. Understand user intent and emotion, segment users, and more. Route chains based on semantic context, and trigger events.
  • Structured Data Extraction: Quickly extract business data from chat conversations using a schema you define. Understand what your Assistant should ask for next in order to complete its task.

Installation​

Sign up for Zep Cloud and create a project.

Follow the Zep Cloud Typescript SDK Installation Guide to install and get started with Zep.

You'll need your Zep Cloud Project API Key to use the Zep Cloud Memory. See the Zep Cloud docs for more information.

npm install @getzep/zep-cloud @langchain/openai @langchain/community

ZepCloudChatMessageHistory + RunnableWithMessageHistory usage​

import { ZepClient } from "@getzep/zep-cloud";
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";
import { ConsoleCallbackHandler } from "@langchain/core/tracers/console";
import { ChatOpenAI } from "@langchain/openai";
import { RunnableWithMessageHistory } from "@langchain/core/runnables";
import { ZepCloudChatMessageHistory } from "@langchain/community/stores/message/zep_cloud";

// Your Zep Session ID.
const sessionId = "<Zep Session ID>";
const zepClient = new ZepClient({
// Your Zep Cloud Project API key https://help.getzep.com/projects
apiKey: "<Zep Api Key>",
});

const prompt = ChatPromptTemplate.fromMessages([
["system", "Answer the user's question below. Be polite and helpful:"],
new MessagesPlaceholder("history"),
["human", "{question}"],
]);

const chain = prompt
.pipe(
new ChatOpenAI({
temperature: 0.8,
modelName: "gpt-3.5-turbo-1106",
})
)
.withConfig({
callbacks: [new ConsoleCallbackHandler()],
});

const chainWithHistory = new RunnableWithMessageHistory({
runnable: chain,
getMessageHistory: (sessionId) =>
new ZepCloudChatMessageHistory({
client: zepClient,
sessionId,
memoryType: "perpetual",
}),
inputMessagesKey: "question",
historyMessagesKey: "history",
});

const result = await chainWithHistory.invoke(
{
question: "What did we talk about earlier?",
},
{
configurable: {
sessionId,
},
}
);

console.log("result", result);

API Reference:

ZepCloudChatMessageHistory + RunnableWithMessageHistory + ZepVectorStore (as retriever) usage​

import { ZepClient } from "@getzep/zep-cloud";
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";
import { ConsoleCallbackHandler } from "@langchain/core/tracers/console";
import { ChatOpenAI } from "@langchain/openai";
import { Document } from "@langchain/core/documents";
import {
RunnableLambda,
RunnableMap,
RunnablePassthrough,
RunnableWithMessageHistory,
} from "@langchain/core/runnables";
import { ZepCloudVectorStore } from "@langchain/community/vectorstores/zep_cloud";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { ZepCloudChatMessageHistory } from "@langchain/community/stores/message/zep_cloud";

interface ChainInput {
question: string;
sessionId: string;
}

async function combineDocuments(docs: Document[], documentSeparator = "\n\n") {
const docStrings: string[] = await Promise.all(
docs.map((doc) => doc.pageContent)
);
return docStrings.join(documentSeparator);
}

// Your Zep Session ID.
const sessionId = "<Zep Session ID>";

const collectionName = "<Zep Collection Name>";

const zepClient = new ZepClient({
// Your Zep Cloud Project API key https://help.getzep.com/projects
apiKey: "<Zep Api Key>",
});

const vectorStore = await ZepCloudVectorStore.init({
client: zepClient,
collectionName,
});

const prompt = ChatPromptTemplate.fromMessages([
[
"system",
`Answer the question based only on the following context and conversation history: {context}`,
],
new MessagesPlaceholder("history"),
["human", "{question}"],
]);

const model = new ChatOpenAI({
temperature: 0.8,
modelName: "gpt-3.5-turbo-1106",
});
const retriever = vectorStore.asRetriever();
const searchQuery = new RunnableLambda({
func: async (input: any) => {
// You can use zep to synthesize a question based on the user input and session context.
// It can be useful because sometimes the user will type something like "yes" or "ok", which is not very useful for vector store retrieval.
const { question } = await zepClient.memory.synthesizeQuestion(
input.session_id
);
console.log("Synthesized question: ", question);
return question;
},
});
const retrieverLambda = new RunnableLambda({
func: async (question: string) => {
const response = await retriever.invoke(question);
return combineDocuments(response);
},
});
const setupAndRetrieval = RunnableMap.from({
context: searchQuery.pipe(retrieverLambda),
question: (x: any) => x.question,
history: (x: any) => x.history,
});
const outputParser = new StringOutputParser();

const ragChain = setupAndRetrieval.pipe(prompt).pipe(model).pipe(outputParser);

const invokeChain = (chainInput: ChainInput) => {
const chainWithHistory = new RunnableWithMessageHistory({
runnable: RunnablePassthrough.assign({
session_id: () => chainInput.sessionId,
}).pipe(ragChain),
getMessageHistory: (sessionId) =>
new ZepCloudChatMessageHistory({
client: zepClient,
sessionId,
memoryType: "perpetual",
}),
inputMessagesKey: "question",
historyMessagesKey: "history",
});

return chainWithHistory.invoke(
{ question: chainInput.question },
{
configurable: {
sessionId: chainInput.sessionId,
},
}
);
};

const chain = new RunnableLambda({
func: invokeChain,
}).withConfig({
callbacks: [new ConsoleCallbackHandler()],
});

const result = await chain.invoke({
question: "Project Gutenberg",
sessionId,
});

console.log("result", result);

API Reference:

Memory Usage​

import { ChatOpenAI } from "@langchain/openai";
import { ConversationChain } from "langchain/chains";
import { ZepCloudMemory } from "@langchain/community/memory/zep_cloud";
import { randomUUID } from "crypto";

const sessionId = randomUUID(); // This should be unique for each user or each user's session.

const memory = new ZepCloudMemory({
sessionId,
// Your Zep Cloud Project API key https://help.getzep.com/projects
apiKey: "<Zep Api Key>",
});

const model = new ChatOpenAI({
modelName: "gpt-3.5-turbo",
temperature: 0,
});

const chain = new ConversationChain({ llm: model, memory });
console.log("Memory Keys:", memory.memoryKeys);

const res1 = await chain.invoke({ input: "Hi! I'm Jim." });
console.log({ res1 });
/*
{
res1: {
text: "Hello Jim! It's nice to meet you. My name is AI. How may I assist you today?"
}
}
*/

const res2 = await chain.invoke({ input: "What did I just say my name was?" });
console.log({ res2 });

/*
{
res1: {
text: "You said your name was Jim."
}
}
*/
console.log("Session ID: ", sessionId);
console.log("Memory: ", await memory.loadMemoryVariables({}));

API Reference:


Was this page helpful?


You can also leave detailed feedback on GitHub.