Skip to main content

Violation of Expectations Chain

This page demonstrates how to use the ViolationOfExpectationsChain. This chain extracts insights from chat conversations by comparing the differences between an LLM's prediction of the next message in a conversation and the user's mental state against the actual next message, and is intended to provide a form of reflection for long-term memory.

The ViolationOfExpectationsChain was implemented using the results of a paper by Plastic Labs. Their paper, Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models can be found here.


npm install @langchain/openai @langchain/community

The below example features a chat between a human and an AI, talking about a journal entry the user made.

import { ViolationOfExpectationsChain } from "langchain/experimental/chains/violation_of_expectations";
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";
import { AIMessage, HumanMessage } from "@langchain/core/messages";

// Short GPT generated conversation between a human and an AI.
const dummyMessages = [
new HumanMessage(
"I've been thinking about the importance of time with myself to discover my voice. I feel like 1-2 hours is never enough."
new AIMessage(
"The concept of 'adequate time' varies. Have you tried different formats of introspection, such as morning pages or long-form writing, to see if they make the process more efficient?"
new HumanMessage(
"I have tried journaling but never consistently. Sometimes it feels like writing doesn't capture everything."
new AIMessage(
"Writing has its limits. What about other mediums like digital art, or interactive journal apps with dynamic prompts that dig deeper? Even coding a personal project can be a form of self-discovery."
new HumanMessage(
"That's an interesting idea. I've never thought about coding as a form of self-discovery."
new AIMessage(
"Since you're comfortable with code, consider building a tool to log and analyze your emotional state, thoughts, or personal growth metrics. It merges skill with introspection, makes the data quantifiable."
new HumanMessage(
"The idea of quantifying emotions and personal growth is fascinating. But I wonder how much it can really capture the 'dark zone' within us."
new AIMessage(
"Good point. The 'dark zone' isn't fully quantifiable. But a tool could serve as a scaffold to explore those areas. It gives a structured approach to an unstructured problem."
new HumanMessage(
"You might be onto something. A structured approach could help unearth patterns or triggers I hadn't noticed."
new AIMessage(
"Exactly. It's about creating a framework to understand what can't easily be understood. Then you can allocate those 5+ hours more effectively, targeting areas that your data flags."

// Instantiate with an empty string to start, since we have no data yet.
const vectorStore = await HNSWLib.fromTexts(
[" "],
[{ id: 1 }],
new OpenAIEmbeddings()
const retriever = vectorStore.asRetriever();

// Instantiate the LLM,
const llm = new ChatOpenAI({
model: "gpt-4",

// And the chain.
const voeChain = ViolationOfExpectationsChain.fromLLM(llm, retriever);

// Requires an input key of "chat_history" with an array of messages.
const result = await voeChain.invoke({
chat_history: dummyMessages,


* Output:
result: [
'The user has experience with coding and has tried journaling before, but struggles with maintaining consistency and fully expressing their thoughts and feelings through writing.',
'The user shows a thoughtful approach towards new concepts and is willing to engage with and contemplate novel ideas before making a decision. They also consider time effectiveness as a crucial factor in their decision-making process.',
'The user is curious and open-minded about new concepts, but also values introspection and self-discovery in understanding emotions and personal growth.',
'The user is open to new ideas and strategies, specifically those that involve a structured approach to identifying patterns or triggers.',
'The user may not always respond or engage with prompts, indicating a need for varied and adaptable communication strategies.'

API Reference:


Now let's go over everything the chain is doing, step by step.

Under the hood, the ViolationOfExpectationsChain performs four main steps:

Step 1. Predict the user's next message using only the chat history.

The LLM is tasked with generating three key pieces of information:

  • Concise reasoning about the users internal mental state.
  • A prediction on how they will respond to the AI's most recent message.
  • A concise list of any additional insights that would be useful to improve prediction. Once the LLM response is returned, we query our retriever with the insights, mapping over all. From each result we extract the first retrieved document, and return it. Then, all retrieved documents and generated insights are sorted to remove duplicates, and returned.

Step 2. Generate prediction violations.

Using the results from step 1, we query the LLM to generate the following:

  • How exactly was the original prediction violated? Which parts were wrong? State the exact differences.
  • If there were errors with the prediction, what were they and why? We pass the LLM our predicted response and generated (along with any retrieved) insights from step 1, and the actual response from the user.

Once we have the difference between the predicted and actual response, we can move on to step 3.

Step 3. Regenerate the prediction.

Using the original prediction, key insights and the differences between the actual response and our prediction, we can generate a new more accurate prediction. These predictions will help us in the next step to generate an insight that isn't just parts of the user's conversation verbatim.

Step 4. Generate an insight.

Lastly, we prompt the LLM to generate one concise insight given the following context:

  • Ways in which our original prediction was violated.
  • Our generated revised prediction (step 3)
  • The actual response from the user. Given these three data points, we prompt the LLM to return one fact relevant to the specific user response. A key point here is giving it the ways in which our original prediction was violated. This list contains the exact differences --and often specific facts themselves-- between the predicted and actual response.

We perform these steps on every human message, so if you have a conversation with 10 messages (5 human 5 AI), you'll get 5 insights. The list of messages are chunked by iterating over the entire chat history, stopping at an AI message and returning it, along with all messages that preceded it.

Once our .call({...}) method returns the array of insights, we can save them to our vector store. Later, we can retrieve them in future insight generations, or for other reasons like insightful context in a chat bot.

Help us out by providing feedback on this documentation page: