Skip to main content

Quickstart

Overview​

We'll go over an example of how to design and implement an LLM-powered chatbot. Here are a few of the high-level components we'll be working with:

  • Chat Models. The chatbot interface is based around messages rather than raw text, and therefore is best suited to Chat Models rather than text LLMs. See here for a list of chat model integrations and here for documentation on the chat model interface in LangChain. You can use LLMs (see here) for chatbots as well, but chat models have a more conversational tone and natively support a message interface.
  • Prompt Templates, which simplify the process of assembling prompts that combine default messages, user input, chat history, and (optionally) additional retrieved context.
  • Chat History, which allows a chatbot to "remember" past interactions and take them into account when responding to followup questions. See here for more information.
  • Retrievers (optional), which are useful if you want to build a chatbot that can use domain-specific, up-to-date knowledge as context to augment its responses. See here for in-depth documentation on retrieval systems.

We'll cover how to fit the above components together to create a powerful conversational chatbot.

Quickstart​

We'll use OpenAI for this quickstart. Install the integration package and set a OPENAI_API_KEY environment variable:

npm install @langchain/openai

Let’s initialize the chat model which will serve as the chatbot’s brain:

import { ChatOpenAI } from "@langchain/openai";

const chat = new ChatOpenAI({
model: "gpt-3.5-turbo-1106",
temperature: 0.2,
});

If we invoke our chat model, the output is an AIMessage:

import { HumanMessage } from "@langchain/core/messages";

await chat.invoke([
new HumanMessage(
"Translate this sentence from English to French: I love programming."
),
]);
AIMessage {
content: 'J'adore la programmation.'
}

The model on its own does not have any concept of state. For example, if you ask a followup question:

await chat.invoke([new HumanMessage("What did you just say?")]);
AIMessage {
content: 'I said, "What did you just say?"'
}

We can see that it doesn’t take the previous conversation turn into context, and cannot answer the question.

To get around this, we need to pass the entire conversation history into the model. Let’s see what happens when we do that:

import { AIMessage } from "@langchain/core/messages";

await chat.invoke([
new HumanMessage(
"Translate this sentence from English to French: I love programming."
),
new AIMessage("J'adore la programmation."),
new HumanMessage("What did you just say?"),
]);
AIMessage {
content: `I said, "J'adore la programmation," which means "I love programming" in French.`
}

And now we can see that we get a good response!

This is the basic idea underpinning a chatbot’s ability to interact conversationally.

Prompt templates​

Let’s define a prompt template to make formatting a bit easier. We can create a chain by piping it into the model:

import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
],
new MessagesPlaceholder("messages"),
]);

const chain = prompt.pipe(chat);

The MessagesPlaceholder above inserts chat messages passed into the chain’s input as messages directly into the prompt. Then, we can invoke the chain like this:

await chain.invoke({
messages: [
new HumanMessage(
"Translate this sentence from English to French: I love programming."
),
new AIMessage("J'adore la programmation."),
new HumanMessage("What did you just say?"),
],
});
AIMessage {
content: `I said, "J'adore la programmation," which means "I love programming" in French.`
}

Message history​

As a shortcut for managing the chat history, we can use a MessageHistory class, which is responsible for saving and loading chat messages. There are many built-in message history integrations that persist messages to a variety of databases, but for this quickstart we’ll use a in-memory, demo message history called ChatMessageHistory.

Here’s an example of using it directly:

import { ChatMessageHistory } from "langchain/stores/message/in_memory";

const demoEphemeralChatMessageHistory = new ChatMessageHistory();

await demoEphemeralChatMessageHistory.addMessage(new HumanMessage("hi!"));

await demoEphemeralChatMessageHistory.addMessage(new AIMessage("whats up?"));

await demoEphemeralChatMessageHistory.getMessages();
[
HumanMessage {
content: 'hi!'
},
AIMessage {
content: 'whats up?'
}
]

Once we do that, we can pass the stored messages directly into our chain as a parameter:

await demoEphemeralChatMessageHistory.addMessage(
new HumanMessage(
"Translate this sentence from English to French: I love programming."
)
);

const responseMessage = await chain.invoke({
messages: await demoEphemeralChatMessageHistory.getMessages(),
});

console.log(responseMessage);
AIMessage {
content: 'The translation of "I love programming" in French is "J\'adore la programmation'
}
await demoEphemeralChatMessageHistory.addMessage(responseMessage);

await demoEphemeralChatMessageHistory.addMessage(
new HumanMessage("What did you just say?")
);

const responseMessage2 = await chain.invoke({
messages: await demoEphemeralChatMessageHistory.getMessages(),
});

console.log(responseMessage2);
AIMessage {
content: `I said, "J'adore la programmation," which means "I love programming" in French.`
}

And now we have a basic chatbot!

While this chain can serve as a useful chatbot on its own with just the model’s internal knowledge, it’s often useful to introduce some form of retrieval-augmented generation, or RAG for short, over domain-specific knowledge to make our chatbot more focused. We’ll cover this next.

Retrievers​

We can set up and use a Retriever to pull domain-specific knowledge for our chatbot. To show this, let’s expand the simple chatbot we created above to be able to answer questions about LangSmith.

We’ll use the LangSmith documentation as source material and store it in a vectorstore for later retrieval. Note that this example will gloss over some of the specifics around parsing and storing a data source - you can see more in-depth documentation on creating retrieval systems here.

Let’s set up our retriever. First, we’ll install some required deps:

npm install cheerio

Next, we’ll use a document loader to pull data from a webpage:

import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";

const loader = new CheerioWebBaseLoader(
"https://docs.smith.langchain.com/user_guide"
);

const rawDocs = await loader.load();

Next, we split it into smaller chunks that the LLM’s context window can handle and store it in a vector database:

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 500,
chunkOverlap: 0,
});

const allSplits = await textSplitter.splitDocuments(rawDocs);

Then we embed and store those chunks in a vector database:

import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

const vectorstore = await MemoryVectorStore.fromDocuments(
allSplits,
new OpenAIEmbeddings()
);

And finally, let’s create a retriever from our initialized vectorstore:

const retriever = vectorstore.asRetriever(4);

const docs = await retriever.invoke("how can langsmith help with testing?");

console.log(docs);
[
Document {
pageContent: 'You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'inputs, and see what happens. At some point though, our application is performing\n' +
'well and we want to be more rigorous about testing changes. We can use a dataset\n' +
'that we’ve constructed along the way (see above). Alternatively, we could spend some\n' +
'time constructing a small dataset by hand. For these situations, LangSmith simplifies',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'chains and agents up the level where they are reliable enough to be used in production.Over the past two months, we at LangChain have been building and using LangSmith with the goal of bridging this gap. This is our tactical user guide to outline effective ways to use LangSmith and maximize its benefits.On by default​At LangChain, all of us have LangSmith’s tracing running in the background by default. On the Python side, this is achieved by setting environment variables, which we establish',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'has helped us debug bugs in formatting logic, unexpected transformations to user input, and straight up missing user input.To a much lesser extent, this is also true of the output of an LLM. Oftentimes the output of an LLM is technically a string but that string may contain some structure (json, yaml) that is intended to be parsed into a structured representation. Understanding what the exact output is can help determine if there may be a need for different parsing.LangSmith provides a',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
}
]

We can see that invoking the retriever above results in some parts of the LangSmith docs that contain information about testing that our chatbot can use as context when answering questions.

Handling documents​

Let’s modify our previous prompt to accept documents as context. We’ll use a createStuffDocumentsChain helper function to “stuff” all of the input documents into the prompt, which also conveniently handles formatting. Other arguments (like messages) will be passed directly through into the prompt:

import { createStuffDocumentsChain } from "langchain/chains/combine_documents";

const questionAnsweringPrompt = ChatPromptTemplate.fromMessages([
[
"system",
"Answer the user's questions based on the below context:\n\n{context}",
],
new MessagesPlaceholder("messages"),
]);

const documentChain = await createStuffDocumentsChain({
llm: chat,
prompt: questionAnsweringPrompt,
});

We can invoke this documentChain with the raw documents we retrieved above:

const demoEphemeralChatMessageHistory2 = new ChatMessageHistory();

await demoEphemeralChatMessageHistory2.addMessage(
new HumanMessage("how can langsmith help with testing?")
);

await documentChain.invoke({
messages: await demoEphemeralChatMessageHistory2.getMessages(),
context: docs,
});
LangSmith can help with testing in several ways. It allows you to quickly edit examples and add them to datasets, expanding the surface area of your evaluation sets. This can help fine-tune a model for improved quality or reduced costs. Additionally, LangSmith simplifies the construction of small datasets by hand, which can be useful for rigorous testing of changes. It also provides tracing and monitoring capabilities to visualize latency, log all traces, and troubleshoot specific issues as they arise, ensuring that your application is performing well and reliable enough for production use.

Awesome! We see an answer synthesized from information in the input documents.

Creating a retrieval chain​

Next, let’s integrate our retriever into the chain. Our retriever should retrieve information relevant to the last message we pass in from the user, so we extract it and use that as input to fetch relevant docs, which we add to the current chain as context. We pass context plus the previous messages into our document chain to generate a final answer.

We also use the RunnablePassthrough.assign() method to pass intermediate steps through at each invocation. Here’s what it looks like:

import type { BaseMessage } from "@langchain/core/messages";
import {
RunnablePassthrough,
RunnableSequence,
} from "@langchain/core/runnables";

const parseRetrieverInput = (params: { messages: BaseMessage[] }) => {
return params.messages[params.messages.length - 1].content;
};

const retrievalChain = RunnablePassthrough.assign({
context: RunnableSequence.from([parseRetrieverInput, retriever]),
}).assign({
answer: documentChain,
});
const response3 = await retrievalChain.invoke({
messages: await demoEphemeralChatMessageHistory2.getMessages(),
});

console.log(response3);
{
messages: [
HumanMessage {
content: 'how can langsmith help with testing?'
}
],
context: [
Document {
pageContent: 'You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be',
metadata: [Object]
},
Document {
pageContent: 'inputs, and see what happens. At some point though, our application is performing\n' +
'well and we want to be more rigorous about testing changes. We can use a dataset\n' +
'that we’ve constructed along the way (see above). Alternatively, we could spend some\n' +
'time constructing a small dataset by hand. For these situations, LangSmith simplifies',
metadata: [Object]
},
Document {
pageContent: 'chains and agents up the level where they are reliable enough to be used in production.Over the past two months, we at LangChain have been building and using LangSmith with the goal of bridging this gap. This is our tactical user guide to outline effective ways to use LangSmith and maximize its benefits.On by default​At LangChain, all of us have LangSmith’s tracing running in the background by default. On the Python side, this is achieved by setting environment variables, which we establish',
metadata: [Object]
},
Document {
pageContent: 'has helped us debug bugs in formatting logic, unexpected transformations to user input, and straight up missing user input.To a much lesser extent, this is also true of the output of an LLM. Oftentimes the output of an LLM is technically a string but that string may contain some structure (json, yaml) that is intended to be parsed into a structured representation. Understanding what the exact output is can help determine if there may be a need for different parsing.LangSmith provides a',
metadata: [Object]
}
],
answer: 'LangSmith can help with testing by allowing you to quickly edit examples and add them to datasets, expanding the surface area of your evaluation sets. This can help in fine-tuning a model for improved quality or reduced costs. Additionally, LangSmith simplifies the process of constructing small datasets by hand, which can be useful for rigorous testing of changes in your application. It also facilitates monitoring of your application, allowing you to log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise.'
}
await demoEphemeralChatMessageHistory2.addMessage(
new AIMessage(response3.answer)
);

await demoEphemeralChatMessageHistory2.addMessage(
new HumanMessage("tell me more about that!")
);

await retrievalChain.invoke({
messages: await demoEphemeralChatMessageHistory2.getMessages(),
});
{
messages: [
HumanMessage {
content: 'how can langsmith help with testing?'
},
AIMessage {
content: 'LangSmith can help with testing by allowing you to quickly edit examples and add them to datasets, expanding the surface area of your evaluation sets. This can help in fine-tuning a model for improved quality or reduced costs. Additionally, LangSmith simplifies the process of constructing small datasets by hand, which can be useful for rigorous testing of changes in your application. It also facilitates monitoring of your application, allowing you to log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise.'
},
HumanMessage {
content: 'tell me more about that!'
}
],
context: [
Document { pageContent: 'shadowRing,', metadata: [Object] },
Document {
pageContent: 'however, there is still no complete substitute for human review to get the utmost quality and reliability from your application.',
metadata: [Object]
},
Document {
pageContent: 'You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be',
metadata: [Object]
},
Document {
pageContent: 'whenever we launch a virtual environment or open our bash shell and leave them set. The same principle applies to most JavaScript environments through process.env1.The benefit here is that all calls to LLMs, chains, agents, tools, and retrievers are logged to LangSmith. Around 90% of the time we don’t even look at the traces, but the 10% of the time that we do… it’s so helpful. We can use LangSmith to debug:An unexpected end resultWhy an agent is loopingWhy a chain was slower than expectedHow',
metadata: [Object]
}
],
answer: 'LangSmith provides a platform for monitoring and debugging your application during testing. It allows you to log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. This can be particularly helpful in identifying and addressing unexpected end results, looping agents, slower-than-expected chains, and other issues that may arise during testing. By using LangSmith, you can gain insights into the performance and behavior of your application, enabling you to make necessary adjustments and improvements for a more reliable and high-quality application.'
}

Nice! Our chatbot can now answer domain-specific questions in a conversational way.

As an aside, if you don’t want to return all the intermediate steps, you can define your retrieval chain like this using a pipe directly into the document chain instead of the final .assign() call:

const retrievalChainWithOnlyAnswer = RunnablePassthrough.assign({
context: RunnableSequence.from([parseRetrieverInput, retriever]),
}).pipe(documentChain);

await retrievalChainWithOnlyAnswer.invoke({
messages: await demoEphemeralChatMessageHistory2.getMessages(),
});
LangSmith provides the capability to monitor your application by logging all traces, visualizing latency and token usage statistics, and troubleshooting specific issues as they arise. This monitoring feature allows you to track the performance of your application and identify any unexpected behavior or issues. Additionally, LangSmith can be used to debug various scenarios, such as unexpected end results, looping agents, or slower-than-expected chains, providing valuable insights and assistance in optimizing the performance and reliability of your application.

Query transformation​

There’s one more optimization we’ll cover here. In the above example, when we asked a followup question, tell me more about that!, you might notice that the retrieved docs don’t directly include information about testing. This is because we’re passing tell me more about that! verbatim as a query to the retriever. The output in the retrieval chain is still okay because the document chain retrieval chain can generate an answer based on the chat history, but we could be retrieving more rich and informative documents:

await retriever.invoke("how can langsmith help with testing?");
[
Document {
pageContent: 'You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'inputs, and see what happens. At some point though, our application is performing\n' +
'well and we want to be more rigorous about testing changes. We can use a dataset\n' +
'that we’ve constructed along the way (see above). Alternatively, we could spend some\n' +
'time constructing a small dataset by hand. For these situations, LangSmith simplifies',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'chains and agents up the level where they are reliable enough to be used in production.Over the past two months, we at LangChain have been building and using LangSmith with the goal of bridging this gap. This is our tactical user guide to outline effective ways to use LangSmith and maximize its benefits.On by default​At LangChain, all of us have LangSmith’s tracing running in the background by default. On the Python side, this is achieved by setting environment variables, which we establish',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'has helped us debug bugs in formatting logic, unexpected transformations to user input, and straight up missing user input.To a much lesser extent, this is also true of the output of an LLM. Oftentimes the output of an LLM is technically a string but that string may contain some structure (json, yaml) that is intended to be parsed into a structured representation. Understanding what the exact output is can help determine if there may be a need for different parsing.LangSmith provides a',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
}
]
await retriever.invoke("tell me more about that!");
[
Document {
pageContent: 'shadowRing,',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'however, there is still no complete substitute for human review to get the utmost quality and reliability from your application.',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
},
Document {
pageContent: 'whenever we launch a virtual environment or open our bash shell and leave them set. The same principle applies to most JavaScript environments through process.env1.The benefit here is that all calls to LLMs, chains, agents, tools, and retrievers are logged to LangSmith. Around 90% of the time we don’t even look at the traces, but the 10% of the time that we do… it’s so helpful. We can use LangSmith to debug:An unexpected end resultWhy an agent is loopingWhy a chain was slower than expectedHow',
metadata: {
source: 'https://docs.smith.langchain.com/user_guide',
loc: [Object]
}
}
]

To get around this common problem, let’s add a "query transformation" step that removes references from the input. We’ll wrap our old retriever as follows:

import { RunnableBranch } from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";

const queryTransformPrompt = ChatPromptTemplate.fromMessages([
new MessagesPlaceholder("messages"),
[
"user",
"Given the above conversation, generate a search query to look up in order to get information relevant to the conversation. Only respond with the query, nothing else.",
],
]);

const queryTransformingRetrieverChain = RunnableBranch.from([
[
(params: { messages: BaseMessage[] }) => params.messages.length > 0,
RunnableSequence.from([parseRetrieverInput, retriever]),
],
queryTransformPrompt
.pipe(chat)
.pipe(new StringOutputParser())
.pipe(retriever),
]).withConfig({ runName: "chat_retriever_chain" });

Above, we handle pass initial queries directly to the retriever as before, but we handle followup questions by rephrasing them according to a prompt. This removes references to chat history, which the retriever is unaware of.

Now let’s recreate our earlier chain with this new queryTransformingRetrieverChain. Note that this new chain accepts a dict as input and parses a string to pass to the retriever, so we don’t have to do additional parsing at the top level:

const conversationalRetrievalChain = RunnablePassthrough.assign({
context: queryTransformingRetrieverChain,
}).assign({
answer: documentChain,
});

const demoEphemeralChatMessageHistory3 = new ChatMessageHistory();

And finally, let’s invoke it!

await demoEphemeralChatMessageHistory3.addMessage(
new HumanMessage("how can langsmith help with testing?")
);

const response4 = await conversationalRetrievalChain.invoke({
messages: await demoEphemeralChatMessageHistory3.getMessages(),
});

console.log(response4);
{
messages: [
HumanMessage {
content: 'how can langsmith help with testing?',
}
],
context: [
Document {
pageContent: 'You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be',
metadata: [Object]
},
Document {
pageContent: 'inputs, and see what happens. At some point though, our application is performing\n' +
'well and we want to be more rigorous about testing changes. We can use a dataset\n' +
'that we’ve constructed along the way (see above). Alternatively, we could spend some\n' +
'time constructing a small dataset by hand. For these situations, LangSmith simplifies',
metadata: [Object]
},
Document {
pageContent: 'chains and agents up the level where they are reliable enough to be used in production.Over the past two months, we at LangChain have been building and using LangSmith with the goal of bridging this gap. This is our tactical user guide to outline effective ways to use LangSmith and maximize its benefits.On by default​At LangChain, all of us have LangSmith’s tracing running in the background by default. On the Python side, this is achieved by setting environment variables, which we establish',
metadata: [Object]
},
Document {
pageContent: 'has helped us debug bugs in formatting logic, unexpected transformations to user input, and straight up missing user input.To a much lesser extent, this is also true of the output of an LLM. Oftentimes the output of an LLM is technically a string but that string may contain some structure (json, yaml) that is intended to be parsed into a structured representation. Understanding what the exact output is can help determine if there may be a need for different parsing.LangSmith provides a',
metadata: [Object]
}
],
answer: 'LangSmith can help with testing in several ways. It allows you to quickly edit examples and add them to datasets, which expands the surface area of your evaluation sets. This can be useful for fine-tuning a model for improved quality or reduced costs. Additionally, LangSmith simplifies the process of constructing small datasets by hand, which can be valuable for rigorous testing of changes. It also provides tracing capabilities to monitor your application, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Overall, LangSmith helps in testing by facilitating dataset construction, monitoring, and debugging.'
}
await demoEphemeralChatMessageHistory3.addMessage(
new AIMessage(response4.answer)
);

await demoEphemeralChatMessageHistory3.addMessage(
new HumanMessage("tell me more about that!")
);

await conversationalRetrievalChain.invoke({
messages: await demoEphemeralChatMessageHistory3.getMessages(),
});
{
messages: [
HumanMessage {
content: 'how can langsmith help with testing?'
},
AIMessage {
content: 'LangSmith can help with testing in several ways. It allows you to quickly edit examples and add them to datasets, which expands the surface area of your evaluation sets. This can be useful for fine-tuning a model for improved quality or reduced costs. Additionally, LangSmith simplifies the process of constructing small datasets by hand, which can be valuable for rigorous testing of changes. It also provides tracing capabilities to monitor your application, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Overall, LangSmith helps in testing by facilitating dataset construction, monitoring, and debugging.'
},
HumanMessage {
content: 'tell me more about that!'
}
],
context: [
Document {
pageContent: 'You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be',
metadata: [Object]
},
Document {
pageContent: 'LangSmith makes it easy to manually review and annotate runs through annotation queues.These queues allow you to select any runs based on criteria like model type or automatic evaluation scores, and queue them up for human review. As a reviewer, you can then quickly step through the runs, viewing the input, output, and any existing tags before adding your own feedback.We often use this for a couple of reasons:To assess subjective qualities that automatic evaluators struggle with, like',
metadata: [Object]
},
Document {
pageContent: 'inputs, and see what happens. At some point though, our application is performing\n' +
'well and we want to be more rigorous about testing changes. We can use a dataset\n' +
'that we’ve constructed along the way (see above). Alternatively, we could spend some\n' +
'time constructing a small dataset by hand. For these situations, LangSmith simplifies',
metadata: [Object]
},
Document {
pageContent: 'datasets​LangSmith makes it easy to curate datasets. However, these aren’t just useful inside LangSmith; they can be exported for use in other contexts. Notable applications include exporting for use in OpenAI Evals or fine-tuning, such as with FireworksAI.To set up tracing in Deno, web browsers, or other runtime environments without access to the environment, check out the FAQs.↩PreviousLangSmithNextTracingOn by defaultDebuggingWhat was the exact input to the LLM?If I edit the prompt, how does',
metadata: [Object]
}
],
answer: 'Certainly! LangSmith simplifies the process of constructing datasets by allowing you to quickly edit examples and add them to datasets. This is valuable for expanding the surface area of your evaluation sets, which can lead to improved model quality or reduced costs. Additionally, LangSmith provides tracing capabilities, allowing you to monitor your application, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. This monitoring functionality helps ensure that your application is performing well and allows for rigorous testing of changes. Furthermore, LangSmith enables the curation of datasets, which can be exported for use in other contexts, such as OpenAI Evals or fine-tuning with platforms like FireworksAI. Overall, LangSmith offers a comprehensive set of tools for testing, monitoring, and dataset management.'
}

To help you understand what’s happening internally, this LangSmith trace shows the first invocation. You can see that the user’s initial query is passed directly to the retriever, which return suitable docs.

The invocation for followup question, illustrated by this LangSmith trace, rephrases the user’s initial question to something more relevant to testing with LangSmith, resulting in higher quality docs.

And we now have a chatbot capable of conversational retrieval!

Next steps​

You now know how to build a conversational chatbot that can integrate past messages and domain-specific knowledge into its generations. There are many other optimizations you can make around this - check out the following pages for more information:

  • Memory management: This includes a guide on automatically updating chat history, as well as trimming, summarizing, or otherwise modifying long conversations to keep your bot focused.
  • Retrieval: A deeper dive into using different types of retrieval with your chatbot.
  • Tool usage: How to allows your chatbots to use tools that interact with other APIs and systems.

Help us out by providing feedback on this documentation page: