Docx files

The DocxLoader allows you to extract text data from Microsoft Word documents. It supports both the modern .docx format and the legacy .doc format. Depending on the file type, additional dependencies are required.

Setup

To use DocxLoader, you'll need the @langchain/community integration along with either mammoth or word-extractor package:

mammoth: For processing .docx files.
word-extractor: For handling .doc files.

Installation

For `.docx` Files

npm
Yarn
pnpm

npm install @langchain/community @langchain/core mammoth

yarn add @langchain/community @langchain/core mammoth

pnpm add @langchain/community @langchain/core mammoth

For `.doc` Files

npm
Yarn
pnpm

npm install @langchain/community @langchain/core word-extractor

yarn add @langchain/community @langchain/core word-extractor

pnpm add @langchain/community @langchain/core word-extractor

Usage

Loading `.docx` Files

For .docx files, there is no need to explicitly specify any parameters when initializing the loader:

import { DocxLoader } from "@langchain/community/document_loaders/fs/docx";

const loader = new DocxLoader(
  "src/document_loaders/tests/example_data/attention.docx"
);

const docs = await loader.load();

Loading `.doc` Files

For .doc files, you must explicitly specify the type as doc when initializing the loader:

import { DocxLoader } from "@langchain/community/document_loaders/fs/docx";

const loader = new DocxLoader(
  "src/document_loaders/tests/example_data/attention.doc",
  {
    type: "doc",
  }
);

const docs = await loader.load();

Docx files

Setup

Installation

For `.docx` Files

For `.doc` Files

Usage

Loading `.docx` Files

Loading `.doc` Files

Was this page helpful?

You can also leave detailed feedback on GitHub.

Docx files

Setup​

Installation​

For .docx Files​

For .doc Files​

Usage​

Loading .docx Files​

Loading .doc Files​

Was this page helpful?

You can also leave detailed feedback on GitHub.

Setup

Installation

For `.docx` Files

For `.doc` Files

Usage

Loading `.docx` Files

Loading `.doc` Files