Skip to main content

Manage Documents

In a knowledge base, each imported item—whether a local file, a Notion page, or a web page—becomes a document. From the document list, you can view and manage all these documents to keep your knowledge accurate, relevant, and up-to-date.
Click the knowledge base name at the top to quickly switch between knowledge bases.
Manage Knowledge Documents
ActionDescription
AddImport a new document.
Modify Chunk SettingsModify a document’s chunking settings (excluding the chunk structure).
Each document can have its own chunking settings, while the chunk structure is shared across the knowledge base and cannot be changed once set.
DeletePermanently remove a document. Deletion cannot be undone.
Enable / DisableTemporarily include or exclude a document from retrieval.
On Dify Cloud, documents that have not been updated or retrieved for a certain period are automatically disabled to optimize performance.

The inactivity period varies by subscription plan:
  • Sandbox: 7 days
  • Professional & Team: 30 days
Professional and Team users can re-enable these documents with one click.
Archive / UnarchiveArchive a document that you no longer need for retrieval but still want to keep. Archived documents are read-only and can be unarchived at any time.
EditModify the content of a document by editing its chunks. See Manage Chunks for details.
RenameChange the name of a document.

Manage Chunks

According to its chunk settings, every document is split into content chunks—the basic units for retrieval. From the chunk list within a document, you can view and manage all its chunks to improve the retrieval efficiency and accuracy.
Click the document name in the upper—left corner to quickly switch between documents.
Manage Knowledge Chunks
ActionDescription
AddAdd one or batch add multiple new chunks.

For documents chunked with Parent-child mode, both new parent and child chunks can be added.
Add chunks is a paid feature on Dify Cloud. Upgrade to Professional or Team to use it.
DeletePermanently remove a chunk. Deletion cannot be undone.
Enable / DisableTemporarily include or exclude a chunk from retrieval. Disabled chunks cannot be edited.
EditModify the content of a chunk. Edited chunks are marked Edited.

For documents chunked with Parent-child mode:
  • When editing a parent chunk, you can choose to regenerate its child chunks or keep them unchanged.
  • Editing a child chunk does not update its parent chunk.
When images in documents are extracted as chunk attachments, their URLs remain in the chunk text. Deleting these URLs won’t affect the extracted image attachments.
Add / Edit / Delete KeywordsIn knowledge bases using the Economical index method, you can add or modify keywords for each chunk to improve its retrievability.

Each chunk can have up to 10 keywords.
Add / Delete Image AttachmentsDelete images extracted from documents or upload new ones within their corresponding chunk.

Image attachments and their chunks can be edited independently without affecting each other.
Each chunk can have up to 10 image attachments, which are returned alongside it during retrieval; images beyond this limit will not be extracted.

For self-hosted deployments, you can adjust this limit via the environment variable SINGLE_CHUNK_ATTACHMENT_LIMIT.
To enable cross-modal retrieval—retrieving both text and images based on semantic relevance, choose a multimodal embedding model (indicated by the Vision icon) for the knowledge base.

Image attachments will then be embedded and indexed for retrieval.

Best Practices

Check Chunk Quality

After a document is chunked, carefully review each chunk to ensure it’s semantically complete and appropriately sized for optimal retrieval accuracy and response relevance. Common issues to watch for:
  • Chunks are too short—may lack sufficient context, leading to semantic loss and inaccurate answers.
  • Chunks are too long—may include irrelevant information, introducing semantic noise and lowering retrieval precision.
  • Chunks are semantically incomplete—caused by forced chunking that cuts through sentences or paragraphs, resulting in missing or misleading content during retrieval.

Use Child Chunks as Retrieval Hooks for Parent Chunks

For documents chunked with Parent-child mode, the system searches across child chunks but returns the parent chunks. Since editing a child chunk does not update its parent, you can treat child chunks as semantic tags or retrieval hints for their parent chunks. To do this, rewrite child chunks into keywords, summaries, or common user queries. For example, if a parent chunk covers the full Return Policy, you could rephrase its child chunks as:
  • How do I return an item?
  • What’s the refund period?
  • Are there any return shipping fees?