Skip to main content
GET
/
datasets
/
{dataset_id}
/
documents
/
{document_id}
Error
A valid request URL is required to generate request examples
{
  "id": "a8e0e5b5-78c6-4130-a5ce-25feb0e0b4ac",
  "position": 1,
  "data_source_type": "upload_file",
  "data_source_info": {
    "upload_file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
  },
  "dataset_process_rule_id": "e1f2a3b4-c5d6-7890-ef12-345678901234",
  "dataset_process_rule": {
    "id": "e1f2a3b4-c5d6-7890-ef12-345678901234",
    "mode": "custom"
  },
  "document_process_rule": {
    "mode": "custom",
    "rules": {
      "pre_processing_rules": [],
      "segmentation": {
        "separator": "###",
        "max_tokens": 500,
        "chunk_overlap": 50
      }
    }
  },
  "name": "guide.txt",
  "created_from": "api",
  "created_by": "ad313dd6-ef04-4dd1-a5b0-c0f0b9e2e7e4",
  "created_at": 1741267200,
  "tokens": 512,
  "indexing_status": "completed",
  "error": null,
  "enabled": true,
  "disabled_at": null,
  "disabled_by": null,
  "archived": false,
  "display_status": "available",
  "word_count": 350,
  "hit_count": 0,
  "doc_form": "text_model",
  "doc_language": "English",
  "doc_type": null,
  "doc_metadata": [],
  "completed_at": 1741267260,
  "updated_at": 1741267260,
  "indexing_latency": 60,
  "segment_count": 5,
  "average_segment_length": 70,
  "summary_index_status": null,
  "need_summary": false
}

Documentation Index

Fetch the complete documentation index at: https://docs.dify.ai/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

API Key authentication. For all API requests, include your API Key in the Authorization HTTP Header, prefixed with Bearer . Example: Authorization: Bearer {API_KEY}. Strongly recommend storing your API Key on the server-side, not shared or stored on the client-side, to avoid possible API-Key leakage that can lead to serious consequences.

Path Parameters

dataset_id
string<uuid>
required

Knowledge base ID.

document_id
string<uuid>
required

Document ID.

Query Parameters

metadata
enum<string>
default:all

all returns all fields including metadata. only returns only id, doc_type, and doc_metadata. without returns all fields except doc_metadata.

Available options:
all,
only,
without

Response

Document details. The response shape varies based on the metadata query parameter. When metadata is only, only id, doc_type, and doc_metadata are returned. When metadata is without, doc_type and doc_metadata are omitted.

id
string

Document identifier.

position
integer

Position index within the knowledge base.

data_source_type
string

How the document was uploaded. upload_file for file uploads, notion_import for Notion imports.

data_source_info
object

Raw data source information.

dataset_process_rule_id
string

ID of the processing rule applied to this document.

dataset_process_rule
object

Knowledge-base-level processing rule configuration.

document_process_rule
object

Document-level processing rule configuration.

name
string

Document name.

created_from
string

Origin of the document. api for API creation, web for UI creation.

created_by
string

ID of the user who created the document.

created_at
number

Unix timestamp of document creation.

tokens
integer

Number of tokens in the document.

indexing_status
string

Current indexing status, e.g. waiting, parsing, cleaning, splitting, indexing, completed, error, paused.

error
string | null

Error message if indexing failed, null otherwise.

enabled
boolean

Whether the document is enabled for retrieval.

disabled_at
number | null

Unix timestamp when the document was disabled, null if enabled.

disabled_by
string | null

ID of the user who disabled the document, null if enabled.

archived
boolean

Whether the document is archived.

display_status
string

Display-friendly indexing status for the UI.

word_count
integer

Total word count of the document.

hit_count
integer

Number of times this document has been retrieved.

doc_form
string

Document chunking mode. text_model for standard text, hierarchical_model for parent-child, qa_model for QA pairs.

doc_language
string

Language of the document content.

doc_type
string | null

Document type classification, null if not set.

doc_metadata
object[]

Custom metadata key-value pairs for this document.

completed_at
number | null

Unix timestamp when processing completed, null if not yet completed.

updated_at
number | null

Unix timestamp of last update, null if never updated.

indexing_latency
number | null

Time taken for indexing in seconds, null if not completed.

segment_count
integer

Number of chunks in the document.

average_segment_length
number

Average character length of chunks.

summary_index_status
string | null

Status of summary indexing, null if summary index is not enabled.

need_summary
boolean

Whether the document needs summary generation.