Skip to main content
GET
/
datasets
/
{dataset_id}
/
documents
/
{document_id}
Error
A valid request URL is required to generate request examples
{
  "id": "a8e0e5b5-78c6-4130-a5ce-25feb0e0b4ac",
  "position": 1,
  "data_source_type": "upload_file",
  "data_source_info": {
    "upload_file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
  },
  "dataset_process_rule_id": "e1f2a3b4-c5d6-7890-ef12-345678901234",
  "dataset_process_rule": {
    "id": "e1f2a3b4-c5d6-7890-ef12-345678901234",
    "mode": "custom"
  },
  "document_process_rule": {
    "mode": "custom",
    "rules": {
      "pre_processing_rules": [],
      "segmentation": {
        "separator": "###",
        "max_tokens": 500,
        "chunk_overlap": 50
      }
    }
  },
  "name": "guide.txt",
  "created_from": "api",
  "created_by": "ad313dd6-ef04-4dd1-a5b0-c0f0b9e2e7e4",
  "created_at": 1741267200,
  "tokens": 512,
  "indexing_status": "completed",
  "error": null,
  "enabled": true,
  "disabled_at": null,
  "disabled_by": null,
  "archived": false,
  "display_status": "available",
  "word_count": 350,
  "hit_count": 0,
  "doc_form": "text_model",
  "doc_language": "English",
  "doc_type": null,
  "doc_metadata": [],
  "completed_at": 1741267260,
  "updated_at": 1741267260,
  "indexing_latency": 60,
  "segment_count": 5,
  "average_segment_length": 70,
  "summary_index_status": null,
  "need_summary": false
}

Authorizations

Authorization
string
header
required

API Key authentication. For all API requests, include your API Key in the Authorization HTTP Header, prefixed with Bearer. Example: Authorization: Bearer {API_KEY}. Strongly recommend storing your API Key on the server-side, not shared or stored on the client-side, to avoid possible API-Key leakage that can lead to serious consequences.

Path Parameters

dataset_id
string<uuid>
required

Knowledge base ID.

document_id
string<uuid>
required

Document ID.

Query Parameters

metadata
enum<string>
default:all

all returns all fields including metadata. only returns only id, doc_type, and doc_metadata. without returns all fields except doc_metadata.

Available options:
all,
only,
without

Response

Document details. The response shape varies based on the metadata query parameter. When metadata is only, only id, doc_type, and doc_metadata are returned. When metadata is without, doc_type and doc_metadata are omitted.

id
string

Document identifier.

position
integer

Position index within the knowledge base.

data_source_type
string

How the document was uploaded. upload_file for file uploads, notion_import for Notion imports.

data_source_info
object

Raw data source information.

dataset_process_rule_id
string

ID of the processing rule applied to this document.

dataset_process_rule
object

Knowledge-base-level processing rule configuration.

document_process_rule
object

Document-level processing rule configuration.

name
string

Document name.

created_from
string

Origin of the document. api for API creation, web for UI creation.

created_by
string

ID of the user who created the document.

created_at
number

Unix timestamp of document creation.

tokens
integer

Number of tokens in the document.

indexing_status
string

Current indexing status, e.g. waiting, parsing, cleaning, splitting, indexing, completed, error, paused.

error
string | null

Error message if indexing failed, null otherwise.

enabled
boolean

Whether the document is enabled for retrieval.

disabled_at
number | null

Unix timestamp when the document was disabled, null if enabled.

disabled_by
string | null

ID of the user who disabled the document, null if enabled.

archived
boolean

Whether the document is archived.

display_status
string

Display-friendly indexing status for the UI.

word_count
integer

Total word count of the document.

hit_count
integer

Number of times this document has been retrieved.

doc_form
string

Document chunking mode. text_model for standard text, hierarchical_model for parent-child, qa_model for QA pairs.

doc_language
string

Language of the document content.

doc_type
string | null

Document type classification, null if not set.

doc_metadata
object[]

Custom metadata key-value pairs for this document.

completed_at
number | null

Unix timestamp when processing completed, null if not yet completed.

updated_at
number | null

Unix timestamp of last update, null if never updated.

indexing_latency
number | null

Time taken for indexing in seconds, null if not completed.

segment_count
integer

Number of chunks in the document.

average_segment_length
number

Average character length of chunks.

summary_index_status
string | null

Status of summary indexing, null if summary index is not enabled.

need_summary
boolean

Whether the document needs summary generation.