Skip to main content
POST
/
datasets
/
{dataset_id}
/
document
/
create-by-text
Error
A valid request URL is required to generate request examples
{
  "document": {
    "id": "a8e0e5b5-78c6-4130-a5ce-25feb0e0b4ac",
    "position": 1,
    "data_source_type": "upload_file",
    "data_source_info": {
      "upload_file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
    },
    "data_source_detail_dict": {
      "upload_file": {
        "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
        "name": "guide.txt",
        "size": 2048,
        "extension": "txt",
        "mime_type": "text/plain",
        "created_by": "ad313dd6-ef04-4dd1-a5b0-c0f0b9e2e7e4",
        "created_at": 1741267200
      }
    },
    "dataset_process_rule_id": "e1f2a3b4-c5d6-7890-ef12-345678901234",
    "name": "guide.txt",
    "created_from": "api",
    "created_by": "ad313dd6-ef04-4dd1-a5b0-c0f0b9e2e7e4",
    "created_at": 1741267200,
    "tokens": 0,
    "indexing_status": "indexing",
    "error": null,
    "enabled": true,
    "disabled_at": null,
    "disabled_by": null,
    "archived": false,
    "display_status": "indexing",
    "word_count": 0,
    "hit_count": 0,
    "doc_form": "text_model",
    "doc_metadata": [],
    "summary_index_status": null,
    "need_summary": false
  },
  "batch": "20250306150245647595"
}

Authorizations

Authorization
string
header
required

API Key authentication. For all API requests, include your API Key in the Authorization HTTP Header, prefixed with Bearer. Example: Authorization: Bearer {API_KEY}. Strongly recommend storing your API Key on the server-side, not shared or stored on the client-side, to avoid possible API-Key leakage that can lead to serious consequences.

Path Parameters

dataset_id
string<uuid>
required

Knowledge base ID.

Body

application/json
name
string
required

Document name.

text
string
required

Document text content.

indexing_technique
enum<string>

Required when adding the first document to a knowledge base. Subsequent documents inherit the knowledge base's indexing technique if omitted. high_quality uses embedding models for precise search; economy uses keyword-based indexing.

Available options:
high_quality,
economy
doc_form
enum<string>
default:text_model

text_model for standard text chunking, hierarchical_model for parent-child chunk structure, qa_model for question-answer pair extraction.

Available options:
text_model,
hierarchical_model,
qa_model
doc_language
string
default:English

Language of the document for processing optimization.

process_rule
object

Processing rules for chunking.

retrieval_model
object

Retrieval model configuration. Controls how chunks are searched and ranked when querying this knowledge base.

embedding_model
string

Embedding model name. Use the model field from Get Available Models with model_type=text-embedding.

embedding_model_provider
string

Embedding model provider. Use the provider field from Get Available Models with model_type=text-embedding.

original_document_id
string

Original document ID for versioning.

Response

Document created successfully.

document
object
batch
string

Batch ID for tracking indexing progress.