Model API Interface
This document provides detailed interface specifications required for Dify model plugin development, including model provider implementation, interface definitions for five model types (LLM, TextEmbedding, Rerank, Speech2text, Text2speech), and complete specifications for related data structures such as PromptMessage and LLMResult. The document serves as a development reference for developers implementing various model integrations.
This section introduces the interface methods and parameter descriptions that providers and each model type need to implement. Before developing a model plugin, you may first need to read Model Design Rules and Model Plugin Introduction.
Model Provider
Inherit the __base.model_provider.ModelProvider
base class and implement the following interface:
credentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
, passed in as api_key
, etc. If validation fails, please throw a errors.validate.CredentialsValidateFailedError
error. Note: Predefined models need to fully implement this interface, while custom model providers only need to implement it simply as follows:
Models
Models are divided into 5 different types, with different base classes to inherit from and different methods to implement for each type.
Common Interfaces
All models need to implement the following 2 methods consistently:
- Model credential validation
Similar to provider credential validation, this validates individual models.
Parameters:
model
(string) Model namecredentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc. If validation fails, please throw a errors.validate.CredentialsValidateFailedError
error.
- Invocation error mapping table
When a model invocation exception occurs, it needs to be mapped to a specified InvokeError
type in Runtime, which helps Dify handle different errors differently. Runtime Errors:
InvokeConnectionError
Connection error during invocationInvokeServerUnavailableError
Service provider unavailableInvokeRateLimitError
Rate limit reachedInvokeAuthorizationError
Authentication failedInvokeBadRequestError
Incorrect parameters passed
You can also directly throw corresponding Errors and define them as follows, so that in subsequent calls you can directly throw exceptions like InvokeConnectionError
.
LLM
Inherit the __base.large_language_model.LargeLanguageModel
base class and implement the following interface:
- LLM Invocation
Implement the core method for LLM invocation, which can support both streaming and synchronous responses.
- Parameters:
model
(string) Model namecredentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
prompt_messages
(array[PromptMessage]) Prompt list
If the model is of Completion
type, the list only needs to include one UserPromptMessage element; if the model is of Chat
type, different messages need to be passed in as a list of SystemPromptMessage, UserPromptMessage, AssistantPromptMessage, ToolPromptMessage elements
-
model_parameters
(object) Model parameters defined by the model YAML configuration’sparameter_rules
. -
tools
(array[PromptMessageTool]) [optional] Tool list, equivalent tofunction
infunction calling
. This is the tool list passed to tool calling. -
stop
(array[string]) [optional] Stop sequence. The model response will stop output before the string defined in the stop sequence. -
stream
(bool) Whether to stream output, default is True For streaming output, it returns Generator[LLMResultChunk], for non-streaming output, it returns LLMResult. -
user
(string) [optional] A unique identifier for the user that can help the provider monitor and detect abuse. -
Return Value
For streaming output, it returns Generator[LLMResultChunk], for non-streaming output, it returns LLMResult.
- Pre-calculate input tokens
If the model does not provide a pre-calculation tokens interface, you can directly return 0.
Parameter explanations are the same as in LLM Invocation
above. This interface needs to calculate based on the appropriate tokenizer
for the corresponding model
. If the corresponding model does not provide a tokenizer
, you can use the _get_num_tokens_by_gpt2(text: str)
method in the AIModel
base class for calculation.
- Get custom model rules [optional]
When a provider supports adding custom LLMs, this method can be implemented to allow custom models to obtain model rules. By default, it returns None.
For most fine-tuned models under the OpenAI
provider, the base model can be obtained through the fine-tuned model name, such as gpt-3.5-turbo-1106
, and then return the predefined parameter rules of the base model. Refer to the specific implementation of OpenAI.
TextEmbedding
Inherit the __base.text_embedding_model.TextEmbeddingModel
base class and implement the following interface:
- Embedding Invocation
-
Parameters:
-
model
(string) Model name -
credentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
-
texts
(array[string]) Text list, can be processed in batch -
user
(string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse. -
Return:
TextEmbeddingResult entity.
- Pre-calculate tokens
Parameter explanations can be found in the Embedding Invocation
section above.
Similar to the LargeLanguageModel
above, this interface needs to calculate based on the appropriate tokenizer
for the corresponding model
. If the corresponding model does not provide a tokenizer
, you can use the _get_num_tokens_by_gpt2(text: str)
method in the AIModel
base class for calculation.
Rerank
Inherit the __base.rerank_model.RerankModel
base class and implement the following interface:
- Rerank Invocation
-
Parameters:
-
model
(string) Model name -
credentials
(object) Credential information The credential parameters are defined by the provider YAML configuration file’sprovider_credential_schema
ormodel_credential_schema
, passed in asapi_key
, etc. -
query
(string) Query request content -
docs
(array[string]) List of segments that need to be reranked -
score_threshold
(float) [optional] Score threshold -
top_n
(int) [optional] Take the top n segments -
user
(string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse. -
Return:
RerankResult entity.
Speech2text
Inherit the __base.speech2text_model.Speech2TextModel
base class and implement the following interface:
- Invoke
-
Parameters:
-
model
(string) Model name -
credentials
(object) Credential information The credential parameters are defined by the provider YAML configuration file’sprovider_credential_schema
ormodel_credential_schema
, passed in asapi_key
, etc. -
file
(File) File stream -
user
(string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse. -
Return:
String after speech conversion.
Text2speech
Inherit the __base.text2speech_model.Text2SpeechModel
base class and implement the following interface:
- Invoke
-
Parameters:
-
model
(string) Model name -
credentials
(object) Credential information The credential parameters are defined by the provider YAML configuration file’sprovider_credential_schema
ormodel_credential_schema
, passed in asapi_key
, etc. -
content_text
(string) Text content to be converted -
streaming
(bool) Whether to stream output -
user
(string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse. -
Return:
Audio stream after text conversion.
Moderation
Inherit the __base.moderation_model.ModerationModel
base class and implement the following interface:
- Invoke
-
Parameters:
-
model
(string) Model name -
credentials
(object) Credential information The credential parameters are defined by the provider YAML configuration file’sprovider_credential_schema
ormodel_credential_schema
, passed in asapi_key
, etc. -
text
(string) Text content -
user
(string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse. -
Return:
False indicates the input text is safe, True indicates it is not.
Entities
PromptMessageRole
Message role
PromptMessageContentType
Message content type, divided into plain text and images.
PromptMessageContent
Message content base class, used only for parameter declaration, cannot be initialized.
Currently supports two types: text and images, and can support text and multiple images simultaneously.
You need to initialize TextPromptMessageContent
and ImagePromptMessageContent
separately.
TextPromptMessageContent
When passing in text and images, text needs to be constructed as this entity as part of the content
list.
ImagePromptMessageContent
When passing in text and images, images need to be constructed as this entity as part of the content
list.
data
can be a url
or an image base64
encoded string.
PromptMessage
Base class for all Role message bodies, used only for parameter declaration, cannot be initialized.
UserPromptMessage
UserMessage message body, represents user messages.
AssistantPromptMessage
Represents model response messages, typically used for few-shots
or chat history input.
Here tool_calls
is the list of tool call
returned by the model after passing in tools
to the model.
SystemPromptMessage
Represents system messages, typically used to set system instructions for the model.
ToolPromptMessage
Represents tool messages, used to pass results to the model for next-step planning after a tool has been executed.
The base class’s content
passes in the tool execution result.
PromptMessageTool
LLMResult
LLMResultChunkDelta
Delta entity within each iteration in streaming response
LLMResultChunk
Iteration entity in streaming response
LLMUsage
TextEmbeddingResult
EmbeddingUsage
RerankResult
RerankDocument
Related Resources
- Model Design Rules - Understand the standards for model configuration
- Model Plugin Introduction - Quickly understand the basic concepts of model plugins
- Quickly Integrate a New Model - Learn how to add new models to existing providers
- Create a New Model Provider - Learn how to develop brand new model providers