This document provides detailed interface specifications required for Dify model plugin development, including model provider implementation, interface definitions for five model types (LLM, TextEmbedding, Rerank, Speech2text, Text2speech), and complete specifications for related data structures such as PromptMessage and LLMResult. The document serves as a development reference for developers implementing various model integrations.
__base.model_provider.ModelProvider
base class and implement the following interface:
credentials
(object) Credential informationprovider_credential_schema
, passed in as api_key
, etc. If validation fails, please throw a errors.validate.CredentialsValidateFailedError
error. Note: Predefined models need to fully implement this interface, while custom model providers only need to implement it simply as follows:
model
(string) Model namecredentials
(object) Credential informationprovider_credential_schema
or model_credential_schema
, passed in as api_key
, etc. If validation fails, please throw a errors.validate.CredentialsValidateFailedError
error.
InvokeError
type in Runtime, which helps Dify handle different errors differently. Runtime Errors:
InvokeConnectionError
Connection error during invocationInvokeServerUnavailableError
Service provider unavailableInvokeRateLimitError
Rate limit reachedInvokeAuthorizationError
Authentication failedInvokeBadRequestError
Incorrect parameters passedInvokeConnectionError
.
__base.large_language_model.LargeLanguageModel
base class and implement the following interface:
model
(string) Model namecredentials
(object) Credential informationprovider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
prompt_messages
(array[PromptMessage]) Prompt listCompletion
type, the list only needs to include one UserPromptMessage element; if the model is of Chat
type, different messages need to be passed in as a list of SystemPromptMessage, UserPromptMessage, AssistantPromptMessage, ToolPromptMessage elements
model_parameters
(object) Model parameters defined by the model YAML configuration’s parameter_rules
.
tools
(array[PromptMessageTool]) [optional] Tool list, equivalent to function
in function calling
. This is the tool list passed to tool calling.
stop
(array[string]) [optional] Stop sequence. The model response will stop output before the string defined in the stop sequence.
stream
(bool) Whether to stream output, default is True
For streaming output, it returns Generator[LLMResultChunk], for non-streaming output, it returns LLMResult.
user
(string) [optional] A unique identifier for the user that can help the provider monitor and detect abuse.
LLM Invocation
above. This interface needs to calculate based on the appropriate tokenizer
for the corresponding model
. If the corresponding model does not provide a tokenizer
, you can use the _get_num_tokens_by_gpt2(text: str)
method in the AIModel
base class for calculation.
OpenAI
provider, the base model can be obtained through the fine-tuned model name, such as gpt-3.5-turbo-1106
, and then return the predefined parameter rules of the base model. Refer to the specific implementation of OpenAI.
__base.text_embedding_model.TextEmbeddingModel
base class and implement the following interface:
model
(string) Model name
credentials
(object) Credential information
provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
texts
(array[string]) Text list, can be processed in batch
user
(string) [optional] A unique identifier for the user
Can help the provider monitor and detect abuse.
Embedding Invocation
section above.
Similar to the LargeLanguageModel
above, this interface needs to calculate based on the appropriate tokenizer
for the corresponding model
. If the corresponding model does not provide a tokenizer
, you can use the _get_num_tokens_by_gpt2(text: str)
method in the AIModel
base class for calculation.
__base.rerank_model.RerankModel
base class and implement the following interface:
model
(string) Model name
credentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
query
(string) Query request content
docs
(array[string]) List of segments that need to be reranked
score_threshold
(float) [optional] Score threshold
top_n
(int) [optional] Take the top n segments
user
(string) [optional] A unique identifier for the user
Can help the provider monitor and detect abuse.
__base.speech2text_model.Speech2TextModel
base class and implement the following interface:
model
(string) Model name
credentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
file
(File) File stream
user
(string) [optional] A unique identifier for the user
Can help the provider monitor and detect abuse.
__base.text2speech_model.Text2SpeechModel
base class and implement the following interface:
model
(string) Model name
credentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
content_text
(string) Text content to be converted
streaming
(bool) Whether to stream output
user
(string) [optional] A unique identifier for the user
Can help the provider monitor and detect abuse.
__base.moderation_model.ModerationModel
base class and implement the following interface:
model
(string) Model name
credentials
(object) Credential information
The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema
or model_credential_schema
, passed in as api_key
, etc.
text
(string) Text content
user
(string) [optional] A unique identifier for the user
Can help the provider monitor and detect abuse.
TextPromptMessageContent
and ImagePromptMessageContent
separately.
content
list.
content
list.
data
can be a url
or an image base64
encoded string.
few-shots
or chat history input.
tool_calls
is the list of tool call
returned by the model after passing in tools
to the model.
content
passes in the tool execution result.