This section introduces the interface methods and parameter descriptions that providers and each model type need to implement. Before developing a model plugin, you may first need to read Model Design Rules and Model Plugin Introduction.

Model Provider

Inherit the __base.model_provider.ModelProvider base class and implement the following interface:

def validate_provider_credentials(self, credentials: dict) -> None:
    """
    Validate provider credentials
    You can choose any validate_credentials method of model type or implement validate method by yourself,
    such as: get model list api

    if validate failed, raise exception

    :param credentials: provider credentials, credentials form defined in `provider_credential_schema`.
    """
  • credentials (object) Credential information

The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema, passed in as api_key, etc. If validation fails, please throw a errors.validate.CredentialsValidateFailedError error. Note: Predefined models need to fully implement this interface, while custom model providers only need to implement it simply as follows:

class XinferenceProvider(Provider):
    def validate_provider_credentials(self, credentials: dict) -> None:
        pass

Models

Models are divided into 5 different types, with different base classes to inherit from and different methods to implement for each type.

Common Interfaces

All models need to implement the following 2 methods consistently:

  • Model credential validation

Similar to provider credential validation, this validates individual models.

def validate_credentials(self, model: str, credentials: dict) -> None:
    """
    Validate model credentials

    :param model: model name
    :param credentials: model credentials
    :return:
    """

Parameters:

  • model (string) Model name
  • credentials (object) Credential information

The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema or model_credential_schema, passed in as api_key, etc. If validation fails, please throw a errors.validate.CredentialsValidateFailedError error.

  • Invocation error mapping table

When a model invocation exception occurs, it needs to be mapped to a specified InvokeError type in Runtime, which helps Dify handle different errors differently. Runtime Errors:

  • InvokeConnectionError Connection error during invocation
  • InvokeServerUnavailableError Service provider unavailable
  • InvokeRateLimitError Rate limit reached
  • InvokeAuthorizationError Authentication failed
  • InvokeBadRequestError Incorrect parameters passed
@property
def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
    """
    Map model invoke error to unified error
    The key is the error type thrown to the caller
    The value is the error type thrown by the model,
    which needs to be converted into a unified error type for the caller.

    :return: Invoke error mapping
    """

You can also directly throw corresponding Errors and define them as follows, so that in subsequent calls you can directly throw exceptions like InvokeConnectionError.

LLM

Inherit the __base.large_language_model.LargeLanguageModel base class and implement the following interface:

  • LLM Invocation

Implement the core method for LLM invocation, which can support both streaming and synchronous responses.

def _invoke(self, model: str, credentials: dict,
            prompt_messages: list[PromptMessage], model_parameters: dict,
            tools: Optional[list[PromptMessageTool]] = None, stop: Optional[list[str]] = None,
            stream: bool = True, user: Optional[str] = None) \
        -> Union[LLMResult, Generator]:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param prompt_messages: prompt messages
    :param model_parameters: model parameters
    :param tools: tools for tool calling
    :param stop: stop words
    :param stream: is stream response
    :param user: unique user id
    :return: full response or stream response chunk generator result
    """
  • Parameters:
    • model (string) Model name
    • credentials (object) Credential information

The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema or model_credential_schema, passed in as api_key, etc.

If the model is of Completion type, the list only needs to include one UserPromptMessage element; if the model is of Chat type, different messages need to be passed in as a list of SystemPromptMessage, UserPromptMessage, AssistantPromptMessage, ToolPromptMessage elements

  • model_parameters (object) Model parameters defined by the model YAML configuration’s parameter_rules.

  • tools (array[PromptMessageTool]) [optional] Tool list, equivalent to function in function calling. This is the tool list passed to tool calling.

  • stop (array[string]) [optional] Stop sequence. The model response will stop output before the string defined in the stop sequence.

  • stream (bool) Whether to stream output, default is True For streaming output, it returns Generator[LLMResultChunk], for non-streaming output, it returns LLMResult.

  • user (string) [optional] A unique identifier for the user that can help the provider monitor and detect abuse.

  • Return Value

For streaming output, it returns Generator[LLMResultChunk], for non-streaming output, it returns LLMResult.

  • Pre-calculate input tokens

If the model does not provide a pre-calculation tokens interface, you can directly return 0.

def get_num_tokens(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                   tools: Optional[list[PromptMessageTool]] = None) -> int:
    """
    Get number of tokens for given prompt messages

    :param model: model name
    :param credentials: model credentials
    :param prompt_messages: prompt messages
    :param tools: tools for tool calling
    :return:
    """

Parameter explanations are the same as in LLM Invocation above. This interface needs to calculate based on the appropriate tokenizer for the corresponding model. If the corresponding model does not provide a tokenizer, you can use the _get_num_tokens_by_gpt2(text: str) method in the AIModel base class for calculation.

  • Get custom model rules [optional]
def get_customizable_model_schema(self, model: str, credentials: dict) -> Optional[AIModelEntity]:
    """
    Get customizable model schema

    :param model: model name
    :param credentials: model credentials
    :return: model schema
    """

When a provider supports adding custom LLMs, this method can be implemented to allow custom models to obtain model rules. By default, it returns None.

For most fine-tuned models under the OpenAI provider, the base model can be obtained through the fine-tuned model name, such as gpt-3.5-turbo-1106, and then return the predefined parameter rules of the base model. Refer to the specific implementation of OpenAI.

TextEmbedding

Inherit the __base.text_embedding_model.TextEmbeddingModel base class and implement the following interface:

  • Embedding Invocation
def _invoke(self, model: str, credentials: dict,
            texts: list[str], user: Optional[str] = None) \
        -> TextEmbeddingResult:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param texts: texts to embed
    :param user: unique user id
    :return: embeddings result
    """
  • Parameters:

  • model (string) Model name

  • credentials (object) Credential information

The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema or model_credential_schema, passed in as api_key, etc.

  • texts (array[string]) Text list, can be processed in batch

  • user (string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse.

  • Return:

TextEmbeddingResult entity.

  • Pre-calculate tokens
def get_num_tokens(self, model: str, credentials: dict, texts: list[str]) -> int:
    """
    Get number of tokens for given prompt messages

    :param model: model name
    :param credentials: model credentials
    :param texts: texts to embed
    :return:
    """

Parameter explanations can be found in the Embedding Invocation section above.

Similar to the LargeLanguageModel above, this interface needs to calculate based on the appropriate tokenizer for the corresponding model. If the corresponding model does not provide a tokenizer, you can use the _get_num_tokens_by_gpt2(text: str) method in the AIModel base class for calculation.

Rerank

Inherit the __base.rerank_model.RerankModel base class and implement the following interface:

  • Rerank Invocation
def _invoke(self, model: str, credentials: dict,
            query: str, docs: list[str], score_threshold: Optional[float] = None, top_n: Optional[int] = None,
            user: Optional[str] = None) \
        -> RerankResult:
    """
    Invoke rerank model

    :param model: model name
    :param credentials: model credentials
    :param query: search query
    :param docs: docs for reranking
    :param score_threshold: score threshold
    :param top_n: top n
    :param user: unique user id
    :return: rerank result
    """
  • Parameters:

  • model (string) Model name

  • credentials (object) Credential information The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema or model_credential_schema, passed in as api_key, etc.

  • query (string) Query request content

  • docs (array[string]) List of segments that need to be reranked

  • score_threshold (float) [optional] Score threshold

  • top_n (int) [optional] Take the top n segments

  • user (string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse.

  • Return:

RerankResult entity.

Speech2text

Inherit the __base.speech2text_model.Speech2TextModel base class and implement the following interface:

  • Invoke
def _invoke(self, model: str, credentials: dict,
            file: IO[bytes], user: Optional[str] = None) \
        -> str:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param file: audio file
    :param user: unique user id
    :return: text for given audio file
    """        
  • Parameters:

  • model (string) Model name

  • credentials (object) Credential information The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema or model_credential_schema, passed in as api_key, etc.

  • file (File) File stream

  • user (string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse.

  • Return:

String after speech conversion.

Text2speech

Inherit the __base.text2speech_model.Text2SpeechModel base class and implement the following interface:

  • Invoke
def _invoke(self, model: str, credentials: dict, content_text: str, streaming: bool, user: Optional[str] = None):
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param content_text: text content to be translated
    :param streaming: output is streaming
    :param user: unique user id
    :return: translated audio file
    """        
  • Parameters:

  • model (string) Model name

  • credentials (object) Credential information The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema or model_credential_schema, passed in as api_key, etc.

  • content_text (string) Text content to be converted

  • streaming (bool) Whether to stream output

  • user (string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse.

  • Return:

Audio stream after text conversion.

Moderation

Inherit the __base.moderation_model.ModerationModel base class and implement the following interface:

  • Invoke
def _invoke(self, model: str, credentials: dict,
            text: str, user: Optional[str] = None) \
        -> bool:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param text: text to moderate
    :param user: unique user id
    :return: false if text is safe, true otherwise
    """
  • Parameters:

  • model (string) Model name

  • credentials (object) Credential information The credential parameters are defined by the provider YAML configuration file’s provider_credential_schema or model_credential_schema, passed in as api_key, etc.

  • text (string) Text content

  • user (string) [optional] A unique identifier for the user Can help the provider monitor and detect abuse.

  • Return:

False indicates the input text is safe, True indicates it is not.

Entities

PromptMessageRole

Message role

class PromptMessageRole(Enum):
    """
    Enum class for prompt message.
    """
    SYSTEM = "system"
    USER = "user"
    ASSISTANT = "assistant"
    TOOL = "tool"

PromptMessageContentType

Message content type, divided into plain text and images.

class PromptMessageContentType(Enum):
    """
    Enum class for prompt message content type.
    """
    TEXT = 'text'
    IMAGE = 'image'

PromptMessageContent

Message content base class, used only for parameter declaration, cannot be initialized.

class PromptMessageContent(BaseModel):
    """
    Model class for prompt message content.
    """
    type: PromptMessageContentType
    data: str  # Content data

Currently supports two types: text and images, and can support text and multiple images simultaneously. You need to initialize TextPromptMessageContent and ImagePromptMessageContent separately.

TextPromptMessageContent

class TextPromptMessageContent(PromptMessageContent):
    """
    Model class for text prompt message content.
    """
    type: PromptMessageContentType = PromptMessageContentType.TEXT

When passing in text and images, text needs to be constructed as this entity as part of the content list.

ImagePromptMessageContent

class ImagePromptMessageContent(PromptMessageContent):
    """
    Model class for image prompt message content.
    """
    class DETAIL(Enum):
        LOW = 'low'
        HIGH = 'high'

    type: PromptMessageContentType = PromptMessageContentType.IMAGE
    detail: DETAIL = DETAIL.LOW  # Resolution

When passing in text and images, images need to be constructed as this entity as part of the content list. data can be a url or an image base64 encoded string.

PromptMessage

Base class for all Role message bodies, used only for parameter declaration, cannot be initialized.

class PromptMessage(ABC, BaseModel):
    """
    Model class for prompt message.
    """
    role: PromptMessageRole  # Message role
    content: Optional[str | list[PromptMessageContent]] = None  # Supports two types: string and content list. The content list is for multimodal needs, see PromptMessageContent for details.
    name: Optional[str] = None  # Name, optional.

UserPromptMessage

UserMessage message body, represents user messages.

class UserPromptMessage(PromptMessage):
    """
    Model class for user prompt message.
    """
    role: PromptMessageRole = PromptMessageRole.USER

AssistantPromptMessage

Represents model response messages, typically used for few-shots or chat history input.

class AssistantPromptMessage(PromptMessage):
    """
    Model class for assistant prompt message.
    """
    class ToolCall(BaseModel):
        """
        Model class for assistant prompt message tool call.
        """
        class ToolCallFunction(BaseModel):
            """
            Model class for assistant prompt message tool call function.
            """
            name: str  # Tool name
            arguments: str  # Tool parameters

        id: str  # Tool ID, only effective for OpenAI tool call, a unique ID for tool invocation, the same tool can be called multiple times
        type: str  # Default is function
        function: ToolCallFunction  # Tool call information

    role: PromptMessageRole = PromptMessageRole.ASSISTANT
    tool_calls: list[ToolCall] = []  # Model's tool call results (only returned when tools are passed in and the model decides to call them)

Here tool_calls is the list of tool call returned by the model after passing in tools to the model.

SystemPromptMessage

Represents system messages, typically used to set system instructions for the model.

class SystemPromptMessage(PromptMessage):
    """
    Model class for system prompt message.
    """
    role: PromptMessageRole = PromptMessageRole.SYSTEM

ToolPromptMessage

Represents tool messages, used to pass results to the model for next-step planning after a tool has been executed.

class ToolPromptMessage(PromptMessage):
    """
    Model class for tool prompt message.
    """
    role: PromptMessageRole = PromptMessageRole.TOOL
    tool_call_id: str  # Tool call ID, if OpenAI tool call is not supported, you can also pass in the tool name

The base class’s content passes in the tool execution result.

PromptMessageTool

class PromptMessageTool(BaseModel):
    """
    Model class for prompt message tool.
    """
    name: str  # Tool name
    description: str  # Tool description
    parameters: dict  # Tool parameters dict


LLMResult

class LLMResult(BaseModel):
    """
    Model class for llm result.
    """
    model: str  # Actually used model
    prompt_messages: list[PromptMessage]  # Prompt message list
    message: AssistantPromptMessage  # Reply message
    usage: LLMUsage  # Tokens used and cost information
    system_fingerprint: Optional[str] = None  # Request fingerprint, refer to OpenAI parameter definition

LLMResultChunkDelta

Delta entity within each iteration in streaming response

class LLMResultChunkDelta(BaseModel):
    """
    Model class for llm result chunk delta.
    """
    index: int  # Sequence number
    message: AssistantPromptMessage  # Reply message
    usage: Optional[LLMUsage] = None  # Tokens used and cost information, only returned in the last message
    finish_reason: Optional[str] = None  # Completion reason, only returned in the last message

LLMResultChunk

Iteration entity in streaming response

class LLMResultChunk(BaseModel):
    """
    Model class for llm result chunk.
    """
    model: str  # Actually used model
    prompt_messages: list[PromptMessage]  # Prompt message list
    system_fingerprint: Optional[str] = None  # Request fingerprint, refer to OpenAI parameter definition
    delta: LLMResultChunkDelta  # Changes in content for each iteration

LLMUsage

class LLMUsage(ModelUsage):
    """
    Model class for llm usage.
    """
    prompt_tokens: int  # Tokens used by prompt
    prompt_unit_price: Decimal  # Prompt unit price
    prompt_price_unit: Decimal  # Prompt price unit, i.e., unit price based on how many tokens
    prompt_price: Decimal  # Prompt cost
    completion_tokens: int  # Tokens used by completion
    completion_unit_price: Decimal  # Completion unit price
    completion_price_unit: Decimal  # Completion price unit, i.e., unit price based on how many tokens
    completion_price: Decimal  # Completion cost
    total_tokens: int  # Total tokens used
    total_price: Decimal  # Total cost
    currency: str  # Currency unit
    latency: float  # Request time (s)

TextEmbeddingResult

class TextEmbeddingResult(BaseModel):
    """
    Model class for text embedding result.
    """
    model: str  # Actually used model
    embeddings: list[list[float]]  # Embedding vector list, corresponding to the input texts list
    usage: EmbeddingUsage  # Usage information

EmbeddingUsage

class EmbeddingUsage(ModelUsage):
    """
    Model class for embedding usage.
    """
    tokens: int  # Tokens used
    total_tokens: int  # Total tokens used
    unit_price: Decimal  # Unit price
    price_unit: Decimal  # Price unit, i.e., unit price based on how many tokens
    total_price: Decimal  # Total cost
    currency: str  # Currency unit
    latency: float  # Request time (s)

RerankResult

class RerankResult(BaseModel):
    """
    Model class for rerank result.
    """
    model: str  # Actually used model
    docs: list[RerankDocument]  # List of reranked segments        

RerankDocument

class RerankDocument(BaseModel):
    """
    Model class for rerank document.
    """
    index: int  # Original sequence number
    text: str  # Segment text content
    score: float  # Score