Agent
This document details the development process for Dify’s Agent strategy plugins, including adding Agent strategy fields in the Manifest file, defining Agent providers, and the core steps for implementing Agent strategies. It provides complete example code for getting parameters, invoking models, invoking tools, and generating and managing logs.
An Agent strategy is an extensible template that defines standard input content and output formats. By developing the functional code for specific Agent strategy interfaces, you can implement various Agent strategies such as CoT (Chain of Thought) / ToT (Tree of Thoughts) / GoT (Graph of Thoughts) / BoT (Skeleton of Thought), enabling complex strategies like Semantic Kernel.
Add Fields in Manifest
To add an Agent strategy in a plugin, you need to add the plugins.agent_strategies
field in the manifest.yaml
file and also define the Agent provider. Here is an example:
Some irrelevant fields in the manifest
file have been omitted here. For the detailed format of the Manifest, please refer to the Define Plugin Information via Manifest File document.
Define Agent Provider
Next, you need to create a new agent.yaml
file and fill in the basic Agent provider information.
It mainly contains basic descriptive content and specifies which strategies the current provider includes. In the example code above, only the most basic function_calling.yaml
strategy file is specified.
Define and Implement Agent Strategy
Definition
Next, you need to define the code that implements the Agent strategy. Create a new function_calling.yaml
file:
The code format is similar to the Tool
standard format, defining four parameters: model
, tools
, query
, and max_iterations
, to implement the most basic Agent strategy. This code allows users to select a model and the tools to use, configure the maximum number of iterations, and finally input a query to start executing the Agent.
Write Functional Implementation Code
Get Parameters
Based on the four parameters defined above, the model type parameter is model-selector
, and the tool type parameter is a special array[tools]
. The forms obtained in the parameters can be converted using the built-in AgentModelConfig
and list[ToolEntity]
in the SDK.
Invoke Model
Invoking the specified model is an essential capability in Agent plugins. Use the session.model.invoke()
function in the SDK to invoke the model. You can get the required input parameters from the model.
Example method signature for invoking the model:
You need to pass the model information model_config
, prompt information prompt_messages
, and tool information tools
.
The prompt_messages
parameter can be invoked using the example code below; the tool_messages
require some conversion.
Please refer to the example code for using invoke model:
Invoke Tool
Invoking tools is also an essential capability in Agent plugins. You can use self.session.tool.invoke()
to call them. Example method signature for invoking a tool:
The required parameters are provider_type
, provider
, tool_name
, and parameters
. In Function Calling, tool_name
and parameters
are often generated by the LLM. Example code for using invoke tool:
The output of the self.session.tool.invoke()
function is a Generator, which means it also needs to be parsed streamingly.
Please refer to the following function for the parsing method:
Log
If you want to see the Agent’s thinking process, besides viewing the normally returned messages, you can use a dedicated interface to display the entire Agent’s thinking process in a tree structure.
Create Log
- This interface creates and returns an
AgentLogMessage
, which represents a node in the log tree. - If
parent
is passed, it indicates that the node has a parent node. - The status defaults to “Success”. However, if you want to better display the task execution process, you can first set the status to “start” to show a “running” log, and then update the log’s status to “Success” after the task is completed. This allows users to clearly see the entire process from start to finish.
label
will be used to display the log title to the user.
Finish Log
If you chose the start
status as the initial state in the previous step, you can use the finish log interface to change the status.
Example
This example shows a simple two-step execution process: first, output a log with the status “Thinking”, then complete the actual task processing.
Related Resources
- Getting Started with Dify Plugins - Understand the overall architecture of plugin development
- Agent Strategy Plugin Example - A practical example of Agent strategy plugin development
- Define Plugin Information via Manifest File - Understand the detailed format of the Manifest file
- Reverse Invocation: Model - Learn how to invoke model capabilities within the platform
- Reverse Invocation: Tool - Learn how to invoke other plugins