Integrate Models on LiteLLM Proxy
LiteLLM Proxy is a proxy server that allows:
Calling 100+ LLMs (OpenAI, Azure, Vertex, Bedrock) in the OpenAI format
Using Virtual Keys to set Budgets, Rate limits and track usage
Dify supports integrating LLM and Text Embedding capabilities models available on LiteLLM Proxy
Quick Integration
Step 1. Start LiteLLM Proxy Server
LiteLLM Requires a config with all your models defined - we will call this file litellm_config.yaml
Detailed docs on how to setup litellm config - here
Step 2. Start LiteLLM Proxy
On success, the proxy will start running on http://localhost:4000
Step 3. Integrate LiteLLM Proxy in Dify
In Settings > Model Providers > OpenAI-API-compatible
, fill in:
Model Name:
gpt-4
Base URL:
http://localhost:4000
Enter the base URL where the LiteLLM service is accessible.
Model Type:
Chat
Model Context Length:
4096
The maximum context length of the model. If unsure, use the default value of 4096.
Maximum Token Limit:
4096
The maximum number of tokens returned by the model. If there are no specific requirements for the model, this can be consistent with the model context length.
Support for Vision:
Yes
Check this option if the model supports image understanding (multimodal), like
gpt4-o
.
Click "Save" to use the model in the application after verifying that there are no errors.
The integration method for Embedding models is similar to LLM, just change the model type to Text Embedding.
More Information
For more information on LiteLLM, please refer to:
Last updated