Integrate Models on LiteLLM Proxy
Last updated
Last updated
is a proxy server that allows:
Calling 100+ LLMs (OpenAI, Azure, Vertex, Bedrock) in the OpenAI format
Using Virtual Keys to set Budgets, Rate limits and track usage
Dify supports integrating LLM and Text Embedding capabilities models available on LiteLLM Proxy
LiteLLM Requires a config with all your models defined - we will call this file litellm_config.yaml
On success, the proxy will start running on http://localhost:4000
In Settings > Model Providers > OpenAI-API-compatible
, fill in:
Model Name: gpt-4
Base URL: http://localhost:4000
Enter the base URL where the LiteLLM service is accessible.
Model Type: Chat
Model Context Length: 4096
The maximum context length of the model. If unsure, use the default value of 4096.
Maximum Token Limit: 4096
The maximum number of tokens returned by the model. If there are no specific requirements for the model, this can be consistent with the model context length.
Support for Vision: Yes
Check this option if the model supports image understanding (multimodal), like gpt4-o
.
Click "Save" to use the model in the application after verifying that there are no errors.
The integration method for Embedding models is similar to LLM, just change the model type to Text Embedding.
For more information on LiteLLM, please refer to: