> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dify.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# LLM

> Invoke language models for text generation and analysis

The LLM node invokes language models to process text, images, and documents. It sends prompts to your configured models and captures their responses, supporting structured outputs, context management, and multimodal inputs.

<Frame caption="LLM Node Configuration Interface">
  ![LLM Node Configuration Interface](https://assets-docs.dify.ai/dify-enterprise-mintlify/en/guides/workflow/node/85730fbfa1d441d12d969b89adf2670e.png)
</Frame>

<Info>
  Configure at least one model provider in **System Settings → Model Providers** before using LLM nodes.
</Info>

## Model Selection and Parameters

Choose from any model provider you've configured. Different models excel at different tasks - GPT-4 and Claude 3.5 handle complex reasoning well but cost more, while GPT-3.5 Turbo balances capability with affordability. For local deployment, use Ollama, LocalAI, or Xinference.

<Frame caption="Model Selection and Parameter Configuration">
  ![Model Selection and Parameter Configuration](https://assets-docs.dify.ai/dify-enterprise-mintlify/en/guides/workflow/node/43f81418ea70d4d79e3705505e777b1b.png)
</Frame>

Model parameters control response generation. **Temperature** ranges from 0 (deterministic) to 1 (creative). **Top P** limits word choices by probability. **Frequency Penalty** reduces repetition. **Presence Penalty** encourages new topics. You can also use presets: **Precise**, **Balanced**, or **Creative**.

## Prompt Configuration

Your interface adapts based on model type. Chat models use message roles (**System** for behavior, **User** for input, **Assistant** for examples), while completion models use simple text continuation.

Reference workflow variables in prompts using double curly braces: `{{variable_name}}`. Variables are replaced with actual values before reaching the model.

```text theme={null}
System: You are a technical documentation expert.
User: {{user_input}}
```

## Context Variables

Context variables inject external knowledge while preserving source attribution. This enables RAG applications where LLMs answer questions using your specific documents.

<Frame caption="Using Context Variables for RAG Applications">
  ![Using Context Variables for RAG Applications](https://assets-docs.dify.ai/dify-enterprise-mintlify/en/guides/workflow/node/5aefed96962bd994f8f05bac96b11e22.png)
</Frame>

Connect a Knowledge Retrieval node's output to your LLM node's context input, then reference it:

```text theme={null}
Answer using only this context:
{{knowledge_retrieval.result}}

Question: {{user_question}}
```

When using context variables from knowledge retrieval, Dify automatically tracks citations so users see information sources.

## Structured Outputs

Force models to return specific data formats like JSON for programmatic use. Configure through three methods:

<Tabs>
  <Tab title="Visual Editor">
    User-friendly interface for simple structures. Add fields with names and types, mark required fields, set descriptions. The editor generates JSON Schema automatically.
  </Tab>

  <Tab title="JSON Schema">
    Write schemas directly for complex structures with nested objects, arrays, and validation rules.

    ```json theme={null}
    {
      "type": "object",
      "properties": {
        "sentiment": {
          "type": "string",
          "enum": ["positive", "negative", "neutral"]
        }
      },
      "required": ["sentiment"]
    }
    ```
  </Tab>

  <Tab title="AI Generation">
    Describe needs in plain language and let AI generate the schema.
  </Tab>
</Tabs>

<Warning>
  Models with native JSON support handle structured outputs reliably. For others, Dify includes the schema in prompts, but results may vary.
</Warning>

## Memory and File Processing

<Frame>
  <img src="https://mintcdn.com/dify-6c0370d8/gyesM3ime6gTaYSO/images/use-dify/workflow/llm-memory.png?fit=max&auto=format&n=gyesM3ime6gTaYSO&q=85&s=64bb594e25b3fc81ddc1149a1ad8b73c" alt="LLM Memory" width="1084" height="668" data-path="images/use-dify/workflow/llm-memory.png" />
</Frame>

Enable Memory to maintain context across multiple LLM calls within a chatflow conversation. When enabled, previous interactions will be included in subsequent prompts as formatted user - assistant outputs. You can customize what goes into the user prompts by editing the `USER` template. Memory is node-specific and doesn't persist between different conversations.

For **File Processing**, add file variables to prompts for multimodal models. GPT-4V handles images, Claude processes PDFs directly, while other models might need preprocessing.

### Vision Configuration

When processing images, you can control the detail level:

* **High detail** - Better accuracy for complex images but uses more tokens
* **Low detail** - Faster processing with fewer tokens for simple images

The default variable selector for vision is `userinput.files` which automatically picks up files from the User Input node.

<Frame caption="File Processing with Multimodal LLMs">
  ![File Processing with Multimodal LLMs](https://assets-docs.dify.ai/2024/11/05b3d4a78038bc7afbb157078e3b2b26.png)
</Frame>

## Jinja2 Template Support

LLM prompts support Jinja2 templating for advanced variable handling. When you use Jinja2 mode (`edition_type: "jinja2"`), you can:

```jinja theme={null}
{% for item in search_results %}
{{ loop.index }}. {{ item.title }}: {{ item.content }}
{% endfor %}
```

Jinja2 variables are processed separately from regular variable substitution, allowing for loops, conditionals, and complex data transformations within prompts.

## Streaming Output

LLM nodes support streaming output by default. Each text chunk is yielded as a `RunStreamChunkEvent`, enabling real-time response display. File outputs (images, documents) are processed and saved automatically during streaming.

## Error Handling

Configure retry behavior for failed LLM calls. Set maximum retry attempts, intervals between retries, and backoff multipliers. Define fallback strategies like default values, error routing, or alternative models when retries aren't sufficient.
