Skip to main content

Introduction

You can use the Knowledge Retrieval node to integrate existing knowledge bases into your Chatflows or Workflows. The node searches specific knowledge for information relevant to queries and outputs results as contextual content for use in downstream nodes (e.g., LLMs). Below is an example of using the Knowledge Retrieval node in a Chatflow:
  1. The User Input node collects the user query.
  2. The Knowledge Retrieval node searches the selected knowledge base(s) for content related to the user query and outputs the retrieval results.
  3. The LLM node generates a response based on both the user query and retrieved knowledge.
  4. The Answer node returns the LLM’s response to the user.
Knowledge Retrieval Node Use Case
Before using a Knowledge Retrieval node, ensure that you have at least one available knowledge base. To learn about creating knowledge bases, see Knowledge.

Configure a Knowledge Retrieval Node

To make the Knowledge Retrieval node work properly, you need to specify:
  • What it should search for (the query)
  • Where it should search (the knowledge base)
  • How to process the retrieval results (the node-level retrieval settings)
You can also use document metadata to enable filter-based searches and further improve retrieval precision.

Specify the Query

Provide the query content that the node should search for in the selected knowledge base(s).
  • Query Text: Select a text variable. For example, use userinput.query to reference user input in Chatflows, or a custom text-type user input variable in Workflows.
  • Query Images: Select an image variable, e.g., the image(s) uploaded by the user through a User Input node, to search by image. The image size limit is 2 MB.
    For self-hosted deployments, you can adjust the image size limit via the environment variable ATTACHMENT_IMAGE_FILE_SIZE_LIMIT.
    The Query Images option is available only when at least one multimodal knowledge base are added.Such knowledge bases are marked with the Vision icon, indicating that they are using a multimodal embedding model.
Add one or more existing knowledge bases for the node to search for content relevant to the query content. When multiple knowledge bases are added, knowledge is first retrieved from all of them simultaneously, then combined and processed according to the node-level retrieval settings.
Knowledge bases marked with the Vision icon support cross-modal retrieval—retrieving both text and images based on semantic relevance.
Click the Edit icon next to any added knowledge base to modify its settings directly within the Knowledge Retrieval node.To learn more about these settings, see Manage Knowledge Settings.

Configure Node-Level Retrieval Settings

Further fine-tune how the node processes retrieval results after they are fetched from the knowledge base(s).
There are two layers of retrieval settings—the knowledge base level and the knowledge retrieval node level.Think of them as two consecutive filters: the knowledge base settings determine the initial pool of results, and the node settings further rerank the results or narrow down the pool.
  • Rerank Settings
    • Weighted Score: The relative weight between semantic similarity and keyword matching during reranking. Higher semantic weight favors meaning relevance, while higher keyword weight favors exact matches.
      Weighted Score is available only when all added knowledge bases are high-quality ones.
    • Rerank Model: The rerank model to re-score and reorder all the results based on their relevance to the query.
    If any multimodal knowledge bases are added, select a multimodal rerank model (indicated by the Vision icon) as well. Otherwise, retrieved images will be excluded from reranking and the final output.
  • Top K: The maximum number of top results to return after reranking. When a rerank model is selected, this value will be automatically adjusted based on the model’s maximum input capacity (how much text the model can process at once).
  • Score Threshold: The minimum similarity score for returned results. Results scoring below this threshold are excluded. Use higher thresholds for stricter relevance or lower thresholds to include broader matches.

Enable Metadata Filtering

Use existing document metadata to restrict retrieval to specific documents within your knowledge base, improving retrieval precision. With metadata filtering enabled, the Knowledge Retrieval node only searches documents that match the specified metadata conditions, rather than searching across the entire knowledge base. This is especially useful for targeted searching in large and diverse knowledge bases.
To learn more about creating and managing document metadata, see Metadata.

Output

The Knowledge Retrieval node outputs the retrieval results as a variable named result, which is an array of retrieved document chunks containing their content, metadata, title, and other attributes. When the retrieval results contain image attachments, the result variable also includes a field named files containing image details.

Use with LLM Nodes

To use the retrieval results to answer user questions in an LLM node:
  1. In the Context field, select the result variable of the Knowledge Retrieval node.
  2. In the LLM prompt field, reference both the Context variable and the user input variable (e.g., userinput.query in Chatflows).
  3. (Optional) If the LLM supports vision capabilities, enable Vision to allow it to interpret image attachments in the Context variable.
    Once Vision is enabled, the LLM can directly understand images in the Context variable—you don’t have to set a Vision input variable for this.
LLM Node Configuration Example
On Dify Cloud, knowledge retrieval operations are subject to rate limits based on the subscription plan. For more information, see Knowledge Request Rate Limit.