> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dify.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Convert Audio to Text

> Convert audio file to text. Supported formats: `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `wav`, `webm`. File size limit is `30 MB`.



## OpenAPI

````yaml /en/api-reference/openapi_completion.json post /audio-to-text
openapi: 3.0.1
info:
  title: Completion App API
  description: >-
    The text generation application offers non-session support and is ideal for
    translation, article writing, summarization AI, and more.
  version: 1.0.0
servers:
  - url: '{api_base_url}'
    description: >-
      The base URL for the Completion App API. Replace {api_base_url} with the
      actual API base URL provided for your application.
    variables:
      api_base_url:
        default: https://api.dify.ai/v1
        description: Actual base URL of the API
security:
  - ApiKeyAuth: []
tags:
  - name: Completions
    description: Operations related to text generation and completion.
  - name: Files
    description: Operations related to file management.
  - name: End Users
    description: Operations related to end user information.
  - name: Feedback
    description: Operations related to user feedback.
  - name: TTS
    description: Operations related to Text-to-Speech and Speech-to-Text.
  - name: Applications
    description: Operations to retrieve application settings and information.
paths:
  /audio-to-text:
    post:
      tags:
        - TTS
      summary: Convert Audio to Text
      description: >-
        Convert audio file to text. Supported formats: `mp3`, `mp4`, `mpeg`,
        `mpga`, `m4a`, `wav`, `webm`. File size limit is `30 MB`.
      operationId: completionAudioToText
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              $ref: '#/components/schemas/AudioToTextRequest'
      responses:
        '200':
          description: Successfully converted audio to text.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/AudioToTextResponse'
              examples:
                audioToTextSuccess:
                  summary: Response Example
                  value:
                    text: >-
                      Hello, I would like to know more about the iPhone 13 Pro
                      Max.
        '400':
          description: >-
            - `app_unavailable` : App unavailable or misconfigured.

            - `no_audio_uploaded` : No audio file was uploaded.

            - `provider_not_support_speech_to_text` : Model provider does not
            support speech-to-text.

            - `provider_not_initialize` : No valid model provider credentials
            found.

            - `provider_quota_exceeded` : Model provider quota exhausted.

            - `model_currently_not_support` : Current model does not support
            this operation.

            - `completion_request_error` : Speech recognition request failed.
          content:
            application/json:
              examples:
                app_unavailable:
                  summary: app_unavailable
                  value:
                    status: 400
                    code: app_unavailable
                    message: App unavailable, please check your app configurations.
                no_audio_uploaded:
                  summary: no_audio_uploaded
                  value:
                    status: 400
                    code: no_audio_uploaded
                    message: Please upload your audio.
                provider_not_support_speech_to_text:
                  summary: provider_not_support_speech_to_text
                  value:
                    status: 400
                    code: provider_not_support_speech_to_text
                    message: Provider not support speech to text.
                provider_not_initialize:
                  summary: provider_not_initialize
                  value:
                    status: 400
                    code: provider_not_initialize
                    message: >-
                      No valid model provider credentials found. Please go to
                      Settings -> Model Provider to complete your provider
                      credentials.
                provider_quota_exceeded:
                  summary: provider_quota_exceeded
                  value:
                    status: 400
                    code: provider_quota_exceeded
                    message: >-
                      Your quota for Dify Hosted OpenAI has been exhausted.
                      Please go to Settings -> Model Provider to complete your
                      own provider credentials.
                model_currently_not_support:
                  summary: model_currently_not_support
                  value:
                    status: 400
                    code: model_currently_not_support
                    message: >-
                      Dify Hosted OpenAI trial currently not support the GPT-4
                      model.
                completion_request_error:
                  summary: completion_request_error
                  value:
                    status: 400
                    code: completion_request_error
                    message: Completion request failed.
        '413':
          description: '`audio_too_large` : Audio file size exceeded the limit.'
          content:
            application/json:
              examples:
                audio_too_large:
                  summary: audio_too_large
                  value:
                    status: 413
                    code: audio_too_large
                    message: Audio size exceeded.
        '415':
          description: '`unsupported_audio_type` : Audio type is not allowed.'
          content:
            application/json:
              examples:
                unsupported_audio_type:
                  summary: unsupported_audio_type
                  value:
                    status: 415
                    code: unsupported_audio_type
                    message: Audio type not allowed.
        '500':
          description: '`internal_server_error` : Internal server error.'
          content:
            application/json:
              examples:
                internal_server_error:
                  summary: internal_server_error
                  value:
                    status: 500
                    code: internal_server_error
                    message: Internal server error.
components:
  schemas:
    AudioToTextRequest:
      type: object
      description: Request body for audio-to-text conversion.
      required:
        - file
      properties:
        file:
          type: string
          format: binary
          description: >-
            Audio file. Supported: `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `wav`,
            `webm`. Limit: `30 MB`.
        user:
          type: string
          description: User identifier.
    AudioToTextResponse:
      type: object
      properties:
        text:
          type: string
          description: Output text from speech recognition.
  securitySchemes:
    ApiKeyAuth:
      type: http
      scheme: bearer
      bearerFormat: API_KEY
      description: >-
        API Key authentication. For all API requests, include your API Key in
        the `Authorization` HTTP Header, prefixed with `Bearer `. Example:
        `Authorization: Bearer {API_KEY}`. **Strongly recommend storing your API
        Key on the server-side, not shared or stored on the client-side, to
        avoid possible API-Key leakage that can lead to serious consequences.**

````