Audio: Speech and Transcription API

Newapi exposes three audio endpoints, all compatible with the OpenAI audio API: text-to-speech synthesis, speech-to-text transcription, and spoken audio translation into English.

Text to speech

POST https://YOUR_NEWAPI_BASE_URL/v1/audio/speech

Generate spoken audio from text. The response is a binary audio file in the format you specify.

Request parameters

model

string

required

The TTS model to use. Common values: tts-1 (faster, lower latency) and tts-1-hd (higher quality).

input

string

required

The text to convert to speech. Maximum 4,096 characters.

voice

string

required

The voice to use. Supported values: alloy, echo, fable, onyx, nova, shimmer.

response_format

string

default:"\"mp3\""

Audio format of the response. Accepted values: mp3, opus, aac, flac, wav, pcm.

speed

number

default:"1"

Playback speed multiplier. Valid range: 0.25 to 4.0.

Response

The response body is the raw audio file as binary data in the requested format (default: audio/mpeg). Stream or save it directly to a file.

Examples

curl -X POST "https://YOUR_NEWAPI_BASE_URL/v1/audio/speech" \
  -H "Authorization: Bearer sk-your-token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello! Welcome to Newapi.",
    "voice": "nova",
    "response_format": "mp3"
  }' \
  --output speech.mp3

Audio transcription

POST https://YOUR_NEWAPI_BASE_URL/v1/audio/transcriptions

Transcribe spoken audio to text. The request uses multipart/form-data.

Request parameters

file

required

The audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.

model

string

required

The transcription model to use. whisper-1 is the standard model.

language

string

The language of the audio, as an ISO-639-1 code (for example en, fr, zh). Providing this improves accuracy and speed.

prompt

string

Optional text to guide the model’s style or supply context (for example, a list of proper nouns or a transcript excerpt). Must be in the same language as the audio.

response_format

string

default:"\"json\""

Format of the transcription output. Options: json (default), text, srt, verbose_json, vtt.

temperature

number

default:"0"

Sampling temperature between 0 and 1. Lower values produce more consistent transcriptions.

timestamp_granularities

string[]

Granularity of word or segment timestamps. Pass ["word"] or ["segment"]. Requires response_format to be verbose_json.

Response fields

text

string

The transcribed text. When response_format is verbose_json, additional fields include language, duration, words, and segments.

Examples

curl -X POST "https://YOUR_NEWAPI_BASE_URL/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-your-token" \
  -F file="@recording.mp3" \
  -F model="whisper-1" \
  -F language="en"

Audio translation

POST https://YOUR_NEWAPI_BASE_URL/v1/audio/translations

Transcribe audio in any supported language and translate the result into English. The request uses multipart/form-data.

Request parameters

file

required

The audio file to translate. Supported formats are the same as for transcriptions.

model

string

required

The model to use. whisper-1 is the standard model.

prompt

string

Optional English-language text to guide the model’s output style.

response_format

string

default:"\"json\""

Output format: json, text, srt, verbose_json, or vtt.

temperature

number

default:"0"

Sampling temperature between 0 and 1.

Response fields

text

string

The translated English transcription.

Example

cURL

curl -X POST "https://YOUR_NEWAPI_BASE_URL/v1/audio/translations" \
  -H "Authorization: Bearer sk-your-token" \
  -F file="@french_audio.mp3" \
  -F model="whisper-1"

Example response

{
  "text": "Hello, how are you doing today?"
}

Overview

AI Model APIs

Management APIs

Audio: Speech and Transcription API

Text to speech

Request parameters

Response

Examples

Audio transcription

Request parameters

Response fields

Examples

Audio translation

Request parameters

Response fields

Example

Example response

Overview

AI Model APIs

Management APIs

Documentation Index

​Text to speech

​Request parameters

​Response

​Examples

​Audio transcription

​Request parameters

​Response fields

​Examples

​Audio translation

​Request parameters

​Response fields

​Example

​Example response

Text to speech

Request parameters

Response

Examples

Audio transcription

Request parameters

Response fields

Examples

Audio translation

Request parameters

Response fields

Example

Example response