Text to Speech Multilingual v2

Text to Speech Multilingual v2 helps you create usable drafts quickly, then refine quality, timing, and style with small, predictable edits. Use it for creator content, localization, podcasts, games, and production workflows where speed and consistency matter.

Get Started

Create natural speech across languages with multilingual voice support.

Text *

The text to convert to speech Max length: 5000

Voice

The voice to use for speech generation

Stability0.5

Voice stability (0-1)

Similarity Boost0.75

Similarity boost (0-1)

Style0

Style exaggeration (0-1)

Speed1

Speech speed (0.7-1.2). Values below 1.0 slow down the speech, above 1.0 speed it up. Extreme values may affect quality.

Timestamps

Whether to return timestamps for each word in the generated speech

Previous Text

The text that came before the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation. Max length: 5000

Next Text

The text that comes after the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation. Max length: 5000

Language Code

Language code (ISO 639-1) used to enforce a language for the model. Currently only Turbo v2.5 and Flash v2.5 support language enforcement. For other models, an error will be returned if language code is provided. Max length: 500

ai.voice.generator.credits_costai.voice.generator.credits_remaining

ai.voice.generator.results_title

ai.voice.generator.results_empty

Why Text to Speech Multilingual v2

Built for real creator workflows where speed and control matter. Start with a clear goal, then iterate with small edits so improvements are easy to compare and keep.

Natural voice

Turn scripts into speech that sounds human, with controllable pacing and tone for narration, ads, support flows, and product demos.

Multilingual workflow

Localize content across languages while keeping a consistent voice direction, then fine-tune speed and emphasis for each locale.

Voice steering

Adjust stability, similarity, and style so revisions stay predictable and easy to compare across iterations.

Fast iteration

Draft quickly, review pronunciation and pacing, then rerender final takes once the direction is approved.

Key advantages

Designed to reduce rework by making outputs easier to predict, compare, and refine across iterations. Move from concept to usable drafts faster with practical control over quality and consistency.

Keep narration consistent across revisions and languages by reusing the same voice and adjusting settings in small steps.

Features

Everyday capabilities for ideation and production: guided generation, controllable settings, and practical outputs that help you iterate quickly and ship usable results.

Voice selection

Choose a voice that matches your brand or character, then reuse it for consistent narration.

Stability and style

Control delivery by adjusting stability, similarity, and style exaggeration.

Speed control

Tune speaking speed for accessibility, pacing, and platform requirements.

Timestamps

Enable timing output when available to support subtitles and editing workflows.

Prompt templates

Use a repeatable structure so results are easier to compare across iterations.

Variation workflow

Generate 3–5 variants, pick the best, then refine with small edits.

Export-ready drafts

Create outputs you can share for review before finalizing.

Version tracking

Keep prompts and outputs together so you can reproduce your best results later.

FAQ

Common questions about quality, workflow, credits, privacy, and best practices.

Start with Text to Speech Multilingual v2

Create a first draft quickly, then refine quality and timing with fast iterations.

Get Started

Text to Speech Multilingual v2

Audio Generator

Why Text to Speech Multilingual v2

Natural voice

Multilingual workflow

Voice steering

Fast iteration

Key advantages

Voice consistency

Production-friendly outputs

Team collaboration

Repeatable pipeline

Features

Voice selection

Stability and style

Speed control

Timestamps

Prompt templates

Variation workflow

Export-ready drafts

Version tracking

FAQ

What is Text to Speech Multilingual v2 best for?

How do I get better results consistently?

How do I write scripts for better speech?

How do I keep the same voice across languages?

How are credits calculated?

What about privacy for prompts and uploads?

How do I manage versions for team review?

How do I avoid generic or repetitive outputs?

Can I use outputs commercially?

What formats can I export?

How can I improve quality without increasing cost too much?

What is a good workflow from draft to final?

How do I troubleshoot artifacts or odd outputs?

How do I build a repeatable workflow for a series?

Start with Text to Speech Multilingual v2

Text to Speech Multilingual v2

Audio Generator

Why Text to Speech Multilingual v2

Natural voice

Multilingual workflow

Voice steering

Fast iteration

Key advantages

Voice consistency

Production-friendly outputs

Team collaboration

Repeatable pipeline

Features

Voice selection

Stability and style

Speed control

Timestamps

Prompt templates

Variation workflow

Export-ready drafts

Version tracking

FAQ

What is Text to Speech Multilingual v2 best for?

How do I get better results consistently?

How do I write scripts for better speech?

How do I keep the same voice across languages?

How are credits calculated?

What about privacy for prompts and uploads?

How do I manage versions for team review?

How do I avoid generic or repetitive outputs?

Can I use outputs commercially?

What formats can I export?

How can I improve quality without increasing cost too much?

What is a good workflow from draft to final?

How do I troubleshoot artifacts or odd outputs?

How do I build a repeatable workflow for a series?

Start with Text to Speech Multilingual v2