Changelog

Stay up to date with the latest changes in our product

Voice Isolator — Remove Background Noise

2.5.0

Extract clean vocals from any audio file. Remove music, background noise, and unwanted sounds with one click.

2026/03/20

What's New

Meet Voice Isolator — a tool that separates human voice from everything else in an audio file.

Upload a recording with background music, street noise, or echo, and Voice Isolator extracts a clean vocal track. One click, no manual editing required.

How It Works

  1. Go to Voice Isolator in the sidebar
  2. Upload your audio file (MP3, WAV, OGG, AAC, FLAC — up to 500MB)
  3. The AI processes the audio and separates voice from background
  4. Preview the isolated voice
  5. Download the clean audio file

Why You'll Use This

  • Clean up podcast recordings — remove air conditioning hum, keyboard clicks, or street noise that slipped into the take
  • Extract vocals from music — isolate the singing voice from an instrumental track
  • Improve video voiceovers — strip noise from phone recordings and use the clean audio in your video
  • Prepare audio for voice cloning — get a clean sample from a noisy recording before creating a voice clone
  • Remix and mashups — extract acapella vocals for creative projects

Before & After Player

The results page shows a side-by-side comparison. Toggle between the original and isolated audio to hear exactly what changed. Download either version.

Supported Formats

FormatMax File SizeMax Duration
MP3, WAV, OGG, AAC, FLAC500 MB1 hour

Pricing

Voice Isolator uses credits based on audio duration. Available to all logged-in users.

Find it under Voice Isolator in the sidebar.

AI Podcast Generator

2.3.0

Turn any topic into a multi-speaker podcast episode with AI. Choose voices, customize the script, and download studio-quality audio.

2026/03/10

What's New

You can now generate full podcast episodes with AI — directly inside AnySpeech.

Give it a topic (or paste your own script), pick two voices, and the system produces a natural-sounding conversation with back-and-forth dialogue.

How It Works

  1. Enter a topic or paste a script
  2. Choose two AI voices for the hosts
  3. Review and edit the generated script
  4. Generate the full audio episode
  5. Download the finished podcast

The AI writes the dialogue, handles speaker transitions, and generates everything as a single downloadable audio file.

What Makes It Different

  • Real conversation flow — not just two voices reading alternating lines. The AI generates natural transitions, reactions, and follow-up questions.
  • Script editing — review and tweak the script before generating audio. Remove sections, add your own lines, or restructure the conversation.
  • Voice variety — choose from any of our 200+ voices. Mix and match languages for multilingual episodes.
  • Studio quality — the output uses our Advanced voice engine for broadcast-ready audio.

Use Cases

  • Content repurposing — turn a blog post or article into a listenable podcast
  • Educational content — create interview-style lessons on any subject
  • Marketing — produce thought-leadership podcasts without booking guests
  • Prototyping — draft podcast episodes before committing to a full production

Access

Available to all users with an AnySpeech account. Podcast generation uses credits from your plan (same rate as Advanced voices).

Find it under AI Podcast Generator in the sidebar.

Speech to Text is Live

2.4.0

Convert audio and video files to accurate text transcriptions. Upload a file and get a full transcript in minutes.

2026/03/01

What's New

AnySpeech now works in reverse. Upload any audio or video file and get an accurate text transcription.

We've been a text-to-speech platform since day one. Now we handle speech-to-text too — making AnySpeech a complete audio-text toolkit.

How It Works

  1. Go to Speech to Text in the sidebar
  2. Upload an audio file (MP3, WAV, M4A) or video file
  3. The AI transcribes the content
  4. Review, edit, and download your transcript

Key Features

  • High accuracy — powered by state-of-the-art speech recognition
  • Multiple languages — transcribe audio in dozens of languages
  • Speaker detection — identifies different speakers in the conversation
  • Timestamps — every segment includes timing information
  • Real-time progress — watch the transcription happen live via SSE streaming

What You Can Do With It

  • Transcribe meetings and interviews — turn recordings into searchable text
  • Create subtitles — use transcripts as a starting point for video captions
  • Repurpose content — convert podcast episodes into blog posts
  • Accessibility — make audio content accessible to deaf and hard-of-hearing users
  • Round-trip workflow — transcribe audio → edit the text → regenerate with a better AI voice

Pricing

Speech to Text uses credits from your plan. The cost is based on audio duration. Check your dashboard for real-time credit usage.

Available to all logged-in users.

Now Available in 10 Languages

2.2.0

AnySpeech is now fully translated into 10 languages — English, Chinese, Spanish, Portuguese, French, German, Turkish, Japanese, Korean, and Italian.

2026/02/15

What's New

AnySpeech is now available in 10 languages. Every page, every button, every settings menu — fully translated.

Supported Languages

LanguageCodeFlag
Englishen🇺🇸
中文 (Chinese)zh🇨🇳
Español (Spanish)es🇪🇸
Português (Portuguese)pt🇧🇷
Français (French)fr🇫🇷
Deutsch (German)de🇩🇪
Türkçe (Turkish)tr🇹🇷
日本語 (Japanese)ja🇯🇵
한국어 (Korean)ko🇰🇷
Italiano (Italian)it🇮🇹

What's Translated

Everything:

  • Full website interface (workbench, dashboard, settings)
  • All SEO content and landing pages
  • Blog posts
  • Legal pages (privacy policy, terms, cookie policy)
  • Email templates
  • Pricing and plan descriptions

How to Switch

Click the language selector in the navigation bar. Your preference is saved automatically and persists across sessions.

Why This Matters

Over 60% of internet users prefer browsing in their native language. By supporting 10 languages, we're making AI text to speech accessible to billions more people worldwide.

More languages are coming. Let us know which one you'd like to see next.

Free Text to Speech Page

2.1.0

A dedicated free TTS tool — no signup, no limits, no credit card. Just paste text and generate speech in 100+ languages.

2026/02/01

What's New

We launched a dedicated free text to speech page at /free-text-to-speech.

No account needed. No credit card. No daily limit tricks. Just open the page, type your text, and generate speech.

Why a Separate Free Page?

The main Text to Speech workbench uses our Advanced and Pro voices — which require an account and credits. That's great for professional work, but not everyone needs premium quality.

The free page uses our Basic voice engine and is designed for:

  • Quick one-off conversions
  • Students and researchers
  • Testing scripts before investing in premium audio
  • Anyone who just wants free TTS without friction

What You Get

  • 100+ languages supported
  • No signup required — works instantly
  • MP3 download for every generation
  • No watermarks on the audio
  • Commercial use allowed

Character Limits

User TypeCharacters per Request
Not logged in1,000
Logged in (free)5,000
Paid planPlan default

Upgrade Path

Want better voice quality? The free page includes a comparison between Basic and Advanced voices, so you can hear the difference and upgrade when you're ready.

AI Voice Cloning is Here

2.0.0

Clone any voice from a 10-second audio clip — with emotion control. Create a consistent brand voice across all your content.

2026/01/15

What's New

This is a big one. You can now clone any voice using just a 10-second audio sample.

Upload a short clip of yourself (or any voice you have permission to use), and AnySpeech creates a digital replica that you can use to generate unlimited speech.

How It Works

  1. Go to Voice Cloning in the sidebar
  2. Upload an audio file (MP3, WAV, or M4A) — 10 to 30 seconds is ideal
  3. Name your voice and confirm consent
  4. Start generating speech with your cloned voice

That's it. No training time. No waiting hours. The voice is ready to use immediately.

Emotion Control

This is what sets our cloning apart. Every cloned voice supports adjustable emotions:

  • Happy
  • Calm
  • Excited
  • Sad
  • Angry
  • Neutral

You pick the emotion per generation. Same voice, different delivery — matching the mood of each piece of content.

Exaggeration Slider

Control how dramatic the emotion sounds. Dial it up for comedy sketches, keep it subtle for professional narration.

Who Gets It

PlanVoice Clones Allowed
Free1
Basic3
Standard5
Professional10
Premium20
Max50

Pricing

Voice cloning uses the same credit rate as Advanced voices (1x). No extra charge for the cloning feature itself.

Free users get one voice clone to try. Upgrade to create more.

New Voice Selection Experience

1.3.0

We've completely redesigned the voice selection interface with preview playback and recent voices.

2025/12/15

What's New

We've completely redesigned the voice selection experience to make it easier and faster to find the perfect voice for your content.

Voice Preview

You can now preview any voice directly from the voice list. Simply click the play button next to any voice to hear a sample before selecting it.

Recently Used Voices

A new "Recently Used" section appears at the top of the voice selector, giving you quick access to voices you've used before. No more scrolling through the entire list!

Search & Filter

  • Search voices by name
  • Filter by language
  • Voices are now grouped by language for easier browsing

Improved Voice Cards

Each voice card now shows:

  • Voice name and language
  • Preview play button
  • Clear selection indicator

This update makes it much faster to find and switch between voices, especially if you work with multiple languages.

Basic Voice Now Supports 60+ Languages

1.2.0

Free Basic voice model now supports over 60 languages including Cantonese, Japanese, Korean, and more.

2025/12/10

What's New

The Basic voice model now supports over 60 languages, making professional text-to-speech accessible to users worldwide.

New Languages Added

Here are some of the new languages now available in Basic voice:

Asian Languages:

  • Cantonese (粤语)
  • Japanese (日本語)
  • Korean (한국어)
  • Vietnamese (Tiếng Việt)
  • Thai (ภาษาไทย)
  • Indonesian (Bahasa Indonesia)

European Languages:

  • French (Français)
  • German (Deutsch)
  • Spanish (Español)
  • Italian (Italiano)
  • Portuguese (Português)
  • Dutch (Nederlands)
  • Polish (Polski)
  • Russian (Русский)

And Many More:

  • Arabic (العربية)
  • Hindi (हिन्दी)
  • Turkish (Türkçe)
  • Greek (Ελληνικά)
  • Hebrew (עברית)
  • And 40+ additional languages

How to Use

  1. Select "Basic" from the model selector
  2. Choose a voice from the language you need
  3. Enter your text and generate

All Basic voice languages are free to use - they don't consume any credits!

Basic Voice is Now Free

1.1.0

The Basic voice model is now completely free to use - no credits required.

2025/12/05

What's New

We're making AI text-to-speech more accessible! Basic voice is now completely free for all users.

Free Usage Limits

User TypeDaily RequestsCharacters per Request
Not logged in101,000
Free account205,000
Paid accountUnlimitedUnlimited

Why We Made This Change

We believe everyone should be able to try AI text-to-speech without barriers. Basic voice provides great quality for:

  • Quick demos and tests
  • Short content pieces
  • Learning and experimentation
  • Multi-language content

No Credits Required

Unlike Advanced and Pro voices, Basic voice does not consume any credits. Use it as much as you want within the daily limits!

Want More?

If you need higher quality voices or unlimited Basic usage:

  • Advanced voice: Premium quality, 1x credit per character
  • Pro voice: Studio-grade quality, 2x credits per character
  • Paid plans: Unlimited Basic voice with no daily limits

Try Basic voice today and see what AI text-to-speech can do for your content!

Faster Audio Generation

1.0.0

Audio generation is now up to 50% faster with real-time progress tracking.

2025/12/01

What's New

We've significantly improved our audio generation infrastructure, resulting in faster generation times and a better user experience.

Speed Improvements

  • 50% faster average generation time
  • More stable processing for long texts
  • Better handling of multiple concurrent requests

Real-Time Progress

You can now see exactly how your audio generation is progressing:

  • Live progress bar showing completion percentage
  • Status updates as your audio is being created
  • Clear indication when generation is complete

Long Text Support

For longer content, we now handle text more intelligently:

  • Automatic chunking for optimal processing
  • Seamless audio merging
  • Consistent quality across the entire output

What This Means for You

  • Less waiting time for your audio
  • Better visibility into the generation process
  • More reliable results, especially for longer content

These improvements work across all voice models (Basic, Advanced, and Pro).