Changelog
Stay up to date with the latest changes in our product
Voice Isolator — Remove Background Noise
2.5.0Extract clean vocals from any audio file. Remove music, background noise, and unwanted sounds with one click.
2026/03/20
What's New
Meet Voice Isolator — a tool that separates human voice from everything else in an audio file.
Upload a recording with background music, street noise, or echo, and Voice Isolator extracts a clean vocal track. One click, no manual editing required.
How It Works
- Go to Voice Isolator in the sidebar
- Upload your audio file (MP3, WAV, OGG, AAC, FLAC — up to 500MB)
- The AI processes the audio and separates voice from background
- Preview the isolated voice
- Download the clean audio file
Why You'll Use This
- Clean up podcast recordings — remove air conditioning hum, keyboard clicks, or street noise that slipped into the take
- Extract vocals from music — isolate the singing voice from an instrumental track
- Improve video voiceovers — strip noise from phone recordings and use the clean audio in your video
- Prepare audio for voice cloning — get a clean sample from a noisy recording before creating a voice clone
- Remix and mashups — extract acapella vocals for creative projects
Before & After Player
The results page shows a side-by-side comparison. Toggle between the original and isolated audio to hear exactly what changed. Download either version.
Supported Formats
| Format | Max File Size | Max Duration |
|---|---|---|
| MP3, WAV, OGG, AAC, FLAC | 500 MB | 1 hour |
Pricing
Voice Isolator uses credits based on audio duration. Available to all logged-in users.
Find it under Voice Isolator in the sidebar.
AI Podcast Generator
2.3.0Turn any topic into a multi-speaker podcast episode with AI. Choose voices, customize the script, and download studio-quality audio.
2026/03/10
What's New
You can now generate full podcast episodes with AI — directly inside AnySpeech.
Give it a topic (or paste your own script), pick two voices, and the system produces a natural-sounding conversation with back-and-forth dialogue.
How It Works
- Enter a topic or paste a script
- Choose two AI voices for the hosts
- Review and edit the generated script
- Generate the full audio episode
- Download the finished podcast
The AI writes the dialogue, handles speaker transitions, and generates everything as a single downloadable audio file.
What Makes It Different
- Real conversation flow — not just two voices reading alternating lines. The AI generates natural transitions, reactions, and follow-up questions.
- Script editing — review and tweak the script before generating audio. Remove sections, add your own lines, or restructure the conversation.
- Voice variety — choose from any of our 200+ voices. Mix and match languages for multilingual episodes.
- Studio quality — the output uses our Advanced voice engine for broadcast-ready audio.
Use Cases
- Content repurposing — turn a blog post or article into a listenable podcast
- Educational content — create interview-style lessons on any subject
- Marketing — produce thought-leadership podcasts without booking guests
- Prototyping — draft podcast episodes before committing to a full production
Access
Available to all users with an AnySpeech account. Podcast generation uses credits from your plan (same rate as Advanced voices).
Find it under AI Podcast Generator in the sidebar.
Speech to Text is Live
2.4.0Convert audio and video files to accurate text transcriptions. Upload a file and get a full transcript in minutes.
2026/03/01
What's New
AnySpeech now works in reverse. Upload any audio or video file and get an accurate text transcription.
We've been a text-to-speech platform since day one. Now we handle speech-to-text too — making AnySpeech a complete audio-text toolkit.
How It Works
- Go to Speech to Text in the sidebar
- Upload an audio file (MP3, WAV, M4A) or video file
- The AI transcribes the content
- Review, edit, and download your transcript
Key Features
- High accuracy — powered by state-of-the-art speech recognition
- Multiple languages — transcribe audio in dozens of languages
- Speaker detection — identifies different speakers in the conversation
- Timestamps — every segment includes timing information
- Real-time progress — watch the transcription happen live via SSE streaming
What You Can Do With It
- Transcribe meetings and interviews — turn recordings into searchable text
- Create subtitles — use transcripts as a starting point for video captions
- Repurpose content — convert podcast episodes into blog posts
- Accessibility — make audio content accessible to deaf and hard-of-hearing users
- Round-trip workflow — transcribe audio → edit the text → regenerate with a better AI voice
Pricing
Speech to Text uses credits from your plan. The cost is based on audio duration. Check your dashboard for real-time credit usage.
Available to all logged-in users.
Now Available in 10 Languages
2.2.0AnySpeech is now fully translated into 10 languages — English, Chinese, Spanish, Portuguese, French, German, Turkish, Japanese, Korean, and Italian.
2026/02/15
What's New
AnySpeech is now available in 10 languages. Every page, every button, every settings menu — fully translated.
Supported Languages
| Language | Code | Flag |
|---|---|---|
| English | en | 🇺🇸 |
| 中文 (Chinese) | zh | 🇨🇳 |
| Español (Spanish) | es | 🇪🇸 |
| Português (Portuguese) | pt | 🇧🇷 |
| Français (French) | fr | 🇫🇷 |
| Deutsch (German) | de | 🇩🇪 |
| Türkçe (Turkish) | tr | 🇹🇷 |
| 日本語 (Japanese) | ja | 🇯🇵 |
| 한국어 (Korean) | ko | 🇰🇷 |
| Italiano (Italian) | it | 🇮🇹 |
What's Translated
Everything:
- Full website interface (workbench, dashboard, settings)
- All SEO content and landing pages
- Blog posts
- Legal pages (privacy policy, terms, cookie policy)
- Email templates
- Pricing and plan descriptions
How to Switch
Click the language selector in the navigation bar. Your preference is saved automatically and persists across sessions.
Why This Matters
Over 60% of internet users prefer browsing in their native language. By supporting 10 languages, we're making AI text to speech accessible to billions more people worldwide.
More languages are coming. Let us know which one you'd like to see next.
Free Text to Speech Page
2.1.0A dedicated free TTS tool — no signup, no limits, no credit card. Just paste text and generate speech in 100+ languages.
2026/02/01
What's New
We launched a dedicated free text to speech page at /free-text-to-speech.
No account needed. No credit card. No daily limit tricks. Just open the page, type your text, and generate speech.
Why a Separate Free Page?
The main Text to Speech workbench uses our Advanced and Pro voices — which require an account and credits. That's great for professional work, but not everyone needs premium quality.
The free page uses our Basic voice engine and is designed for:
- Quick one-off conversions
- Students and researchers
- Testing scripts before investing in premium audio
- Anyone who just wants free TTS without friction
What You Get
- 100+ languages supported
- No signup required — works instantly
- MP3 download for every generation
- No watermarks on the audio
- Commercial use allowed
Character Limits
| User Type | Characters per Request |
|---|---|
| Not logged in | 1,000 |
| Logged in (free) | 5,000 |
| Paid plan | Plan default |
Upgrade Path
Want better voice quality? The free page includes a comparison between Basic and Advanced voices, so you can hear the difference and upgrade when you're ready.
AI Voice Cloning is Here
2.0.0Clone any voice from a 10-second audio clip — with emotion control. Create a consistent brand voice across all your content.
2026/01/15
What's New
This is a big one. You can now clone any voice using just a 10-second audio sample.
Upload a short clip of yourself (or any voice you have permission to use), and AnySpeech creates a digital replica that you can use to generate unlimited speech.
How It Works
- Go to Voice Cloning in the sidebar
- Upload an audio file (MP3, WAV, or M4A) — 10 to 30 seconds is ideal
- Name your voice and confirm consent
- Start generating speech with your cloned voice
That's it. No training time. No waiting hours. The voice is ready to use immediately.
Emotion Control
This is what sets our cloning apart. Every cloned voice supports adjustable emotions:
- Happy
- Calm
- Excited
- Sad
- Angry
- Neutral
You pick the emotion per generation. Same voice, different delivery — matching the mood of each piece of content.
Exaggeration Slider
Control how dramatic the emotion sounds. Dial it up for comedy sketches, keep it subtle for professional narration.
Who Gets It
| Plan | Voice Clones Allowed |
|---|---|
| Free | 1 |
| Basic | 3 |
| Standard | 5 |
| Professional | 10 |
| Premium | 20 |
| Max | 50 |
Pricing
Voice cloning uses the same credit rate as Advanced voices (1x). No extra charge for the cloning feature itself.
Free users get one voice clone to try. Upgrade to create more.
New Voice Selection Experience
1.3.0We've completely redesigned the voice selection interface with preview playback and recent voices.
2025/12/15
What's New
We've completely redesigned the voice selection experience to make it easier and faster to find the perfect voice for your content.
Voice Preview
You can now preview any voice directly from the voice list. Simply click the play button next to any voice to hear a sample before selecting it.
Recently Used Voices
A new "Recently Used" section appears at the top of the voice selector, giving you quick access to voices you've used before. No more scrolling through the entire list!
Search & Filter
- Search voices by name
- Filter by language
- Voices are now grouped by language for easier browsing
Improved Voice Cards
Each voice card now shows:
- Voice name and language
- Preview play button
- Clear selection indicator
This update makes it much faster to find and switch between voices, especially if you work with multiple languages.
Basic Voice Now Supports 60+ Languages
1.2.0Free Basic voice model now supports over 60 languages including Cantonese, Japanese, Korean, and more.
2025/12/10
What's New
The Basic voice model now supports over 60 languages, making professional text-to-speech accessible to users worldwide.
New Languages Added
Here are some of the new languages now available in Basic voice:
Asian Languages:
- Cantonese (粤语)
- Japanese (日本語)
- Korean (한국어)
- Vietnamese (Tiếng Việt)
- Thai (ภาษาไทย)
- Indonesian (Bahasa Indonesia)
European Languages:
- French (Français)
- German (Deutsch)
- Spanish (Español)
- Italian (Italiano)
- Portuguese (Português)
- Dutch (Nederlands)
- Polish (Polski)
- Russian (Русский)
And Many More:
- Arabic (العربية)
- Hindi (हिन्दी)
- Turkish (Türkçe)
- Greek (Ελληνικά)
- Hebrew (עברית)
- And 40+ additional languages
How to Use
- Select "Basic" from the model selector
- Choose a voice from the language you need
- Enter your text and generate
All Basic voice languages are free to use - they don't consume any credits!
Basic Voice is Now Free
1.1.0The Basic voice model is now completely free to use - no credits required.
2025/12/05
What's New
We're making AI text-to-speech more accessible! Basic voice is now completely free for all users.
Free Usage Limits
| User Type | Daily Requests | Characters per Request |
|---|---|---|
| Not logged in | 10 | 1,000 |
| Free account | 20 | 5,000 |
| Paid account | Unlimited | Unlimited |
Why We Made This Change
We believe everyone should be able to try AI text-to-speech without barriers. Basic voice provides great quality for:
- Quick demos and tests
- Short content pieces
- Learning and experimentation
- Multi-language content
No Credits Required
Unlike Advanced and Pro voices, Basic voice does not consume any credits. Use it as much as you want within the daily limits!
Want More?
If you need higher quality voices or unlimited Basic usage:
- Advanced voice: Premium quality, 1x credit per character
- Pro voice: Studio-grade quality, 2x credits per character
- Paid plans: Unlimited Basic voice with no daily limits
Try Basic voice today and see what AI text-to-speech can do for your content!
Faster Audio Generation
1.0.0Audio generation is now up to 50% faster with real-time progress tracking.
2025/12/01
What's New
We've significantly improved our audio generation infrastructure, resulting in faster generation times and a better user experience.
Speed Improvements
- 50% faster average generation time
- More stable processing for long texts
- Better handling of multiple concurrent requests
Real-Time Progress
You can now see exactly how your audio generation is progressing:
- Live progress bar showing completion percentage
- Status updates as your audio is being created
- Clear indication when generation is complete
Long Text Support
For longer content, we now handle text more intelligently:
- Automatic chunking for optimal processing
- Seamless audio merging
- Consistent quality across the entire output
What This Means for You
- Less waiting time for your audio
- Better visibility into the generation process
- More reliable results, especially for longer content
These improvements work across all voice models (Basic, Advanced, and Pro).