Model Library

Published 5/12/2026

Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices

A new tool from Together AI for searching, filtering, and auditioning voices across leading TTS models.

Summary

  • Voice finder: Search 600+ voices across MiniMax, Cartesia, Deepgram, Rime, and other models available through Together AI.
  • Search by prompt or audio: Describe the voice you need, or upload a short voice sample to find similar voices with playable recommendations.
  • Model-aware metadata: Each voice is tagged across 15+ attributes, including pitch, accent, language, age, emotion, and speaking style.

Choosing the right voice for a voice agent is still too manual. Provider catalogs can include dozens or hundreds of voices, and the documentation rarely tells you which one fits a fintech support agent, a meditation guide, or a game show host.

Voice finder gives developers a faster way to search the Together AI voice catalog. Type in what you are building or upload a short audio sample of the voice you have in mind, then compare ranked recommendations, listen inline, and filter by the attributes that matter for your use case.

Voicefinder demo animation

How it works

Voice finder indexes 600+ voices across 10 TTS models on Together AI. Each voice is playable directly in the tool.

Behind the ranking layer, an omni-model has listened to every voice and generated structured metadata across 15+ dimensions, including pitch, gender, accent, language, age, emotion, and speaking style. That metadata powers both natural-language search and manual filtering.

A few example searches:

  • “a calm female voice for a meditation app”
  • “a confident voice for a fintech support agent”
  • “an energetic host for a game show”
  • “a warm bilingual voice for customer service”

The goal is simple: get from a use case to a short list of voices quickly enough to keep building.

Why this matters for voice agents

Voice agents depend on more than model quality. The voice has to fit the product, the customer, and the moment. A healthcare intake agent, a restaurant ordering agent, and an entertainment companion should not sound interchangeable.

Together AI gives teams a single platform for building real-time voice agents across STT, LLM, and TTS. The full pipeline runs co-located on one cloud, holding end-to-end latency under 500ms, fast enough for real-time turn-taking. Voice Finder makes the model-selection step easier by giving developers a faster way to explore the voices available across that stack.

Get started

→ Try Voice finder at findtherightvoice.com
→ Explore the Together AI Voice Platform
→ Read the voice agent documentation
→ Contact Sales for dedicated endpoints and production deployment