Skip to content
BestAI
Compare tray

Deepgram

Realtime voice AI API platform

A developer speech AI API platform for realtime STT, TTS, and voice agents through Nova, Flux, Aura, and the Voice Agent API.

Korean I/OAPI availableCommercial use OK
Edge

vs. similar tools: It focuses on realtime voice-agent infrastructure, including turn detection and interruption handling, beyond STT and TTS.

Overview

At a glance

  • Offers Nova-3 STT and Flux realtime conversational recognition
  • Designed around 50+ languages and low latency
  • Official Series C and 1,300-organization signals are public
  • Requires more API integration work than a finished transcription app
  • Advanced voice-agent usage needs cost modeling
  • Best for: Developer teams building realtime voice agents and high-scale transcription APIs
Read more

Deepgram is a speech AI API platform for developer teams adding realtime recognition and voice agents to products. Nova-3 handles high-accuracy transcription, Flux focuses on realtime conversational recognition and turn detection, and Aura covers speech generation. It is especially strong for embedding live conversational voice experiences into products, not just batch file transcription.

Its strength is latency and scale. Deepgram pricing starts with a $200 free credit and usage-based billing, with Nova-3 published from $0.0048 per minute and Flux from $0.0065 per minute. Deepgram says Nova-3 supports more than 50 languages, and its 2026 announcement reported a $130 million Series C, a $1.3 billion valuation, and more than 1,300 organizations building with its APIs. Cloud APIs and self-hosted or on-premises options are both part of its positioning.

The tradeoff is operational complexity. Because it is an API embedded into a product, teams need to design audio capture, streaming connections, error handling, and cost monitoring. Voice Agent API and advanced capabilities are billed by connection time or add-on usage, so contact-center and high-volume products should simulate usage before rollout. For a ready-made meeting-notes workflow, Otter.ai is simpler.

Deepgram is a strong fit for product teams that need low latency and high-volume realtime voice infrastructure. AssemblyAI can be easier to evaluate when the focus is file transcription, summaries, and speech-understanding APIs. If the goal is no-setup meeting notes and searchable conversations for a team, compare Otter.ai as well.

Pricing

PlanMonthly priceLimits
Pay As You Go200 credits$200 free credit, then Nova-3 starts at $0.0048 per minute
Growth-Annual prepaid credits, public model endpoints, and higher concurrency
Custom-Custom contract for custom models, enterprise support, and deployment needs

Specs

Languages
50
Real-time
Supported
API
Yes
Open source
No
Self-hosting
Available
Korean support
Input/output only
Commercial use
Allowed

Popularity

Buzz and recognition on absolute thresholds

83

Absolute-threshold score

83

High confidence4/4 signals

Hacker News buzzGoogle TrendsYouTube recent resultsVerified public benchmarkHacker News buzz Criterion: Sum of Hacker News story points from strict title matches. Raw value: 475 pts Absolute-threshold score: 6.3/10 Updated: 2026-06-16Google Trends Criterion: Google Trends relative search interest over the past 30 days. Raw value: 63 / 100 Absolute-threshold score: 6.6/10 Updated: 2026-06-16YouTube recent results Criterion: YouTube Data API recent video search result estimate over the past 30 days. Raw value: 190 results Absolute-threshold score: 4.8/10 Updated: 2026-06-16Verified public benchmark Criterion: Public adoption evidence confirmed from official sites, official docs, filings, company announcements, or credible reporting. Raw value: $1.3B valuation and 1,300+ organizations reported by Deepgram Absolute-threshold score: 9/10 Updated: 2026-01-13

Each axis maps to a 1-10 absolute threshold where 10 means broadly recognizable. Collected: 2026-06-16.

Verified public benchmark: $1.3B valuation and 1,300+ organizations reported by Deepgram (as of 2026-01-13) Source

By popularity

  • 93
    ElevenLabs

    Its expressive, multilingual voice quality and rich API and ecosystem have made it an industry standard.

  • 91
    Otter.ai

    It delivers realtime transcription and meeting summaries as a finished app, then extends into connected-app search and follow-up workflows.

  • 88
    HeyGen

    It leads in avatar presenter videos and multilingual dubbing quality.

  • 87
    Fireflies.ai

    With 100+ transcription languages, AskFred, CRM/work-app integrations, and APIs, it is built to turn meeting notes into automated workflows.

  • 86
    AssemblyAI

    It goes beyond transcription by packaging natural-language prompting, keyterm boosts, medical mode, and voice agent APIs in one platform.

Compare Deepgram

Last updated: 2026-06-16

All tools