Product

Argmax Pro SDK 2

April 7, 2026

Argmax Pro SDK 2

The next generation of our flagship SDK is here to give you new superpowers:

  • Real-time speech-to-text with speakers
  • Reach frontier accuracy for your use case with custom vocabulary
  • Ship across iOS, macOS, and now, Android!


Real-time Transcription with Speakers

Argmax Pro SDK serves Nvidia Parakeet and OpenAI Whisper for streaming speech-to-text in real-time. Argmax Pro SDK 2 introduces speakers (who said what) to the real-time mode!

The best part: Unlike what the industry has come to expect, there is no punishment on accuracy for using real-time mode, it retains the full accuracy of pre-recorded mode!

If you are an Enterprise customer that brings their own fine-tuned models to Argmax SDK but still want the real-time speakers feature, please reach out on your dedicated Slack channel.

In our benchmark preview, Argmax Pro SDK's Real-time Transcription with Speakers surpasses top cloud APIs like Deepgram in speaker-attributed transcription accuracy as measured by cpWER (lower the better) on the callhome (telephone conversations) dataset.

Argmax Pro SDK 2, powered by Nvidia Sortformer, brings about a generational accuracy leap for speaker diarization compared to Argmax Pro SDK 1's SpeakerKit, powered by pyannote 4. Hence, we open-sourced SpeakerKit's pyannote 4 engine alongside the general availability of Argmax Pro SDK 2.

Argmax is committed to democratizing on-device AI with free and open-source developer tools while building frontier-level capabilities into our commercial SDK for developers and Enterprises with the most demanding requirements.

Custom Vocabulary

Frontier accuracy for speech-to-text systems is not achievable with a single general-purpose model. Every use case has domain-specific jargon. Every user has personal context.

Argmax Pro SDK 2 brings an improved Custom Vocabulary feature that lets developers customize the system for their domain or for each user, bringing frontier accuracy to each app and end-user that is only achievable with context

The best part: Argmax's research breakthrough enabled this feature to scale to 3000 keywords while most cloud APIs are limited to a few hundred. This scale unlocks unprecented use cases, such as specialty-specific AI medical scribes scaling frontier accuracy to all specialties. Here is the guide to applications in Healthcare.

Argmax Pro SDK for Android

Real-time transcription with Nvidia Parakeet is now supported on Android!

We designed Argmax Pro SDK for Android to be Kotlin-first for an unapologetically platform-native developer experience with top-tier reliability and familiar abstractions for Android teams.

We have collaborated with Google when building Android support to ensure that Android devices from various popular hardware vendors are tested and validated to bring meaningful device share on Day 1.

Getting Started

Current customers can seamlessly update their dependencies in Xcode.

If you are new to Argmax, you can get started with Argmax Pro SDK 2 today:

Related Articles