Partnership

pyannoteAI on Argmax SDK

June 23, 2025

pyannoteAI on Argmax SDK

TL;DR

  • Argmax partners with pyannoteAI, the industry-leading speaker recognition technology company. pyannoteAI's flagship commercial model for speaker diarization is now available on the Argmax Marketplace, our gallery of leading proprietary models optimized for on-device deployment with Argmax SDK.
  • SpeakerKit, part of Argmax SDK, was built on pyannote's open-source model to bring speaker diarization on-device and will now expand support to their commercial model that sets the frontier on speaker recognition accuracy. The same familiar APIs, no code changes required to upgrade!
  • Our Interspeech 2025 research paper shows that pyannoteAI's flagship commercial model attains the highest accuracy among several server-side and on-device competitors, and Argmax SpeakerKit achieves the second highest accuracy while being roughly 10x faster than any other solution. The combination of the two is... Submit this form to get started.
  • If you are a model builder and want to expand your distribution while earning 100% of the model upgrade revenue through Argmax's device install base like pyannoteAI did, drop us a note!

pyannoteAI's flagship model is a drastic quality improvement over open-source
pyannote-3.1 (OSS) across many commercial use cases (Source: pyannoteAI)

Speaker Diarization

We have been fans of the pyannoteAI team and what they have built ever since we started building SpeakerKit for on-device speaker diarization earlier this year. As part of building SpeakerKit, we went deep and benchmarked 5 systems (open-source and proprietary) across 13 datasets with a unified evaluation methodology and we found that pyannote-3.1 consistently achieved lower error rates compared to others. Our benchmarks are published at Interspeech 2025 and the code is open-source:

Argmax's Interspeech 2025 paper that benchmarks 5 systems across 13 datasets representing various use cases

In these benchmarks, pyannoteAI's commercial model (denoted PyAnnote-AI in the plot) served via their cloud API ranked first on quality, improving the Diarization Error Rate (DER) significantly. In the meanwhile, SpeakerKit matched pyannote-3.1 OSS DER on almost all datasets as expected. The first cohort of SpeakerKit customers noted that the system is already faster than they expected and any future improvements should come in the form of additional features or even higher accuracy (lower DER).

We are excited to announce that pyannoteAI's commercial model is now available on Argmax Marketplace as a SpeakerKit-compatible model upgrade!

Real-time Streaming Diarization

Many of our current and prospective customers have been asking about real-time streaming diarization. Although there are some products in the market today, we do not think this technology has achieved commercial-grade status yet. We will ship our version of this technology when the bleeding-edge accuracy meets our bar. In the meanwhile, please enjoy ultra fast inference by rerunnning speaker diarization after each and every transcribed sentence.

This commercial model is just the first chapter of our partnership. In the future, we intend to make other breakthrough technologies from pyannoteAI easy to deploy on device everywhere with Argmax SDK. If you are interested in evaluating this technology, please submit this form to get started.

Related Articles