Benchmarks

Apple SpeechAnalyzer and Argmax WhisperKit

June 20, 2025

TL;DR

In WWDC 2025, Apple introduced SpeechAnalyzer to modernize its on-device speech recognition framework with a new proprietary Apple model.
In our benchmarks, Apple matches the speed and accuracy of mid-tier OpenAI Whisper models on long-form conversational speech transcription.
Developers looking for a free offering with this specific mid-tier speed-accuracy trade-off can pick either Apple SpeechAnalyzer or a smaller model in Argmax WhisperKit, depending on their other requirements. We publish a comprehensive feature set comparison below.
For those with even more demanding requirements, Argmax Pro SDK offers frontier accuracy for speech AND speaker recognition while achieving ~5x higher transcription speed compared to either framework.

‍

Benchmarks

SDK	Model	Error Rate (↓)	Speed Factor (↑)	Size (↓)
Argmax WhisperKit	openai/whisper-base.en	15.2	111	145 MB
Apple SpeechAnalyzer	Apple SpeechTranscriber	14.0	70	133 MB
Argmax WhisperKit	openai/whisper-small.en	12.8	35	216 MB
Argmax Pro SDK	nvidia/parakeet-v2	11.7	359	420 MB

↓: Lower is better ↑: Higher is better

‍

Speed Factor (↑)

Speed factor indicates the number of seconds of input audio processed by the transcription system in one second of wall-clock time, e.g. A speed factor of 60 means that a system can process 1 minute of audio in 1 second.

All results are computed on an M4 Mac mini running macOS 26 Beta Seed 1. Apple results are obtained through this open-source benchmark script and can be easily reproduced. Argmax results are obtained in our Playground app on TestFlight and can be reproduced even more easily.

‍

Error Rate (↓)

This is the Word Error Rate (WER) metric computed on a random 10% subset of the earnings22 dataset, consisting of ~12 hours of English conversations from earnings calls with analysts. The reason for picking this dataset is that Apple mentions long-form conversational speech as the primary improvement with their new SpeechTranscriber model.

‍

‍Size (↓)

This is the total download size of the model in megabytes (MB). If installed, Apple’s model can be found here:

Apple SpeechTranscriber model assets are downloaded into /System/Library/AssetsV2

‍

Feature Set Comparison

Feature	Apple SpeechAnalyzer	Argmax WhisperKit	Argmax Pro SDK
Offline Transcription	✓	✓	✓
Real-time Transcription	✓	✓	✓
Word Timestamps	✓	✓	✓
Voice Activity Detection	✓	✓	✓
Speaker Diarization	–	–	✓
Diarized Transcripts	–	–	✓
Language Detection	–	✓	✓
Languages Supported	10	100	100
Model Support	Apple	Whisper: OpenAI models or your fine-tuned version	Best available on the market: From any model vendor or your custom model
Model Updates	At Apple’s discretion	Developer’s discretion	Developer’s discretion

‍

Deployment Considerations

Consideration	Apple SpeechTranscriber Model	Argmax Models
Is the model pre-installed in the operating system?	No	No
Is the model automatically downloaded during first use?	Yes	Yes
Does the model increase my app download size?	No	No
Does the model itself increase my app’s memory usage?	No	No
Compute Engine	Neural Engine + CPU (Hardcoded)	Neural Engine (GPU and CPU usage is configurable if desired)
iOS compatibility	iOS 26 and newer	iOS 17 and newer
macOS compatibility	macOS 26 and newer	macOS 14 and newer

‍

Support Considerations

	Apple SpeechAnalyzer	Argmax WhisperKit	Argmax Pro SDK
Debug-ability	Source not available	Open-source	Open-core
How are issues reported & fixed?	1) File a Feedback Assistant ticket and check if issue gets fixed in the next OS update	1) Self-troubleshoot 2) File a GitHub issue 3) Get help on Discord	1) Get priority support on Slack
How fast can issues be fixed?	Next OS update at the earliest	Immediate hot-fix possible	Immediate hot-fix possible

‍

Commercial Considerations

‍

	Apple SpeechAnalyzer	Argmax WhisperKit	Argmax Pro SDK
Cost	Free	Free	Pricing
License	Apple Proprietary	MIT	Argmax Standard License

‍

Comprehensive Benchmarks

This post covers just the first stage of our benchmarks for offline transcription. We will include Apple SpeechAnalyzer in our upcoming real-time streaming transcription benchmarks which will also include top cloud speech-to-text API providers. Achieving high accuracy and low latency at the same time in real-time streaming mode is hard to solve, and we are curious to see what Apple cooked!

‍

Argmax will integrate Apple

We were slightly disappointed to see that Apple’s model still requires a download and does not come pre-installed with iOS or macOS. However, if Apple SpeechAnalyzer is widely adopted, a newly installed app will find that the model was previously downloaded by another app, including Apple’s first-party apps, on the same device, and skip the download! This removes a significant obstacle for on-device deployment: the latency from app install to first inference, which is dominated by model download time.

For this purpose, Argmax will integrate Apple SpeechAnalyzer so that Argmax WhisperKit and Argmax Pro SDK users may also benefit from a pre-downloaded model while their Argmax model is being downloaded for them.

‍

Browse Apple SpeechAnalyzer documentation.

Start with Argmax WhisperKit on GitHub.

Get access to Argmax Pro SDK.

‍

Apple SpeechAnalyzer and Argmax WhisperKit

Benchmarks

Speed Factor (↑)

Error Rate (↓)

‍Size (↓)

Feature Set Comparison

Deployment Considerations

Support Considerations

Commercial Considerations

Comprehensive Benchmarks

Argmax will integrate Apple

Related Articles

ModMed

Argmax Local Server