March 18, 2026

Our first foray into Android was in collaboration with Qualcomm. We deployed OpenAI Whisper on Qualcomm Snapdragon NPUs for speech-to-text on pre-recorded audio. This was built on the Google TFLite inference runtime, predecessor to LiteRT.
This was a promising start but there were several limitations to this approach:
Google LiteRT supplanted TFLite as the next-gen inference runtime for Android. Most importantly:
Thanks to our collaboration with Google LiteRT, Argmax became the first commercial SDK to ship with LiteRT that supports virtually all of the major NPU vendors in the Android market.
Since our first foray into Android, Nvidia Parakeet overtook OpenAI Whisper as the leading speech-to-text model. We built the first and only real-time streaming implementation of Nvidia Parakeet on the market, reusing the techniques we published in our ICML 2025 paper. This algorithm elevates the real-time transcription accuracy to match the reference transcription accuracy on pre-recorded audio. Now, the exact same accuracy is available across Android and Apple platforms via Argmax Pro SDK.
Argmax Pro SDK for Apple has recently expanded to support real-time transcription with speakers and custom vocabulary. Argmax Pro SDK for Android is expected to reach feature parity over time. The launch version only supports real-time transcription.
Finally, we redesigned Argmax Pro SDK for Android from the ground up to be Kotlin-first for an unapologetically platform-native developer experience with top-tier reliability and familiar abstractions for Android teams.
Try Argmax Pro SDK for Android today: