Product

Argmax Local Server

August 8, 2025

Argmax Local Server
  • Argmax's real-time transcription matches top cloud APIs on feature set and accuracy while providing consistently lower latency because we optimized it to run locally on the powerful Apple Neural Engine.
  • However, Argmax SDK required native app integration which meant that migrating from the cloud required writing new code until now.
  • Due to popular demand, we built Argmax Local Server so developers can migrate from cloud APIs to on-device without changing their implementation!
  • It is now also easier to work with Argmax in non-native apps (Electron, Tauri), using clients in Javascript, Rust, Python and more.
  • Argmax Local Server is feature-complete for AI Meeting Notes apps with multi-stream real-time transcription of system audio and microphone, custom vocabulary for names and more.

Multi-stream Real-time Transcription

Stream multiple  audio input streams and get multiple real-time text transcript streams without additional memory consumption or slowdown. Multi-stream is required for AI Meeting Notes apps when concurrently transcribing both the system audio, e.g. Google Meet audio from remote participants, and the system microphone, e.g. Google Meet audio for local participants.

WebSocket API & Cloud API Compatibility

Argmax Local Server's WebSocket API is compatible with that of Deepgram Streaming Speech-to-text. The demo video above demonstrates a popular Electron app using Argmax Local Server with ZERO changes to app code when migrating from Deepgram. This is possible by simply switching the inference host URL from remote host (api.deepgram.com) to local host (localhost).

Speaker Diarization

If your user experience requires speaker separation beyond system audio versus microphone, Argmax SDK SpeakerKit implements leading open-source speaker diarization systems such as pyannote-4.0 to separate and identify each and every meeting attendee. SpeakerKit also exclusively offers the best-in-market speaker diarization system on-device: pyannoteAI Precision-2.

SpeakerKit supports speaker diarization with prerecorded audio, which means that speaker diarization can be run at the end of each meeting. Real-time speaker diarization will be added to Argmax Local Server (and Enterprise SDK) in early 2026, get on the waitlist to be the first to hear and onboard when it ships!

Other Apps Do Not Slow Down

Before Argmax Local Server hit the market, many apps like Granola tried on-device inference and decided against it because CPU and GPU resource contention with other apps led to slowdowns and user complaints.

In our mission to make on-device the obvious architectural choice for audio model inference infrastructure, we solved this problem. See 1:31 in the video above for details on how.

To learn more about Argmax Local Server, request an Argmax Enterprise demo today!

Related Articles