GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →
Model Intelligence Sheet

omi-health/omi-med-stt-v1-gguf overview

Omi Med STT v1 GGUF GGUF export of Omi Med STT v1 https://huggingface.co/omi health/omi med stt v1 for Linux and Windows CPU use through the omi med stt CLI. T…

ggufautomatic-speech-recognitionmedicalparakeetparakeet.cppomi-med-sttenbase_model:nvidia/parakeet-tdt-0.6b-v2base_model:quantized:nvidia/parakeet-tdt-0.6b-v2license:cc-by-4.0region:us

Runs locally from ~886.2 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
50
Likes
0
Pipeline
automatic-speech-recognition

Repository Files & Downloads

2 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
omi-med-stt-v1-f16.ggufGGUFF161.33 GBDownload
omi-med-stt-v1-q8_0.ggufGGUFQ8_0886.2 MBDownload

Model Details

Model IDomi-health/omi-med-stt-v1-gguf
Authoromi-health
Pipelineautomatic-speech-recognition
Licensecc-by-4.0
Base modelnvidia/parakeet-tdt-0.6b-v2
Last modified2026-06-09T00:37:54.000Z

Model README

---

license: cc-by-4.0

language:

  • en

library_name: gguf

tags:

  • automatic-speech-recognition
  • medical
  • parakeet
  • gguf
  • parakeet.cpp
  • omi-med-stt

pipeline_tag: automatic-speech-recognition

base_model: nvidia/parakeet-tdt-0.6b-v2

---

Omi Med STT v1 GGUF

GGUF export of Omi Med STT v1

for Linux and Windows CPU use through the omi-med-stt CLI.

This is the portability path. If you have Apple Silicon, use the MLX q8 repo. If

you have an NVIDIA GPU, use the canonical NeMo checkpoint.

Quickstart

pip install -U omi-med-stt
omi-med-stt install-cpp --cpp-backend cpu
omi-med-stt audio.wav --runtime cpp

Files

| File | Status |

|---|---|

| omi-med-stt-v1-q8_0.gguf | Default CPU artifact, benchmarked |

| omi-med-stt-v1-f16.gguf | Provided for conversion/experimentation; not independently benchmarked |

Evaluation

Full evaluation details: omi.health/research/omi-med-stt.

Benchmark: 7.18h of real and synthetic clinical speech across dialogue, dictation, medication review, procedures/devices/tests, and general speech. Speed is shown as time to process one hour of audio; lower is faster.

NeMo vs Open / Local Models

Local GPU baselines were run on A10 where applicable; VibeVoice-ASR 9B used H100.

| Model | WER | M-WER | Drug M-WER | Medical Recall | Speed: time / 1 hour audio (formula-derived x realtime) |

|---|---:|---:|---:|---:|---:|

| VibeVoice-ASR 9B | 11.10% | 1.78% | 1.36% | 98.71% | 5m 20s (11.2x) |

| Omi Med STT v1 NeMo | 8.30% | 2.37% | 4.75% | 97.95% | 25s (146.3x) |

| Qwen3 ASR 1.7B | 10.72% | 3.13% | 6.11% | 97.21% | 44s (81.1x) |

| Whisper Large v3 Turbo (A10) | 11.98% | 3.93% | 5.88% | 96.45% | 1m 19s (45.8x) |

| Cohere Transcribe 03-2026 | 14.88% | 5.05% | 11.09% | 95.16% | 25s (146.3x) |

| Parakeet TDT 0.6B v3 | 15.26% | 8.01% | 9.50% | 96.34% | 23s (157.9x) |

| Parakeet TDT 0.6B v2 base | 16.45% | 8.36% | 8.60% | 96.20% | 23s (153.8x) |

Runtime Artifacts

Same internal evaluation as the canonical checkpoint.

| Artifact | WER | M-WER | Drug M-WER | Medical Recall | Speed: time / 1 hour audio (formula-derived x realtime) |

|---|---:|---:|---:|---:|---:|

| NeMo canonical | 8.30% | 2.37% | 4.75% | 97.95% | 25s (146.3x) |

| MLX q8 | 8.61% | 2.75% | 5.20% | 97.63% | 53s (67.4x) |

| GGUF q8_0 | 9.12% | 3.20% | 6.33% | 97.53% | 2m 53s (20.8x) |

The GGUF q8_0 build is useful when CPU portability matters. It is not the

quality-leading artifact.

Compatibility

These files are not llama.cpp text-model GGUF files. They require a Parakeet

ASR runtime. The supported path is:

omi-med-stt audio.wav --runtime cpp

The CLI installs the patched parakeet.cpp runtime needed for Omi Med STT v1.

Links

Safety

Omi Med STT v1 is speech-to-text only. It is not a diagnostic, triage,

prescribing, or clinical decision model, and it is not clinically validated.

Transcripts must be reviewed before any clinical use.

Run omi-health/omi-med-stt-v1-gguf with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models