Benchmarks for audio models that aren't just speech.
audiobench is an open, reproducible evaluation suite for ASR, separation, tagging, and the long tail of audio ML tasks. One command. Real workloads. No marketing-grade numbers.
Single-number vanity metrics: WER on LibriSpeech-clean tells you nothing about how a model handles noisy restaurants, code-switching, or 8kHz phone audio.
Train-test contamination: Public eval sets leak into pretraining corpora. audiobench includes held-out, freshly-licensed clips you can trust.
Reproducibility by accident: Pinned data revisions, deterministic decoding, hash-verified outputs.
Built by audio engineers: Sample rate, dynamic range, and codec round-trips are first-class concerns.