Whisper to me like lovers do
TL:DR
Longer story
So, I was in need of a transcribing solution, and stumbled upon mystic.ai: they are nice, AND they currently give $50/mo (that’s bordering with I love you) in free credits.
Obviously, the first choice was OpenAI’s Whisper as it’s good, has a lot of optimized solutions etc.
Stumbling upon insanely-fast-whisper was a blessing: as mystic.ai charges per compute time, I could potentially make my psychotherapy diary infinite (yeah, like the glove, but the stones’re in my head).
Whisper optimizations
Batching:
batch_size=8
(depends on your VRAM)Half-precision:
torch_dtype=torch.float16
BetterTransformer: yada yada, but it’s actually included now in torch>=2.1.1
Flash Attention 2: wasn’t able to install, frankly speaking. That
--no-build-isolation
’s a killer!
Why all the fuss?
I’ll omit the quirks of pipeline management, as those were largely low skills, but will post some benchmarks of the same 5-minute .oga
file.
Okay, it’s not a 10x gain, but still.
Links
A directory of free and out-of-copyright awesome books
Some databases by 80k Hours:
Reminiscences of 139 YC startups. TL;DR: AI, AIOps, Finance, Health, B2B, etc. Everyone’s incorporating this or that copilot-style projects, i.e. “Use AI to insert word”. So it’s Hype -> Money -> Health -> Sales, in a vicious cycle.
Welcome to Teleogenic/Boi Diaries❣️
Other places I cross-post to:
Reuse
Citation
@online{kogan2023,
author = {Kogan, Zakhar},
title = {Whisper to Me Like Lovers Do},
date = {2023-12-19},
url = {https://teleogenic.com/posts/231219-whisper/},
langid = {en}
}