<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Self-Hosted on steeman.be</title><link>https://www.steeman.be/categories/self-hosted/</link><description>Recent content in Self-Hosted on steeman.be</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Sat, 18 Apr 2026 14:00:00 +0200</lastBuildDate><atom:link href="https://www.steeman.be/categories/self-hosted/index.xml" rel="self" type="application/rss+xml"/><item><title>Local Whisper Transcription with Speaker Diarization: My GPU-Powered Docker Setup</title><link>https://www.steeman.be/posts/local-whisper-transcription-with-speaker-diarization/</link><pubDate>Sat, 18 Apr 2026 14:00:00 +0200</pubDate><guid>https://www.steeman.be/posts/local-whisper-transcription-with-speaker-diarization/</guid><description>&lt;p&gt;I wanted a transcription tool that runs entirely on my own hardware — no audio leaves the machine, no cloud APIs, no subscriptions. Something that handles any language (including tricky ones like Flemish dialect), produces speaker-labeled output, and can be tuned with domain-specific vocabulary for whatever context I&amp;rsquo;m transcribing.&lt;/p&gt;
&lt;p&gt;What I ended up with is a Docker container powered by an NVIDIA RTX 3090 that transcribes audio with Whisper, aligns every word to a precise timestamp, and identifies who said what — all in about two minutes for a 42-minute recording.&lt;/p&gt;</description></item></channel></rss>