RiverScript:how it was built
This is a presentation of a solo SaaS project developed from scratch over 6 months. RiverScript is a workspace for speech-to-text conversion powered by AI. It is designed for people who work with calls, YouTube videos, webinars, podcasts, and long-form content every day.
About this project
An AI SaaS. Built solo in 6 months.
Built independently: no tasks, no meetings. Just deep focus and day-to-day work.
Superpowers
What makes RiverScript special.
Heavy files without slowdowns
A desktop client built with Tauri + Rust handles files up to 50 GB. No full cloud uploads — speed and bandwidth savings are dramatically better right away.
Desktop Audio Capture
You can record and transcribe audio from any application or browser tab directly. WASAPI on Windows, ScreenCaptureKit on macOS. Youtube, meetings, streams - everything is captured without extra steps.
Fully on-device Voice Activity
Silero VAD + ONNX Runtime runs locally. Silence is removed before anything is sent to the server. This significantly reduces costs, bandwidth usage, AI hallucinations and transcription time.
Supports basically everything
Works with any audio or video format supported by FFmpeg (MP3, MP4, MKV, MOV, FLAC, OGG, WEBM, and 100+ more). If FFmpeg can read it - RiverScript can transcribe it.
Tech stack: how this was built
Fully cross-platform application
One frontend, three platforms. Where possible - we split the logic. Where necessary - we use native capabilities.
Web Application
Desktop client · Windows
Desktop client · macOS
Infrastructure
Self-hosted. Multi-server Architecture.
I didn’t want to depend on third-party cloud services like Vercel and Supabase, so I built my own self-hosted and reliable infrastructure using four Hetzner servers.
Web Server
Rust Worker
DB Primary
DB Replica
Tools & services
AI Integrations
Multi-provider AI. Automatic fallback.
If one provider goes down, the system instantly switches to another.
Speech to Text
Text Processing · LLM
Audio Processing
Monitoring & reliability
Full transparency of everything happening.
I can see everything happening in the system - in real time.
Third-party services
Integrations that make it run.
The pre-launch demo is live.
The landing page is locked until June 2026 - but the app is already fully deployed, functional, and available on the free tier right now. You can start using it right away.