RiverScript:how it was built

This is a presentation of a solo SaaS project developed from scratch over 8 months. RiverScript is a workspace for speech-to-text conversion powered by AI. It is designed for people who work with calls, YouTube videos, webinars, podcasts, and long-form content every day.

Go to RiverScript

About this project

An AI SaaS. Built solo in 8 months.

Built independently: no tasks, no meetings. Just day-to-day work.

Solo SaaS Project

8 Months Development

Launch: July 2026

Superpowers

What makes RiverScript special.

Heavy files without slowdowns

A desktop client built with Tauri + Rust handles files up to 50 GB. No full cloud uploads — speed and bandwidth savings are dramatically better right away.

Live Recording Transcription

You can record and transcribe audio from any application or browser tab directly. WASAPI on Windows, ScreenCaptureKit on macOS. Youtube, meetings, streams - everything is captured without extra steps.

Fully on-device Voice Activity

Silero VAD + ONNX Runtime runs locally. Silence is removed before anything is sent to the server. This significantly reduces costs, bandwidth usage, AI hallucinations and transcription time.

Supports basically everything

Works with any audio or video format supported by FFmpeg (MP3, MP4, MKV, MOV, FLAC, OGG, WEBM, and 100+ more). If FFmpeg can read it - RiverScript can transcribe it.

Tech stack: how this was built

Fully cross-platform application

One frontend, three platforms. Where possible - we split the logic. Where necessary - we use native capabilities.

Web Application

Next.js

React

TypeScript

Tailwind CSS

Zustand

NextAuth

Desktop client · Windows

Tauri

Rust

React

FFmpeg

WASAPI

Silero VAD

ONNX Runtime

Desktop client · macOS

Tauri

Rust

React

FFmpeg

Screen Capture Kit

Silero VAD

ONNX Runtime

Infrastructure

Self-hosted. Multi-server Architecture.

I didn’t want to depend on third-party cloud services like Vercel and Supabase, so I built my own self-hosted and reliable infrastructure using four Hetzner servers.

Web Server

CoolifyNext.jsRedisUmamiBarman

Rust Worker

Audio & video processingFFmpegVAD pipelineRedis queue

DB Primary

PostgreSQL 16pgBouncerPostgREST

DB Replica

PostgreSQL 16Streaming replicationAuto-failoverFloating IP

Tools & services

Coolify

Docker

GitHub

Redis

PostgreSQL

pgBouncer

PostgREST

Barman

AI Integrations

Multi-provider AI. Automatic fallback.

If one provider goes down, the system instantly switches to another.

Speech to Text

Self-hosted Whisper V3

Deepgram Nova-3

ElevenLabs Scribe v2

Text Processing · LLM

W&B Qwen3

OpenAI GPT-4o

DeepInfra Qwen3

Audio Processing

Silero VAD

ONNX Runtime

Monitoring & reliability

Full transparency of everything happening.

I can see everything happening in the system - in real time.

Grafana Cloud

Sentry

PostHog

Umami

UptimeRobot

Third-party services

Integrations that make it run.

Polar

Resend

Cloudflare

Hetzner

RiverScript is live.

Built. Deployed. Running. The full product is live right now.

Web app dashboard Download Desktop client