OpenSub

A professional desktop subtitle editor with AI-powered transcription. Create, edit, and style subtitles with word-level timing precision — Descript-style editing for everyone.

Built with AI — This entire application was developed using Claude Code, Anthropic's AI coding assistant. From architecture decisions to implementation details, every line of code was written collaboratively with AI.

Standalone App — No Python or FFmpeg Installation Required

What is OpenSub?

OpenSub is a macOS desktop application that uses AI-powered speech recognition to automatically generate subtitles from video files. The app uses WhisperX (based on OpenAI's Whisper) and is optimized for Apple Silicon Macs (M1/M2/M3/M4).

The Workflow

Import Video — Drag and drop any video file into the app
Automatic Transcription — WhisperX transcribes speech with precise word-level timing
Edit Subtitles — Correct text, adjust timing, and customize styling
Export — Save as subtitle file (ASS/SRT) or burn directly into video

Features

AI-Powered Transcription

WhisperX Integration — State-of-the-art speech recognition based on OpenAI's Whisper
Word-Level Timing — Precise timestamps for every single word (karaoke-ready)
Speaker Diarization — Automatic identification of different speakers (via pyannote.audio)
Multilingual Support — Over 99 languages supported
GPU Accelerated — Optimized for Apple Silicon with Metal Performance Shaders (MPS)

Professional Subtitle Editing

Visual Timeline — Waveform display for precise timing adjustments
Drag & Drop — Simply drag a video to get started
Real-Time Preview — Subtitles are rendered live on the video
Segment Editing — Split, merge, and adjust subtitle segments

Advanced Styling

Typography Control — Full control over fonts, sizes, and colors
Karaoke Animation — Word-by-word highlighting synchronized with speech
Multiple Animations — Karaoke, fade-in, appear, scale effects
Free Positioning — Place subtitles anywhere with magnetic guides
Text Effects — Outlines, shadows, and backgrounds for better readability

Export Options

ASS Format — Advanced SubStation Alpha with full styling support
SRT Format — Universal compatibility with all players
Burned-In Subtitles — Render subtitles directly into the video via FFmpeg

System Requirements

Supported Systems

Component	Minimum	Recommended
Operating System	macOS 12 (Monterey)	macOS 13+ (Ventura/Sonoma/Sequoia)
Processor	Intel Mac	Apple Silicon (M1/M2/M3/M4)
Memory	8 GB RAM	16 GB RAM or more
Disk Space	10 GB free	20 GB free (for AI models)

Note: Intel Macs are supported, but transcription runs on CPU only and is significantly slower. Apple Silicon Macs use GPU acceleration for up to 3x faster processing.

Installation

Standalone App (Recommended)

OpenSub is distributed as a fully standalone application — no external dependencies required. Python, FFmpeg, PyTorch, and all AI models are bundled directly into the app.

Download the latest release from GitHub Releases
Open the .dmg file
Drag OpenSub to your Applications folder
Launch OpenSub from Applications

⚠️ Important - macOS Security Warning

Since the app is not code-signed, macOS will block it with a "damaged" or "unverified developer" warning. To fix this, open Terminal and run:
xattr -cr /Applications/OpenSub.app
Then launch the app normally. This only needs to be done once after installation.

What's Included

The standalone build bundles everything you need:

Component	Version	Purpose
Python	3.12	Runtime for AI services
PyTorch	2.x	Machine learning framework
WhisperX	Latest	Speech recognition engine
FFmpeg	6.x	Video/audio processing
AI Models	Downloaded on first use	Whisper model files

First Transcription

When you run your first transcription, OpenSub will automatically download the required AI models (~1-3 GB depending on model size). This is a one-time download stored in the app's data directory.

Model sizes:

tiny — ~75 MB (fastest, lower accuracy)
base — ~150 MB
small — ~500 MB
medium — ~1.5 GB
large-v3 — ~3 GB (most accurate, recommended)

Development Setup

Want to contribute or build from source? Follow these steps to set up the development environment.

Prerequisites

macOS 12 (Monterey) or later
Node.js 18+ and npm
Git
Xcode Command Line Tools (xcode-select --install)

Clone and Install

# Clone the repository
git clone https://github.com/ibimspumo/OpenSub.git
cd OpenSub

# Install Node.js dependencies
npm install

Development Mode

For rapid development with hot-reload:

npm run dev

This starts Electron in development mode with:

Hot module replacement for the renderer process
Automatic restart on main process changes
Source maps for debugging

Building for Production

Quick Build (Development Testing)

# Build without setting up the bundled Python environment
npm run dist:unsigned

Full Standalone Build

To create a distributable .dmg with all dependencies bundled:

# Download portable Python and set up the AI environment
npm run setup:python

# Build the complete standalone app
npm run dist

The setup:python script:

Downloads a standalone Python 3.12 build (~50 MB)
Creates a virtual environment in build-resources/python-env
Installs WhisperX, PyTorch, MLX, and all AI dependencies (~2-3 GB)

Available Scripts

Script	Description
`npm run dev`	Start development server with hot-reload
`npm run build`	Build the app without packaging
`npm run build:mac`	Build and package for macOS
`npm run dist`	Full standalone build (with Python)
`npm run dist:unsigned`	Build without Python bundling (for testing)
`npm run setup:python`	Download Python and install AI dependencies
`npm run typecheck`	Run TypeScript type checking
`npm run lint`	Run ESLint
`npm run test`	Run Playwright tests
`npm run test:ui`	Run tests with Playwright UI

Project Structure

OpenSub/
├── src/
│   ├── main/           # Electron main process
│   │   ├── services/   # Core services (FFmpeg, Python, etc.)
│   │   └── ipc/        # IPC handlers for renderer communication
│   ├── renderer/       # React frontend
│   │   ├── components/ # UI components
│   │   ├── store/      # Zustand state management
│   │   └── i18n/       # Internationalization
│   ├── preload/        # Electron preload scripts
│   └── shared/         # Shared types and utilities
├── python-service/     # Python AI service
│   ├── whisper_service/# WhisperX transcription
│   └── requirements.txt
├── scripts/            # Build and setup scripts
├── resources/          # App icons and assets
└── build-resources/    # Python environment (generated)

Python Environment Details

The AI transcription runs as a separate Python process managed by Electron. For development:

Local Python: You can use your system Python with a virtual environment
Bundled Python: The npm run setup:python creates a portable Python installation

To set up a local development environment with your system Python:

cd python-service
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Note: The bundled Python is required for distribution builds but not for development.

Architecture

OpenSub follows a multi-process architecture with clear separation of concerns:

┌─────────────────────────────────────────────────────────────────┐
│                         Electron App                             │
├─────────────────┬─────────────────┬─────────────────────────────┤
│   Main Process  │  Renderer (UI)  │     Python Service          │
│   (Node.js)     │  (React)        │     (WhisperX-MLX)          │
├─────────────────┼─────────────────┼─────────────────────────────┤
│ • IPC handlers  │ • React 18      │ • WhisperX transcription    │
│ • FFmpeg        │ • Zustand state │ • MLX Metal backend         │
│ • File I/O      │ • Tailwind CSS  │ • Word-level alignment      │
│ • SQLite DB     │ • Video player  │ • JSON-RPC communication    │
└─────────────────┴─────────────────┴─────────────────────────────┘

Process Architecture

Main Process (`src/main/`)

The Electron main process handles all system-level operations:

Service	File	Purpose
WhisperService	`services/WhisperService.ts`	Spawns and manages the Python transcription process
FFmpegService	`services/FFmpegService.ts`	Video/audio processing, export with burned-in subtitles
ProjectDatabase	`services/ProjectDatabase.ts`	SQLite persistence for projects and settings
OpenRouterService	`services/OpenRouterService.ts`	AI-powered subtitle correction via Gemini API
ModelManager	`services/ModelManager.ts`	Whisper model download and selection

Renderer Process (`src/renderer/`)

The UI is built with React 18 and uses Zustand for state management:

Store	Purpose
projectStore	Current project, subtitles, video metadata
styleProfileStore	Saved style presets, import/export
uiStore	UI state, dialogs, loading indicators

Key components:

VideoPlayer — HTML5 video with custom controls and subtitle overlay
Timeline — Waveform display with draggable subtitle segments
SubtitleEditor — Text editing with word-level timing
StylePanel — Typography, colors, animations, karaoke effects

Python Service (`python-service/`)

The AI transcription runs in a separate Python process:

Electron Main ←──JSON-RPC over stdio──→ Python Service
                                              │
                                              ├── WhisperX-MLX (transcription)
                                              ├── PyTorch (model runtime)
                                              └── MLX (Metal GPU acceleration)

Communication: JSON-RPC 2.0 over stdin/stdout
Model: WhisperX with MLX backend for Apple Silicon
Alignment: Word-level timestamps via forced alignment

IPC Communication

The main and renderer processes communicate via Electron IPC:

// Renderer → Main (invoke)
const result = await window.api.whisper.transcribe(audioPath, options)

// Main → Renderer (events)
mainWindow.webContents.send('whisper:progress', { percent: 50, stage: 'transcribing' })

IPC channels are defined in src/shared/types.ts (IPC_CHANNELS).

Data Flow: Transcription

User drops video → Renderer sends ffmpeg:extract-audio
Audio extracted → Renderer sends whisper:transcribe
Main process → Spawns/reuses Python service
Python service → Runs WhisperX with MLX backend
Progress events → Streamed back via whisper:progress
Result → Word-level timestamps stored in project state

Data Flow: Export

User clicks Export → Opens export dialog
Subtitle frames rendered → Canvas API generates PNG frames
Frames saved → Main process writes to temp directory
FFmpeg overlay → Frames composited onto video
Output → MP4 with burned-in subtitles

Tech Stack

Layer	Technology	Purpose
Frontend	React 18, TypeScript, Zustand	UI components and state
Styling	Tailwind CSS, Radix UI	Design system
Desktop	Electron 28	Native app container
Build	electron-vite, Vite	Fast bundling
Video	FFmpeg, fluent-ffmpeg	Media processing
AI/ML	WhisperX, PyTorch	Speech recognition
Database	better-sqlite3	Project persistence

Keyboard Shortcuts

Key	Function
`Space`	Play/Pause
`←` / `→`	Skip 5 seconds back/forward
`Cmd + O`	Open video
`Cmd + S`	Save project
`Enter`	Confirm subtitle edit
`Escape`	Cancel edit

Supported Formats

Video Import: MP4, MOV, AVI, MKV, WebM, M4V

Export:

ASS — Full style support, recommended for media players
SRT — Universal compatibility
MP4 — Video with burned-in subtitles

Contributing

Contributions are welcome! Whether you're fixing bugs, adding features, improving documentation, or translating the UI — your help is appreciated.

How to Contribute

Fork the repository
Create a branch for your feature or fix: git checkout -b feature/your-feature-name
Make your changes following the code style guidelines below
Test your changes: npm run typecheck && npm run lint
Commit with a clear message describing what you changed
Push to your fork and open a Pull Request

Code Style

TypeScript — Use strict typing, avoid any when possible
React — Functional components with hooks, no class components
Naming — camelCase for variables/functions, PascalCase for components/types
Formatting — Run npm run lint before committing

Good First Issues

New to the project? Look for issues labeled good first issue — these are smaller tasks that are great for getting familiar with the codebase.

Reporting Bugs

Found a bug? Please open an issue with:

A clear description of the problem
Steps to reproduce
Expected vs. actual behavior
Your macOS version and hardware (Intel/Apple Silicon)

Feature Requests

Have an idea? Open an issue with the enhancement label. Describe the feature and why it would be useful.

License

This project is licensed under the MIT License — see the LICENSE file for details.

What This Means

✅ Free to use — Use OpenSub for personal or commercial projects
✅ Free to modify — Fork it, customize it, make it your own
✅ Free to distribute — Share your modified versions
⚠️ No warranty — Provided "as is" without guarantees

Third-Party Licenses

OpenSub uses several open-source projects:

Project	License	Purpose
Electron	MIT	Desktop app framework
React	MIT	UI library
WhisperX	BSD-2-Clause	Speech recognition
PyTorch	BSD-3-Clause	Machine learning
FFmpeg	LGPL/GPL	Media processing

Resources

WhisperX GitHub — Transcription engine
Electron Documentation — Desktop framework
FFmpeg Documentation — Video processing
HuggingFace — AI models

Built with Electron, React, and WhisperX
Optimized for Apple Silicon

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.claude/commands		.claude/commands
.github/workflows		.github/workflows
python-service		python-service
resources		resources
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DISTRIBUTION_PLAN.md		DISTRIBUTION_PLAN.md
README.md		README.md
components.json		components.json
electron-builder.yml		electron-builder.yml
electron.vite.config.ts		electron.vite.config.ts
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.web.json		tsconfig.web.json

ibimspumo/OpenSub

Folders and files

Latest commit

History

Repository files navigation

OpenSub

What is OpenSub?

The Workflow

Features

AI-Powered Transcription

Professional Subtitle Editing

Advanced Styling

Export Options

System Requirements

Supported Systems

Installation

Standalone App (Recommended)

What's Included

First Transcription

Development Setup

Prerequisites

Clone and Install

Development Mode

Building for Production

Quick Build (Development Testing)

Full Standalone Build

Available Scripts

Project Structure

Python Environment Details

Architecture

Process Architecture

Main Process (src/main/)

Renderer Process (src/renderer/)

Python Service (python-service/)

IPC Communication

Data Flow: Transcription

Data Flow: Export

Tech Stack

Keyboard Shortcuts

Supported Formats

Contributing

How to Contribute

Code Style

Good First Issues

Reporting Bugs

Feature Requests

License

What This Means

Third-Party Licenses

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Uh oh!

Languages

Main Process (`src/main/`)

Renderer Process (`src/renderer/`)

Python Service (`python-service/`)

Packages