VTT to TXT Converter

A Go tool that converts WebVTT (.vtt) subtitle files to simplified text format, combining consecutive speaker entries and removing metadata.

Features

Removes WEBVTT header and UUID metadata
Extracts speaker names from voice tags
Simplifies timestamps (HH:MM:SS format, no milliseconds)
Combines consecutive entries from the same speaker
Sorts cues by timestamp (handles overlapping times)
Decodes HTML entities
Support for stdin/stdout piping

Quick Start

# Download and install
sudo wget -O /usr/local/bin/vtt2txt https://github.com/gbm-dev/vtt2txt/releases/latest/download/vtt2txt-linux-amd64
sudo chmod +x /usr/local/bin/vtt2txt

# Use it
vtt2txt recording.vtt

Installation

Download Pre-built Binary

sudo wget -O /usr/local/bin/vtt2txt https://github.com/gbm-dev/vtt2txt/releases/latest/download/vtt2txt-linux-amd64
sudo chmod +x /usr/local/bin/vtt2txt

Or download from the Releases page.

Build from Source

git clone https://github.com/gbm-dev/vtt2txt.git
cd vtt2txt
go build -o vtt2txt
sudo cp vtt2txt /usr/local/bin/

Usage

# Auto-generate output filename (creates recording.txt)
vtt2txt recording.vtt

# Specify output file
vtt2txt recording.vtt transcript.txt

# Pipe to another command
vtt2txt recording.vtt | less

# Show help
vtt2txt --help

# Show version
vtt2txt --version

Batch Processing

Process multiple files at once using bash:

# Process all .vtt files in parallel
for f in *.vtt; do vtt2txt "$f" & done; wait

# Controlled parallelism (4 at a time)
ls *.vtt | xargs -P 4 -I {} vtt2txt {}

# Using GNU parallel (if installed)
parallel vtt2txt ::: *.vtt

Each file automatically outputs to its corresponding .txt file (e.g., meeting.vtt → meeting.txt).

Output Format

Speaker Name (HH:MM:SS - HH:MM:SS):
Combined text from all consecutive entries by this speaker.

Next Speaker (HH:MM:SS - HH:MM:SS):
Their combined text...

Example

Input (WebVTT):

WEBVTT

uuid-identifier-123/0
00:00:10.500 --> 00:00:15.750
<v John Doe>Hello everyone, welcome to the meeting.</v>

uuid-identifier-123/1
00:00:15.750 --> 00:00:20.100
<v John Doe>Today we'll discuss the project timeline.</v>

Output (Text):

John Doe (00:00:10 - 00:00:20):
Hello everyone, welcome to the meeting. Today we'll discuss the project timeline.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
parser		parser
processor		processor
writer		writer
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VTT to TXT Converter

Features

Quick Start

Installation

Download Pre-built Binary

Build from Source

Usage

Batch Processing

Output Format

Example

License

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

gbm-dev/vtt2txt

Folders and files

Latest commit

History

Repository files navigation

VTT to TXT Converter

Features

Quick Start

Installation

Download Pre-built Binary

Build from Source

Usage

Batch Processing

Output Format

Example

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages