A Go tool that converts WebVTT (.vtt) subtitle files to simplified text format, combining consecutive speaker entries and removing metadata.
- Removes WEBVTT header and UUID metadata
- Extracts speaker names from voice tags
- Simplifies timestamps (HH:MM:SS format, no milliseconds)
- Combines consecutive entries from the same speaker
- Sorts cues by timestamp (handles overlapping times)
- Decodes HTML entities
- Support for stdin/stdout piping
# Download and install
sudo wget -O /usr/local/bin/vtt2txt https://github.com/gbm-dev/vtt2txt/releases/latest/download/vtt2txt-linux-amd64
sudo chmod +x /usr/local/bin/vtt2txt
# Use it
vtt2txt recording.vttsudo wget -O /usr/local/bin/vtt2txt https://github.com/gbm-dev/vtt2txt/releases/latest/download/vtt2txt-linux-amd64
sudo chmod +x /usr/local/bin/vtt2txtOr download from the Releases page.
git clone https://github.com/gbm-dev/vtt2txt.git
cd vtt2txt
go build -o vtt2txt
sudo cp vtt2txt /usr/local/bin/# Auto-generate output filename (creates recording.txt)
vtt2txt recording.vtt
# Specify output file
vtt2txt recording.vtt transcript.txt
# Pipe to another command
vtt2txt recording.vtt | less
# Show help
vtt2txt --help
# Show version
vtt2txt --versionProcess multiple files at once using bash:
# Process all .vtt files in parallel
for f in *.vtt; do vtt2txt "$f" & done; wait
# Controlled parallelism (4 at a time)
ls *.vtt | xargs -P 4 -I {} vtt2txt {}
# Using GNU parallel (if installed)
parallel vtt2txt ::: *.vttEach file automatically outputs to its corresponding .txt file (e.g., meeting.vtt → meeting.txt).
Speaker Name (HH:MM:SS - HH:MM:SS):
Combined text from all consecutive entries by this speaker.
Next Speaker (HH:MM:SS - HH:MM:SS):
Their combined text...
Input (WebVTT):
WEBVTT
uuid-identifier-123/0
00:00:10.500 --> 00:00:15.750
<v John Doe>Hello everyone, welcome to the meeting.</v>
uuid-identifier-123/1
00:00:15.750 --> 00:00:20.100
<v John Doe>Today we'll discuss the project timeline.</v>
Output (Text):
John Doe (00:00:10 - 00:00:20):
Hello everyone, welcome to the meeting. Today we'll discuss the project timeline.
MIT