Skip to content

berkaycubuk/content-reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Content Reader

A lightweight CLI tool that converts web page content to audio using Kokoro TTS. Simply provide a URL, and the tool fetches the content, extracts readable text, and generates an MP3 audio file.

Features

  • Fetches and extracts readable content from web pages
  • Handles large articles by intelligently chunking text (~1500 characters per chunk)
  • Uses Kokoro TTS API with af_heart voice
  • Combines multiple audio chunks into a single MP3 file
  • Generates timestamped filenames for easy organization
  • Comprehensive unit tests for all components

Prerequisites

  • Go 1.21 or higher
  • Kokoro TTS running locally at http://localhost:8880
  • ffmpeg (for combining audio chunks)

Installing ffmpeg

macOS:

brew install ffmpeg

Ubuntu/Debian:

sudo apt-get install ffmpeg

Windows: Download from ffmpeg.org

Installation

# Clone the repository
git clone <repository-url>
cd content-reader

# Install dependencies
go mod download

# Build the application
go build -o content-reader

Usage

./content-reader <url>

Examples

# Convert a blog post to audio
./content-reader "https://steipete.me/posts/just-talk-to-it"

# Convert a Wikipedia article
./content-reader "https://en.wikipedia.org/wiki/Go_(programming_language)"

# Convert another blog post
./content-reader "https://simonwillison.net/2026/Feb/7/claude-fast-mode/#atom-everything"

The tool will:

  1. Fetch the webpage content
  2. Extract readable text (removing navigation, ads, etc.)
  3. Split text into manageable chunks
  4. Generate speech for each chunk using Kokoro TTS
  5. Combine all chunks into a single MP3 file
  6. Save the output with a filename based on the URL and timestamp

Architecture

The project is organized into focused packages:

  • fetcher - Handles HTTP requests with proper headers
  • extractor - Extracts readable content from HTML using go-readability
  • chunker - Splits large text into optimal-sized chunks for TTS processing
  • tts - Communicates with Kokoro TTS API
  • main - CLI interface that orchestrates the workflow

Testing

Run all unit tests:

# Run all tests
go test ./... -v

# Run tests for a specific package
go test ./fetcher -v
go test ./extractor -v
go test ./chunker -v
go test ./tts -v

# Run tests with coverage
go test ./... -cover

All tests are passing with comprehensive coverage of core functionality.

Configuration

The tool uses the following defaults:

  • Kokoro TTS URL: http://localhost:8880
  • Voice: af_heart
  • Chunk Size: 1500 characters
  • Output Format: MP3

These can be modified in the source code if needed.

Known Issues

  • Some heading titles may be missing in the generated audio

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages