Skip to content

maxxlife/cleanreader-ai

Repository files navigation

CleanReader AI 📖

A distraction-free, AI-powered article reader that intelligently reconstructs web content using Google Gemini.

License React Gemini TypeScript

🚀 Overview

CleanReader AI is not just a web scraper—it is an intelligent content reconstruction engine. By leveraging the Gemini 3 Pro model with Google Search Grounding, it bypasses clutter, aggressive ads, and client-side layout shifts to deliver a pristine reading experience.

It utilizes modern web architecture nuances (such as CDN caching of static HTML) to access content that might otherwise be obscured by complex client-side paywalls or popups, reformatting it into clean, readable Markdown.

✨ Features

  • AI-Powered Extraction: Uses gemini-3-pro-preview to reason about page structure and extract main content while ignoring boilerplate.
  • Search Grounding: Automatically searches for cached versions or alternative sources if the direct URL is difficult to parse.
  • Distraction-Free UI: Removes ads, sidebars, and popups.
  • Theme Support:
    • ☀️ Light Mode: Standard paper-like reading.
    • Sepia Mode: Low-contrast, easy on the eyes.
    • 🌙 Dark Mode: High-contrast OLED friendly.
  • Source Verification: Provides a list of "Sources & References" used by the AI to verify the content.
  • Markdown Rendering: Clean typography using Inter and Merriweather fonts.

🛠️ How It Works

  1. Input: The user provides a URL.
  2. Analysis: The app sends a prompt to the Google Gemini API.
  3. Grounding:
    • Instead of a standard HTTP fetch (which triggers paywalls/CORS), the AI uses the Google Search Tool.
    • It looks for the content in the live index and caches.
    • Architectural Insight: Many paywalls are implemented client-side or after HTML generation. CDNs often cache the full content. The AI is instructed to look for these "leaked" versions.
  4. Reconstruction: The model reconstructs the narrative into structured JSON containing the Title, Author, Summary, and Markdown Body.
  5. Rendering: The React frontend renders the sanitized Markdown in the user's selected theme.

💻 Tech Stack

  • Frontend: React 19, TypeScript
  • Styling: Tailwind CSS
  • AI SDK: Google GenAI SDK (@google/genai)
  • Model: gemini-3-pro-preview
  • Icons: Lucide React

🚀 Getting Started

Prerequisites

  • Node.js (v18+)
  • A Google Gemini API Key (Must have access to gemini-3-pro-preview and Paid Tier for Search Grounding).

Installation

  1. Clone the repository

    git clone https://github.com/yourusername/cleanreader-ai.git
    cd cleanreader-ai
  2. Install dependencies

    npm install
  3. Environment Setup Create a .env file in the root directory:

    API_KEY=your_google_gemini_api_key_here
  4. Run the application

    npm start

⚠️ Disclaimer

This tool is intended for educational purposes and personal use to improve accessibility and reading experience. It demonstrates how Large Language Models (LLMs) can interact with web content. Users should respect the terms of service of the websites they visit and support content creators.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Read online text articles that you want without a subscription

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published