A distraction-free, AI-powered article reader that intelligently reconstructs web content using Google Gemini.
CleanReader AI is not just a web scraper—it is an intelligent content reconstruction engine. By leveraging the Gemini 3 Pro model with Google Search Grounding, it bypasses clutter, aggressive ads, and client-side layout shifts to deliver a pristine reading experience.
It utilizes modern web architecture nuances (such as CDN caching of static HTML) to access content that might otherwise be obscured by complex client-side paywalls or popups, reformatting it into clean, readable Markdown.
- AI-Powered Extraction: Uses
gemini-3-pro-previewto reason about page structure and extract main content while ignoring boilerplate. - Search Grounding: Automatically searches for cached versions or alternative sources if the direct URL is difficult to parse.
- Distraction-Free UI: Removes ads, sidebars, and popups.
- Theme Support:
- ☀️ Light Mode: Standard paper-like reading.
- ☕ Sepia Mode: Low-contrast, easy on the eyes.
- 🌙 Dark Mode: High-contrast OLED friendly.
- Source Verification: Provides a list of "Sources & References" used by the AI to verify the content.
- Markdown Rendering: Clean typography using Inter and Merriweather fonts.
- Input: The user provides a URL.
- Analysis: The app sends a prompt to the Google Gemini API.
- Grounding:
- Instead of a standard HTTP fetch (which triggers paywalls/CORS), the AI uses the Google Search Tool.
- It looks for the content in the live index and caches.
- Architectural Insight: Many paywalls are implemented client-side or after HTML generation. CDNs often cache the full content. The AI is instructed to look for these "leaked" versions.
- Reconstruction: The model reconstructs the narrative into structured JSON containing the Title, Author, Summary, and Markdown Body.
- Rendering: The React frontend renders the sanitized Markdown in the user's selected theme.
- Frontend: React 19, TypeScript
- Styling: Tailwind CSS
- AI SDK: Google GenAI SDK (
@google/genai) - Model:
gemini-3-pro-preview - Icons: Lucide React
- Node.js (v18+)
- A Google Gemini API Key (Must have access to
gemini-3-pro-previewand Paid Tier for Search Grounding).
-
Clone the repository
git clone https://github.com/yourusername/cleanreader-ai.git cd cleanreader-ai -
Install dependencies
npm install
-
Environment Setup Create a
.envfile in the root directory:API_KEY=your_google_gemini_api_key_here
-
Run the application
npm start
This tool is intended for educational purposes and personal use to improve accessibility and reading experience. It demonstrates how Large Language Models (LLMs) can interact with web content. Users should respect the terms of service of the websites they visit and support content creators.
This project is licensed under the MIT License - see the LICENSE file for details.