PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
-
Updated
Dec 10, 2025 - Python
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取
An local, offline (after initial setup), portable OCR software that can process images and PDF files, using DeepSeek-OCR AI (running directly on your machine).
Un-LOCC: Universal Lossy Optical Context Compression for Vision-Based Language Models Achieve nearly 3x token compression at over 93% retrieval accuracy using existing Vision-Language Models.
🐊 Snappy's unique approach unifies vision-language late interaction with structured OCR for region-level knowledge retrieval. Like the project? Drop a star! ⭐
An out-of-the-box local Web UI for DeepSeek-OCR. Built with FastAPI + Vue.js, it supports PDF/Image uploads, progress tracking, and result visualization with bounding boxes. Easily experience the power of a top-tier OCR model.
Multi-tenant RAG API powered by LightRAG/RAG-Anything. Auto-selects best parser (DeepSeek-OCR/MinerU/Docling) via complexity scoring
A Windows-based screenshot OCR utility powered by DeepSeek-OCR. This tool allows users to quickly capture screen regions and perform high-accuracy Optical Character Recognition (OCR) directly on the captured image, leveraging the powerful DeepSeek-OCR model. It supports local model deployment and features real-time model output streaming.
A monorepo containing various utility scripts, tools, and applications for development, automation, and AI-powered tasks.
A Gradio-based demo application for comparing state-of-the-art OCR models: DeepSeek-OCR, Dots.OCR, HunyuanOCR, and Nanonets-OCR2-3B.
Host your own DeepSeek OCR in easy way through modal serverless compute
Self hosting your own DeepSeek OCR model in AWS
Here is a way to self host the Deep Seek OCR model in AWS without Bedrock. This allows for you to run OCR jobs at the scale you need without the limits of token costs.
A Gradio-powered web interface for performing advanced OCR tasks using the DeepSeek-OCR model. This experimental app leverages Hugging Face Transformers to process images for text extraction, document conversion, figure parsing, and object localization.
A high-performance highly-customizable reverse OCR tool that renders text or huggingface-compatible datasets to images. Dimension, DPI, CSS configurable!
modal.com deployment of deepseek ocr as a fastapi serverless app
AI-powered document ingestion and knowledge base API using DeepSeek-OCR.
This repository provides a fully containerized development setup for running DeepSeek-OCR with GPU acceleration.
Add a description, image, and links to the deepseek-ocr topic page so that developers can more easily learn about it.
To associate your repository with the deepseek-ocr topic, visit your repo's landing page and select "manage topics."