low-vram

Here are 12 public repositories matching this topic...

pheonix-delta / WiredBrain-Hierarchical-Rag

Hierarchical RAG architecture scaling to 693K chunks on consumer hardware (4GB VRAM). Features 3-address routing, hybrid vector+graph fusion, and SetFit classification.

python knowledge-graph semantic-search postgress rag vector-search edge-ai network-routing setfit pgvector advanced-rag graphrag rust-optimization agentic-rag grounded-reasoning low-vram hierarchical-rag anti-hallucination vector-databse

Updated Feb 11, 2026
Python

QKV-Core / QKV-Core

Star

"Adaptive Hybrid Quantization Framework for deploying 7B+ LLMs on low-VRAM devices (e.g., GTX 1050). Features surgical block alignment and Numba-accelerated inference.

python machine-learning cuda inference transformer attention numba quantization deep-tech llm gguf low-vram qkv

Updated Jan 14, 2026
Python

airesearch-official / Z-Image-Turbo-Windows

Star

One-click Windows installer for Z-Image Turbo AI image generation. Optimized for low-VRAM GPUs (4GB+). Features Gradio web UI, automatic setup, and GGUF model support.

image-generation windows-installer one-click-installer ai-tools stable-diffusion ai-image-generation gguf gradio-ui low-vram z-image-turbo

Updated Dec 13, 2025
PowerShell

Raxephion / AuraGen-AuraFlow-WebUI

Star

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

python open-source image-generation webui gradio text-to-image stable-diffusion diffusers local-inference generative-ai ai-image-generator auraflow low-vram

Updated Jun 7, 2025
Python

Cordux / ComfyUI-Wan2.2-workflow

Star

A ComfyUI Workflow for low vram users

work text-to-image video-generation image-to-video stable-diffusion gguf lowvram comfyu low-vram wan22

Updated Jan 26, 2026

Moqabil / ComfyUI-Wan2.2-workflow

Star

🎥 Generate high-quality videos on budget hardware with the Wan 2.2 14B Low-VRAM Workflow for ComfyUI, optimized for smooth performance and quick results.

work text-to-image video-generation image-to-video stable-diffusion gguf lowvram comfyu low-vram wan22

Updated Feb 19, 2026

kelvinweijun / wan-2.2-animate-comfyui-kaggle

Star

Contains the notebooks and workflows configured to run inference from Wan 2.2 Animate with ComfyUI on Kaggle T4 GPUs smoothly

notebook ipython-notebook kaggle video-generation kaggle-notebook i2v low-vram wan22

Updated Nov 27, 2025
Jupyter Notebook

asmarufoglu / local-genai-forge

Star

A privacy-first Generative AI pipeline for prototyping 3D-style game assets on consumer hardware. Optimized for low-VRAM (4GB) GPUs using PyTorch, Diffusers, and Streamlit.

asset-pipeline game-assets stable-diffusion diffusers generative-ai low-vram

Updated Dec 6, 2025
Python

Trenaus / LIA-Cognitive-Engine-Showcase

Star

Technical Showcase: 22B True-MoE Engine running on 6GB VRAM (GTX 1060). Demonstrates "Surgical" NF4 quantization, dynamic expert swapping, and the custom "Grace Hopper" pipeline.

research optimization cuda inference moe custom-kernels llm systolic-array low-vram gtx1060

Updated Jan 8, 2026

A production-ready, frugal, sovereign AI system that orchestrates India's open-source language models to achieve state-of-the-art reasoning on consumer hardware through Test-Time Compute (TTC) and Cognitive Serialization.

python machine-learning optimization tui artificial-intelligence indic-nlp chain-of-thought generative-ai agentic-workflow indic-ai frugal-ai low-vram model-orchestration

Updated Feb 11, 2026
Python

shanevcantwell / prompt-prix

Star

Audit local LLM function calling and agentic reliability. Visual tool-use benchmarking for quantized models on YOUR hardware.

open-source gradio fan-out multi-gpu ai-safety tool-use model-benchmarking llm local-inference function-calling lm-studio llm-evaluation agentic-ai open-weight low-vram constraint-compliance quantization-testing

Updated Feb 12, 2026
Python

mtmatheuus / QKV-Core

Star

🚀 Run modern 7B LLMs on legacy 4GB GPUs without crashes, breaking the VRAM barrier for developers facing GPU limitations.

python machine-learning cuda inference transformer attention numba quantization deep-tech llm gguf low-vram qkv

Updated Feb 19, 2026
Python

Improve this page

Add a description, image, and links to the low-vram topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the low-vram topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

low-vram

Here are 12 public repositories matching this topic...

pheonix-delta / WiredBrain-Hierarchical-Rag

QKV-Core / QKV-Core

airesearch-official / Z-Image-Turbo-Windows

Raxephion / AuraGen-AuraFlow-WebUI

Cordux / ComfyUI-Wan2.2-workflow

Moqabil / ComfyUI-Wan2.2-workflow

kelvinweijun / wan-2.2-animate-comfyui-kaggle

asmarufoglu / local-genai-forge

Trenaus / LIA-Cognitive-Engine-Showcase

harvatechs / Ariv

shanevcantwell / prompt-prix

mtmatheuus / QKV-Core

Improve this page

Add this topic to your repo