LLaMA-Factory FP8 training environment for NVIDIA Hopper GPUs. Fixes common configuration issues causing 2x slowdown with FP8 mixed precision.
-
Updated
Nov 16, 2025 - Python
LLaMA-Factory FP8 training environment for NVIDIA Hopper GPUs. Fixes common configuration issues causing 2x slowdown with FP8 mixed precision.
GEN3C: Generative Novel 3D Captions - Adapted for NVIDIA Blackwell GPU architecture (sm_120). Includes automatic GPU detection, CPU-based T5 text encoding for Blackwell compatibility, and full backward compatibility with older GPUs.
Add a description, image, and links to the transformer-engine topic page so that developers can more easily learn about it.
To associate your repository with the transformer-engine topic, visit your repo's landing page and select "manage topics."