Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Welcome to the official code repository for "Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs".

Your star means a lot to us in developing this project! ⭐⭐⭐

📰 News

[2025/10/15] 🔥 We release the code for quantizing dLLMs!
[2025/08/20] 🚀 Our paper is available on arXiv!

👀 Introduction

We present the first systematic study on quantizing diffusion-based language models (dLLMs).
This repository implements state-of-the-art post-training quantization (PTQ) methods for dLLM, including GPTQ, AWQ, SmoothQuant, QuaRot, and DuQuant.
We comprehensively investigate the impact of quantization on dLLMs across four key dimensions: bit-width, quantization method, task category, and model architecture.

🔧 Installation

conda create -n qdlm python=3.10 -y
conda activate qdlm
git clone https://github.com/FelixMessi/QDLM
pip install --upgrade qdlm 
pip install -r requirements.txt
pip install math-verify==0.8.0 antlr4-python3-runtime==4.11.0 sympy==1.14.0
cd ./lm-evaluation-harness && pip install -e .

To run evaluation for QuaRot, please download and install the fast-hadamard-transform with your cuda version.

⚙️ Usage

Please refer to the scripts folder for running different weight-only quantization methods (AWQ, GPTQ) and weight–activation quantization methods (SmoothQuant, QuaRot, DuQuant).

Please download the LLaDA-base/LLaDA-Instruct or Dream models and replace the MODEL_PATH with your specific paths.

Detailed usage instructions are provided in the corresponding shell scripts.

📂 Contact

If you have further questions, please open an issue or contact haokun.lin@cripac.ia.ac.cn or xuhb2001@gmail.com.

Discussions and potential collaborations are also welcome.

🙏 Acknowledgement

This repo is built upon the following projects: AutoGPTQ, AWQ, QuaRot, DuQuant, and lm-eval.

We thank the authors for their codes.

📝 Citation

Please cite our work if you use our code or discuss our findings in your own research:

@article{lin2025quantization,
  title={Quantization meets dllms: A systematic study of post-training quantization for diffusion llms},
  author={Lin, Haokun and Xu, Haobo and Wu, Yichen and Guo, Ziyu and Zhang, Renrui and Lu, Zhichao and Wei, Ying and Zhang, Qingfu and Sun, Zhenan},
  journal={arXiv preprint arXiv:2508.14896},
  year={2025}
}

🧠 Related Work

Explore our additional research on Post-training Quantization and Network Pruning:

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
AutoGPTQ		AutoGPTQ
DuQuant		DuQuant
QuaRot		QuaRot
llm-awq		llm-awq
lm-evaluation-harness		lm-evaluation-harness
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

📰 News

👀 Introduction

🔧 Installation

⚙️ Usage

📂 Contact

🙏 Acknowledgement

📝 Citation

🧠 Related Work

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

FelixMessi/QDLM

Folders and files

Latest commit

History

Repository files navigation

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

📰 News

👀 Introduction

🔧 Installation

⚙️ Usage

📂 Contact

🙏 Acknowledgement

📝 Citation

🧠 Related Work

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages