How to quantize both vision encoder and llm together?

I am trying to quantize dots_ocr model, whose submodules are ['model','vision_tower','lm_head'], as shown below:

<img width="398" height="82" alt="Image" src="https://github.com/user-attachments/assets/2e416a7b-bead-4a2b-bd02-4289d0d7c5d8" />

By tracing the code down, I found that this [code](https://github.com/ModelCloud/GPTQModel/blob/main/gptqmodel/looper/module_looper.py#L772) is for parsing modules for quantization, but it only return one module. By experiment, I can quantize either `vision_tower` or `model` separately and loading by vllm， but I can't quantize both due to above mechanism. Please help me figure out how to solve this, thanks^_^


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to quantize both vision encoder and llm together? #1998

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to quantize both vision encoder and llm together? #1998

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions