Bitsandbytes quantization extension

Hi,
thanks for sharing the code. I have tryed to use your repo using `bitsandbytes` for model quantization. Unfortunately, the training process does not work: the layers defined in `modelling_llama.py` as 
```
        self.dropout = nn.Dropout(classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)
```
do not get trained, and after finetuning they contain only `nan`values. I guess it is a data type conflict, as the hidden layers are loaded in 4/8 bits, while the classifier is still saved in memory as float16... Any clue/plan on how to fix that?

 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bitsandbytes quantization extension #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bitsandbytes quantization extension #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions