Thank you for your interest in our work! This repository contains the original implementation of "AIMMerging: Adaptive Iterative Model Merging Using Training Trajectories for Language Model Continual Learning".
Reproducing the results from our paper is straightforward—just follow the steps outlined below.
conda create -n AimMerging python=3.8
conda activate AimMerging
pip install -r requirements.txt
Important: Please ensure the following package versions:
transformers==4.28.1peft==0.4.0
Then, replace the corresponding files in the transformers package (typically located at anaconda_path/envs/AimMerging/lib/python3.8/site-packages/transformers/) with the modified versions of trainer.py and training_args.py.
These modifications are required to support our Adaptive Iterative Model Merging (AIMMerging) framework.
Detailed comments are included in the modified files to help you understand the changes.
Our data preprocessing follows the approach used in O-LoRA. We also provide preprocessed datasets that are ready to use.
Download the required backbone models from Hugging Face:
To fine-tune the models, run the following command. This will also generate predictions for three metrics: Overall Performance (OP), Forward Transfer (FWT), and Backward Transfer (BWT).
./scripts/run_train_ours.shNotes:
- Use the
model_pathargument to specify the location of your downloaded models. - We use LoRA for efficient parameter-efficient fine-tuning.
- Fine-tuned model weights will be saved to
$checkpoint_files. - All the visualized results presented in our paper (Section "Visualization") will be saved in the
./Figfolder for easy access. - The prediction results will be stored in the
$outputfolder.
To calculate the metrics, run:
./src/eval_avgPerf.py
./src/eval_fwt.py
./src/eval_bwt.pyWe hope you find this repository useful! If you encounter any issues or have questions, feel free to open an issue or contact us.
If this work proves beneficial or use our code for your research, citing our paper would be greatly appreciated.
@misc{feng2025aimmerging,
title={AIMMerging: Adaptive Iterative Model Merging Using Training Trajectories for Language Model Continual Learning},
author={Yujie Feng and Jian Li and Xiaoyu Dong and Pengfei Xu and Xiaohui Zhou and Yujia Zhang and Zexin LU and Yasha Wang and Alan Zhao and Xu Chu and Xiao-Ming Wu},
year={2025},
eprint={2509.17348},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2509.17348},
}