Skip to content

nikhilsab/LLMFE

Repository files navigation

LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers

arXiv Hugging Face

Official implementation of LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers.

📄 Overview

LLM-FE is a novel framework that leverages Large Language Models (LLMs) as evolutionary optimizers to automate feature engineering for tabular datasets. LLM-FE iteratively generates and refines features using structured prompts, selecting high-impact transformations based on model performance. This approach enables the discovery of interpretable and high-quality features, enhancing the performance of various machine learning models across diverse classification and regression tasks.

⚙️ Installation

To run the code, create a conda environment and install the dependencies using requirements.txt:

conda create -n llmfe python=3.11.7
conda activate llmfe
pip install -r requirements.txt

🔧 Usage

In run_llmfe.sh file, set the OPENAI API key under

export API KEY = <ENTER YOUR API KEY>

To run the LLM-FE pipeline on a sample dataset:

bash run_llmfe.sh

📝 Citation

@article{abhyankar2025llm,
  title={LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers},
  author={Abhyankar, Nikhil and Shojaee, Parshin and Reddy, Chandan K},
  journal={arXiv preprint arXiv:2503.14434},
  year={2025}
}

📄 License

This repository is licensed under MIT licence.

This work is built on top of other open source projects like FunSearch and LLM-SR. We thank the original contributors of these works for open-sourcing their valuable source codes.

📬 Contact Us

For any questions or issues, you are welcome to open an issue in this repo, or contact us at nikhilsa@vt.edu and parshinshojaee@vt.edu.

About

This is the official repo for the paper "LLM-FE"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published