-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Feature Election algorithm implementation using the Flower framework #6182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @christofilojohn, thanks for creating this PR! The example looks very promising. I have a few general comments to help get it merged into the Flower main branch:
Thanks again for your contribution! Don’t hesitate to reach out if you have any questions or concerns—I’m more than happy to help, especially with the migration to the Message API. |
Thank you for your comments and feedback. |
|
Hi Ioannis @christofilojohn, maybe the |
|
Yes I will fix it thank you. |
|
I will have the updated file structure and messaging API uploaded soon, but I have a question. |
Hi Ioannis @christofilojohn , we don't need to include licensing headers in the files, instead we have a |
|
Thank you for your comment, I changed it and it will be there on my next commit (doing some tests). I would like to ask if it’s ok to add my email and the conference name, with pending full citation to the readme, as the full paper url is not yet available, and if I can update on availability, |
Absolutely! Apart from the readme, you can also add your name and email in |
|
Perfect, thanks for the info. I uploaded a new version based on the Message API (feel free to comment on anything) and added an optional auto tuning method of the 'freedom degree' hyperparameter based on hill-climbing algorithm. The core behavior of the algorithm was also changed a bit, to perform feature election, decide on a global feature mask and then continue on normal FL aggregation rounds - to mirror our implementation in the paper. The README and tests are also updated. |
|
Hello, @yan-gao-GY I uploaded a new version that passes black, ilint and mypy, because I saw that similar tests were on the workflow file. Please let me know if everything is ok or if I need to make any changes, |
yan-gao-GY
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @christofilojohn Thanks a lot for the update, and sorry for the delay! I've just made some suggested changes mainly for readme.
For pyproject.toml, we can remove the part of tool.pytest, tool.mypy and tool.isort. Instead, we can use Flower build-in test tool to format the code. Could you run ./dev/format.sh and ./dev/test.sh under flower root directory?
Co-authored-by: Yan Gao <yan@flower.ai>
Co-authored-by: Yan Gao <yan@flower.ai>
Co-authored-by: Yan Gao <yan@flower.ai>
Co-authored-by: Yan Gao <yan@flower.ai>
Co-authored-by: Yan Gao <yan@flower.ai>
Co-authored-by: Yan Gao <yan@flower.ai>
Co-authored-by: Yan Gao <yan@flower.ai>
|
Perfect, I accepted the suggested changes, removed the formatting tools from the project and after running the format script and edinting the Readme to remove special characters, the local tests pass, with only a deprecation warning on the script: site-packages/beautysh.py:7: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. Everything else is fine in the latest commit. |
|
Hello, @yan-gao-GY Please let me know if there are any more tests for me to run or changes to do for the pull request to proceed, |
Issue
Description
This work originates from FLASH: A framework for Federated Learning with Attribute Selection and Hyperparameter optimization, a work presented at FLTA IEEE 2025 achieving the Best Student Paper Award.
Feature Election enables multiple clients with tabular datasets to collaboratively identify the most relevant features without sharing raw data. It works by using conventional feature selection algorithms on the client side and performing a weighted aggregation of their results.
Related issues/PRs
Proposal
Explanation
This PR introduces a complete Feature Election workflow including:
Custom Strategy (FeatureElectionStrategy): Implements the core aggregation logic using a "Freedom Degree" to balance between Feature Intersection (strict consensus) and Feature Union.
Modular Feature Selection: A FeatureSelector utility supporting multiple methods including Lasso, Random Forest, and PyImpetus (Markov Blanket).
Synthetic Data Generation: A task.py module that generates synthetic data with consistent informative features across clients (fixed random seed) while allowing for non-IID partitioning, ensuring valid consensus is mathematically possible.
Checklist
#contributions)