-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[Bugfix] fix fuse_allreduce_rms when tp =1 #30178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a bugfix in AllReduceFusionPass to prevent a potential AttributeError in is_applicable_for_range. The change adds checks to ensure the pass is not applied if it has been disabled during initialization or if a required attribute (max_token_num) is missing. This correctly handles cases where the pass initialization exits early. While the fix is effective, I've noted a small redundancy in the added checks that could be simplified for better code clarity and maintainability.
| if not hasattr(self, "max_token_num"): | ||
| logger.warning_once( | ||
| "AllReduce fusion pass missing max token bound; skipping", | ||
| ) | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check for max_token_num appears to be redundant. A review of the __init__ method for AllReduceFusionPass shows that self.disabled is only False if the initialization completes successfully, which includes setting self.max_token_num. In all early-exit scenarios, self.disabled remains True, and self.max_token_num is not set. Therefore, the initial check for self.disabled on line 1191 is sufficient to guard against the AttributeError. Removing this redundant block will make the code cleaner and easier to reason about.
|
Thanks, merging! |
Purpose
Fix #24252 (comment)
cc @ProExpertProg @hjjq
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.