Skip to content

Conversation

@ZJY0516
Copy link
Contributor

@ZJY0516 ZJY0516 commented Dec 6, 2025

Purpose

Fix #24252 (comment)

cc @ProExpertProg @hjjq

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
@ZJY0516 ZJY0516 changed the title [Bugfix] [Bugfix] fix fuse_allreduce_rms when tp =1 Dec 6, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a bugfix in AllReduceFusionPass to prevent a potential AttributeError in is_applicable_for_range. The change adds checks to ensure the pass is not applied if it has been disabled during initialization or if a required attribute (max_token_num) is missing. This correctly handles cases where the pass initialization exits early. While the fix is effective, I've noted a small redundancy in the added checks that could be simplified for better code clarity and maintainability.

Comment on lines 1198 to 1202
if not hasattr(self, "max_token_num"):
logger.warning_once(
"AllReduce fusion pass missing max token bound; skipping",
)
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This check for max_token_num appears to be redundant. A review of the __init__ method for AllReduceFusionPass shows that self.disabled is only False if the initialization completes successfully, which includes setting self.max_token_num. In all early-exit scenarios, self.disabled remains True, and self.max_token_num is not set. Therefore, the initial check for self.disabled on line 1191 is sufficient to guard against the AttributeError. Removing this redundant block will make the code cleaner and easier to reason about.

Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
@ProExpertProg ProExpertProg added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 7, 2025
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
@ProExpertProg ProExpertProg enabled auto-merge (squash) December 7, 2025 16:29
@ProExpertProg
Copy link
Collaborator

Thanks, merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants