-
Notifications
You must be signed in to change notification settings - Fork 1.4k
sched/wdog: Fix-up for SMP and Improvements. #17640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
acassis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fix-Point really good description and explanation about the issue! Kudos!!
4d539bf to
87f046e
Compare
This commit Removed wd_recover. Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit reduced 1 write operation for wdog deleting and decoupled the WDOG_ISACTIVE with the list implementation. Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit removed the workaround for wdog latency. Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
…ection).
If the wdog use the fine-grained spin-lock, and allow the callback execution without the lock held, there will be incorrect interleaving.
E.g. The first `nxsem_timeout` callback function caused the second semaphore wait to fail.
Core 0 [nxsem_clockwait] | Core 1
enter_critical_section() | ...
wd_start(nxsem_timeout) | ...
nxsem_wait(sem) | wd_expiration() --> nxsem_timeout
wd_cancel(&rtcb->waitdog) | try enter_critical_section()
leave_critical_section() | Failed retry...
....nxsem_clockwait | Failed retry...
enter_critical_section() | Failed retry...
wd_start(nxsem_timeout) | Failed retry...
nxsem_wait(sem) | Core 1 enter the critical section
| nxsem_wait_irq(wtcb, ETIMEDOUT) -> incorrectly wake-up the rtcb.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit simplified wd_timer() to reduce WCET when `noswitches` is false. Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
87f046e to
c9dbe72
Compare
|
Great improvements for wdog. and I really hope we can work together to improve scheduler and hrtimer separately with you focusing on scheduler related and I focusing on hrtimer related. You can refer to #17642 , where I believe the issues you raised regarding the current hrtimer behavior in SMP mode have been addressed. |
c9dbe72 to
ebc4ea4
Compare
anchao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was critical section changed back from spinlock? Weren't you optimizing SMP performance?
This commit fixed the MISRA C-2004 violation rule 11.1, 10.4 and more. Signed-off-by: jiangtao16 <jiangtao16@xiaomi.com> Signed-off-by: wangzhi16 <wangzhi16@xiaomi.com> Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
ebc4ea4 to
e685cc1
Compare
There's a functional correctness issue using spinlock.
|
This commmit simplified the code and correct the critical_section. Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
e685cc1 to
7012342
Compare
anchao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary
This PR is part III of the #17556 and based on #17584, it primarily introduces fix-up and improvements to the WDOG module, including the following key changes:
wd_recoverinterface to simplify the WDOG module, reduce coupling, and decrease code size.list_delete_fastinterface to improve WDOG deletion performance. Theoretically, this saves at least 1 write operation on the path where a WDOG needs to be deleted, and 2 write operations when a WDOG needs to be reinserted.enter_critical_section) to fix concurrency-related bugs.wd_timerto reduce performance overhead and Worst-Case Execution Time (WCET) for timeout handling.Special attention is required regarding why we reverted the fine-grained WDOG spin-lock to the big kernel lock (
enter_critical_section).In the earliest implementation, the WDOG had only two states:
%%{ init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#FFFFFF', 'primaryTextColor' : '#000000', 'mermiad-container': "#FFFFFF", 'primaryBorderColor': '#000000', 'lineColor': '#000000', 'secondaryColor': '#FFFFFF', 'tertiaryColor': '#000000' }, 'sequence': { 'mirrorActors': false } } }%% stateDiagram-v2 WDOG_INACTIVE --> WDOG_INACTIVE : wd_cancel WDOG_INACTIVE --> WDOG_ACTIVE : wd_start WDOG_ACTIVE --> WDOG_ACTIVE : wd_start WDOG_ACTIVE --> WDOG_INACTIVE : wd_cancel or callback finishedFigure 1. Correct WDOG state diagram.
This implementation was functionally correct in an SMP environment. However, after switching to a spin-lock and relaxing the critical section, the WDOG entered three states:
%%{ init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#FFFFFF', 'primaryTextColor' : '#000000', 'mermiad-container': "#FFFFFF", 'primaryBorderColor': '#000000', 'lineColor': '#000000', 'secondaryColor': '#FFFFFF', 'tertiaryColor': '#000000' }, 'sequence': { 'mirrorActors': false } } }%% stateDiagram-v2 WDOG_INACTIVE|private --> WDOG_ACTIVE|shared : wd_start WDOG_ACTIVE|shared --> WDOG_INACTIVE|shared : execute callback WDOG_INACTIVE|shared --> WDOG_ACTIVE|shared : wd_start in callback to restart WDOG_INACTIVE|shared --> WDOG_INACTIVE|private : callback finished WDOG_ACTIVE|shared --> WDOG_INACTIVE|private : wd_cancel WDOG_INACTIVE|shared --> WDOG_INACTIVE|shared : wd_cancel failedFigure 2. Incorrect WDOG state diagram in current implementation.
After modifying the critical section, the execution of the WDOG callback function is no longer atomic. This inevitably introduces a new state:
WDOG_INACTIVE | shared. In this state, even thoughwdog->func == NULL, its ownership does not belong to the user thread, as shown in Figure 2. If the user repurposes the WDOG memory space at this moment, a race condition occurs.Race Condition Example:
wd_cancel, readswdog->func == NULL, mistakenly assumes the WDOG is in theWDOG_INACTIVE | privatestate, and returns directly. The user thread then callsmemsetto set the WDOG memory space to0x55aaaa55.wd_start, reads a non-NULLwdog->func(0x55aaaa55), and attempts to remove the WDOG.0x55aaaa55, causing a memory access fault.To address this issue, a straightforward idea is to call
wd_cancelbefore using the WDOG to ensure it enters theWDOG_INACTIVE | privatestate. However, even this simple approach is not easily achievable. Closer inspection of the state diagram reveals:wd_cancelin theWDOG_ACTIVE | sharedstate can the WDOG return to the user-availableWDOG_INACTIVE | privatestate.WDOG_INACTIVE | sharedstate,wd_cancelfails. If the user thread mistakenly believes the WDOG is in theWDOG_INACTIVE | privatestate and requeues the WDOG, a race condition still occurs, leading to runtime errors.In this scenario, to ensure WDOG concurrency correctness, it becomes necessary to loop
wd_canceluntil it succeeds. However, this looping approach introduces the following problems:WDOG_INACTIVE | sharedWDOG_ACTIVE | sharedWDOG_INACTIVE | sharedas a periodic timer. Ifwd_cancelis called precisely when the WDOG is in theWDOG_INACTIVE | sharedstate (executing the callback),wd_cancelmight fail indefinitely. In other words,wd_cancelcan starve, violating the system's real-time and deterministic properties.WDOG_INACTIVE | sharedWDOG_ACTIVE | sharedtransition), using the WDOG still requires waiting for the callback to finish. However, the current implementation lacks such a waiting mechanism.Explanation of other related points:
Why not improve the state machine based on the new fine-grained spin-lock?
After introducing the third state
WDOG_INACTIVE | shared, we could enhance the state machine to achieve the following:wd_cancel.wd_cancel. Consequently, we would have to reimplement all code that uses WDOGs.For example, under the asynchronous
wd_cancelsemantics the following problematic thread interleaving could still occur:wd_start/wd_cancelare called. Reviewing and rewriting all related logic would clearly be impractical.Why not use a recursive spin-lock (
rspin_lock) but instead revert to a big kernel lock?Because it could lead to circular waiting and deadlock:
tcb->waitdogis restarted within a critical section.Why is the most time-consuming operation,
nxsched_reassess_timer(), also placed within the critical section?Impact
This change will affect the WDOG module's code size, performance, critical section entry/exit time, and the WCET (Worst-Case Execution Time) of timeout handling.
wd_recovershould result in an overall reduction in code size.list_delete_fastand replacing the original WDOG deletion interface reduces code size and slightly improves performance.wd_timeroptimizes the performance and WCET of timer timeout handling, reverting the spin-lock to a big-kernel-lock significantly increases the execution time within the critical section. However, this sacrifice is necessary for functional correctness. Without functional correctness, high performance is meaningless. Once the HRTimer is integrated, we will gradually replace WDOG calls with hrtimer and apply optimizations to minimize the size of WDOG's critical sections as much as possible.Testing
We ran
ostestonrv-virt:smpto verify functional correctness, withCONFIG_SCHED_TICKLESS=yandCONFIG_RR_INTERVAL=0enabled.We also launched 4 instances and executed
ostest64 times in total. All test results passed.