ReadonlyFilesystem Condition with Auto-Recovery Detection#1234
ReadonlyFilesystem Condition with Auto-Recovery Detection#1234CharudathGopal wants to merge 1 commit intokubernetes:masterfrom
Conversation
|
Welcome @CharudathGopal! |
|
Hi @CharudathGopal. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: CharudathGopal The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@Random-Liu @sjenning Can you please take a look at this PR |
ISSUE: #474
Problem Statement
The existing NPD SystemLogMonitor for ReadonlyFilesystem is a one-way trigger: it sets the condition to True when detecting
Remounting filesystem read-onlyin kernel logs, but never clears it. The condition remains stuck until NPD pod restart or node reboot, causing some Kubernetes platform to delete the node.Solution: CustomPluginMonitor with Active Recovery Detection
This PR introduces a CustomPluginMonitor (check_ro_filesystem.sh) that actively monitors both detection and recovery of read-only filesystems.
Key Features
NPD Design Adherence & Detection (5-minute lookback)
/dev/kmsgfor "Remounting filesystem read-only" messages/proc/uptime/host/proc/1/mountsRecovery Check (all-time lookback)
/dev/kmsgwith no time limitTargeted Approach (Avoids False Positives)
/proc/mounts(which would flag legitimate read-only mounts like /boot, CD-ROMs, ConfigMaps)Deployment Changes
ReadonlyFilesystem CustomPluginMonitor is not enabled by default, users can enable this feature by following steps documented in
docs/readonly-recovery-plugin-monitor.md--config.system-log-monitor=/config/readonly-monitor.jsonwith--config.custom-plugin-monitor=/config/readonly-recovery-plugin-monitor.json