fix(): update CR with all tunnel status fields including latency#454
Merged
rajendra-avesha merged 1 commit intomasterfrom Nov 6, 2025
Merged
fix(): update CR with all tunnel status fields including latency#454rajendra-avesha merged 1 commit intomasterfrom
rajendra-avesha merged 1 commit intomasterfrom
Conversation
Rahul-D78
approved these changes
Nov 5, 2025
908c2e5 to
9e02b28
Compare
Enhanced isGWPodStatusChanged to compare ALL TunnelStatus fields: - Latency: Gateway latency in milliseconds - RxRate/TxRate: Throughput metrics - PacketLoss: Packet loss percentage - RemoteIP/LocalIP: Tunnel endpoint IPs - IntfName: Interface name - TunnelState: Tunnel state string Signed-off-by: Rajendra <rajendra@aveshasystems.com>
9e02b28 to
8984466
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix(slicegateway): update CR with all tunnel status fields including latency
Enhanced isGWPodStatusChanged to compare ALL TunnelStatus fields:
Description
This PR fixes a critical bug where gateway metrics (latency, packet loss, throughput) were not being updated in the SliceGateway Custom Resource after the initial tunnel establishment.
Problem
The
isGWPodStatusChanged()function incontrollers/slicegateway/utils.gowas only checking two fields:TunnelStatus.Status(tunnel state: UP/DOWN)PeerPodName(remote gateway pod name)This caused the function to return "unchanged" even when critical metrics like latency, packet loss, and throughput changed. As a result, the SliceGateway CR was only updated when the tunnel state changed (UP↔DOWN), effectively freezing all metrics after the initial tunnel establishment.
Root Cause
The function returns
trueif these fields are unchanged, which when negated inisGatewayStatusChanged(), prevents CR updates. Since tunnel status rarely changes after being UP, metrics were never updated.Solution
Enhanced the function to compare ALL TunnelStatus fields:
Now the CR updates whenever ANY metric changes, ensuring fresh monitoring data every reconciliation cycle (120 seconds).
Impact
Before Fix:
After Fix:
Fields Now Being Monitored
Latency1RxRate8TxRate8PacketLoss0RemoteIP"10.70.255.2"LocalIP"10.70.255.1"IntfName"tun0"TunnelState"UP"Status1Fixes #[issue-number]
How Has This Been Tested?
Test Environment
worker-single-networker-multi-netaveshadev/worker-operator-ent-egs:1.16.0-SNAPSHOT-228fd10cTest Cases
Test Case A: Latency Field Population
Test Case B: Metrics Update Frequency
Test Case C: PacketLoss Tracking
Test Case D: Throughput Metrics
Test Case E: RemoteIP Field
Test Case F: Single-Network Mode
worker-single-netsliceTest Case G: Multi-Network Mode
worker-multi-netsliceTest Case H: Both Worker Clusters
Verification Commands
Test Results Summary
Evidence from Sidecar Logs
Before Fix:
After Fix:
Checklist:
go fmtDoes this PR introduce a breaking change?
NO - This PR does not introduce any breaking changes. It is a bug fix that enhances existing functionality without changing any APIs or interfaces.
All existing deployments will benefit from this fix immediately upon upgrade with no migration or configuration changes required.
Additional Notes
Backward Compatibility
Performance Impact
Monitoring Recommendations
After deploying this fix, users can now:
For real-time metrics (< 120s granularity), continue using Prometheus endpoints on gateway sidecars (port 18080).
Documentation Updates
Documentation should be updated to reflect: