Add auto-disable for explorers with unreachable RPCs#437
Conversation
When an explorer's RPC becomes unreachable, the explorerSyncCheck job keeps trying to start PM2 processes that immediately fail. This creates unnecessary load on the PM2 server. This change adds automatic sync deactivation after 3 consecutive failures: - Track sync failures on Explorer model (syncFailedAttempts, syncDisabledAt, syncDisabledReason, nextRecoveryCheckAt) - Auto-disable shouldSync after 3 consecutive RPC failures or PM2 timeouts - Add syncRecoveryCheck job that periodically tests disabled explorers - Use exponential backoff for recovery checks (5m -> 15m -> 1h -> 6h) - Reset failure tracking when sync is manually re-enabled Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
Nitpicks 🔍
|
PR Code Suggestions ✨
|
|
CodeAnt AI finished reviewing your PR. |
- Fix race condition: use atomic increment for syncFailedAttempts - Add null check for explorer.workspace in updateExplorerSyncingProcess - Simplify backoff calculation using recoveryAttempts counter - Add max recovery attempts (10) after which manual intervention required - Add batching (50 explorers) to syncRecoveryCheck job - Add random jitter (up to 2 min) to stagger recovery checks - Remove dead code (resetSyncState method) - Add recoveryAttempts column to migration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The syncRecoveryCheck job was registered in jobs/index.js and scheduled in scheduler.js, but was missing from the priorities list which caused a crash on startup when trying to enqueue to a non-existent queue. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
User description
Summary
Changes
New files:
run/migrations/20260109161142-add-sync-failure-tracking.js- Adds tracking columns to explorers tablerun/jobs/syncRecoveryCheck.js- Periodic recovery check job with exponential backoffrun/tests/jobs/syncRecoveryCheck.test.js- Tests for recovery jobModified files:
run/models/explorer.js- New fields and helper methods for failure trackingrun/jobs/updateExplorerSyncingProcess.js- Track failures and auto-disablerun/jobs/index.js- Register new jobrun/scheduler.js- Schedule recovery job (runs every 5 minutes)How it works
syncFailedAttemptsshouldSync=falsewith reasonsyncRecoveryChecktests disabled explorers and re-enables when RPC is backTest plan
updateExplorerSyncingProcesstests - 16 passedsyncRecoveryChecktests - 5 passed🤖 Generated with Claude Code
CodeAnt-AI Description
Auto-disable explorer sync after repeated RPC or PM2 failures and automatically recover with exponential backoff
What Changed
Impact
✅ Fewer PM2 restarts and related server load✅ Fewer repeated sync attempts against unreachable RPCs✅ Automatic re-enable when an explorer's RPC recovers💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.