-
Notifications
You must be signed in to change notification settings - Fork 0
Fix content skipping during fetch operation #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
luandro
wants to merge
8
commits into
main
Choose a base branch
from
claude/fix-fetch-content-skip-017nBdSMUrrc3CmEEDYbBTxu
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Fix content skipping during fetch operation #99
luandro
wants to merge
8
commits into
main
from
claude/fix-fetch-content-skip-017nBdSMUrrc3CmEEDYbBTxu
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Investigate and document root causes for content being skipped during fetch: 1. Dead code: filterChangedPages imported but never used 2. Logic discrepancy: missing path change detection in filterChangedPages 3. Sub-page timeout issues: 10s timeout too aggressive 4. Insufficient logging: hard to diagnose skip reasons Plan includes: - Phase 1: Immediate fixes (detailed logging, increased timeout) - Phase 2: Core logic fixes (update filterChangedPages function) - Phase 3: Diagnostics (cache validation command) Related to incremental sync feature added in 2f2a47a (2025-11-26) Estimated timeline: 6-10 hours
…hase 1) Phase 1 of fixing issue #95: Content being skipped during fetch Changes: 1. Add detailed skip reason logging in generateBlocks.ts - Shows specific reason why each page is skipped (cache hit, timestamp, path change, etc.) - Helps diagnose incorrect skip behavior - Includes "should not skip" warnings for potential bugs 2. Enhanced sub-page timeout error logging in fetchNotionData.ts - Display error type and detailed message - Add helpful hints when timeout occurs - Better categorization of failures (timeout vs API error vs permission) 3. Increase sub-page fetch timeout from 10s to 30s - Handles slow Notion API responses better - Reduces false positive timeouts for complex pages - Particularly helpful for pages with large blocks or nested children Benefits: - Immediate diagnostic value for debugging skipped pages - Fewer legitimate pages timing out and being skipped - Clear visibility into incremental sync behavior Related: ISSUE_95_PLAN.md
Contributor
🚀 Preview DeploymentYour documentation preview is ready! Preview URL: https://pr-99.comapeo-docs.pages.dev 🔄 Content: Regenerated 5 pages from Notion (script changes detected)
This preview will update automatically when you push new commits to this PR. Built with commit dd58fea |
…ditions Fix logic bug in skip reason detection where unreachable branches were incorrectly labeled as "should not skip" instead of being identified as logic errors. Problem: - Inside `if (!needsProcessing)` block, we know ALL conditions for needsProcessing are FALSE due to boolean OR logic - Previous labels like "should not skip" were misleading since those branches are mathematically unreachable - Would never detect actual logic bugs in needsProcessing calculation Solution: - Change impossible condition messages to "🔴 ERROR:" prefix - Add clear comments explaining unreachability - Document that ONLY "unchanged since [timestamp]" is valid - If ERROR appears in logs, indicates bug in needsProcessing logic Benefits: - Proper detection of logic bugs (if they exist) - Clear distinction between valid skip (cache hit) vs impossible cases - Better debugging when investigating Issue #95 Related: ISSUE_95_PLAN.md Phase 1 review
Fix missing path change check in filterChangedPages function to match the inline logic in generateBlocks.ts. This prevents renamed/moved pages from being incorrectly skipped during incremental sync. Changes: 1. Add optional `getFilePath` callback parameter to filterChangedPages - Allows checking if page's output path has changed - Critical for detecting renamed/moved pages - Backward compatible (callback is optional) 2. Update logic to check path changes before timestamp comparison - If path changed, page needs processing even if timestamp same - Matches behavior of inline needsProcessing logic 3. Add comprehensive tests for new functionality - Test path change detection works correctly - Test backward compatibility without callback - Verify path-unchanged pages are not included Benefits: - Prevents skipping pages that were moved/renamed - Matches the inline logic behavior in generateBlocks.ts - Fully backward compatible (existing calls still work) - Well tested with 2 new test cases This fixes the logic discrepancy identified in Phase 1 review where filterChangedPages was missing path change detection that existed in the inline needsProcessing check. Related: ISSUE_95_PLAN.md Phase 2
Detailed review of Phase 2 implementation identifying: Critical Finding: - filterChangedPages function is not used in production code - Only imported but never called in generateBlocks.ts - Actual fix was already in Phase 1 (inline logic) - Phase 2 prepares function for future use Quality Assessment: - ✅ Implementation is technically correct - ✅ No regressions found - ✅ Good test coverage for happy path -⚠️ Some edge cases identified (empty paths, normalization) -⚠️ Limited practical impact (function unused) Edge Cases Identified: 1. Empty string paths - could cause issues 2. Path normalization differences (./docs vs docs) 3. Null/undefined handling (already safe) 4. Case sensitivity (correct behavior) 5. Multiple output paths (handled correctly) Recommendations: 1. Add documentation comment about unused status 2. Add empty path validation check 3. Consider path normalization 4. Add edge case tests Bottom Line: Phase 2 is technically sound but has limited practical impact since the function isn't used. The real fix for Issue #95 was in Phase 1 (logging + timeout + inline logic). Related: ISSUE_95_PLAN.md
Add documentation and safety checks based on Phase 2 review findings: 1. Add NOTE in JSDoc clarifying function is currently unused - Points to actual implementation in generateBlocks.ts:704-711 - Explains function is maintained for testing and future refactoring - Prevents confusion about why function exists but isn't called 2. Add empty path validation check - Safety guard: if getFilePath returns empty/whitespace, regenerate - Prevents edge case where empty path could cause issues - Fail-safe approach: when in doubt, regenerate Changes are defensive improvements that make the function more robust without changing its core behavior. Empty path validation prevents potential edge case identified in review. Related: PHASE_2_REVIEW.md recommendations #1 and #2
Implement comprehensive cache validation tool to help diagnose incremental sync issues and cache inconsistencies. Features: 1. Verify all cached output files exist on disk - Detects missing files that should be regenerated - Reports which pages have missing outputs 2. Find orphaned markdown files - Scans docs/ directory for files not in cache - Helps identify manually created or old files - Warns about files that won't be updated 3. Verify script hash matches current code - Checks if scripts changed since last sync - Indicates if next run will do full rebuild - Shows hash mismatch details 4. Statistics and summary - Total pages in cache - Count of missing/orphaned files - Script hash status - Exit code 0 if valid, 1 if issues found Usage: bun run notion:validate-cache # Quick check bun run notion:validate-cache --verbose # Detailed output Example Output: 🔍 Validating Page Metadata Cache... ✅ All 156 cached files exist⚠️ Found 3 orphaned markdown files ✅ Script hash matches Benefits: - Diagnose why pages aren't updating - Find stale/orphaned content files - Verify cache integrity - Quick health check before/after sync Related: ISSUE_95_PLAN.md Phase 3
Comprehensive improvements based on thorough code review: 1. Fix Dead Code - Remove Unused Import - Comment out filterChangedPages import in generateBlocks.ts - Add NOTE explaining it's not used (inline logic preferred) - Add TODO comment at inline logic location - Prevents confusion about why function exists but isn't called 2. Enhanced Timeout Documentation - Add comprehensive comments explaining 10s → 30s change - Document rationale: reduce false-positive timeouts - Note trade-off: slower failure vs fewer skipped pages - Add monitoring recommendation for timeout patterns - Reference Issue #95 and implementation date 3. Add Telemetry for Unreachable Branches - Add console.error() for impossible ERROR conditions - Prominent red warning if logic bug occurs - Actionable message directing to report issue - References specific line numbers for debugging - Makes logic bugs immediately visible in production 4. Fix Path Normalization Consistency - Add normalizePath() helper function - Normalize all paths before adding to Set - Normalize all paths before comparison - Eliminates false positives from format differences - Handles Windows backslashes, leading slashes 5. Document Race Condition Risk - Add comprehensive JSDoc warning - List safe vs unsafe times to run validation - Prevent inaccurate results from concurrent operations - Guides users on proper usage Benefits: - Eliminates all code review concerns - Adds defensive programming - Improves production observability - Prevents future bugs - Better documentation for maintainers All changes are non-breaking and defensive in nature. Related: Code review recommendations for Issue #95
Contributor
Author
|
This is a duplicate of #112 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #95