-
Notifications
You must be signed in to change notification settings - Fork 0
Optimize directory traversal to skip excluded directories during enumeration #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize directory traversal to skip excluded directories during enumeration #15
Conversation
… during traversal Co-authored-by: HandyS11 <62420910+HandyS11@users.noreply.github.com>
…efaults Co-authored-by: HandyS11 <62420910+HandyS11@users.noreply.github.com>
|
@copilot fix the conflicts |
Co-authored-by: HandyS11 <62420910+HandyS11@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR aims to reduce unnecessary I/O during EF entity/base-class discovery by avoiding traversal into commonly excluded directories (e.g., bin, obj, .git, node_modules) and updates EF relationship analysis behavior for self-references.
Changes:
- Refactors entity and base-class file discovery to use manual recursion that skips excluded directories before descending.
- Adds/uses a recursion depth cap for base-class discovery and includes early termination when all base classes are found.
- Changes self-referencing relationship classification in
RelationshipAnalyzerand updates the corresponding unit test.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
src/ProjGraph.Lib/Services/EfAnalysis/EntityFileDiscovery.cs |
Implements manual directory traversal to skip excluded directories and refactors base-class searching with a depth limit. |
src/ProjGraph.Lib/Services/EfAnalysis/RelationshipAnalyzer.cs |
Adjusts reference-navigation relationship typing (notably self-reference handling). |
tests/ProjGraph.Tests.Unit/Services/EfAnalysis/EfAnalysisAdvancedTests.cs |
Updates the expected relationship type for the self-referencing entity test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| foreach (var csFile in Directory.EnumerateFiles(currentDir, EfAnalysisConstants.FilePatterns.CSharpFiles, options)) | ||
| { | ||
| var fullPath = Path.GetFullPath(csFile); | ||
| if (fullPath.Equals(normalizedContextPath, StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| continue; | ||
| } | ||
|
|
||
| // Skip common non-source directories that can be large | ||
| var pathSegments = fullPath.Split(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar); | ||
| if (pathSegments.Any(s => s is "bin" or "obj" or ".git" or "node_modules")) | ||
| await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles); | ||
| } | ||
|
|
||
| // Recursively process subdirectories, skipping excluded directories | ||
| foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options)) | ||
| { | ||
| var dirName = Path.GetFileName(subDir); | ||
| if (dirName is "bin" or "obj" or ".git" or "node_modules") | ||
| { | ||
| continue; | ||
| } | ||
|
|
||
| await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles); | ||
| await SearchDirectoryRecursiveAsync(subDir, entityTypeNames, normalizedContextPath, entityFiles); | ||
| } |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within SearchDirectoryRecursiveAsync, traversal continues even after all entityTypeNames have been found, which reduces the benefit of early directory skipping on large trees. Add an early-exit check (e.g., after processing files and before descending into each subdir) to return once entityTypeNames are all present in entityFiles.
| relationship.SourceEntity = targetEntity.Name; | ||
| relationship.TargetEntity = sourceEntity.Name; | ||
| } | ||
| else if (NavigationPropertyAnalyzer.HasInverseReference(prop, targetType) || | ||
| sourceEntity.Name == targetEntity.Name) | ||
| else if (NavigationPropertyAnalyzer.HasInverseReference(prop, targetType)) | ||
| { | ||
| // If it has an inverse reference OR it's a self-reference, treat as One-to-One | ||
| // If it has an inverse reference (including self-references with an explicit inverse), treat as One-to-One | ||
| relationship.Type = EfRelationshipType.OneToOne; |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR changes relationship classification behavior (self-references without an inverse are no longer treated as OneToOne). That behavioral change isn’t mentioned in the PR title/description, which currently focuses on directory traversal performance. Please document this behavior change (or split it into a separate PR) so reviewers/users can evaluate it independently.
| else if (NavigationPropertyAnalyzer.HasInverseReference(prop, targetType)) | ||
| { | ||
| // If it has an inverse reference OR it's a self-reference, treat as One-to-One | ||
| // If it has an inverse reference (including self-references with an explicit inverse), treat as One-to-One |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "including self-references with an explicit inverse", but the code checks only for the presence of any inverse reference navigation property (it does not verify an explicit inverse configuration like [InverseProperty]). Please reword the comment to match what the analyzer actually detects.
| // If it has an inverse reference (including self-references with an explicit inverse), treat as One-to-One | |
| // If there is an inverse reference navigation property, treat as One-to-One |
| foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options)) | ||
| { | ||
| var dirName = Path.GetFileName(subDir); | ||
| if (dirName is "bin" or "obj" or ".git" or "node_modules") | ||
| { |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The excluded-directory list ("bin", "obj", ".git", "node_modules") is duplicated in both the entity search and base-class search. To avoid future divergence and make additions easier, consider centralizing this into a single constant/HashSet (ideally with an explicit StringComparer) and reusing it in both places.
| /// <summary> | ||
| /// Maximum recursion depth when searching for base class files to prevent infinite recursion | ||
| /// and limit search scope to reasonable project structures. | ||
| /// </summary> | ||
| private const int MaxSearchDepth = 10; |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR description mentions a recursion depth cap of 100, but the new MaxSearchDepth constant is set to 10. Please confirm the intended limit and either update the constant or adjust the PR description/docs so they match the actual behavior.
| private static async Task SearchDirectoryRecursiveAsync( | ||
| string currentDir, | ||
| HashSet<string> entityTypeNames, | ||
| string normalizedContextPath, | ||
| Dictionary<string, string> entityFiles) |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SearchDirectoryRecursiveAsync introduces unbounded call-stack recursion over the filesystem (no max depth / no reparse-point guard). This can blow the stack on unusually deep directory trees or get stuck following cyclic links. Consider adding a depth parameter + limit (similar to the base-class search) and/or skipping reparse points (FileAttributes.ReparsePoint) / switching to an explicit stack-based traversal.
| foreach (var csFile in Directory.EnumerateFiles(currentDir, EfAnalysisConstants.FilePatterns.CSharpFiles, options)) | ||
| { | ||
| var fullPath = Path.GetFullPath(csFile); | ||
| if (fullPath.Equals(normalizedContextPath, StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| continue; | ||
| } | ||
|
|
||
| // Skip common non-source directories that can be large | ||
| var pathSegments = fullPath.Split(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar); | ||
| if (pathSegments.Any(s => s is "bin" or "obj" or ".git" or "node_modules")) | ||
| await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles); | ||
| } |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This foreach loop immediately maps its iteration variable to another variable - consider mapping the sequence explicitly using '.Select(...)'.
Addresses inefficient directory enumeration in
EntityFileDiscovery.cswherebin,obj,.git, andnode_modulesdirectories were being fully enumerated before filtering, causing unnecessary I/O in large projects.Changes
SearchDirectoryRecursivelyAsync: Manual recursive traversal that checks directory names before descending, preventing enumeration of excluded paths entirelySearchDirectoryForEntitiesAsync: ReplacedDirectory.EnumerateFileswith recursive enumeration + post-filter pattern with upfront directory filteringSearchForBaseClassFiles: Same pattern with early termination when all base classes are foundBefore/After
Before:
After:
Performance improvement scales with size of excluded directories.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.