Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 100 additions & 23 deletions src/ProjGraph.Lib/Services/EfAnalysis/EntityFileDiscovery.cs
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ namespace ProjGraph.Lib.Services.EfAnalysis;
/// </remarks>
public static class EntityFileDiscovery
{
/// <summary>
/// Maximum recursion depth when searching for base class files to prevent infinite recursion
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for MaxSearchDepth indicates it's only used for "base class files", but this constant should also apply to entity file searching in SearchDirectoryRecursiveAsync to prevent unbounded recursion. Update the documentation to reflect that this constant applies to all recursive file search operations in this class.

Suggested change
/// Maximum recursion depth when searching for base class files to prevent infinite recursion
/// Maximum recursion depth for recursive file search operations in this class to prevent infinite recursion

Copilot uses AI. Check for mistakes.
/// and limit search scope to reasonable project structures.
/// </summary>
private const int MaxSearchDepth = 10;
/// <summary>
/// Discovers the file paths of entity files within the specified search directories.
/// </summary>
Expand Down Expand Up @@ -173,37 +178,69 @@ public static HashSet<string> ExtractEntityTypeNames(ClassDeclarationSyntax cont
/// For each file, it calls <see cref="ProcessSourceFileAsync"/> to process the file and add matching
/// entity type names and their file paths to the <paramref name="entityFiles"/> dictionary.
/// Any access errors encountered during directory traversal are ignored.
/// Directories like bin, obj, .git, and node_modules are skipped during traversal for performance.
/// </remarks>
private static async Task SearchDirectoryForEntitiesAsync(
string searchDir,
HashSet<string> entityTypeNames,
string normalizedContextPath,
Dictionary<string, string> entityFiles)
{
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SearchDirectoryForEntitiesAsync method is now just a thin wrapper that directly calls SearchDirectoryRecursiveAsync without adding any additional logic. Consider either removing this wrapper and calling SearchDirectoryRecursiveAsync directly from DiscoverEntityFilesAsync (line 49), or add the depth parameters to SearchDirectoryRecursiveAsync and initialize them in this wrapper method to make it serve a clear purpose.

Suggested change
{
{
// Basic validation to ensure we do not attempt to search with invalid inputs.
if (string.IsNullOrWhiteSpace(searchDir) ||
entityTypeNames == null ||
entityFiles == null)
{
return;
}
// Avoid starting a recursive search on a directory that does not exist.
if (!System.IO.Directory.Exists(searchDir))
{
return;
}

Copilot uses AI. Check for mistakes.
await SearchDirectoryRecursiveAsync(searchDir, entityTypeNames, normalizedContextPath, entityFiles);
}

/// <summary>
/// Recursively searches a directory for C# files, skipping common build and version control directories.
/// </summary>
/// <param name="currentDir">The current directory to search.</param>
/// <param name="entityTypeNames">A set of entity type names to search for in the C# files.</param>
/// <param name="normalizedContextPath">The normalized file path of the context file to exclude from the search.</param>
/// <param name="entityFiles">
/// A dictionary where the keys are entity type names and the values are the corresponding file paths.
/// </param>
/// <returns>A task that represents the asynchronous operation.</returns>
/// <remarks>
/// This method implements manual recursion to avoid descending into directories that typically
/// contain build artifacts or dependencies (bin, obj, .git, node_modules), improving performance
/// for large projects.
/// </remarks>
private static async Task SearchDirectoryRecursiveAsync(
string currentDir,
HashSet<string> entityTypeNames,
string normalizedContextPath,
Dictionary<string, string> entityFiles)
Comment on lines +207 to +211
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SearchDirectoryRecursiveAsync method lacks depth tracking and maximum depth enforcement, unlike SearchForBaseClassFilesRecursive which has these safeguards. This creates an inconsistency and potential risk of unbounded recursion if the file system contains symbolic links or deeply nested directory structures. Consider adding currentDepth and maxDepth parameters to match the pattern used in SearchForBaseClassFilesRecursive, and add an early return check similar to line 461.

Copilot uses AI. Check for mistakes.
{
try
{
// Process files in the current directory
var options = new EnumerationOptions
{
RecurseSubdirectories = true, IgnoreInaccessible = true, AttributesToSkip = FileAttributes.System
RecurseSubdirectories = false,
IgnoreInaccessible = true,
AttributesToSkip = FileAttributes.System
};

foreach (var csFile in Directory.EnumerateFiles(searchDir, EfAnalysisConstants.FilePatterns.CSharpFiles,
options))
foreach (var csFile in Directory.EnumerateFiles(currentDir, EfAnalysisConstants.FilePatterns.CSharpFiles, options))
{
var fullPath = Path.GetFullPath(csFile);
if (fullPath.Equals(normalizedContextPath, StringComparison.OrdinalIgnoreCase))
{
continue;
}

// Skip common non-source directories that can be large
var pathSegments = fullPath.Split(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
if (pathSegments.Any(s => s is "bin" or "obj" or ".git" or "node_modules"))
await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles);
}
Comment on lines +223 to +232
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This foreach loop immediately maps its iteration variable to another variable - consider mapping the sequence explicitly using '.Select(...)'.

Copilot uses AI. Check for mistakes.

// Recursively process subdirectories, skipping excluded directories
foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options))
{
var dirName = Path.GetFileName(subDir);
if (dirName is "bin" or "obj" or ".git" or "node_modules")
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directory name filtering uses pattern matching which is case-sensitive ("bin" or "obj" or ".git" or "node_modules"). On Windows, directory names are case-insensitive, so "Bin", "BIN", "Obj", "OBJ" would not be filtered out. Consider using case-insensitive comparison, similar to how path comparisons are handled elsewhere in the codebase (e.g., line 226 uses StringComparison.OrdinalIgnoreCase). Example: if (dirName.Equals("bin", StringComparison.OrdinalIgnoreCase) || dirName.Equals("obj", StringComparison.OrdinalIgnoreCase) || ...)

Suggested change
if (dirName is "bin" or "obj" or ".git" or "node_modules")
if (string.Equals(dirName, "bin", System.StringComparison.OrdinalIgnoreCase)
|| string.Equals(dirName, "obj", System.StringComparison.OrdinalIgnoreCase)
|| string.Equals(dirName, ".git", System.StringComparison.OrdinalIgnoreCase)
|| string.Equals(dirName, "node_modules", System.StringComparison.OrdinalIgnoreCase))

Copilot uses AI. Check for mistakes.
{
continue;
}

await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles);
await SearchDirectoryRecursiveAsync(subDir, entityTypeNames, normalizedContextPath, entityFiles);
Comment on lines +207 to +243
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file WorkspaceTypeDiscovery.cs (lines 61-71) still uses the old pattern of RecurseSubdirectories = true with post-traversal filtering. For consistency and performance, consider applying the same optimization pattern used in this PR to that file as well.

Copilot uses AI. Check for mistakes.
}
Comment on lines +234 to 244
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SearchDirectoryRecursiveAsync should check if all entities have been found and return early, similar to the optimization in SearchForBaseClassFilesRecursive (lines 484-487, 502-505). This would prevent unnecessary traversal once all entity files have been discovered. Consider adding a check after processing files and before recursing into subdirectories: if (entityTypeNames.All(entityFiles.ContainsKey)) return;

Copilot uses AI. Check for mistakes.
}
catch
Expand Down Expand Up @@ -386,51 +423,91 @@ private static DirectoryInfo FindSolutionRoot(string startDirectory, int maxLeve
/// if the files are found; otherwise, the dictionary will be empty.
/// </returns>
/// <remarks>
/// This method iterates through the provided base class names and attempts to locate their corresponding
/// file paths by calling the <see cref="TryFindBaseClassFile"/> method. If a file is found, it is added
/// to the resulting dictionary. If no file is found for a base class name, it is skipped.
/// This method recursively searches for base class files while skipping common build and version control
/// directories (bin, obj, .git, node_modules) to improve performance for large projects.
/// </remarks>
public static Dictionary<string, string> SearchForBaseClassFiles(
HashSet<string> baseClassNames,
DirectoryInfo solutionRoot)
{
var baseClassFiles = new Dictionary<string, string>();
SearchForBaseClassFilesRecursive(solutionRoot.FullName, baseClassNames, baseClassFiles, 0, MaxSearchDepth);
return baseClassFiles;
}

/// <summary>
/// Recursively searches for base class files, skipping common build and version control directories.
/// </summary>
/// <param name="currentDir">The current directory to search.</param>
/// <param name="baseClassNames">A set of base class names to search for.</param>
/// <param name="baseClassFiles">
/// A dictionary to store the found base class files where the keys are base class names
/// and the values are the corresponding file paths.
/// </param>
/// <param name="currentDepth">The current recursion depth.</param>
/// <param name="maxDepth">The maximum recursion depth to prevent infinite recursion.</param>
/// <remarks>
/// This method implements manual recursion to avoid descending into directories that typically
/// contain build artifacts or dependencies (bin, obj, .git, node_modules), improving performance
/// for large projects.
/// </remarks>
private static void SearchForBaseClassFilesRecursive(
string currentDir,
HashSet<string> baseClassNames,
Dictionary<string, string> baseClassFiles,
int currentDepth,
int maxDepth)
{
if (currentDepth >= maxDepth || baseClassFiles.Count == baseClassNames.Count)
{
return;
}

try
{
var options = new EnumerationOptions
{
RecurseSubdirectories = true, IgnoreInaccessible = true, MaxRecursionDepth = 10
RecurseSubdirectories = false,
IgnoreInaccessible = true,
AttributesToSkip = FileAttributes.System
};

foreach (var file in Directory.EnumerateFiles(solutionRoot.FullName, "*.cs", options))
// Process files in the current directory
foreach (var file in Directory.EnumerateFiles(currentDir, "*.cs", options))
{
var fullPath = Path.GetFullPath(file);
// Skip common non-source directories that can be large
var pathSegments = fullPath.Split(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
if (pathSegments.Any(s => s is "bin" or "obj" or ".git" or "node_modules"))
var fileName = Path.GetFileNameWithoutExtension(file);
if (baseClassNames.Contains(fileName))
{
continue;
var fullPath = Path.GetFullPath(file);
baseClassFiles.TryAdd(fileName, fullPath);

if (baseClassFiles.Count == baseClassNames.Count)
{
return;
}
}
}

var fileName = Path.GetFileNameWithoutExtension(file);
if (!baseClassNames.Contains(fileName))
// Recursively process subdirectories, skipping excluded directories
foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options))
{
var dirName = Path.GetFileName(subDir);
if (dirName is "bin" or "obj" or ".git" or "node_modules")
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directory name filtering uses pattern matching which is case-sensitive ("bin" or "obj" or ".git" or "node_modules"). On Windows, directory names are case-insensitive, so "Bin", "BIN", "Obj", "OBJ" would not be filtered out. Consider using case-insensitive comparison, similar to how path comparisons are handled elsewhere in the codebase (e.g., line 226 uses StringComparison.OrdinalIgnoreCase). Example: if (dirName.Equals("bin", StringComparison.OrdinalIgnoreCase) || dirName.Equals("obj", StringComparison.OrdinalIgnoreCase) || ...)

Suggested change
if (dirName is "bin" or "obj" or ".git" or "node_modules")
if (dirName.Equals("bin", System.StringComparison.OrdinalIgnoreCase)
|| dirName.Equals("obj", System.StringComparison.OrdinalIgnoreCase)
|| dirName.Equals(".git", System.StringComparison.OrdinalIgnoreCase)
|| dirName.Equals("node_modules", System.StringComparison.OrdinalIgnoreCase))

Copilot uses AI. Check for mistakes.
{
continue;
}

baseClassFiles.TryAdd(fileName, fullPath);
SearchForBaseClassFilesRecursive(subDir, baseClassNames, baseClassFiles, currentDepth + 1, maxDepth);

if (baseClassFiles.Count == baseClassNames.Count)
{
break;
return;
}
}
}
catch
{
// Ignore access errors
}

return baseClassFiles;
}
}
Loading