Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 5, 2025

  • Add AddRange method to ArrayBuilder<T> in /src/libraries/Common/src/System/Collections/Generic/ArrayBuilder.cs
  • Refactor Directory.cs to use ArrayBuilder<T> with AddRange instead of new List<T>(enumerable).ToArray()
  • Refactor DirectoryInfo.cs to use ArrayBuilder<T> with AddRange instead of new List<T>(enumerable).ToArray()
  • Add tests for ArrayBuilder<T>.AddRange method
  • Optimize AddRange to pre-allocate for ICollection<T> inputs and use ICollection.CopyTo for better performance
  • Add AddRange method to NativeAOT's ArrayBuilder<T> in /src/coreclr/tools/Common/System/Collections/Generic/ArrayBuilder.cs
  • Rename EnsureCapacity to Grow in Common ArrayBuilder for clarity (the method expects capacity checks before calling)
  • Add tests specifically for ICollection<T> optimization:
    • AddRange_ICollection_PreallocatesCapacity - verifies capacity is pre-allocated for ICollection inputs
    • AddRange_ICollection_EmptyCollection - verifies empty ICollection works correctly
    • AddRange_ICollection_AfterExistingItems - verifies ICollection optimization works when items already exist
    • AddRange_NonICollection_GrowsIncrementally - verifies non-ICollection path grows incrementally
  • Build and test to verify changes work correctly
Original prompt

There are a few locations in src/libraries/System.Private.CoreLib/src/System/IO/Directory.cs src/System/IO/DirectoryInfo.cs, and src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs that have code of the pattern new List<concretetype>(enumerableData).ToArray(). They do this because they're in System.Private.Corelib, and don't have access to LINQ's ToArray method. However, these locations can be improved by using System.Collections.Generic.ArrayBuilder<T>, which is in corelib and can potentially avoid intermediate allocations. Refactor these locations to use ArrayBuilder<T>, and do it by adding an AddRange helper to ArrayBuilder to encapsulate the enumeration loop and keep callsites fairly simple. You don't need to search for any instances of this code pattern outside the files I've listed; they exist, but they're all tests and don't really need to be adjusted now.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits December 5, 2025 02:48
…Info, and Timer to use it

Co-authored-by: 333fred <2371880+333fred@users.noreply.github.com>
Co-authored-by: 333fred <2371880+333fred@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor code to use ArrayBuilder for better performance Add ArrayBuilder.AddRange and use it instead of List.ToArray pattern Dec 5, 2025
Copilot AI requested a review from 333fred December 5, 2025 03:00
Co-authored-by: 333fred <2371880+333fred@users.noreply.github.com>
Copilot AI requested a review from 333fred December 5, 2025 03:35
@333fred 333fred marked this pull request as ready for review December 5, 2025 04:14
Copilot AI review requested due to automatic review settings December 5, 2025 04:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an AddRange method to ArrayBuilder<T> to improve performance in locations that previously used the new List<T>(enumerable).ToArray() pattern. The implementation optimizes for ICollection<T> inputs by pre-allocating capacity, while falling back to incremental growth for general IEnumerable<T> inputs.

Key changes:

  • Added AddRange(IEnumerable<T>) method to both Common and NativeAOT ArrayBuilder<T> implementations with ICollection<T> optimization
  • Refactored I/O methods in Directory.cs, DirectoryInfo.cs, and Timer.cs to use ArrayBuilder<T>.AddRange() instead of List<T>.ToArray()
  • Added comprehensive test coverage including tests for the ICollection optimization path, empty collections, and incremental growth

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/libraries/Common/src/System/Collections/Generic/ArrayBuilder.cs Adds AddRange method with ICollection optimization for pre-allocating capacity
src/coreclr/tools/Common/System/Collections/Generic/ArrayBuilder.cs Adds AddRange method to NativeAOT's ArrayBuilder with same optimization
src/libraries/System.Private.CoreLib/src/System/IO/Directory.cs Refactors three methods (GetFiles, GetDirectories, GetFileSystemEntries) to use ArrayBuilder.AddRange instead of List.ToArray pattern
src/libraries/System.Private.CoreLib/src/System/IO/DirectoryInfo.cs Refactors three methods (GetFiles, GetFileSystemInfos, GetDirectories) to use ArrayBuilder.AddRange instead of List.ToArray pattern
src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs Refactors debugger proxy property to use ArrayBuilder.AddRange instead of List.ToArray pattern
src/libraries/Common/tests/Tests/System/Collections/Generic/ArrayBuilderTests.cs Adds comprehensive test coverage for AddRange including ICollection optimization, empty collections, and non-ICollection paths


[DebuggerBrowsable(DebuggerBrowsableState.RootHidden)]
public TimerQueueTimer[] Items => new List<TimerQueueTimer>(_queue.GetTimersForDebugger()).ToArray();
public TimerQueueTimer[] Items
Copy link
Member

@stephentoub stephentoub Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a debugger type proxy.... the extra allocations here don't matter.... better to keep the code simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please make this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted Timer.cs to use the original List approach in 9491f52.

foreach (T item in collection)
{
UncheckedAdd(item);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a deoptimization over using List, as it's copying element by element via enumeration rather than using ICollection.CopyTo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please make this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use ICollection.CopyTo instead of element-by-element enumeration in 9491f52. Both the Common and NativeAOT ArrayBuilder implementations now use CopyTo for better performance.

{
Debug.Assert(items != null);

if (items is ICollection<T> collection)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@333fred, all of the cited uses won't have an ICollection, as they all appear to be iterators. Are there other callers where this code will be used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There aren't currently. However, if AddRange is an overload that is offered on ArrayBuilder, then it seems likely that at some point, someone will try to call it with an ICollection; while I didn't suggest that Copilot add this optimization, it seems like a reasonable thing to keep (assuming, of course, it uses CopyTo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The internal helper types like ArrayBuilder are implemented to be minimal. They are often not behaving like you would expect from a public BCL API and they tend to expose methods to squeeze bits of extra performance (e.g. ArrayBuilder.UncheckedAdd).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to go either way here. Just tell whether you want me to keep the optimization around ICollection or not 🙂.

@stephentoub
Copy link
Member

stephentoub commented Dec 7, 2025

@333fred, I'm wondering if the additional complexity here is warranted. The possible allocation savings are:

  • the list object itself
  • the array backing the list oif it ends up being of exactly the right size

Presumably the latter rarely occurs, so this is primarily about avoiding the list object on code paths that are already allocating a bunch of FileSystemInfo objects and an array to store them all?

@333fred
Copy link
Member

333fred commented Dec 8, 2025

Presumably the latter rarely occurs, so this is primarily about avoiding the list object on code paths that are already allocating a bunch of FileSystemInfo objects and an array to store them all?

Right. Perfectly fine if we want to just keep the Lists here, but it seemed like a reasonable potential perf optimization.

Co-authored-by: 333fred <2371880+333fred@users.noreply.github.com>
Co-authored-by: 333fred <2371880+333fred@users.noreply.github.com>
@tannergooding
Copy link
Member

tannergooding commented Dec 9, 2025

The possible allocation savings are:

I'm not sure the allocation savings are particularly meaningful, except maybe in some more niche edge cases. I think the bigger thing is avoiding repeated copies by ensuring that the backing collection is big enough.

Allocations really aren't that expensive. The GC is also very good at its job, we have plenty of analyzers and tools to detect allocations when they do get meaningful, etc.

The copy cost, however, can be much more important as it can impact the rest of the program due to how it invalidates the cache and involves touching two or more times the memory than needed. I think this is more what makes this a meaningful thing to handle

The backing collection isn't likely to be the right size, so at least one copy is likely needed. But there's still all the intermediate growth copies that happen in the interim as we go 4->8->16->32... items. At 8x int, we're touching at least a full cache line of memory (and likely a couple due to alignment, allocations being in different areas, etc). Then we double that at each growth boundary.

@stephentoub
Copy link
Member

The possible allocation savings are:

I'm not sure the allocation savings are particularly meaningful, except maybe in some more niche edge cases. I think the bigger thing is avoiding repeated copies by ensuring that the backing collection is big enough.

Allocations really aren't that expensive. The GC is also very good at its job, we have plenty of analyzers and tools to detect allocations when they do get meaningful, etc.

The copy cost, however, can be much more important as it can impact the rest of the program due to how it invalidates the cache and involves touching two or more times the memory than needed. I think this is more what makes this a meaningful thing to handle

How does this meaningfully help avoid copies? There's no difference here in presizing, and the only time a copy will be avoided is if the resulting array happens to be exactly the same size as what's required, which is unlikely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants