Skip to content

Conversation

@dscho
Copy link
Member

@dscho dscho commented Dec 18, 2025

Similar to 1475d9f (mimalloc: use "weak" random seed when statically linked, 2023-05-12) (which was actually already applied upstream in microsoft/mimalloc@3e1d800e (potential fix for windows static linking with thread creation in dll's, 2022-11-07), this ensures that another code path uses "weak" random seeds when initializing mimalloc.

The scenario when this happens is likely the exact same as back when the original fix was implemented (but I can only hypothesize about this, as I was unable to find a reproducer in my setup, and even the reporters could not find a reliable reproducer, either): an atexit() handler is called after mimalloc's own atexit() handler deinitializes mimalloc. This atexit() handler (namely post_command_hook_atexit()) then calls getenv(), which internally resolves to mingw_getenv(), which wants to calloc() the result so that it is not overwritten while in use. This, in turn, uses mimalloc, which dutifully initializes the random seed. It apparently takes a different code path than back in 2023, though, where it now tries to initialize a strong random generator implemented in bcrypt.dll. However, that initialization fails because it happens during the phase when only atexit() handlers are called, and bcrypt has obviously already been deinitialized, in a way that causes a crash.

The stack trace in question looks like this:

ntdll!ZwWaitForMultipleObjects+0x140 ...
KERNELBASE!WaitForMultipleObjectsEx+0x123 ...
KERNELBASE!WaitForMultipleObjects+0x11 ...
kernel32!WerpReportFaultInternal+0x62c ...
kernel32!WerpReportFault+0xc5 ...
KERNELBASE!UnhandledExceptionFilter+0x34c ...
ntdll!RtlpThreadExceptionFilter+0x2e ...
ntdll!RtlUserThreadStart$filt$0+0x3f0 ...
ntdll!_C_specific_handler+0x93 ...
ntdll!RtlpExecuteHandlerForException+0xf ...
ntdll!RtlDispatchException+0x437 ...
ntdll!KiUserExceptionDispatch+0x2e() ...
bcryptPrimitives!AesRNGState_generate+0x79 ...
bcryptPrimitives!GenRandomAes+0xf70 ...
bcryptPrimitives!MSCryptGenRandom+0x15f ...
bcrypt!BCryptGenRandom+0x7a ...
bcrypt!BCryptGenSysternPreferredRandom+0x340 ....
bcrypt!BCryptGenRandom+0x119 ...
git!_mi_prim_random_buf+0x23() [compat\mimalloc\prim\windows\prim.c @ 648]
git!mi_random_init_ex+0x61() [compat\mimalloc\random.c @ 179]
git!_mi_heap_init+0xe8() [compat\mimalloc\heap.c @ 226]
git!mi_thread_init+0x21b() [compat\mimalloc\init.c @ 396]
git!mi_heap_get_default+0x9() [compat\mimalloc\mimalloc\prim.h @ 415]
git!_mi_malloc_generic+0x9d() [compat\mimalloc\page.c @ 993]
git!mingw_getenv+0x4f() [compat\mingw.c @ 2446]
git!run_post_command_hook+0x84() [git.c @ 515]

Let's treat this code path in the exact same way as the code path in mi_heap_main_init().

dscho and others added 30 commits November 17, 2025 19:13
If we are going to write an object there is no use in calling
the read object hook to get an object from a potentially remote
source.  We would rather just write out the object and avoid the
potential round trip for an object that doesn't exist.

This change adds a flag to the check_and_freshen() and
freshen_loose_object() functions' signatures so that the hook
is bypassed when the functions are called before writing loose
objects. The check for a local object is still performed so we
don't overwrite something that has already been written to one
of the objects directories.

Based on a patch by Kevin Willford.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Alejandro Pauly <alpauly@microsoft.com>
When using the sparse-checkout feature, the file might not be on disk
because the skip-worktree bit is on. This used to be a bug in the
(hence deleted) `recursive` strategy. Let's ensure that this bug does
not resurface.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The 'git worktree' command was marked as BLOCK_ON_GVFS_REPO because it
does not interact well with the virtual filesystem of VFS for Git. When
a Scalar clone uses the GVFS protocol, it enables the
GVFS_BLOCK_COMMANDS flag, since commands like 'git gc' do not work well
with the GVFS protocol.

However, 'git worktree' works just fine with the GVFS protocol since it
isn't doing anything special. It copies the sparse-checkout from the
current worktree, so it does not have performance issues.

This is a highly requested option.

The solution is to stop using the BLOCK_ON_GVFS_REPO option and instead
add a special-case check in cmd_worktree() specifically for a particular
bit of the 'core_gvfs' global variable (loaded by very early config
reading) that corresponds to the virtual filesystem. The bit that most
closely resembled this behavior was non-obviously named, but does
provide a signal that we are in a Scalar clone and not a VFS for Git
clone. The error message is copied from git.c, so it will have the same
output as before if a user runs this in a VFS for Git clone.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
When using the sparse-checkout feature git should not write to the working
directory for files with the skip-worktree bit on.  With the skip-worktree
bit on the file may or may not be in the working directory and if it is
not we don't want or need to create it by calling checkout_entry.

There are two callers of checkout_target.  Both of which check that the
file does not exist before calling checkout_target.  load_current which
make a call to lstat right before calling checkout_target and
check_preimage which will only run checkout_taret it stat_ret is less than
zero.  It sets stat_ret to zero and only if !stat->cached will it lstat
the file and set stat_ret to something other than zero.

This patch checks if skip-worktree bit is on in checkout_target and just
returns so that the entry doesn't not end up in the working directory.
This is so that apply will not create a file in the working directory,
then update the index but not keep the working directory up to date with
the changes that happened in the index.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Add the ability to block built-in commands based on if the `core.gvfs`
setting has the `GVFS_USE_VIRTUAL_FILESYSTEM` bit set. This allows us
to selectively block commands that use the GVFS protocol, but don't use
VFS for Git (for example repos cloned via `scalar clone` against Azure
DevOps).

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
As of 9e59b38 (object-file: emit corruption errors when detected,
2022-12-14), Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git via the
`read-object` hook, as per the `GitCorruptObjectTests` code
added in microsoft/VFSForGit@2db0c030eb25 (New
features: [...] -  GVFS can now recover from corrupted git object files
[...] , 2018-02-16).

So let's support precisely that, and add a regression test that ensures
that re-downloading corrupt objects via the `read-object` hook works.

While at it, avoid the XOR operator to flip the bits, when we actually
want to make sure that they are turned off: Use the AND-NOT operator for
that purpose.

Helped-by: Matthew John Cheetham <mjcheetham@outlook.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Loosen the blocking of the `repack` command from all "GVFS repos" (those
that have `core.gvfs` set) to only those that actually use the virtual
file system (VFS for Git only). This allows for `repack` to be used in
Scalar clones.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
String formatting can be a performance issue when there are
hundreds of thousands of trees.

Change to stop using the strbuf_addf and just add the strings
or characters individually.

There are a limited number of modes so added a switch for the
known ones and a default case if something comes through that
are not a known one for git.

In one scenario regarding a huge worktree, this reduces the
time required for a `git checkout <branch>` from 44 seconds
to 38 seconds, i.e. it is a non-negligible performance
improvement.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Loosen the blocking of the `fsck` command from all "GVFS repos" (those
that have `core.gvfs` set) to only those that actually use the virtual
file system (VFS for Git only). This allows for `fsck` to be used in
Scalar clones.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
The following commands and options are not currently supported when working
in a GVFS repo.  Add code to detect and block these commands from executing.

1) fsck
2) gc
4) prune
5) repack
6) submodule
8) update-index --split-index
9) update-index --index-version (other than 4)
10) update-index --[no-]skip-worktree
11) worktree

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Loosen the blocking of the `prune` command from all "GVFS repos" (those
that have `core.gvfs` set) to only those that actually use the virtual
file system (VFS for Git only). This allows for `prune` to be used in
Scalar clones.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
In earlier versions of `microsoft/git`, we found a user who had set
`core.gvfs = false` in their global config. This should not have been
necessary, but it also should not have caused a problem. However, it
did.

The reason was that `gvfs_load_config_value()` was called from
`config.c` when reading config key/value pairs from all the config
files. The local config should override the global config, and this is
done by `config.c` reading the global config first then reading the
local config. However, our logic only allowed writing the `core_gvfs`
variable once.

In v2.51.0, we had to adapt to upstream changes that changed way the
`core.gvfs` config value is read, and the special handling is no longer
necessary, yet we still want the test case that ensures that this bug
does not experience a regression.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Replace the special casing of the `worktree` command being blocked on
VFS-enabled repos with the new `BLOCK_ON_VFS_ENABLED` flag.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Emit a warning message when the `gvfs.sharedCache` option is set that
the `repack` command will not perform repacking on the shared cache.

In the future we can teach `repack` to operate on the shared cache, at
which point we can drop this commit.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Teach STATUS to optionally serialize the results of a
status computation to a file.

Teach STATUS to optionally read an existing serialization
file and simply print the results, rather than actually
scanning.

This is intended for immediate status results on extremely
large repos and assumes the use of a service/daemon to
maintain a fresh current status snapshot.

2021-10-30: packet_read() changed its prototype in ec9a37d (pkt-line.[ch]:
remove unused packet_read_line_buf(), 2021-10-14).

2021-10-30: sscanf() now does an extra check that "%d" goes into an "int"
and complains about "uint32_t". Replacing with "%u" fixes the compile-time
error.

2021-10-30: string_list_init() was removed by abf897b (string-list.[ch]:
remove string_list_init() compatibility function, 2021-09-28), so we need to
initialize manually.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Teach status serialization to take an optional pathname on
the command line to direct that cache data be written there
rather than to stdout.  When used this way, normal status
results will still be written to stdout.

When no path is given, only binary serialization data is
written to stdout.

Usage:
    git status --serialize[=<path>]

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach status deserialize code to reject status cache
when printing in porcelain V2 and there are unresolved
conflicts in the cache file.  A follow-on task might
extend the cache format to include this additiona data.

See code for longer explanation.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
On index load, clear/set the skip worktree bits based on the virtual
file system data. Use virtual file system data to update skip-worktree
bit in unpack-trees. Use virtual file system data to exclude files and
folders not explicitly requested.

Update 2022-04-05: disable the "present-despite-SKIP_WORKTREE" file removal
behavior when 'core.virtualfilesystem' is enabled.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Changes to the global or repo-local excludes files can change the
results returned by "git status" for untracked files.  Therefore,
it is important that the exclude-file values used during serialization
are still current at the time of deserialization.

Teach "git status --serialize" to report metadata on the user's global
exclude file (which defaults to "$XDG_HOME/git/ignore") and for the
repo-local excludes file (which is in ".git/info/excludes").  Serialize
will record the pathnames and mtimes for these files in the serialization
header (next to the mtime data for the .git/index file).

Teach "git status --deserialize" to validate this new metadata.  If either
exclude file has changed since the serialization-cache-file was written,
then deserialize will reject the cache file and force a full/normal status
run.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
…x has been redirected

Fixes #13

Some git commands spawn helpers and redirect the index to a different
location.  These include "difftool -d" and the sequencer
(i.e. `git rebase -i`, `git cherry-pick` and `git revert`) and others.
In those instances we don't want to update their temporary index with
our virtualization data.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Teach `git status --deserialize` to either wait indefintely
or immediately fail if the status serialization cache file
is stale.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add check to see if a directory is included in the virtualfilesystem
before checking the directory hashmap.  This allows a directory entry
like foo/ to find all untracked files in subdirectories.
With the "--untracked-files=complete" option status computes a
superset of the untracked files.  We use this when writing the
status cache.  If subsequent deserialize commands ask for either
the complete set or one of the "no", "normal", or "all" subsets,
it can still use the cache file because of filtering in the
deserialize parser.

When running status with the "-uno" option, the long format
status would print a "(use -u to show untracked files)" hint.

When deserializing with the "-uno" option and using a cache computed
with "-ucomplete", the "nothing to commit, working tree clean" message
would be printed instead of the hint.

It was easy to miss because the correct hint message was printed
if the cache was rejected for any reason (and status did the full
fallback).

The "struct wt_status des" structure was initialized with the
content of the status cache (and thus defaulted to "complete").
This change sets "des.show_untracked_files" to the requested
subset from the command-line or config.  This allows the long
format to print the hint.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When our patches to support that hook were upstreamed, the hook's name
was eliciting some reviewer suggestions, and it was renamed to
`post-index-change`. These patches (with the new name) made it into
v2.22.0.

However, VFSforGit users may very well have checkouts with that hook
installed under the original name.

To support this, let's just introduce a hack where we look a bit more
closely when we just failed to find the `post-index-change` hook, and
allow any `post-indexchanged` hook to run instead (if it exists).
When using fsmonitor the CE_FSMONITOR_VALID flag should be checked when
wanting to know if the entry has been updated. If the flag is set the
entry should be considered up to date and the same as if the CE_UPTODATE
is set.

In order to trust the CE_FSMONITOR_VALID flag, the fsmonitor data needs to
be refreshed when the fsmonitor bitmap is applied to the index in
tweak_fsmonitor. Since the fsmonitor data is kept up to date for every
command, some tests needed to be updated to take that into account.

istate->untracked->use_fsmonitor was set in tweak_fsmonitor when the
fsmonitor bitmap data was loaded and is now in refresh_fsmonitor since
that is being called in tweak_fsmonitor. refresh_fsmonitor will only be
called once and any other callers should be setting it when refreshing
the fsmonitor data so that code can use the fsmonitor data when checking
untracked files.

When writing the index, fsmonitor_last_update is used to determine if
the fsmonitor bitmap should be created and the extension data written to
the index. When running through unpack-trees this is not copied to the
result index. This makes the next time a git command is ran do all the
work of lstating all files to determine what is clean since all entries
in the index are marked as dirty since there wasn't any fsmonitor data
saved in the index extension.

Copying the fsmonitor_last_update to the result index will cause the
extension data for fsmonitor to be in the index for the next git command
to use.

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
dscho and others added 27 commits November 17, 2025 19:13
On the off-chance that it's NULL...

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL is GitHub's native offering of a static code analyzer, and hence
integrates with GitHub Actions better than any other static code
analyzer.

By default, it comes with a large range of "queries" that test for
common code patterns that should be avoided.

For now, we only target source code written in C, via the `language:
cpp` directive. Just in case that other languages should be targeted,
too, this GitHub workflow job is set up as a matrix job to make that
easier in the future.

For full documentation, see
https://docs.github.com/en/code-security/code-scanning/introduction-to-code-scanning/about-code-scanning-with-codeql

Co-authored-by: Pierre Tempel <turbo@github.com>
Co-authored-by: Arthur Baars <aibaars@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `branch_get()` can return NULL values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As pointed out by CodeQL, `lookup_commit()` can return NULL.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In some instances, CodeQL's web UI on github.com leaves questions
unanswered. For example, in some alerts it is really necessary to follow
the entire "taint flow" to understand why something might be an issue.

The alerts for the `cpp/uncontrolled-allocation-size` rule, for example,
are all false positives, and only when inspecting the exact flow does it
become obvious that one alert wants to point out that the size of a
binary patch hunk, which is specified in the patch, is then used to
determine how much memory to allocate, which may potentially run out of
memory (and is hence just Git doing what it is asked to, and does not
need to be changed).

To help with those issues, publish the `.sarif` file as part of every
workflow run; This allows downloading that file and inspecting it e.g.
with the SARIF viewer extension in VS Code (for details, see
https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `branch_get()` can return NULL values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The code is a bit too hard to reason about for CodeQL to figure out
whether the `fill_commit_graph_info()`  function is at all called after
`write_commit_graph()` returns (and hence whether `topo_levels` goes out
of context before it is used again).

The Git project insists that this is correct (and does not want to make
the code more obviously correct), so let's silence CodeQL's complaints
in this instance.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A couple of CodeQL's queries are opinionated in a way that is obviously
not shared by Git's source code's state, and apparently intentionally so.

For example, the "For loop variable changed in body" query as well as
the "No trivial switch statements" one result in too many results that
are apparently intentional in Git's source code. Let's not worry about
those, then. Also, Git has plenty of instances where variables shadow
other variables.

Other valid yet not quite critical issues identified by CodeQL include
complex conditionals and nested switch statements spanning several
pages.

We probably want to address these issues at some stage, but they are not
as critical as other problems pointed out by CodeQL, so let's silence
those queries for now and take care of them at a later stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…ray past end

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A few places where CodeQL thinks that variables might be uninitialized.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…oes NUL-terminate correctly

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These patches implement some defensive programming to address complaints
some static analyzers might have.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Let's exclude GitWeb from being scanned; It is not distributed by us.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL pointed out a couple of issues, which are addressed in this patch
series.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series has been long in the making, ever since Johannes
Nicolai and myself spiked this in November/December 2020.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The microsoft/git fork includes pre- and post-command hooks, with the
initial intention of using these for VFS for Git. In that environment,
these are important hooks to avoid concurrent issues when the
virtualization is incomplete.

However, in the Office monorepo the post-command hook is used in a
different way. A custom hook is used to update the sparse-checkout, if
necessary. To avoid this hook from being incredibly slow on every Git
command, this hook checks for the existence of a "sentinel file" that is
written by a custom post-index-change hook and no-ops if that file does
not exist.

However, even this "no-op" is 200ms due to the use of two scripts (one
simple script in .git/hooks/ does some environment checking and then
calls a script from the working directory which actually contains the
logic).

Add a new config option, 'postCommand.strategy', that will allow for
multiple possible strategies in the future. For now, the one we are
adding is 'post-index-change' which states that we should write a
sentinel file instead of running the 'post-index-change' hook and then
skip the 'post-command' hook if the proper sentinel file doesn't exist.
(If it does exist, then delete it and run the hook.)

--- 

This fork contains changes specific to monorepo scenarios. If you are an
external contributor, then please detail your reason for submitting to
this fork:

* [ ] This is an early version of work already under review upstream.
* [ ] This change only applies to interactions with Azure DevOps and the
      GVFS Protocol.
* [ ] This change only applies to the virtualization hook and VFS for
Git.
* [x] This change only applies to custom bits in the microsoft/git fork.
This patch series has been long in the making, ever since Johannes
Nicolai and myself spiked this in November/December 2020.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This fixup updates b46ad0a (gvfs-helper: create tool to fetch objects
using the GVFS Protocol, 2019-08-13) in reaction to f6f236d
(packfile: refactor `install_packed_git()` to work on packfile store,
2025-09-23) which is included in Git 2.52.0.

The refactored packfile store includes an automatic inclusion of new
packifles into the MRU list. This introduces a bug in microsoft/git's
use of the GVFS protocol in the following scenario in 'git fetch':

 1. If the prefetch downloads at least one prefetch packfile, then it is
    added to the MRU list twice, creating an infinite loop.

 2. If the refs that are updated include commits that are not present in
    the packfile list, then the MRU lookup will iterate through without
    interruption, hitting the infinite loop.

The fix is to modify this patch to no longer include a custom "add to
MRU" method now that the default implementation does this for us.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
This makes the commit trivial and will be dropped in a future update.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
This fixup updates b46ad0a (gvfs-helper: create tool to fetch objects
using the GVFS Protocol, 2019-08-13) in reaction to f6f236d
(packfile: refactor `install_packed_git()` to work on packfile store,
2025-09-23) which is included in Git 2.52.0.

This PR is organized to fixup the commit mentioned above and to drop
4e743e6 (packfile: add install_packed_git_and_mru(), 2019-09-25) now
that the packfile method is no longer used.

The refactored packfile store includes an automatic inclusion of new
packifles into the MRU list. This introduces a bug in microsoft/git's
use of the GVFS protocol in the following scenario in 'git fetch':

1. If the prefetch downloads at least one prefetch packfile, then it is
added to the MRU list twice, creating an infinite loop.

2. If the refs that are updated include commits that are not present in
the packfile list, then the MRU lookup will iterate through without
interruption, hitting the infinite loop.

The fix is to modify this patch to no longer include a custom "add to
MRU" method now that the default implementation does this for us.

* [X] This change only applies to interactions with Azure DevOps and the
      GVFS Protocol.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
This is a fix for a performance regression introduced by #822. That
change updated the packfile installation process to call a different
packfile installation method after some upstream changes to the packfile
data structures.

However, while I thought I read the new packfile methods as including a
modification of the MRU cache, it apparently does not fully install the
packfile in the list.

This manifests in something like `git checkout` where the missing blobs
are queued for download, downloaded in blob packfiles, and then the
checkout process continues. During the process of writing the data to
the worktree, the "updating files" progress indicator slows to a crawl
because it was "missing" the blobs and then downloaded them on-demand.

This fixes the issue by being less fancy about packfiles and using
`packfile_store_reprepare()` to just reset the full packfile list. This
is more future-proof and isn't very expensive compared to the packfile
download.

I augmented a test to include tracing that shows a necessary blob is
queued for packfile download and is not later downloaded via an
immediate request. Without the code change, that test would fail.

* [X] This change only applies to interactions with Azure DevOps and the
GVFS Protocol.
Similar to 1475d9f (mimalloc: use "weak" random seed when
statically linked, 2023-05-12) (which was actually already applied
upstream in microsoft/mimalloc@3e1d800e
(potential fix for windows static linking with thread creation in dll's,
2022-11-07), this ensures that another code path uses "weak" random
seeds when initializing mimalloc.

The scenario when this happens is likely the exact same as back when
the original fix was implemented (but I can only hypothesize about this,
as I was unable to find a reproducer in my setup, and even the reporters
could not find a reliable reproducer, either): an `atexit()` handler
is called after mimalloc's own `atexit()` handler deinitializes
mimalloc. This `atexit()` handler (namely `post_command_hook_atexit()`)
then calls `getenv()`, which internally resolves to `mingw_getenv()`,
which wants to `calloc()` the result so that it is not overwritten
while in use. This, in turn, uses mimalloc, which dutifully initializes
the random seed. It apparently takes a different code path than back in
2023, though, where it now tries to initialize a strong random generator
implemented in bcrypt.dll. However, that initialization fails because
it happens during the phase when only `atexit()` handlers are called,
and bcrypt has obviously already been deinitialized, in a way that
causes a crash.

The stack trace in question looks like this:

 ntdll!ZwWaitForMultipleObjects+0x140 ...
 KERNELBASE!WaitForMultipleObjectsEx+0x123 ...
 KERNELBASE!WaitForMultipleObjects+0x11 ...
 kernel32!WerpReportFaultInternal+0x62c ...
 kernel32!WerpReportFault+0xc5 ...
 KERNELBASE!UnhandledExceptionFilter+0x34c ...
 ntdll!RtlpThreadExceptionFilter+0x2e ...
 ntdll!RtlUserThreadStart$filt$0+0x3f0 ...
 ntdll !_ C_specific_handler+0x93 ...
 ntdll!RtlpExecuteHandlerForException+0xf ...
 ntdll!RtlDispatchException+0x437 ...
 ntdll!KiUserExceptionDispatch+0x2e() ...
 bcryptPrimitives!AesRNGState_generate+0x79 ...
 bcryptPrimitives!GenRandomAes+0xf70 ...
 bcryptPrimitives!MSCryptGenRandom+0x15f ...
 bcrypt!BCryptGenRandom+0x7a ...
 bcrypt!BCryptGenSysternPreferredRandom+0x340 ....
 bcrypt!BCryptGenRandom+0x119 ...
 git!_ mi_prim_random_buf+0x23() [compat\mimalloc\prim\windows\prim.c @ 648]
 git!mi_random_init_ex+0x61() [compat\mimalloc\random.c @ 179]
 git!_ mi_heap_init+0xe8() [compat\mimalloc\heap.c @ 226]
 git!mi_thread_init+0x21b() [compat\mimalloc\init.c @ 396]
 git!mi_heap_get_default+0x9() [compat\mimalloc\mimalloc\prim.h @ 415]
 git!_ mi_malloc_generic+0x9d() [compat\mimalloc\page.c @ 993]
 git!mingw_getenv+0x4f() [compat\mingw.c @ 2446]
 git!run_post_command_hook+0x84() [git.c @ 515]

Let's treat this code path in the exact same way as the code path in
`mi_heap_main_init()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho closed this Dec 18, 2025
@dscho
Copy link
Member Author

dscho commented Dec 18, 2025

Wrong repo...

@dscho dscho deleted the fix-mimalloc-crash-in-post-command-v2.52 branch December 18, 2025 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.