Skip to content

Conversation

@edgchen1
Copy link

Background: On a Windows ARM system, I observed that cpuinfo_has_arm_fp16_arith() started to return false after upgrading to a more recent cpuinfo version.

In #333, the initialization of cpuinfo_isa.fp16arith was updated to use IsProcessorFeaturePresent(PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE). I suspect that this is not supported on older Windows versions.

This change adds a fallback path to set cpuinfo_isa.fp16arith the old way.

// Assume that Dot Product support implies FP16
// arithmetics and RDM support. ARM manuals don't
// guarantee that, but it holds in practice.
cpuinfo_isa.fp16arith = dotprod;

@meta-cla meta-cla bot added the cla signed label Nov 21, 2025
@edgchen1
Copy link
Author

Hi @tonybaloney, would you mind taking a look at this one?

@tonybaloney
Copy link
Contributor

What chip is it?

@edgchen1
Copy link
Author

What chip is it?

This was observed on an Azure Standard D16pds v5 machine with an Ampere Altra processor.


// PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE may not be available in older
// Windows versions. If fp16arith was not detected with
// IsProcessorFeaturePresent(PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE), fall
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is true in practice, if the detection for FP16 is available and used, and it says false, we should respect that?
There have been recent A75 without FP16. They also dont have dot product.
Can you clarify a case for when this is necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is true in practice, if the detection for FP16 is available and used, and it says false, we should respect that?

Yes, I agree that if detection is available, we should respect it. However, I am not sure that it is actually available. IsProcessorFeaturePresent() returns 0 both when the feature is actually not detected to be present and also when detection is not available.

Can you clarify a case for when this is necessary?

The behavior I observed was that, on a CI build system with an Ampere Altra processor, cpuinfo_has_arm_fp16_arith() used to return true (prior to the use of IsProcessorFeaturePresent(PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE) in src/arm/windows/init.c) and then started to return false. On that same system, I was able to run a test program with NEON FP16 intrinsics successfully, so it appears that FP16 instructions are actually available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants