-
Notifications
You must be signed in to change notification settings - Fork 1
Regenerate binaries on ISPC 1.29.1 #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Turns out there are a bunch of new |
On the MacBook Air M4Main @ 6e7b616 (ISPC 1.20...)This
|
|
Looks like performance is not restored in 1.28, or we're still doing something wrong. Barely any change compared against 1.27 (which was 24% slower than This PR @ 00d0256❯ cargo bench
Downsample `square_test.png` using ispc_downsampler
time: [48.412 ms 48.475 ms 48.539 ms]
change: [+29.098% +29.471% +29.821%] (p = 0.00 < 0.05)
Performance has regressed. |
|
Re-running this test on my host, recompiled on this ISPC version: ❯ ispc --version
Intel(r) Implicit SPMD Program Compiler (Intel(r) ISPC), 1.28.2 (build commit @ 20250924, LLVM 20.1.8)On latest ❯ cargo bench
Downsample `square_test.png` using ispc_downsampler
time: [46.776 ms 46.875 ms 46.969 ms]Then following the suggestion from @Jasper-Bekkers in Traverse-Research/intel-tex-rs-2#42 to only use i32x4 because NEON is 128-bits slightly regresses performance: ❯ cargo bench
Downsample `square_test.png` using ispc_downsampler
time: [48.003 ms 48.101 ms 48.196 ms]
change: [+2.3395% +2.6161% +2.9034%] (p = 0.00 < 0.05)
Performance has regressed.Also, this M4 chip is supposed to save SME (Scalable Matrix Extensions) but not SVE (Scalable Vector Extensions) and confirmed with Perhaps this needs to be reported upstream as I'm slightly out of ideas how to best bisect this compiler performance regression. |
|
Just went back in history to generate the blobs for all missing versions: ISPC
|
|
Yeah I closed thar PR because later I realized why there was a big delta: I was profiling on battery. |
|
@Jasper-Bekkers Oh I'm also exclusively developing on battery (the perks of Apple putting RTGs in these MacBooks 🤤) but the ±37ms vs ±45ms regression remains consistent. |
https://github.com/ispc/ispc/releases/tag/v1.29.1
TODO: Still need to compare performance, but perhaps this helps on newer architectures. Might also have to evaluate if we're simply missing some
TargetISAflags relevant for newer SoCs?