Skip to content

Accellerate calculation of standard deviations#261

Merged
tammojan merged 2 commits intolofar-astron:masterfrom
bnikolic:stdevimprove
Aug 5, 2025
Merged

Accellerate calculation of standard deviations#261
tammojan merged 2 commits intolofar-astron:masterfrom
bnikolic:stdevimprove

Conversation

@bnikolic
Copy link
Contributor

Surprisingly calculation of the standard deviation is an appreciable fraction of our self-cal pipelines runtime somehow... I'll come back to a more complete solution in future probably, but for now this reuses the mean (which was already calculated) and also reuses a the computed deviations from the raw mean (calculated once only). Unfortunately adds a bit of complexity, but makes quite a big impact for us (~20% in runtime)

Bojan Nikolic added 2 commits July 30, 2025 15:05
Reuse means, and reuse squared deviations from the raw mean. The
numerical stability of this approach should be very good as long as
clipped mean is not very far from the raw mean.
@tammojan
Copy link
Collaborator

tammojan commented Aug 5, 2025

Thanks! Sorry this took some time to test. I made some unit tests which I'll submit as a new pull request.

With %timeit I found a performance improvement of 35% in just the function bstat, so well worth the few extra lines of code.

Unfortunately the C implementation of bstat does not give the same answer anymore (as documented in the function). It is a lot faster though.

%timeit bstat_master(r, None, 2)
624 μs ± 1.96 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit bstat_pr261(r, None, 2)
403 μs ± 889 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit bstat_c(r, None, 2)
173 μs ± 758 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

@tammojan tammojan merged commit 353680e into lofar-astron:master Aug 5, 2025
1 check passed
@gmloose
Copy link
Collaborator

gmloose commented Aug 6, 2025

Any idea why the C-version now gives different results. That cannot be the result of this PR, or can it?

@bnikolic
Copy link
Contributor Author

Sorry I should have highlighted that the results will not be the same to machine precision (it is alluded to in one of the commit messages). The reason is that this approach uses the original raw mean to calculate the deviations, then corrects to the actual refined clipped mean. But hopefully changes you are see are very small? In my tests with teh SKA dataset some intermediate results changed very slightly but it did not affect the final result at all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants