Accellerate calculation of standard deviations by bnikolic · Pull Request #261 · lofar-astron/PyBDSF

bnikolic · 2025-07-30T16:03:40Z

Surprisingly calculation of the standard deviation is an appreciable fraction of our self-cal pipelines runtime somehow... I'll come back to a more complete solution in future probably, but for now this reuses the mean (which was already calculated) and also reuses a the computed deviations from the raw mean (calculated once only). Unfortunately adds a bit of complexity, but makes quite a big impact for us (~20% in runtime)

Reuse means, and reuse squared deviations from the raw mean. The numerical stability of this approach should be very good as long as clipped mean is not very far from the raw mean.

tammojan · 2025-08-05T18:09:11Z

Thanks! Sorry this took some time to test. I made some unit tests which I'll submit as a new pull request.

With %timeit I found a performance improvement of 35% in just the function bstat, so well worth the few extra lines of code.

Unfortunately the C implementation of bstat does not give the same answer anymore (as documented in the function). It is a lot faster though.

%timeit bstat_master(r, None, 2)
624 μs ± 1.96 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit bstat_pr261(r, None, 2)
403 μs ± 889 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit bstat_c(r, None, 2)
173 μs ± 758 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

gmloose · 2025-08-06T07:49:07Z

Any idea why the C-version now gives different results. That cannot be the result of this PR, or can it?

bnikolic · 2025-08-11T13:54:25Z

Sorry I should have highlighted that the results will not be the same to machine precision (it is alluded to in one of the commit messages). The reason is that this approach uses the original raw mean to calculate the deviations, then corrects to the actual refined clipped mean. But hopefully changes you are see are very small? In my tests with teh SKA dataset some intermediate results changed very slightly but it did not affect the final result at all

Bojan Nikolic added 2 commits July 30, 2025 15:05

Save on standard deviation calculation

1d97097

Reuse means, and reuse squared deviations from the raw mean. The numerical stability of this approach should be very good as long as clipped mean is not very far from the raw mean.

Adapt also the last sigma calculation

34070da

tammojan merged commit 353680e into lofar-astron:master Aug 5, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accellerate calculation of standard deviations#261

Accellerate calculation of standard deviations#261
tammojan merged 2 commits intolofar-astron:masterfrom
bnikolic:stdevimprove

bnikolic commented Jul 30, 2025

Uh oh!

tammojan commented Aug 5, 2025

Uh oh!

Uh oh!

gmloose commented Aug 6, 2025

Uh oh!

bnikolic commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bnikolic commented Jul 30, 2025

Uh oh!

tammojan commented Aug 5, 2025

Uh oh!

Uh oh!

gmloose commented Aug 6, 2025

Uh oh!

bnikolic commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants