Dev barra2 pipeline #186

christianstassen · 2025-10-24T04:00:58Z

This introduces a notebook showing how to create a pipeline for BARRA-R2 and ERA5.

I also added a bunch of new Transforms, which introduce arithmetics (add, subtract, multiply and divide) and a new Operation to round valid time steps.

The transforms are part of the notebook, I thought it might be helpful for other to see how to create their own but, I also added them to the PET source code.

All changes have been tested by me in the notebook, by pytest, and I have run the black tool.

coveralls · 2025-10-24T04:08:47Z

Pull Request Test Coverage Report for Build 18769161211

Details

35 of 125 (28.0%) changed or added relevant lines in 3 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.3%) to 61.023%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
packages/pipeline/src/pyearthtools/pipeline/operations/xarray/round_time.py	8	26	30.77%
packages/data/src/pyearthtools/data/transforms/arithmetic.py	26	98	26.53%

Totals
Change from base Build 18765016601:	-0.3%
Covered Lines:	9592
Relevant Lines:	15297

💛 - Coveralls

nikeethr · 2026-01-06T00:48:43Z

Thanks for your work, appreciate it. I have a design comment/question:

With the simplistic operations subtract, add etc. any reason why they need to be defined so explicitly as transforms? Just for my own understanding.

My guess is that this puts them within the category of "pipeline operations", so that they can be represented in some form - e.g. used for example to derive declarative versions of the pipeline. There may be other uses, but this is the main one I can think of. It's totally fine to do that, but there is a "but":

It seems troublesome to have to do this for every operation - I think there is a better way to code this sort of pattern e.g. via higher order functions. From a functional standpoint, it seems the datasets are dissected into individual arrays (using keys retrieved) with the operands applied to them and this seems to be common for all (most?) of the operations.

If, on the other hand, the reason for the transforms is for "consistency" just for consistency sake - and not for what I described above (or some other justifiable purpose), then using xarray directly seems more sensible to me. Especially if what is done is already a wrapper around something xarray is capable of. More specific/complicated transforms like temporal manipulation/rounding, can be an exception.

\cc @tennlee, since this is more of a design comment, that may also apply to pre-existing code.

christianstassen · 2026-01-07T03:51:36Z

Thanks @nikeethr!
Some of what you describe above is above my head :) Probably best for @tennlee to comment on this.
I added those transforms for 'physical' reasons. Those being that the units of some variables in ERA5 do not match those in BARRA2. For example, precipitation in ERA5 is accumulated in meters, whereas in BARRA2 it is a flux and in kg m-2 s-1. To convert between the two I need to multiply the field in ERA5 by 1000. Some other variables have more complicated conversions (e.g., the radiation fields).
Rather than post-processing ERA5 and storing a copy of it, I thought it be easier to implement this in the pipeline.

Happy to hear other thoughts in this :)

nikeethr · 2026-01-07T07:29:05Z

Hey @christianstassen, thanks for your explanation. I understand what you’re trying to do.

To elaborate - I’m thinking about maintenance and abstraction. Not saying you need to fix anything. It’s more an unavoidable juncture when it comes to any codebase that deals with pipelines. You have the operators and the structure around the operator conveying “meta” information.

For basic binary operations I’d expect the code bloat/overhead to be minimal, and there to be some pre-existing convenience wrapper that helps construct the pattern for each operation, so that it doesn’t have to be repeated. If we decide to change the backend that does the computations (or even bump a version), or the wrapping code itself - we’d have to apply any fixes to each class - making it an error prone and cumbersome exercise.

I’d say there’s nothing actually to fix in this case, but more something to take note of or think about if we see the same pattern repeating.

(It’s just that it’s quite apparent in this PR since they are “simple” operators - so it’s easier for me to raise the point to the maintainers. Also this would not even be in my radar for new or lightweight codebase, but this one is particularly big already)

christianstassen added 8 commits October 17, 2025 09:06

Initial commit of new BARRA2 pipeline notebook

e744cf4

Updated issue of wrong valid time with better examples.

03a33c6

Working pipeline for BARRA-R2

8a72a5f

Merge branch 'develop' into dev-barra2-pipeline

030d710

Remove any local changes to PYTHONPATH

e0b8ab1

Added arithmetic functionality

ff096ee

Added functionality to round time steps.

85b714e

blacked

b3119de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dev barra2 pipeline #186

Dev barra2 pipeline #186

Uh oh!

christianstassen commented Oct 24, 2025

Uh oh!

coveralls commented Oct 24, 2025

Uh oh!

nikeethr commented Jan 6, 2026 •

edited

Loading

Uh oh!

christianstassen commented Jan 7, 2026

Uh oh!

nikeethr commented Jan 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Dev barra2 pipeline #186

Are you sure you want to change the base?

Dev barra2 pipeline #186

Uh oh!

Conversation

christianstassen commented Oct 24, 2025

Uh oh!

coveralls commented Oct 24, 2025

Pull Request Test Coverage Report for Build 18769161211

Details

💛 - Coveralls

Uh oh!

nikeethr commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christianstassen commented Jan 7, 2026

Uh oh!

nikeethr commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nikeethr commented Jan 6, 2026 •

edited

Loading

nikeethr commented Jan 7, 2026 •

edited

Loading