-
Notifications
You must be signed in to change notification settings - Fork 24
Dev barra2 pipeline #186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Dev barra2 pipeline #186
Conversation
Pull Request Test Coverage Report for Build 18769161211Details
💛 - Coveralls |
|
Thanks for your work, appreciate it. I have a design comment/question: With the simplistic operations subtract, add etc. any reason why they need to be defined so explicitly as transforms? Just for my own understanding. My guess is that this puts them within the category of "pipeline operations", so that they can be represented in some form - e.g. used for example to derive declarative versions of the pipeline. There may be other uses, but this is the main one I can think of. It's totally fine to do that, but there is a "but": It seems troublesome to have to do this for every operation - I think there is a better way to code this sort of pattern e.g. via higher order functions. From a functional standpoint, it seems the datasets are dissected into individual arrays (using keys retrieved) with the operands applied to them and this seems to be common for all (most?) of the operations. If, on the other hand, the reason for the transforms is for "consistency" just for consistency sake - and not for what I described above (or some other justifiable purpose), then using \cc @tennlee, since this is more of a design comment, that may also apply to pre-existing code. |
|
Thanks @nikeethr! Happy to hear other thoughts in this :) |
|
Hey @christianstassen, thanks for your explanation. I understand what you’re trying to do. To elaborate - I’m thinking about maintenance and abstraction. Not saying you need to fix anything. It’s more an unavoidable juncture when it comes to any codebase that deals with pipelines. You have the operators and the structure around the operator conveying “meta” information. For basic binary operations I’d expect the code bloat/overhead to be minimal, and there to be some pre-existing convenience wrapper that helps construct the pattern for each operation, so that it doesn’t have to be repeated. If we decide to change the backend that does the computations (or even bump a version), or the wrapping code itself - we’d have to apply any fixes to each class - making it an error prone and cumbersome exercise. I’d say there’s nothing actually to fix in this case, but more something to take note of or think about if we see the same pattern repeating. (It’s just that it’s quite apparent in this PR since they are “simple” operators - so it’s easier for me to raise the point to the maintainers. Also this would not even be in my radar for new or lightweight codebase, but this one is particularly big already) |
This introduces a notebook showing how to create a pipeline for BARRA-R2 and ERA5.
I also added a bunch of new Transforms, which introduce arithmetics (add, subtract, multiply and divide) and a new Operation to round valid time steps.
The transforms are part of the notebook, I thought it might be helpful for other to see how to create their own but, I also added them to the PET source code.
All changes have been tested by me in the notebook, by pytest, and I have run the
blacktool.