Skip to content

Comments

introducing flow output#850

Draft
gpetretto wants to merge 2 commits intomaterialsproject:mainfrom
gpetretto:flowout
Draft

introducing flow output#850
gpetretto wants to merge 2 commits intomaterialsproject:mainfrom
gpetretto:flowout

Conversation

@gpetretto
Copy link
Contributor

Partially related to #842, but aiming at more general modification of the code.

On top of the issues described in #842, in general there are a few issues that users have pointed out

  1. Flows are often created from makers and their name can be easily set, however often the outputs need to be retrieved from the output of the last job (consider for example in the phonon workflow in atomate2 one needs to to retrieve the output of the Job named generate_frequencies_eigenvectors). For users that are not very familiar with the whole internal structure of the Flow this results to be unintuitive and requires knowing the name of the last job for the different kind of Flows.
  2. While the input of each single job is usually stored in the outputs, it is not easy to keep track of the exact inputs that were used to generate a Flow. In particular, it would be useful to be able to store also the Maker with all the parameters used.

After discussing with @davidwaroquiers we though that one possible way to address the previous issues would be to introduce an output document representing a Flow output. So, in this draft we attempt to introduce this new entity. Here are the main changes:

  • An index attribute is added to the Flow.
  • All Flows and subflows produce a JobStoreDocument when all the contained Jobs are processed. Stored in the standard JobStore.
  • This output of a Flow document contains the references to the output of the jobs chosen when creating the Flow (similarly to what is done in a store_inputs output document).
  • If a Flow is generated from a Maker, the instance of Maker and the arguments passed to make() are added as an attribute to the Flow object and are then stored in the Flow output document.
  • When a Job returns a replace containing a Flow, there is no store_inputs Job anymore, but the replacing Flow gets the uuid of the original Job and index=job-index+1. The Flow output takes the role of the store_inputs.
  • currently only supported in the run_locally for testing purposes. Tested it with several combinations of Flows/Jobs/Makers.

So, to make an example, consider the following code:

@dataclass
class TestMaker(Maker):
    x: int = 1

    def make(self, a: int):
        j = add(self.x, a)

        return Flow([j], output=j.output)


f = TestMaker(x=2).make(a=2)

After everything is completed, the output store will contain an output from the add job, with the numerical output, and another document corresponding to the uuid of the Flow f. The latter contains a reference to the output of the job j and the details of the Maker object. The advantage here is that a user can fetch the results as an output of a Flow, without having to know the actual internal structure of the Flow and also have the inputs used to generate it.

Currently, this approach does not really solve problem 1 in #842, but I think that there are options to use the Flow output to solve it in a similar way to how it is done in #843.

There are of course several critical points in this proposal:

  • The introduction of some backward incompatible changes. The main one being that, since in this implementation Flow has an index, the hosts attribute of Job and Flow is not a list of uuids anymore, but a list of uuid+index pairs.
  • not still defined how this could/would work for jobflow-remote and fireworks
  • The creation of additional output documents to the JobStore, althought small.

We are aware that such a change would not be a trivial one, but we thought that the potential improvements in the user experience would be worth opening this PR so that it can be better discussed.

@utf @JaGeo @Andrew-S-Rosen What do you think?

@gpetretto gpetretto marked this pull request as draft February 11, 2026 15:37
@Andrew-S-Rosen
Copy link
Member

I am in favor of the proposed changes!

@JaGeo
Copy link
Member

JaGeo commented Feb 11, 2026

I am also in favor. I would just like to know downstream complications you mught expect.

Would this require changes for atomate2 itself?

@utf
Copy link
Member

utf commented Feb 23, 2026

This looks very nice @gpetretto, thanks for tackling it. I need to spend some time to sit down and go through the changes and what they might mean. But I hope to get time early next week.

@gpetretto
Copy link
Contributor Author

gpetretto commented Feb 24, 2026

This looks very nice @gpetretto, thanks for tackling it. I need to spend some time to sit down and go through the changes and what they might mean. But I hope to get time early next week.

Great, thanks! Let me know if you have any doubt or you want to have a discussion on some aspects of the PR.

I am also in favor. I would just like to know downstream complications you mught expect.

Would this require changes for atomate2 itself?

Sorry for the delay. I have spent some time checking the atomate2 tests and potential downsides.
First I can say that most of the atomate2 tests pass directly with the current changes. ~25 do not pass, but almost all of them have trivial fixes. Just to give an idea here the most common failures and fixes are:

  • the output access needs to be dereferenced, because now the last output is the output of the Flow, that only contains a reference to the real output
  • the number of resposes from run_locally is different (since also the Flow output end up there) and this number is checked or the output is found in a specific order in the list of responses and needs to be adapted.

However, there are two tests that fail due to a change in the behaviour in the current PR. One example is the Gruneises workflow, due to run_phonon_jobs Job:
https://github.com/materialsproject/atomate2/blob/b94214aab55c977d195dca33d0bfd0b3193083f9/src/atomate2/common/jobs/gruneisen.py#L113

The problem here is that the Response contains a replace (which is a Flow with no output) and an output. In the original implementation this means that when referring to the original_job.output the selected output is correctly fetched, while in this implementation the replacing Flow output "hides" the original job output, since it has the same uuid but index=2.
There are a few potential solutions to this, but none fully satisfying. Possibly the best option would involve that OutputReference may optionally include an index, so that one can use it to point to the exact reference. This will not be a breaking change, but it may be good to discuss about such a solution before proceeding.

Another issue that emerged, unrelated from the atomate2 tests, is how to deal with the Flow output when some of its jobs fail. Since there may be jobs with on_missing_references=None it could be preferable that the output is generated when all the jobs in the subflow are either completed or failed (this is what I did now in run_locally). This may require some discussions about how to proceed.

Concerning the breaking change of the hosts attribute that goes from being a string to a tuple, one option could be to use f"{uuid}_{index}" as values instead of the (uuid, index) tuple. This would avoid changing the type in the JobStoreDocument and could be easier to handle (the list of tuples is a bit annoying to check, if you look at the code). I am not sure if this could make queries to the DB more cumbersome if the code needs to query just by uuid at some point.

Overall, we think that some minor downsides may be worth for the addition of these functionalities.
For us the next steps would be to check how this would impact jobflow-remote and fireworks and ensure that both can still be used without problems. If the change is not convenient or not possible for fireworks we would consider having a flag that preserves the legacy behaviour. Since this may be time consuming, before proceeding with further investigations we would wait for some indications from your side if you think that these changes are potentially acceptable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants