Skip to content

Commit 47a1881

Browse files
chu11Al Chu11
authored andcommitted
fast_job_submission: add new guide
Add new guide to improve job submission speed in Flux.
1 parent dc61a2f commit 47a1881

File tree

2 files changed

+240
-1
lines changed

2 files changed

+240
-1
lines changed

jobs/fast-job-submission.rst

Lines changed: 238 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,238 @@
1+
.. _fast-job-submission-tutorial:
2+
3+
===================
4+
Fast Job Submission
5+
===================
6+
7+
One of the biggest value adds for Flux are for those who are dealing with many many jobs, minimally hundreds, but into millions of jobs.
8+
9+
This tutorial will try to go over some of the basics of how to submit a large number of jobs.
10+
11+
--------------------
12+
Basic Job Submission
13+
--------------------
14+
15+
Lets assume you need to submit a number of scripts to run. The way this might traditionally be done via a script is like so:
16+
17+
.. code-block:: sh
18+
19+
#!/bin/sh
20+
21+
for myjob in `ls my_job_scripts/myjob*.sh`
22+
do
23+
flux mini submit ${myjob}
24+
done
25+
26+
flux queue drain
27+
28+
In this example, I have a directory (``my_job_scripts``) with a number a job scripts prefixed with ``myjob``. We iterate through all the job scripts in the directory one by one, submitting them via a job submission command (in this case ``flux mini submit``). Then I wait for all the jobs to finish with ``flux queue drain``.
29+
30+
Lets run this really quick under a local flux instance. In my example directory, I have 1000 scripts suffixed with a number (i.e. ``myjob1.sh`` through ``myjob1000.sh``). Each script just runs ``sleep 0``. I also added some simple timings to see how long job submissions take.
31+
32+
.. code-block:: sh
33+
34+
#!/bin/sh
35+
# filename: job_submit_loop.sh
36+
37+
start=`date +%s`
38+
for myjob in `ls my_job_scripts/myjob*.sh`
39+
do
40+
flux mini submit ${myjob}
41+
done
42+
end=`date +%s`
43+
runtime=$((end-start))
44+
echo "Job submissions took $runtime seconds"
45+
46+
flux queue drain
47+
end=`date +%s`
48+
runtime=$((end-start))
49+
echo "Job submissions and runtime took $runtime seconds"
50+
51+
.. code-block:: console
52+
53+
> ./job_submit_loop.sh
54+
<snip, many job ids printed out>
55+
Job submissions took 351 seconds
56+
Job submission and runtime took 352 seconds
57+
58+
As you can see, it took 351 seconds to submit all of these jobs.
59+
60+
This can be slow for several reasons:
61+
62+
* If you have a lot of job scripts, this is a slow `O(n)` process. Each call to ``flux mini submit`` will involve another round of messages being sent/received to/from Flux.
63+
64+
* You are competing with other users that are also submitting jobs and doing other things with the Flux system instance.
65+
66+
---------------------------
67+
Asynchronous Job Submission
68+
---------------------------
69+
70+
Jobs can be asynchronously submitted via several mechanisms. This will allow us to significantly reduce the slow iterative process of submitting jobs one by one.
71+
72+
The first mechanism is the ``--cc`` (carbon copy) roption in ``flux mini submit``. It will allow the user to replicate every id specified in an :ref:`IDSET<idset>`. Along with the ``{cc}`` substitution string, we can submit all 1000 scripts on the command line like so:
73+
74+
.. code-block:: console
75+
76+
> flux mini submit --cc="1-1000" "my_job_scripts/myjob{cc}.sh"
77+
78+
This substitution is convenient and largely replaces the loop from the above script.
79+
80+
The real benefit is what will go on behind the scenes. Instead of iterating through job submissions one by one, internally job submissions will be sent asynchronously, so we no longer have call and wait after every ``flux job submit`` call. This will allow job submissions to go a lot faster. How much faster?
81+
82+
.. code-block:: console
83+
84+
> time flux mini submit --cc="1-1000" "my_job_scripts/myjob{cc}.sh"
85+
<snip, many job ids printed out>
86+
real 0m3.281s
87+
user 0m1.426s
88+
sys 0m0.140s
89+
90+
We're looking at a wallclock speedup of about 99% here (351 seconds vs 3 seconds). And just to show that the submission time was the bottleneck before and not runtime, lets use the ``--wait`` option with ``flux mini submit``. This will inform ``flux mini submit`` to return after all the jobs have run to completion.
91+
92+
.. code-block:: console
93+
94+
> time flux mini submit --wait --cc="1-1000" "my_job_scripts/myjob{cc}.sh"
95+
<snip, many job ids printed out>
96+
real 0m50.428s
97+
user 0m3.235s
98+
sys 0m0.384s
99+
100+
Now that the job submission is so fast, the bottleneck becomes the actual running of the jobs, not the job submission time. The total submission and runtime of the jobs fell from 352 seconds to 50 seconds.
101+
102+
Another way to submit jobs asynchronously is with ``flux mini bulksubmit``. The interface may be familiar to those who know the `GNU parallel command <https://www.gnu.org/software/parallel/>`_. The following example is the bulksubmit equivalent to our original loop script.
103+
104+
.. code-block:: console
105+
106+
> time flux mini bulksubmit my_job_scripts/myjob{}.sh ::: $(seq 1 1000)
107+
<snip, many job ids printed out>
108+
real 0m3.133s
109+
user 0m1.445s
110+
sys 0m0.145s
111+
112+
--------------------------
113+
Subinstance Job Submission
114+
--------------------------
115+
116+
To solve competition with other users, we can launch a :ref:`subinstance<subinstance>` of Flux.
117+
118+
What are we doing by launching a subinstance? We're basically launching another Flux instance as a job. And once we do that, we have our own Flux resource manager and scheduler that is independent of other users.
119+
120+
We can launch a subinstance of Flux via ``flux mini batch`` and run our job submission loop from earlier.
121+
122+
.. code-block:: console
123+
124+
> flux mini batch -n24 ./job_submit_loop.sh
125+
126+
Because I'm writing this tutorial against my own Flux instance, and the node I'm on isn't that busy, this isn't really going to gain us much in terms of performance.
127+
128+
But we can think about how this can be done if we scale it up. We could divide up our resources and launch multiple Flux instances and divide up the job submissions amongst them.
129+
130+
I'm going to go back to the first looping iteration example from before. Using ``--cc`` or ``bulksubmit`` are so fast with 1000 jobs, that we wouldn't really see any performance difference using subinstances.
131+
132+
I'll also use a slightly altered loop script I call `job_submit_loop_range.sh`. It will take two numbers on the command line and iterate only between those numbers.
133+
134+
.. code-block:: sh
135+
136+
#!/bin/sh
137+
# filename: job_submit_loop_range.sh
138+
139+
for i in `seq $1 $2`
140+
do
141+
flux mini submit my_job_scripts/myjob${i}.sh
142+
done
143+
144+
Lets launch two subinstances in the following script.
145+
146+
.. code-block:: sh
147+
148+
#!/bin/sh
149+
# filename: subinstance_2.sh
150+
151+
start=`date +%s`
152+
flux mini batch -n12 ./job_submit_loop_range.sh 1 500
153+
flux mini batch -n12 ./job_submit_loop_range.sh 501 1000
154+
flux queue drain
155+
end=`date +%s`
156+
runtime=$((end-start))
157+
echo "Job submissions and runtime took $runtime seconds"
158+
159+
My node happens to have 24 cores, so I divide those cores up evenly between these two subinstances (12 cores each), and each of them handling the submission of 500 jobs (1-500 in one, 501-1000 in the other). Because it is difficult to test ONLY job submissions amongst multiple subinstances, I'm only outputting the combined submission and runtime length vs. just the submission time length.
160+
161+
.. code-block:: console
162+
163+
> ./subinstance_2.sh
164+
fXE52ptMd
165+
fXEFWfou1
166+
Job submissions and runtime took 177 seconds
167+
168+
The result is about what we expected, it was about half the time from before (352 seconds vs 177 seconds).
169+
170+
What if we launched 4 subinstances instead of two? Lets do the same experiment, dividing up the cores (6 for each subinstance) and jobs (250 for each subinstance) evenly.
171+
172+
.. code-block:: sh
173+
174+
#!/bin/sh
175+
# filename: subinstance_4.sh
176+
177+
start=`date +%s`
178+
flux mini batch -n6 ./job_submit_loop_range.sh 1 250
179+
flux mini batch -n6 ./job_submit_loop_range.sh 251 500
180+
flux mini batch -n6 ./job_submit_loop_range.sh 501 750
181+
flux mini batch -n6 ./job_submit_loop_range.sh 751 1000
182+
flux queue drain
183+
end=`date +%s`
184+
runtime=$((end-start))
185+
echo "Job submissions and runtime took $runtime seconds"
186+
187+
.. code-block:: console
188+
189+
> ./subinstance_4.sh
190+
fYYku2CNj
191+
fYYvQXbD1
192+
fYZ6DpqRR
193+
fYZFonC6j
194+
Job submissions and runtime took 93 seconds
195+
196+
Not surprsingly, we've cut our job submission and runtime time down even more to 93 seconds.
197+
198+
Although I haven't gone into it within the example, one could also launch a subinstance, within a subinstance.
199+
200+
-------------------------
201+
Combining Things Together
202+
-------------------------
203+
204+
Lets try to put this all together and have subinstances use ``flux jobs submit`` with the ``--cc`` option. We'll run the experiment with 10000 jobs. Based on our original loop taking 352 seconds on 1000 jobs, we could estimate this would normally take 3520 seconds, or about 58 minutes.
205+
206+
.. code-block:: sh
207+
208+
#!/bin/sh
209+
# filename: job_submit_async_range.sh
210+
211+
flux mini submit --wait --cc="$1-$2" "my_job_scripts/myjob{cc}.sh"
212+
213+
.. code-block:: sh
214+
215+
#!/bin/sh
216+
# filename: subinstance_4_async.sh
217+
218+
start=`date +%s`
219+
flux mini batch -n6 ./job_submit_async_range.sh 1 2500
220+
flux mini batch -n6 ./job_submit_async_range.sh 2501 5000
221+
flux mini batch -n6 ./job_submit_async_range.sh 5001 7500
222+
flux mini batch -n6 ./job_submit_async_range.sh 7501 10000
223+
flux queue drain
224+
end=`date +%s`
225+
runtime=$((end-start))
226+
echo "Job submissions and runtime took $runtime seconds"
227+
228+
.. code-block:: console
229+
230+
> ./subinstance_4_async.sh
231+
fRnvTBcsq
232+
fRo6q6bHq
233+
fRoG5n7Jf
234+
fRoQvjpoy
235+
Job submissions and runtime took 106 seconds
236+
237+
Given our original loop for 1000 jobs took 352 seconds, 106 seconds for 10000 jobs is pretty good improvement :-)
238+

jobs/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,5 @@ Do you have a question? `let us know <https://github.com/flux-framework/flux-doc
1212

1313
debugging
1414
batch
15-
hierarchies
15+
hierarchies
16+
fast-job-submission

0 commit comments

Comments
 (0)