This is how to run the rc_kappa
code. There are two ways this code runs, in parallel on a single computer, and distributed on the cluster.
rc_cluster.R
- Sets up a new distributed cluster run.rc_kappa.sh
- A shell script to qsub the single-machine version of the code. This was failing on the cluster because R’s fork-cluster would get a SIGPIPE singal.rc_worker.sh
- A shell script to qsub for distributed runs.rc_kappa_run.R
- A main() to run the single-machine version of the code.rc_run_plan.R
- A main() to run the first distributed step.rc_run_worker.R
- What runs in tasks for the main distributed step.rc_run_assemble.R
- A main() to run the distributed assembly step.rc_kappa.R
- The main file with most of the code.serialize.R
- Saves and loads HDF chunks of data, for each tile.plan_io.R
- Saves and loads the plan to decompose the data for parallel computation.hilbert.R
- Orders tiles so that nearby tiles are nearby in a linear list.test-*.R
- Testing files for the code.There is some introductory information for using the cluster in run_ramp.md
. For rc_kappa
, in particular, there are three steps.
All three steps take the same command line arguments. There is a single command sets up for a distributed run, rc_cluster.R
. The first item below will describe using this. The following items break it down into parts to explain what happens inside that first command.
If you look in scripts/Makefile
, you’ll see a call like this one:
/ihme/singularity-images/rstudio/shells/execRscript.sh \
-i /ihme/singularity-images/rstudio/ihme_rstudio_4051.img \
-s rc_cluster.R --config=210908_world.toml --outvars=210908_fewer \
--years=2019:2019 --draws=1000 --tasks=1000
Inside this command’s main()
function, it does the following:
qsub
. It does this by substituting values into a template shell script.You can edit the shell scripts. If, for instance, there is less work than there are tasks, then that will lead to the assembly failing. In that case, reduce the number of cluster tasks to be less than the length of the plan file (minus one), and run again.
Your job is to login to the cluster, submit the first script and look at its job number. That job number will have digits and then a period, like this: 234247929.1-1000. The part before the period is the Job ID, and the rest is the Task Id. When you submit the assembling, you can qsub it and append to that qsub command a flag to tell it to wait for the previous job id.
The first script is rc_run_plan.R
.
--config=rc_kappa.toml
This one uses parameters from the rc_kappa.toml
configuration file.--years=2000:2019
You have to give it a years argument, and that argument must be at least two years, for instance, 2017:2018.--draws=100
The draws can be 1, in which case it uses mean values.--tasks=100
Not used for this step, but I like to keep the arguments consistent. This doesn’t match draws in any way. It’s the number of SGE jobs we will run.--outvars=201121_split100
This is the output directory. It will be: /ihme/malaria_modeling/projects/globalrc/outputs/basicr/201121_split100
./ihme/singularity-images/rstudio/shells/execRscript.sh -i /ihme/singularity-images/rstudio/ihme_rstudio_4030.img -s rc_run_plan.R --config=rc_kappa.toml --outvars=201121_split100 --years=2000:2019 --draws=100 --tasks=100
This step is quick to run. I do it interactively. It creates two files in the outvars
directory:
plan.json
- This is the extent of the map to process and information about the blocksize and coordinates in lat-long.work.csv
- This is a list of tiles which have a nonzero PfPR value, where a tile is a 32x32 set of pixels in each image. The blocksize is set in the rc_kappa.toml
file.The worker runs under SGE. The shell script to qsub, with qsub rc_worker.sh
, is set up to handle both the initial run and rerunning any task that failed to complete. In it, you’ll see these choices.
-cwd
- Start the process in the same directory as the one in which I invoke the qsub rc_worker.sh
.fthread=2
- Each R process takes one thread, but it can be helpful to allow another thread because I/O can take its own thread sometimes.-o /share/temp/sgeoutput...
- This says that each task gets its own output file, containing both output and error, because of the -y
argument.-t 1-100
- This tells the scheduler to run 100 copies of this script as what the scheduler calls tasks. Each task has its own SGE_TASK_ID
set to its index in the 100, starting from 1, going to 100. Each time the R code runs, it checks the SGE_TASK_ID
environment variable in order to figure out its task.-q all.q
- This is the general queue. Works fine.-l h_rt=3:00:00
- Allow three hours to run. One hour was too litte, and sometimes I/O can be terribly much slower than normal, so leave room, but not 72 hours, because they it won’t get scheduled soon.-l m_mem_free=4G
- Request 4GB of RAM. This seems to need about 2GB.This worker took over an hour for 9 tiles per worker for 100 draws. That means 1 tile for 1000 draws could be 2 hours.
If I haven’t run this script on a given number of draws, I run it first for tasks -t 100-100
, and record its job id, in order to see how long it takes and how much memory it needs. These are in qacct -j <job_id>
as RSS (in kb) and wallclock time. Another way, easier, is to run the comman interactively, prefixed by /usr/bin/time --verbose
and adding --task=1
at the end. This will both the real time and the maximum resident set size, in kilobytes. Divide by 1024^2 to get GB for the qsub command. It should be near 1-4 GB.
#!/bin/sh
#$ -P proj_mmc
#$ -cwd
#$ -j y
#$ -o /share/temp/sgeoutput/adolgert/rc_kappa_$JOB_ID_$TASK_ID.txt
#$ -N rc_split
#$ -l fthread=2
#$ -t 1-100
#$ -q all.q
#$ -l h_rt=3:00:00
#$ -l m_mem_free=4G
#$ -S /bin/bash
# If there were tasks that didn't run, paste their numbers here.
# This way you can submit one job that works through the missing ones.
# MISSING=40,44,47,49,50,52
MISSING=
if [[ -n "${MISSING}" ]]
then
export SGE_TASK_ID=`echo $MISSING | cut -d"," -f"${SGE_TASK_ID}"`
fi
VERSION=201121_split100
/ihme/singularity-images/rstudio/shells/execRscript.sh \
-i /ihme/singularity-images/rstudio/ihme_rstudio_4030.img \
-s rc_run_worker.R --config=rc_kappa.toml --outvars=${VERSION} \
--years=2000:2019 --tasks=100 --draws=100
The MISSING
lines are there in case any tasks fail. This is a common problem on this cluster. You’ll see that the next step, to assemble the data, will report missing tasks to the command line. You can copy those values here. Then, if there are 6 missing, change the tasks to go from -t 1-6
, and, instead of executing the 1st-6th tasks, it will do the missing ones.
These runs generate one HDF5 file for each worker, in the same outvars directory as above.
Finally, we read all of the HDF5 files and create GeoTIFFs, one for each variable, for each median and confidence interval, for each year.
The rc_run_assemble.R
script takes the same arguments as those above. If this script doesn’t find all of its inputs, it will list those that are missing and print the missing tasks in a format suitable to place in the MISSING
variable in the qsub script above. If you know some jobs failed, running this is any easy way to find the failures.
This script takes about 3GB to read data from 100 files. It works by variable, so the memory usage is low. It would be possible to run this in parallel over the variables. Takes about 20 mins now.
/ihme/singularity-images/rstudio/shells/execRscript.sh -i /ihme/singularity-images/rstudio/ihme_rstudio_4030.img -s rc_run_assemble.R --config=rc_kappa.toml --outvars=201121_split100 --years=2000:2019 --draws=100 --tasks=100
The workers can run in parallel with GNU parallel. You use the task ID to say which variable this worker will save. There are currently 18 variables, but may change.
parallel Rscript scripts/rc_run_assemble.R --config=scripts/sam_mean.toml --outvars=201124_africa_mean --years=2017:2017 --draws=100 --tasks=217 --overwrite --task={} ::: {1..18}