Meeting HPC - Getting Started on Nautilus

Nautilus Architecture

Nautilus is an SGI Altix UV1000 system and is the primary resource of the NICS Remote Data Analysis and Visualization Center (RDAV). It has 1024 cores (Intel Nehalem EX processors), 4 terabytes of global shared memory, and 8 GPUs in a single system image. Nautilus currently has a 960 TB GPFS file system, a CPU speed of 2.0 GHz, and a peak performance of 8.2 Teraflops. In addition, Nautilus is fully integrated with the NICS network which includes Kraken, Athena and the High Performance Storage System (HPSS).

Using Software via Modules

All software on Nautilus is managed through the module system. Here are descriptions of the various module commands:

module list - gives listing of currently loaded modules
module avail - shows all modules available on Nautilus
module avail app - shows all modules available whose name matches app*
module load app - sets up environment to use app
module unload app - clears environment of references to app
module swap app1 app2 - unloads app1 and loads app2; usually used to swap versions or compilers
module help app - gives specific usage info about app
module whatis app - gives descriptive one-liner about app

The NICS webpage lists available software on Nautilus with descriptions and notes on usage:

http://www.nics.tennessee.edu/computing-resources/nautilus/software

Running Jobs via PBS

PBS options

Jobs are run on Nautilus through the Portable Batch System (PBS, also called Torque) which includes a job scheduler called Moab. Jobs can be run interactively or submitted to the scheduler as a batch script. When you submit a job (interactive or batch) you must supply PBS options which give the scheduler information about your run. Common options are:

-l ncpus=n - this one is mandatory; requests n cores for your job
-l mem=m - specifies m memory for your job; defaults to 4GB for every core requested
-l walltime=t - specifies time limit t for your job; defaults to 1 hour for interactive or batch jobs
-A account - specifies account to charge for your job

See the NICS website for a complete list and more info on all PBS options:

http://www.nics.tennessee.edu/computing-resources/nautilus/Batch_Scripts

Jobs are submitted using the qsub command as seen in the examples below.

Interactive Job Example

To submit an interactive job, use the -I PBS option. For example:

qsub -I -l ncpus=8,walltime=3:00:00 -A UT-TENN0038

will request an interactive job using 8 cores for 3 hours for account UT-TENN0038. When your job starts, you will be on a compute node on Nautilus with the requested cores available. To end the interactive job, simply type exit.

Batch Script Example

Here's an example PBS batch script to run an mpi program:

#PBS -A UT-TENN0033
#PBS -N mpi_job
#PBS -j oe
#PBS -q analysis
#PBS -l ncpus=512,walltime=10:00:00

cd $PBS_O_WORKDIR
mpiexec ./hwmpi

If this script is called script.pbs, you would submit it with:

qsub script.pbs

This script requests 512 cores and 10 hours of runtime. The additional PBS options specify a name for the job (-N mpi_job), the queue to run in (-q analysis) and the format for job information output (-j oe specifies one output file combining stdout and stderr).

$PBS_O_WORKDIR is an environment variable provided by PBS. It is set to the path of the directory from which you executed qsub. Let's say in this case that the hwmpi program is in ~/foo/bar. So from within the ~/foo/bar directory, you issue qsub script.pbs. Now when your job runs on a compute node, it will start in your home directory (~/). So before it can run ./hwmpi successfully, it must cd to the directory ~/foo/bar--this is what cd $PBS_O_WORKDIR accomplishes in the batch script.

Job Monitoring

Two helpful commands for monitoring your job's status are showq and qstat. The qstat command generally gives more detailed information about jobs in the system. You can use the -r option with both of these commands to limit the information to only running jobs.

Running Multiple Serial Jobs in Parallel

A Common Situation

Most of the time, code that is run through the batch system in an HPC environment is parallel--either using MPI, OpenMP or pthreads. However, a common situation is when a user needs to run multiple instances of some serial code simultaneously to achieve parallelism. On some systems, the mpiexec or mpirun command can be used to run non-MPI code as a solution to this problem, however on Nautilus, this is not an option. Here's a batch script that would achieve this on Nautilus:

#PBS -A UT-TENN0033
#PBS -N multi_job
#PBS -j oe
#PBS -q analysis
#PBS -l ncpus=512,walltime=10:00:00

cd $PBS_O_WORKDIR
./my_program &
./my_program &
./my_program &
    . . .
./my_program &
wait

Essentially, this script will run my_program 512 times in the background and then wait on all processes to finish. The operating system on Nautilus will take care of actually distributing the processess over the 512 cores allocated.

While this approach is doable, there are some problems:

What if you need to pass different command line arguments for each run of my_program? This can be very tedious to set up.
What if you need to run 3000 instances and there are only 128 cores available?
How will you manage the output from all of the different runs?

Enter Eden

We developed a tool called Eden as an answer to these problems. Eden makes it very easy to run as many serial jobs as you need on whatever resources are available.

Eden

Eden is entirely script-based (Bourne shell and some tcl) making it very portable and easy to use. To use Eden, all you need is a list of commands that you want to execute. This can be an explicit list that you generate yourself or you can give Eden a command template and a list of parameters from which Eden will generate the command list. Both ways are shown in the examples below.

Just as with any other software on Nautilus, load Eden through the module system:

module load eden

The basic usage of Eden is:

eden {commands|params} {PBS options}

The first argument tells Eden whether you're providing a list of commands or a parameter file. This is followed by the PBS options for your job in the same format as you would use for a qsub command when submitting an interactive job.

Starting with Command List

Let's say you have the following commands that you want to run:

./my_program -t 0.2
./my_program -t 0.4
./my_program -t 0.6
./my_program -t 0.8
./my_program -t 1.0
./my_program -t 1.2
./my_program -t 1.4
./my_program -t 1.6

First make a directory to use for your run. Name your list of commands commands and place it in the directory. Now you're ready to run Eden from your directory. This will be a very small example using only 4 cores to run the 8 commands:

module load eden
eden commands -l ncpus=4,walltime=1:00:00 -A UT-TENN0033

Notice that the PBS options are in the same format as you would normally use with qsub. Eden will spit out some messages informing you of its actions and then submit your job to the queue. When your job is complete, you'll have a number of files in your directory:

commands_done - as commands are completed, their index number is written to this file; if a job is stopped prematurely, this file is used as a restart file for Eden; also, you can check this file as your job is running to monitor its progress
eden.config - this file lists all of the information regarding your job; it is used to coordinate the different processess of Eden
eden_job.pbs - this is the actual pbs script generated by Eden to submit your job
You'll also have the usual PBS output files from your job submission

In addition, you'll have a directory named with the timestamp of your Eden run. Within that directory are all of the output files from the individual commands. Each command will have a .out, a .err and a .time file.

Note that Eden treats each line of your commands file as a command to run--but a single line could contain multiple commands, separated by semicolons, if you wish.

Automatic Command Generation

Now let's say that your command list is going to be too much trouble to produce manually. You would rather just say, "Here's my program. Here are my parameters that I want to run through. Go." Eden makes this possible.

Here's an example params file that you would use (instead of the commands file) to generate the command list from the above example:

./my_program -t $threshold
threshold 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

Now you would call this file params and then run Eden with:

eden params -l ncpus=4

and Eden will automatically generate the command list to run your jobs.

Params File

The first line of the params file is your command template. This is simply the command (or a series of commands separated by semicolons) that you wish to run with placeholders for the parameters that will vary. In the above example, $threshold is the placeholder. Placeholders are signified by prepending a $ just as you would for variables in a shell script.

Following the command template, you simply list the parameters (that correspond to the placeholders) along with a list of values they can take. This makes it very easy to accomplish runs in which you need to try all possible permutations of a number of parameters. Here's another example params file:

./blackbox -t $threshold -p output$i.png $alpha $beta $gamma

threshold 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
alpha NULL -a
beta NULL -b
gamma NULL -g

There are two additional things in this example. First of all, you can use the built-in placeholder $i which is simply the index number of each command. In this case, the output file signified by the -p option will be appended with an incrementing index for each run. Also, if you want a parameter to be either present or not present, use the NULL keyword as one of the possible values for the parameter. Here are the first few commands generated by the above params file:

./blackbox -t 0.2 -p output0.png
./blackbox -t 0.2 -p output1.png   -g
./blackbox -t 0.2 -p output2.png  -b
./blackbox -t 0.2 -p output3.png  -b -g
./blackbox -t 0.2 -p output4.png -a
./blackbox -t 0.2 -p output5.png -a  -g
./blackbox -t 0.2 -p output6.png -a -b
./blackbox -t 0.2 -p output7.png -a -b -g
./blackbox -t 0.4 -p output8.png

As you can see, Eden will generate every possible permutation of the values for threshold, alpha, beta and gamma.

Large-scale parameter studies with serial code can now be set up and managed quickly and easily with Eden on Nautilus.

HPC Utilities

— Getting Started on Nautilus

Scott Simmerman, for NSF NIMBioS/RDAV Joint Tutorial

[Part 1: HPC Challenges/Part 2: HPC Utilities]