Overview
The Unity cluster is a high-performance computing (HPC) environment maintained by Arts and Sciences Technology Services (ASCTech). Unity mirrors the environment at the Ohio Supercomputer Center (OSC), and provides researchers with convenient computational resources while relieving them of the burdens of administering servers and storage, maintaining an appropriate physical environment, and complying with many security requirements.
The cluster consists of many individual computers called nodes that handle specific tasks. All of the nodes run Red Hat Enterprise Linux. Research groups can buy a compute node and have exclusive use of it or allow shared use by all Unity users (or alternate those models—exclusive use when needed by the group, shared when not in use by the group). ASCTech provides several compute nodes for shared use. We maintain a list of nodes and their specifications.
ASCTech maintains a software stack of popular research applications and you can request that we install additional software. See the Module environment section below for how to list and use these applications. In addition, you can install your own software in your home directory; in fact, that's the how you install packages or modules for R. Python and Perl.
You get a 100-GB home directory that is available on all the nodes in the cluster. Additional project storage is available on request.
Getting Started
Before you can use Unity, we need to enable your account. With that done, you log in to Unity with your ASC account using ssh
(Windows, Mac and Linux computers typically include the SSH suite of utilities, but there are other applications that make this easier).
$ ssh name.#@unity.asc.ohio-state.edu
This will log you in to a head or login node. You may be able to tell that from the operating system prompt that you see—by default, your prompt is your name.# followed by the name of the node that you’re on. So on a login node, your prompt will be like name.#@unity-login1
, where unity-login1
is the name of the login node.
If you are not on an ASC network (for example, from OSU Wireless or from home), you’ll need to connect through ASCTech's jump host (preferred) or connect to the ASC VPN first.
Basic Use
Module environment
In order to facilitate a variety of users who may want different versions of applications, Unity (like OSC) uses the Lmod Environment Modules package. This provides a common stack of requested software, but it requires that you explicitly load many applications before you can use them.
To see the modules that are available on Unity, at a command prompt type
$ module avail
This reports the applications (and their versions) that are installed in the module system. For example, module avail
reports multiple versions of Python, including python/2.7
and python/3.5
, and python/3.5
is followed by (D)
. The (D)
indicates that 3.5 is the default version of Python; you can load Python 3.5 for use by typing
$ module load python
If you really want Python 2.7, enter
$ module load python/2.7
You can use module load
either at the command line (on the head node or in interactive mode—see below) or in a batch script. If you routinely use a module, you can put it in your .bashrc
file.
To see what modules you've already loaded, enter module list
.
Login nodes
As noted above, when you initially log in, you’re on a login node. You’ll be able to interact with Unity through the Linux command line.
A login node is a reasonable place to copy data to or from the cluster using or or to download data from public repositories. You can also edit text files and compile code on a login node.
However, a login node is not where you want to run computations. The login nodes are limited in their capabilities; any computation run on one of these nodes could impact other users. Instead, you’ll want to use either an interactive session on a compute node or submit a job using the scheduler.
Interactive mode
The simplest way get an interactive session on a compute node is by typing (on a login node)
$ sinteractive
The default prompt will change to your name.# at the name of the compute node (see the Unity online documentation for a list of nodes and specifications).
You can change the defaults to sinteractive
(by default in interactive mode you get one hour with one core on one compute node with 3 GB of memory) by passing it arguments; you can get more information on sinteractive
by running it with the --help
argument.
$ sinteractive --help
Usage: sinteractive [-p] [-N] [-n] [-c] [-m] [-M] [-g] [-G] [-L] [-t] [-J] [-A] [-w]
Optional arguments:
-p: partition of where to run job (default: debug)
-N: number of nodes to request (default: 1)
-n: number of tasks to request (default: 1)
-c: number of CPU cores to request (default: 1)
-m: memory per CPU (default: Partition default)
-M: memory per node (default: Partition default)
-g: number of GPUs to request (default: None)
-G: GRES to request (default: None)
-L: Licenses to request (default: None)
-t: Time limit (default: Partition default)
-J: job name (default: interactive)
-w: node name
sinteractive
is actually a script that calls two Slurm commands, salloc
and srun
. You may get more control over an interactive job by doing something like this:
$ salloc --nodes=1 --ntasks-per-node=4 --mem=16g --time=04:00:00
salloc: WARNING: Due to a bug in SLURM your request of --ntasks-per-node=4 is being replaced with --ntasks=4
salloc: Pending job allocation 3021226
salloc: job 3021226 queued and waiting for resources
salloc: job 3021226 has been allocated resources
salloc: Granted job allocation 3021226
salloc: Waiting for resource configuration
salloc: Nodes p0084 are ready for job
$ srun --jobid=3021226 --pty /bin/bash
Type exit
to leave the interactive session.
Batch mode
In typical use, you write Slurm scripts using a text editor and submit them as jobs to Unity using the sbatch command. For example, if your script is namedv myscript.sh
, you would submit your job from an operating system prompt on a login node like this:
sbatch myscript.sh
A simple script file might look like this:
#!/usr/bin/env bash
#SBATCH --time=04:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --job-name=R-script-test
#SBATCH --mail-type=ALL
#SBATCH --mail-user=shew.1@osu.edu
# cd to dir from which I submit job so that
# Rscript can find R code file
cd $SLURM_SUBMIT_DIR
# COMMANDS TO RUN
ml gnu/9.1.0
ml R/4.0.2
Rscript my-R-script.R
The lines that begin with #SBATCH
are directives that set options in the scheduling system. For example, the line that begins #SBATCH --time
sets a walltime limit of four hours. If the job is not complete in four hours it will abort, so it’s important to make a generous estimate of how long you expect your job to take, but just guessing the maximum wall time (which is 336 hours, or 14 days) plays havoc with the scheduler. The line that begins #SBATCH --node
tells the job to run on a single node, and the next line specifies four cores on that node, The line #SBATCH --mail-type=ALL
tells the scheduler to send email when the job starts, ends or has an error, and the #SBATCH --mail-user
line tells the scheduler where to send those emails (put your own email address in your script). Other lines that begin with #
are comments (this is just a shell script).
Next comes the executable part of the script. Here we load a compiler and R, and then run an R script (that is itself in another text file).
The job enters Unity’s queue and runs when resources become available. You’ll get an email when the job begins and an email when it either aborts with an error or completes successfully.
You can get more immediate information about the progress of your job by using the squeue
command. OSC has a page describing tools for monitoring your job.
You can cancel a running job by typing scancel <job_id>
at a command prompt on Unity.