General information

The HPCC is a CentOS 7 Linux compute cluster that is actively growing through the combination of separate computing clusters into a single, shared resource.

About HPCC

Forged from ancient firesstrong> deep in the Hill Center underworld

About HPCC


Xeon Phis, NVIDIA Teslas & Quadras, and multiple generations of Xeon CPU cores



Benchmark results and performance summaries for specific applications


The HPCC includes the following hardware (this list may already be outdated since the cluster is actively growing):

53 CPU-only nodes, each with 16 Intel Xeon E5-2670 (Sandy Bridge) cores + 128 GB RAM
5 CPU-only nodes, each with 16 Intel Xeon e5-2670 (Ivy Bridge) cores + 128 GB RAM
26 CPU-only nodes, each with 16 Intel Xeon e5-2670 (Haswell) cores + 128 GB RAM
4 CPU-only nodes, each with 16 Intel Xeon e5-2680 (Broadwell) cores + 128 GB RAM
3 16-core E5-2670 nodes with 8 Nvidia Tesla M2070 GPUs onboard
2 28-core E5-2680 nodes with 4 Quadra M6000 GPUs onboard
1 16-core E5-2670 node with 8 Xeon Phi 5110P accelerators onboard

Default run time = 2 hours in the main partition (30 min in the testing partition)
Maximum run time = 14 days

Connecting to the HPCC

The HPCC is currently accessed using a single login node, fen2 (“fen” = front end node).

ssh [your NetID]

Moving files

There are many different ways to this: secure copy (scp), remote sync (rsync), an FTP client (FileZilla), etc. Let’s assume you’re logged-in to a local workstation or laptop (not already logged-in to the HPCC). To send files from your local system to your HPCC /home directory,

scp file-1.txt file-2.txt [NetID][NetID]

To pull a file from your HPCC /home directory to your laptop (note the “.” at the end of this command),

scp [NetID][NetID]/file-1.txt  . 

If you want to copy an entire directory and its contents using scp, you’ll need to “package” your directory into a single, compressed file before moving it:

tar -czf my-directory.tar.gz my-directory

After moving it, you can “unpackage” that .tar.gz file to get your original directory and contents:

tar -xzf my-directory.tar.gz

Listing available resources

Before requesting resources (compute nodes), it’s helpful to see what resources are available and what cluster partitions (job queues) to use for certain resources.

Example of using the sinfo command:

$ sinfo

testing*     up    2:00:00      1    mix slepner001
testing*     up    2:00:00      1  alloc slepner002
main         up 14-00:00:0      1 drain* slepner041
main         up 14-00:00:0      1   fail slepner057
main         up 14-00:00:0     53    mix gpu[001-004],slepner[001-082]
main         up 14-00:00:0     26  alloc slepner[002,004,006,008-079]
tesla        up 14-00:00:0      3    mix gpu[001-003]
xeonphi      up 14-00:00:0      1    mix gpu004
maxwell      up 14-00:00:0      2   idle gpu[005-006]
admin        up    2:00:00      1   idle slepnert001

Understanding this output:

Slepner? Norse Mythology, "Sleipnir" 8-legged war horse (this made more sense when CPUs had 8 cores).

There are 4 partitions, testing (traditional compute nodes, CPUs only), main (traditional compute nodes, CPUs only), tesla (nodes with general-purpose GPU accelerators), xeonphi (nodes with Xeon Phi coprocessors).

The upper limit for a job’s run time is 14 days (336 hours), but the testing partition has a limit of 2 hours.

Allocated (alloc) nodes are currently running jobs.

Mixed (mix) nodes have jobs using some, but not all, CPU cores onboard.

Idle nodes are currently available for new jobs.

Drained (drain, drng) nodes are not available for use and may be offline for maintenance.

Loading software modules

When you first log-in, only basic system-wide tools are available automatically. To use a specific software package that is already installed, you can setup your environment using the module system.

The module avail command will show a list of the core (primary) modules available:

$ module avail

---------------------- /opt/sw/modulefiles/Core ----------------------

   ARACNE/20110228     bowtie2/2.2.9          (D)    gcc/5.3          (D)    java/1.7.0_79        python/3.5.0
   HISAT2/2.0.4        bwa/0.7.12                    hdf5/1.8.16             java/1.8.0_66        python/3.5.2        (D)
   OpenCV/2.3.1        bwa/0.7.13             (D)    intel/16.0.1            java/1.8.0_73 (D)    samtools/0.1.19
   STAR/2.5.2a         cuda/7.5                      intel/16.0.3     (D)    modeller/9.16        samtools/1.2
   Trinotate/2.0.2     cufflinks/2.2.1               intel/17.0.0            mvapich2/2.1         samtools/1.3.1      (D)
   bamtools/2.4.0      delly/0.7.6                   intel/17.0.1            pgi/16.9             test/666
   bcftools/1.2        gaussian/g03revE01            intel_mkl/16.0.1        pgi/16.10     (D)    trinityrnaseq/2.1.1
   bedtools2/2.25.0    gaussian/09revD01.orig        intel_mkl/16.0.3 (D)    python/2.7.10
   blat/35             gaussian/09revD01      (D)    intel_mkl/17.0.0        python/2.7.11
   bowtie2/2.2.6       gcc/4.9.3                     intel_mkl/17.0.1        python/2.7.12

Understanding this output:

The packages with a (D) are the default versions for packages where multiple versions are available.

To see a comprehensive list of all available modules (not just the core modules) use the module spider command.

$ module spider

The following is a list of the modules currently available:
  ARACNE: ARACNE/20110228
    ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context

  HISAT2: HISAT2/2.0.4
    HISAT2: graph-based alignment of next generation sequencing reads to a population of genomes

  HMMER: HMMER/3.1b2
    HMMER: biosequence analysis using profile hidden Markov models

  NAMD: NAMD/2.10
    NAMD: Scalable Molecular Dynamics

  ORCA: ORCA/3.0.3
    ORCA: An ab initio, DFT and semiempirical SCF-MO package

  OpenCV: OpenCV/2.3.1
    OpenCV: Open Source Computer Vision

  PETSc: PETSc/3.6.3
    PETSc: Portable, Extensible Toolkit for Scientific Computation

Loading a software module changes your environment settings so that the executable binaries, needed libraries, etc. are available for use.

To load a software module, use the module load command, followed by the name and version desired.

To remove select modules, use the module unload command. To remove all loaded software modules, use the module purge command.

To load the default version of any software package, use the module load command but only specify the name of the package, not the version number.

Below are some examples.

$ module load intel/16.0.3 mvapich2/2.1
$ module list
Currently Loaded Modules:
  1) intel/16.0.3   2) mvapich2/2.1

$ module unload mvapich2/2.1
$ module list
Currently Loaded Modules:
  1) intel/16.0.3

$ module purge
$ module list
No modules loaded

$ module load intel
$ module list
Currently Loaded Modules:
  1) intel/16.0.3

If you always use the same software modules, your ~/.bashrc (a hidden login script located in your /home directory) can be configured to load those modules automatically every time you log in. Just add your desired module load command(s) to the end of that file. You can always edit your ~/.bashrc file to change or remove those commands later.

Running a serial (single-core) job

Here’s an example of a SLURM job script for a serial job. I’m running a program called “zipper” which is in my /scratch (temporary work) directory. I plan to run my entire job from within my /scratch directory because that offers the best filesystem I/O performance.


#SBATCH --partition=main         # Partition (job queue)
#SBATCH --job-name=zipx001a      # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=1        # Threads per process (or per core)
#SBATCH --mem=2000               # Total real memory required (MB) for each node
#SBATCH --time=02:00:00          # Total run time limit (HH:MM:SS)
#SBATCH --output=slurm.%N.%j.out # Combined STDOUT and STDERR output file
#SBATCH --export=ALL             # Export you current environment settings to the job environment

cd /scratch/[your NetID]

/scratch/[your NetID]/zipper/2.4.1/bin/zipper <

Understanding this job script:

A job script contains the instructions for the SLURM workload manager (cluster job scheduler) to manage resource allocation, scheduling, and execution of your job.

The lines beginning with #SBATCH contain commands intended only for the workload manager.

My job will be assigned to the “main” partition (job queue).

This job will only use 1 CPU core and should not require much memory, so I have requested only 2 GB of RAM — it’s a good practice to request only about 2 GB per core for any job unless you know that your job will require more than that.

My job will be terminated when the run time limit has been reached, even if the program I’m running is not finished. It is not possible to extend this time after a job starts running.

Any output that would normally go to the command line will be redirected into the output file I have specified, and that file will be named using the compute node name and the job ID number.

Here’s how to run a serial batch job, loading modules and using the sbatch command:

First, be sure to confiure your environment as needed for running your job. This usually means loading any needed modules.

$ module purge
$ module load intel/16.0.3 fftw/3.3.1
$ sbatch

The sbatch command reads the contents of your job script and forwards those instructions to the SLURM workload manager. Depending on the level of activity on the cluster, your job may wait in the job queue for mintues or hours before it begins running.

Running a parallel (multicore MPI) job

Here’s an example of a SLURM job script for a parallel job. See the previous (serial) example for some important details omitted here.


#SBATCH --partition=main         # Partition (job queue)
#SBATCH --job-name=zipx001a      # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=16              # Number of tasks to run (usually = cores) on each node
#SBATCH --cpus-per-task=1        # Threads per process (or per core)
#SBATCH --mem=124000             # Total real memory required (MB) for each node
#SBATCH --time=02:00:00          # Total run time limit (HH:MM:SS)
#SBATCH --output=slurm.%N.%j.out # Combined STDOUT and STDERR output file
#SBATCH --export=ALL             # Export you current environment settins to the job environment

cd /scratch/[your NetID]

srun --mpi=pmi2 /scratch/[your NetID]/zipper/2.4.1/bin/zipper <

Understanding this job script:

The srun command is used to coordinate communication among the parallel tasks of your job. You must specify how many tasks you will be using, and this number usually matches the –ntasks value in your job’s hardware allocation request.

This job will use 16 CPU cores and nearly 8 GB of RAM per core, so I have requested a total of 124 GB of RAM — it’s a good practice to request only about 2 GB per core for any job unless you know that your job will require more than that.

Here’s how to run a parallel batch job, loading modules and using the sbatch command:

$ module purge
$ module load intel/16.0.3 fftw/3.3.1 mvapich2/2.1
$ sbatch

Note here that I’m also loading the module for the parallel communication libraries (MPI libraries) needed by my parallel executable.

Running an interactive job

An interactive job gives you an active connection to a compute node (or collection of compute nodes) where you will have a login shell and you can run commands directly on the command line. This can be useful for testing, short analysis tasks, computational steering, or for running GUI-based applications.

When submitting an interactive job, you can request resources (single or multiple cores, memory, GPU nodes, etc.) just like you would in a batch job:

[NetID@fen2 ~]$ srun --partition=main --nodes=1 --ntasks=1 --cpus-per-task=1 --mem=2000 --time=00:30:00 --export=ALL --pty bash -i

srun: job 1365471 queued and waiting for resources
srun: job 1365471 has been allocated resources

[NetID@slepner045 ~]$

Notice that, when the interactive job is ready, the command prompt changes from NetID@fen2 to NetID@slepner045. This change shows that I’ve been automatically logged-in to slepner045 and I’m now ready to run commands there. To exit this shell and return to the shell running on the fen2 login node, type the exit command.

Monitoring the status of jobs

The simplest way to quickly check on the status of active jobs is by using the squeue command:

$ squeue -u [your NetID]

1633383      main   zipper    xx345   R       1:15      1 slepner36

Here, the state of each job is typically listed as being either PD (pending), R (running), along with the amount of allocated time that has been used (DD-HH:MM:SS).

For summary accounting information (including jobs that have already completed), you can use the sacct command:

$ sacct

       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
1633383          zipper       main      statx         16    RUNNING      0:0

Here, the state of each job is listed as being either PENDING, RUNNING, COMPLETED, or FAILED.

Killing / cancelling / terminating jobs

To terminate a job, regardless of whether it is running or just waiting in the job queue, use the scancel command and specify the JobID number of the job you wish to terminate:

$ scancel 1633383

A job can only be cancelled by the owner of that job. When you terminate a job, a message from the SLURM workload manager will be directed to STDERR and that message will look like this:

slurmstepd: *** JOB 1633383 ON slepner036 CANCELLED AT 2016-10-04T15:38:07 ***

Installing your own software

Package management systems like yum or apt-get, which are used to install software in typical Linux systems, are not available to users of shared computing resources like the HPCC. Thus, most packages need to be compiled from their source code and then installed. Further, most packages are generally configured to be installed in /usr or /opt, but these locations are inaccessible to (not writeable for) general users. Special care must be taken by users to ensure that the packages will be installed in their own /home directory (/home/[NetID]).

As an example, here are the steps for installing ZIPPER, a generic example package that doesn’t actually exist:

(1) Download your software package. You can usually download a software package to your laptop, and then transfer the downloaded package to your /home/[NetID] directory on the HPCC for installation. Alternatively, if you have the http or ftp address for the package, you can transfer that package directly to your home directory while logged-in to the HPCC using the wget utility:

$ wget

(2) Unzip and unpack the .tar.gz (or .tgz) file. Most software packages are compressed in a .zip, .tar or .tar.gz file. You can use the tar utility to unpack the contents of these files:

$ tar -zxf zipper-4.1.5.tar.gz

(3) Read the instructions for installing. Several packages come with an INSTALL or README script with instructions for setting up that package. Many will also explicitly include instructions on how to do so on a system where you do not have root access. Alternatively, the installation instructions may be posted on the website from which you downloaded the software.

$ cd zipper-4.1.5
$ less README

(4) Load the required software modules for installation. Software packages generally have dependencies, i.e., they require other software packages in order to be installed. The README or INSTALL file will generally list these dependencies. Often, you can use the available modules to satisfy these dependencies. But sometimes, you may also need to install the dependencies for yourself. Here, we load the dependencies for ZIPPER:

$ module load intel/16.0.3 mvapich2/2.1

(5) Perform the installation. The next few steps vary widely but instructions almost always come with the downloaded source package. Guidance on the special arguments passed to the configure script is often available by running the ./configure -–help command. What you see below is just a typical example of special options that might be specified.

$ ./configure --prefix=/home/[NetID]/zipper/4.1.5 --disable-float --enable-mpi --without-x --disable-shared
$ make -j 4
$ make install

Several packages are set up in a similar way, i.e., using configure, then make, and make install. Note the options provided to the configure script – these differ from package to package, and are documented as part of the setup instructions, but the prefix option is almost always supported. It specifies where the package will be installed. Unless this special argument is provided, the package will generally be installed to a location such as /usr/local or /opt, but users do not have write-access to those directories. So, here, I'm installing software in my /home/[NetID]/zipper/4.1.5 directory. The following directories are created after installation:

/home/[NetID]/zipper/4.1.5/bin where executables will be placed

/home/[NetID]/zipper/4.1.5/lib where library files will be placed

/home/[NetID]/zipper/4.1.5/include where header files will be placed

/home/[NetID]/zipper/4.1.5/share/man where documentation will be placed

(6) Configure environment settings. The above bin, lib, include and share directories are generally not part of the shell environment, i.e., the shell and other programs don’t “know” about these directories. Therefore, the last step in the installation process is to add these directories to the shell environment:

export PATH=/home/[NetID]/zipper/4.1.5/bin:$PATH
export C_INCLUDE_PATH=/home/[NetID]/zipper/4.1.5/include:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/home/[NetID]/zipper/4.1.5/include:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=/home/[NetID]/zipper/4.1.5/lib:$LIBRARY_PATH
export LD_LIBRARY_PATH=/home/[NetID]/zipper/4.1.5/lib:$LD_LIBRARY_PATH
export MANPATH=/home/[NetID]/zipper/4.1.5/share/man:$MANPATH

These export commands are standalone commands that change the shell environment, but these new settings are only valid for the current shell session. Rather than executing these commands for every shell session, they can be added to the end of your ~/.bashrc file which will result in those commands being executed every time you log-in to the HPCC.

Example: Running GROMACS

Here is a simple example procedure that demonstrates how to use GROMACS 2016 on the HPCC. In this example, we’ll start with a downloaded PDB file and proceed through importing that file into GROMACS, solvating the protein, a quick energy minimization, and then an MD equilibration. This example is not intended to teach anyone how to use GROMACS. Instead, it is intended to assist new GROMACS users in learning to use GROMACS on the HPCC.

(1) Download a PDB file.


(2) Load the GROMACS 2016 module and its prerequisite modules.

module load intel/17.0.1 mvapich2/2.2 gromacs/2016.1

(3) Import the PDB into GROMACS, while defining the force field and water model to be used for this system.

gmx_mpi pdb2gmx -f 5EWT.pdb -ff charmm27 -water tip3p -ignh -o 5EWT.gro -p -i 5EWT.itp

(4) Increase the size of the unit cell to accomodate a reasonable volume of solvent around the protein.

gmx_mpi editconf -f 5EWT.gro -o 5EWT_newbox.gro -box 10 10 10 -center 5 5 5

(5) Now add water molecules into the empty space in the unit cell to solvate the protein.

gmx_mpi solvate -cp 5EWT_newbox.gro -p -o 5EWT_solv.gro

(6) Prepare your SLURM job script(s). The 2 mdrun commands in the following steps can be executed from within an interactive session or they can be run in batch mode using job scripts. If your mdrun commands/job might take more than a few minutes to run, it would be best to run them in batch mode using a job script. Here’s an example job script for a GROMACS MD simulation. To run the 2 mdrun commands below, simply replace the example mdrun command in this script with one of the mdrun commands from the steps below and submit that job after preparing the simulation with the appropriate grompp step.

#SBATCH --partition=main                # Partition (job queue)
#SBATCH --job-name=gmdrun               # Assign an 8-character name to your job
#SBATCH --nodes=1                       # Number of nodes
#SBATCH --ntasks=16                     # Processes (usually cores) on each node
#SBATCH --cpus-per-task=1               # Threads per process (or per core)
#SBATCH --mem=124000                    # Memory per node (MB)
#SBATCH --time=00:20:00                 # Total run time limit (HH:MM:SS)
#SBATCH --output=slurm.%N.%j.out        # combined STDOUT and STDERR output file
#SBATCH --export=ALL                    # Export you current env to the job env
srun --mpi=pmi2 gmx_mpi mdrun -v -s 5EWT_solv_prod.tpr \
                -o 5EWT_solv_prod.trr -c 5EWT_solv_prod.gro \
                -e 5EWT_solv_prod.edr -g

(7) Peform an inital, quick energy minimization. Here, we’re using a customized MD parameters file named em.mdp, which contains these instructions:

integrator     = steep
nsteps         = 200
cutoff-scheme  = Verlet
coulombtype    = PME
pbc            = xyz
emtol          = 100

These are the commands (both the grompp step and the mdrun step) used to prepare and run the minimization:

gmx_mpi grompp -f em.mdp -c 5EWT_solv.gro -p -o 5EWT_solv_mini.tpr -po 5EWT_solv_mini.mdp

gmx_mpi mdrun -v -s 5EWT_solv_mini.tpr -o 5EWT_solv_mini.trr -c 5EWT_solv_mini.gro -e 5EWT_solv_mini.edr -g

(8) Perform a quick MD equilibration (same syntax/commands for a regular MD run). Here, we’re using a customized MD parameters file named equil.mdp, which contains these instructions:

integrator               = md
dt                       = 0.002
nsteps                   = 5000
nstlog                   = 50
nstenergy                = 50
nstxout                  = 50
continuation             = yes
constraints              = all-bonds
constraint-algorithm     = lincs
cutoff-scheme            = Verlet
coulombtype              = PME
rcoulomb                 = 1.0
vdwtype                  = Cut-off
rvdw                     = 1.0
DispCorr                 = EnerPres
tcoupl                   = V-rescale
tc-grps                  = Protein  SOL
tau-t                    = 0.1      0.1
ref-t                    = 300      300
pcoupl                   = Parrinello-Rahman
tau-p                    = 2.0
compressibility          = 4.5e-5
ref-p                    = 1.0

These are the commands (both the grompp step and the mdrun step) used to prepare and run the equilibration:

gmx_mpi grompp -f equil.mdp -c 5EWT_solv_mini.gro -p -o 5EWT_solv_equil.tpr -po 5EWT_solv_equil.mdp

gmx_mpi mdrun -v -s 5EWT_solv_equil.tpr -o 5EWT_solv_equil.trr -c 5EWT_solv_equil.gro -e 5EWT_solv_equil.edr -g