UNH Premise Cluster UNH Logo

Table of contents

Overview of the Premise cluster

High-performance computing (HPC) is the use of parallel processing for running advanced application programs efficiently, reliably and quickly.

Quick Start Cover Page
HPC Quick Start.pdf

What is the "Condo Compute Model"?

This cluster was funded by what is commonly called the "Condo Compute Model".

For Free?

The Premise core infrastructure is provided for UNH researchers. The Premise core infrastructure includes: racks, power distribution, cooling, InfiniBand Network mesh, Lustre file storage, a head node and four compute nodes. You may thank; UNH RCC, UNH Central IT, and the Research Office for providing this infrastructure.

Users compete equally for available resources in a "shared" job queue. All Premise users are expected to play nicely. An HPC Advisory Board has been created to provide RCC with direction on enforcement for the common good. Every attempt will be made to utilize available resources fairly.

Buy-in

Your budget should include "HPC buy-in" funding to satisfy your projects minimum needs. There is no other way to garantee the required resources will be available when you need them. The "Hardware Description" below defines three standard node configurations (approximate price on 12/1/16): base ($8k), hi-ram ($12.5k), and gpu ($13.5k). Contact RCC for current pricing for your proposal budget. Your grant retains ownership of any hardware you purchase.

Owners are provided a restricted job queue with priority scheduling on any hardware they own. When no owner priority work exists "shared" queue jobs may be scheduled on the idle hardware. Owners should expect active jobs to complete in a "reasonable amount of time". This might cause wait times for some priority jobs.

Description of Hardware

The Premise cluster is an HPC made up of:

What is the theoretical performance of this cluster?

CPU performance only

Premise has 14 compute nodes with two CPUs per node for a total of 28 CPUs. Each 12-core "Intel(R) Xeon(R) CPU E5-2680 v3 @2.50GHz" is rated at 356.50 double precision GFlops.

(Total CPU performance) 
  = (14 node) * (2 CPU/node) * (356.50 GFlops/CPU ) 
  = (9982 GFlops) 
  = (9.75 TFlops)

GPU performance only

Four nodes of the Premise cluster contain NVidia K80 GPU cards. NVidia specs claim upto 2.91 Teraflops double precision for a single K80 card.

(Total GPU performance)
  = (4 GPU) * (2.91 TFlops/GPU) 
  = (11.64 TFlops)

Combined CPU + GPU = (21.39 TFlops)

(Combined Performance)
  = (Total CPU performance) + (Total GPU performance)
  = (9.75 TFlops) + (11.64 TFlops) 
  = (21.39 TFlops)

Usage

Premise is managed by UNH Research Computing Center staff. Please email administrative, technical requests to: rccops@sr.unh.edu

The focus of the Premise cluster is to support UNH Research. If you are seeking academic student HPC experience we currently suggest using XSEDE resources. For more information on XSEDE please contact the UNH Campus Champion "Grace Wilson Caudill" Grace.WilsonCaudill@unh.edu

Utilize Premise for Research

Establish a Premise account

Create a Premise account by emailing UNH Research Computing Center staff at: rccops@sr.unh.edu

For account creation requests we suggest the following information:

Full Name:
Email and phone:
Requested login id:
Research Group:
Expected use case / Research area:

(Other relevant information)

Examples of other relevant information:

Connecting to Premise

The only way to connect to Premise is by using a SecureSHell (SSH) encrypted connection. You will need an SSH Client program on a your internet connected computer to reach Premise. Often this can be done on your local command line by typing:

ssh premise.sr.unh.edu

You may need to install an SSH Client on your computer.

Getting started

HPC software is most often field specific. You probably have a better idea of where to look for relevant software tools in your field that we do, but you are welcome to ask RCC what we might know.

If you are bringing your own source code or use common Linux tools they may already exist on the cluster. Some software packages may be available as "modules" (more information available below).

How do I run my program

You should not run your programs directly on the Premise login head node.

First copy any required data onto the Premise system, preferably in a subdirectory of your home area. Your same home area is mounted on all of the Premise nodes.

Once you have the application and necessary data ready you will submit it as a job into the batch system using Slurm. For details on using Slurm start with the local slurm usage notes.

More questions?

You may ask other Premise questions via email to ops@sr.unh.edu. This is RCC's general support email, so please indicate that your question is related to "Premise".

Visualize Current Usage

RCC chose XDMoD to visualize Premise usage. The Premise XDMoD webpage may be viewed when on campus or using the VPN.

Major Software Packages

slurm

The batch system is used to schedule jobs to run on the compute nodes. Each command supports a --help option and has a Unix man page.

More information can be found here:

Common comands include:

sbatch
Submit a script to be run when the required resource are available.
squeue
Shows your jobs currently running or waiting to be run.
sacct
Shows your jobs that have completed or failed to run.

modules

Used to select specific software packages and exact versions. See the official module man page for more information.

Common commands include:

module avail
Provides a list of modules available on this cluster
module load X
Load the package X into the current shells environment. If more than one version is available it is specified as X/verion.
module list
Display the list of packages currently loaded in this shell.

MATLAB

Users of Matlab on the Premise compute cluster should not run graphically on the Premise head node. Unlike running on your desktop, matlab jobs must be submitted to the Slurm job queue. A helper script has been created to submit your matlab.m scripts for you.

Run your matlab script with:

[rea@premise ~]$ module load matlab
[rea@premise ~]$ sMATLAB.py matlabscriptname.m

Use "sMATLAB.py --help" to describe available options and defaults.

Adding the "--verbose" option to sMATLAB.py displays both the Slurm sbatch command line and helper job script that is being generated for you. This could be used as a starting point for users wishing to create their own Slurm scripts.

Note that Matlab does not automatically use all the cores on a node, or split a job accross multiple nodes for you. These features must be coded into your scripts. Some WEBDOC links exist in the autogenerated script that might be helpful.

Here is an example matlab script utilizing "parfor" to iterate work accross all the cores on one node. Using Matlab on more than one node is not supported. Premise nodes currently all have 24 cores.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
parpool(str2num(getenv('SLURM_JOB_CPUS_PER_NODE')));   % workers=24 cores per Premise node
tic   % start timer
ticBytes(gcp);   % time should include distribution transfers
n = 1024;
A = zeros(n);
parfor (i = 1:n)   % Distribute these "n" iterations over workers in parpool.
    A(i,:) = (1:n) .* sin(i*2*pi/1024);
end
tocBytes(gcp)   % timer should include collection transfers
toc   % stop & display elapsed time.

Installed Modules