Você está na página 1de 165

Sheffield HPC Documentation

Release

October 06, 2016


Contents

1 Research Computing Team 3

2 Research Software Engineering Team 5

i
ii
Sheffield HPC Documentation, Release

The current High Performance Computing (HPC) system at Sheffield, is the Iceberg cluster.
A new system, ShARC (Sheffield Advanced Research Computer), is currently under development. It is not yet ready
for use.

Contents 1
Sheffield HPC Documentation, Release

2 Contents
CHAPTER 1

Research Computing Team

The research computing team are the team responsible for the iceberg service, as well as all other aspects of research
computing. If you require support with iceberg, training or software for your workstations, the research computing
team would be happy to help. Take a look at the Research Computing website or email research-it@sheffield.ac.uk.

3
Sheffield HPC Documentation, Release

4 Chapter 1. Research Computing Team


CHAPTER 2

Research Software Engineering Team

The Sheffield Research Software Engineering Team is an academically led group that collaborates closely with CiCS.
They can assist with code optimisation, training and all aspects of High Performance Computing including GPU
computing along with local, national, regional and cloud computing services. Take a look at the Research Software
Engineering website or email rse@sheffield.ac.uk

2.1 Using the HPC Systems

2.1.1 Getting Started

If you have not used a High Performance Computing (HPC) cluster, Linux or even a command line before this is the
place to start. This guide will get you set up using iceberg in the easiest way that fits your requirements.

Getting an Account

Before you can start using iceberg you need to register for an account. Accounts are availible for staff by emailing
helpdesk@sheffield.ac.uk.
The following categories of students can also have an account on iceberg with the permission of their supervisors:
• Research Postgraduates
• Taught Postgraduates - project work
• Undergraduates 3rd & 4th year - project work
Student’s supervisors can request an account for the student by emailing helpdesk@sheffield.ac.uk.

Note: Once you have obtained your iceberg username, you need to initialize your iceberg password to be the same as
your normal password by using the CICS synchronize passwords system.

Connecting to iceberg (Terminal)

Accessing iceberg through an ssh terminal is easy and the most flexible way of using iceberg, as it is the native way
of interfacing with the linux cluster. It also allows you to access iceberg from any remote location without having to
setup VPN connections.

5
Sheffield HPC Documentation, Release

Rest of this page summarizes the recommended methods of accessing iceberg from commonly used
platforms. A more comprehensive guide to accessing iceberg and transfering files is located at
http://www.sheffield.ac.uk/cics/research/hpc/using/access

Windows

Download and install the Installer edition of mobaxterm.


After starting MobaXterm you should see something like this:

Click Start local terminal and if you see something like the the following then please continue to the Connect to
iceberg section.

Mac OS/X and Linux

Linux and Mac OS/X both have a terminal emulator program pre-installed.
Open a terminal and then go to Connect to iceberg.

Connect to iceberg

Once you have a terminal open run the following command:

6 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

ssh -X <username>@iceberg.shef.ac.uk

where you replace <username> with your CICS username.


This should give you a prompt resembling the one below:
[te1st@iceberg-login2 ~]$

at this prompt type:


qsh

like this:
[te1st@iceberg-login2 ~]$ qsh
Your job 135355 ("INTERACTIVE") has been submitted
waiting for interactive job to be scheduled ....
Your interactive job 135355 has been successfully scheduled.

which will pop up another terminal window, which supports graphical applications.

Note: Iceberg is a compute cluster. When you login to the cluster you reach one of two login nodes. You should not
run applications on the login nodes. Running qsh gives you an interactive terminal on one of the many worker nodes
in the cluster.
If you only need terminal based (CLI) applications you can run the qrsh command. Which will give you a shell on a
worker node, but without graphical application (X server) support.

What Next?

Now you have connected to iceberg, you can look at how to submit jobs with Iceberg’s Queue System or look at
Software on iceberg.

2.1.2 Scheduler

qhost

qhost is a scheduler command that show’s the status of Sun Grid Engine hosts.

Documentation

Documentation is available on the system using the command:


man qhost

Examples

Get an overview of the nodes and CPUS on Iceberg


qhost

This shows every node in the cluster. Some of these nodes may be reserved/purchased for specific research groups and
hence may not be available for general use.

2.1. Using the HPC Systems 7


Sheffield HPC Documentation, Release

qrsh

qrsh is a scheduler command that requests an interactive session on a worker node. The resulting session will not
support graphical applications. You will usually run this command from the head node.

Examples

Request an interactive session that provides the default amount of memory resources
qrsh

Request an interactive session that provides 10 Gigabytes of real and virtual memory
qrsh -l rmem=10G -l mem=10G

qrshx

qrshx is a scheduler command that requests an interactive session on a worker node. The resulting session will support
graphical applications. You will usually run this command from the head node.

Examples

Request an interactive X-Windows session that provides the default amount of memory resources and launch the gedit
text editor
qrshx
gedit

Request an interactive X-Windows session that provides 10 Gigabytes of real and virtual memory and launch the latest
version of MATLAB
qrshx -l mem=10G -l rmem=10G
module load apps/matlab
matlab

Request an interactive X-Windows session that provides 10 Gigabytes of real and virtual memory and 4 CPU cores
qrshx -l rmem=10G -l mem=10G -pe openmp 4

Sysadmin notes

qrshx is a Sheffield-developed modification to the standard set of scheduler commands. It is at /usr/local/bin/qrshx


and contains the following
#!/bin/sh
exec qrsh -v DISPLAY -pty y "$@" bash

qsh

qsh is a scheduler command that requests an interactive X-windows session on a worker node. The resulting terminal
is not user-friendly and we recommend that you use our qrshx command instead.

8 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Examples

Request an interactive X-Windows session that provides the default amount of memory resources
qsh

Request an interactive X-Windows session that provides 10 Gigabytes of real and virtual memory
qsh -l rmem=10G -l mem=10G

Request an interactive X-Windows session that provides 10 Gigabytes of real and virtual memory and 4 CPU cores
qsh -l rmem=10G -l mem=10G -pe openmp 4

qstat

qstat is a scheduler command that displays the status of the queues.

Examples

Display all jobs queued on the system


qstat

Display all jobs queued by the username foo1bar


qstat -u foo1bar

Display all jobs in the openmp parallel environment


stat -pe openmp

Display all jobs in the queue named foobar


qstat -q foobar.q

qsub

qsub is a scheduler command that submits a batch job to the system.

Examples

Submit a batch job called myjob.sh to the system


qsub myjob.sh

qtop

qtop is a scheduler command that provides a summary of all processes running on the cluster for a given user.

2.1. Using the HPC Systems 9


Sheffield HPC Documentation, Release

Examples

qtop is only available on the worker nodes. As such, you need to start an interactive session on a worker node using
qrsh or qrshx in order to use it.
To give a summary of all of your currently running jobs
qtop

Summary for job 256127

HOST VIRTUAL-MEM RSS-MEM %CPU %MEM CPUTIME+ COMMAND


testnode03 106.22 MB 1.79 MB 0.0 0.0 00:00:00 bash
testnode03 105.62 MB 1.27 MB 0.0 0.0 00:00:00 qtop
testnode03 57.86 MB 3.30 MB 0.0 0.0 00:00:00 ssh
--------- --------
TOTAL: 0.26 GB 0.01 GB

Iceberg’s Queue System

To manage use of the Iceberg cluster, there is a queue system (SoGE, a derivative of the Sun Grid Engine).
The queue system works by a user requesting some task, either a script or an interactive session, be run on the cluster
and then the scheduler will take tasks from the queue based on a set of rules and priorities.

Using Iceberg Interactively

If you wish to use the cluster for interactive use, such as running applications such as MATLAB or Ansys, or compiling
software, you will need to request that the scheduler gives you an interactive session. For an introduction to this see
Getting Started.
There are three commands which give you an interactive shell:
• qrsh - Requests an interactive session on a worker node. No support for graphical applications.
• qrshx - Requests an interactive session on a worker node. Supports graphical applications. Superior to qsh.
• qsh - Requests an interactive session on a worker node. Supports graphical applications.
You can configure the resources available to the interactive session by specifying them as command line options to the
qsh or qrsh commands. For example to run a qrshx session with access to 16 GB of virtual RAM
[te1st@iceberg-login2 ~]$ qrshx -l mem=16G

or a session with access to 8 cores:


[te1st@iceberg-login2 ~]$ qrshx -pe openmp 8

A table of Common Interactive Job Options is given below, any of these can be combined together to request more
resources.

Note: Long running jobs should use the batch submission system rather than requesting an interactive session for a
very long time. Doing this will lead to better cluster performance for all users.

10 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Command Description
-l Specify the total maximum execution time for the job.
h_rt=hh:mm:ss
-l mem=xxG Specify the maximum amount (xx) of memory to be used (per process or core)
Common Interactive Job Options
-pe <env> Specify a parallel environment and number of processors.
<nn>
-pe openmp The openmp parallel environment provides multiple threads on one node. <nn>
<nn> max number of threads.

Running Batch Jobs on iceberg

The power of iceberg really comes from the ‘batch job’ queue submission process. Using this system, you write a
script which requests various resources, initializes the computational environment and then executes your program(s).
The scheduler will run your job when resources are available. As the task is running, the terminal output and any
errors are captured and saved to disk, so that you can see the output and verify the execution of the task.
Any task that can be executed without any user intervention while it is running can be submitted as a batch job to
Iceberg. This excludes jobs that require a Graphical User Interface (GUI), however, many common GUI applications
such as Ansys or MATLAB can also be used without their GUIs.
When you submit a batch job, you provide an executable file that will be run by the scheduler. This is normally a bash
script file which provides commands and options to the program you are using. Once you have a script file, or other
executable file, you can submit it to the queue by running:
qsub myscript.sh

Here is an example batch submission script that runs a fictitious program called foo
#!/bin/bash
# Request 5 gigabytes of real memory (mem)
# and 5 gigabytes of virtual memory (mem)
#$ -l mem=5G -l rmem=5G

# load the module for the program we want to run


module load apps/gcc/foo

#Run the program foo with input foo.dat


#and output foo.res
foo < foo.dat > foo.res

Some things to note:


• The first line always needs to be #!/bin/bash to tell the scheduler that this is a bash batch script.
• Comments start with a #
• Scheduler options, such as the amount of memory requested, start with #$
• You will usually require one or more module commands in your submission file. These make programs and
libraries available to your scripts.
Here is a more complex example that requests more resources
#!/bin/bash
# Request 16 gigabytes of real memory (mem)
# and 16 gigabytes of virtual memory (mem)
#$ -l mem=16G -l rmem=16G
# Request 4 cores in an OpenMP environment
#$ -pe openmp 4

2.1. Using the HPC Systems 11


Sheffield HPC Documentation, Release

# Email notifications to me@somedomain.com


#$ -M me@somedomain.com
# Email notifications if the job aborts
#$ -m a

# load the modules required by our program


module load compilers/gcc/5.2
module load apps/gcc/foo

#Set the OPENMP_NUM_THREADS environment variable to 4


export OMP_NUM_THREADS=4

#Run the program foo with input foo.dat


#and output foo.res
foo < foo.dat > foo.res

Scheduler Options

Com- Description
mand
-l Specify the total maximum execution time for the job.
h_rt=hh:mm:ss
-l Specify the maximum amount (xx) of memory to be used.
mem=xxG
-l Target a node by name. Not recommended for normal use.
hostname=
-l arch= Target a processor architecture. Options on Iceberg include intel-e5-2650v2 and intel-x5650
-N Job name, used to name output files and in the queue list.
-j Join the error and normal output into one file rather than two.
-M Email address to send notifications to.
-m bea Type of notifications to send. Can be any combination of begin (b) end (e) or abort (a) i.e. -m ea for
end and abortion messages.
-a Specify the earliest time for a job to start, in the format MMDDhhmm. e.g. -a 01011130 will
schedule the job to begin no sooner than 11:30 on 1st January.

All scheduler commands

All available scheduler commands are listed in the Scheduler section.

Frequently Asked SGE Questions

How many jobs can I submit at any one time


You can submit up to 2000 jobs to the cluster, and the scheduler will allow up to 200 of your jobs to run simultaneously
(we occasionally alter this value depending on the load on the cluster).
How do I specify the processor type on Iceberg?
Add the following line to your submission script
#$ -l arch=intel-e5-2650v2

This specifies nodes that have the Ivybridge E5-2650 CPU. All such nodes on Iceberg have 16 cores.

12 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

To only target the older, 12 core nodes that contain X5650 CPUs add the following line to your submission script
#$ -l arch=intel-x5650

How do I specify multiple email addresses for job notifications?


Specify each additional email with it’s own -M option
#$ -M foo@example.com
#$ -M bar@example.com

How do you ensure that a job starts after a specified time?


Add the following line to your submission script
#$ -a time

but replace time with a time in the format MMDDhhmm


For example, for 22nd July at 14:10, you’d do
#$ -a 07221410

This won’t guarantee that it will run precisely at this time since that depends on available resources. It will, however,
ensure that the job runs after this time. If your resource requirements aren’t too heavy, it will be pretty soon after.
When I tried it, it started about 10 seconds afterwards but this will vary.
This is a reference section for commands that allow you to interact with the scheduler.
• qhost - Show’s the status of Sun Grid Engine hosts.
• qrsh - Requests an interactive session on a worker node. No support for graphical applications.
• qrshx - Requests an interactive session on a worker node. Supports graphical applications. Superior to qsh.
• qsh - Requests an interactive session on a worker node. Supports graphical applications.
• qstat - Displays the status of jobs and queues.
• qsub - Submits a batch job to the system.
• qtop - Provides a summary of all processes running on the cluster for a given user

2.2 Iceberg

2.2.1 Software on iceberg

These pages list the software available on iceberg. If you notice an error or an omission, or wish to request new
software please email the research computing team at research-it@sheffield.ac.uk.

Modules on Iceberg

In general the software available on iceberg is loaded and unloaded via the use of the modules system 1 .
Modules make it easy for us to install many versions of different applications, compilers and libraries side by side and
allow users to setup the computing environment to include exactly what they need.
1 http://modules.sourceforge.net/

2.2. Iceberg 13
Sheffield HPC Documentation, Release

Note: Modules are not available on the login node. You must move to a worker node using either qrsh or qsh (see
Getting Started) before any of the following commands will work.

Available modules can be listed using the following command:


module avail

Modules have names that look like apps/python/2.7. To load this module, you’d do:
module load apps/python/2.7

You can unload this module with:


module unload apps/python/2.7

It is possible to load multiple modules at once, to create your own environment with just the software you need. For
example, perhaps you want to use version 4.8.2 of the gcc compiler along with MATLAB 2014a
module load compilers/gcc/4.8.2
module load apps/matlab/2014a

Confirm that you have loaded these modules wih


module list

Remove the MATLAB module with


module unload apps/matlab/2014a

Remove all modules to return to the base environment


module purge

Module Command Reference

Here is a list of the most useful module commands. For full details, type man module at an iceberg command
prompt.
• module list – lists currently loaded modules
• module avail – lists all available modules
• module load modulename – loads module modulename
• module unload modulename – unloads module modulename
• module switch oldmodulename newmodulename – switches between two modules
• module initadd modulename – run this command once to automatically ensure that a module is loaded
when you log in. (It creates a .modules file in your home dir which acts as your personal configuration.)
• module show modulename - Shows how loading modulename will affect your environment
• module purge – unload all modules
• module help modulename – may show longer description of the module if present in the modulefile
• man module – detailed explanation of the above commands and others

14 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Applications

CASTEP

CASTEP
Latest Version 16.1
URL http://www.castep.org/

Licensing Only licensed users of CASTEP are entitled to use it and license details are available on CASTEP’s
website. Access to CASTEP on the system is controlled using a Unix group. That is, only members of the castep
group can access and run the program. To be added to this group, you will need to contact the Iceberg email the team
at research-it@sheffield.ac.uk and provide evidence of your eligibility to use CASTEP.

Interactive Usage The serial version of CASTEP should be used for interactive usage. After connecting to iceberg
(see Connect to iceberg), start an interactive session with the qrsh or qsh command. Make the serial version of
CASTEP available using the one of the commands
module load apps/intel/15/castep/16.1-serial
module load apps/intel/15/castep/8.0-serial

The CASTEP executable is called castep-serial so if you execute


castep.serial

You should get the following


Usage:
castep <seedname> : Run files <seedname>.cell [and <seedname>.param]
" [-d|--dryrun] <seedanme> : Perform a dryrun calculation on files <seedname>.cell
" [-s|--search] <text> : print list of keywords with <text> match in description
" [-v|--version] : print version information
" [-h|--help] <keyword> : describe specific keyword in <>.cell or <>.param
" " all : print list of all keywords
" " basic : print list of basic-level keywords
" " inter : print list of intermediate-level keywords
" " expert : print list of expert-level keywords
" " dummy : print list of dummy keywords

If, instead, you get


-bash: castep.serial: command not found

It is probably because you are not a member of the castep group. See the section on Licensing above for details on
how to be added to this group.
Interactive usage is fine for small CASTEP jobs such as the Silicon example given at
http://www.castep.org/Tutorials/BasicsAndBonding
To run this example, you can do
# Get the files, decompress them and enter the directory containing them
wget http://www.castep.org/files/Si2.tgz
tar -xvzf ./Si2.tgz
cd Si2

2.2. Iceberg 15
Sheffield HPC Documentation, Release

#Run the CASTEP job in serial


castep.serial Si2

#Read the output using the more command


more Si2.castep

CASTEP has a built in help system. To get more information on using castep use
castep.serial -help

Alternatively you can search for help on a particular topic


castep.serial -help search keyword

or list all of the input parameters


castep.serial -help search all

Batch Submission - Parallel The parallel version of CASTEP is called castep.mpi. To make the parallel envi-
ronment available, use one of the following module commands
module load apps/intel/15/castep/16.1-parallel
module load apps/intel/15/castep/8.0-parallel

As an example of a parallel submission, we will calculate the bandstructure of graphite following the tutorial at
http://www.castep.org/Tutorials/BandStructureAndDOS
After connecting to iceberg (see Connect to iceberg), start an interactive session with the qrsh or qsh command.
Download and decompress the example input files with the commands
wget http://www.castep.org/files/bandstructure.tgz
tar -xvzf ./bandstructure.tgz

Enter the directory containing the input files for graphite


cd bandstructure/graphite/

Create a file called submit.sge that contains the following


#!/bin/bash
#$ -pe openmpi-ib 4 # Run the calculation on 4 CPU cores
#$ -l rmem=4G # Request 4 Gigabytes of real memory per core
#$ -l mem=4G # Request 4 Gigabytes of virtual memory per core
module load apps/intel/15/castep/16.1-parallel

mpirun castep.mpi graphite

Submit it to the system with the command


qsub submit.sge

After the calculation has completed, get an overview of the calculation by looking at the file graphite.castep
more graphite.castep

Installation Notes These are primarily for system administrators.


CASTEP Version 16.1

16 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

The jump in version numbers from 8 to 16.1 is a result of CASTEP’s change of version numbering. There are no
versions 9-15.
Serial (1 CPU core) and Parallel versions of CASTEP were compiled. Both versions were compiled with version
15.0.3 of the Intel Compiler Suite and the Intel MKL versions of BLAS and FFT were used. The parallel version made
use of OpenMPI 1.8.8
The Serial version was compiled and installed with
module load compilers/intel/15.0.3
install_dir=/usr/local/packages6/apps/intel/15/castep/16.1
mkdir -p $install_dir

tar -xzf ./CASTEP-16.1.tar.gz


cd CASTEP-16.1

#Compile Serial version


make INSTALL_DIR=$install_dir FFT=mkl MATHLIBS=mkl10
make INSTALL_DIR=$install_dir FFT=mkl MATHLIBS=mkl10 install install-tools

The directory CASTEP-16.1 was then deleted and the parallel version was installed with
#!/bin/bash
module load libs/intel/15/openmpi/1.8.8
#The above command also loads Intel Compilers 15.0.3
#It also places the MKL in LD_LIBRARY_PATH

install_dir=/usr/local/packages6/apps/intel/15/castep/16.1

tar -xzf ./CASTEP-16.1.tar.gz


cd CASTEP-16.1

#Workaround for bug described at http://www.cmth.ph.ic.ac.uk/computing/software/castep.html


sed 's/-static-intel/-shared-intel/' obj/platforms/linux_x86_64_ifort15.mk -i

#Compile parallel version


make COMMS_ARCH=mpi FFT=mkl MATHLIBS=mkl10
mv ./obj/linux_x86_64_ifort15/castep.mpi $install_dir

CASTEP Version 8
Serial (1 CPU core) and Parallel versions of CASTEP were compiled. Both versions were compiled with version
15.0.3 of the Intel Compiler Suite and the Intel MKL versions of BLAS and FFT were used. The parallel version made
use of OpenMPI 1.8.8
The Serial version was compiled and installed with
module load compilers/intel/15.0.3
install_dir=/usr/local/packages6/apps/intel/15/castep/8.0

tar -xzf ./CASTEP-8.0.tar.gz


cd CASTEP-8.0

#Compile Serial version


make INSTALL_DIR=$install_dir FFT=mkl MATHLIBS=mkl10
make INSTALL_DIR=$install_dir FFT=mkl MATHLIBS=mkl10 install install-tools

The directory CASTEP-8.0 was then deleted and the parallel version was installed with
#!/bin/bash
module load libs/intel/15/openmpi/1.8.8

2.2. Iceberg 17
Sheffield HPC Documentation, Release

#The above command also loads Intel Compilers 15.0.3


#It also places the MKL in LD_LIBRARY_PATH

install_dir=/usr/local/packages6/apps/intel/15/castep/8.0
mkdir -p $install_dir

tar -xzf ./CASTEP-8.0.tar.gz


cd CASTEP-8.0

#Compile parallel version


make COMMS_ARCH=mpi FFT=mkl MATHLIBS=mkl10
mv ./obj/linux_x86_64_ifort15/castep.mpi $install_dir

Modulefiles
• CASTEP 16.1-serial
• CASTEP 16.1-parallel
• CASTEP 8.0-serial
• CASTEP 8.0-parallel

Testing Version 16.1 Serial


The following script was submitted via qsub from inside the build directory:
#!/bin/bash
#$ -l mem=10G
#$ -l rmem=10G
module load compilers/intel/15.0.3

cd CASTEP-16.1/Test
../bin/testcode.py -q --total-processors=1 -e /home/fe1mpc/CASTEP/CASTEP-16.1/obj/linux_x86_64_ifort

All but one of the tests passed. It seems that the failed test is one that fails for everyone for
this version since there is a missing input file. The output from the test run is on the system at
/usr/local/packages6/apps/intel/15/castep/16.1/CASTEP_SERIAL_tests_09022016.txt
Version 16.1 Parallel
The following script was submitted via qsub from inside the build directory
#!/bin/bash
#$ -pe openmpi-ib 4
#$ -l mem=10G
#$ -l rmem=10G
module load libs/intel/15/openmpi/1.8.8

cd CASTEP-16.1/Test
../bin/testcode.py -q --total-processors=4 --processors=4 -e /home/fe1mpc/CASTEP/CASTEP-16.1/obj/lin

All but one of the tests passed. It seems that the failed test is one that fails for everyone for
this version since there is a missing input file. The output from the test run is on the system at
/usr/local/packages6/apps/intel/15/castep/16.1/CASTEP_Parallel_tests_09022016.txt
Version 8 Parallel The following script was submitted via qsub

18 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

#!/bin/bash
#$ -pe openmpi-ib 4
module load libs/intel/15/openmpi/1.8.8

cd CASTEP-8.0
make check COMMS_ARCH=mpi MAX_PROCS=4 PARALLEL="--total-processors=4 --processors=4"

All tests passed.

GATK

GATK
Version 3.4-46
URL https://www.broadinstitute.org/gatk/

The Genome Analysis Toolkit or GATK is a software package for analysis of high-throughput sequencing data, devel-
oped by the Data Science and Data Engineering group at the Broad Institute. The toolkit offers a wide variety of tools,
with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its
robust architecture, powerful processing engine and high-performance computing features make it capable of taking
on projects of any size.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive sesssion with the qsh or
qrsh command.
The latest version of GATK (currently 3.4-46) is made available with the command
module load apps/binapps/GATK

Alternatively, you can load a specific version with


module load apps/binapps/GATK/3.4-46
module load apps/binapps/GATK/2.6-5

Version 3.4-46 of GATK also changes the environment to use Java 1.7 since this is required by GATK 3.4-46. An
environment variable called GATKHOME is created by the module command that contains the path to the requested
version of GATK.
Thus, you can run the program with the command
java -jar $GATKHOME/GenomeAnalysisTK.jar -h

Which will give a large amount of help, beginning with the version information
---------------------------------------------------------------------------------
The Genome Analysis Toolkit (GATK) v3.4-46-gbc02625, Compiled 2015/07/09 17:38:12
Copyright (c) 2010 The Broad Institute
For support and documentation go to http://www.broadinstitute.org/gatk
---------------------------------------------------------------------------------

Documentation The GATK manual is available online https://www.broadinstitute.org/gatk/guide/

Installation notes The entire install is just a .jar file. Put it in the install directory and you’re done.

2.2. Iceberg 19
Sheffield HPC Documentation, Release

Modulefile Version 3.4-46


• The module file is on the system at /usr/local/modulefiles/apps/binapps/GATK/3.4-46
Its contents are
#%Module1.0#####################################################################
##
## GATK 3.4-46 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

#This version of GATK needs Java 1.7


module load apps/java/1.7

proc ModulesHelp { } {
puts stderr "Makes GATK 3.4-46 available"
}

set version 3.4-46


set GATK_DIR /usr/local/packages6/apps/binapps/GATK/$version

module-whatis "Makes GATK 3.4-46 available"

prepend-path GATKHOME $GATK_DIR

Version 2.6-5
• The module file is on the system at /usr/local/modulefiles/apps/binapps/GATK/2.6-5
Its contents are
#%Module1.0#####################################################################
##
## GATK 2.6-5 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes GATK 3.4-46 available"
}

set version 2.6.5


set GATK_DIR /usr/local/packages6/apps/binapps/GATK/$version

module-whatis "Makes GATK 2.6-5 available"

prepend-path GATKHOME $GATK_DIR

MOMFBD

20 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

MOMFBD
Version 2016-04-14
URL http://dubshen.astro.su.se/wiki/index.php?title=MOMFBD

MOMFBD or Multi-Object Multi-Frame Blind Deconvolution is a image processing application for removing the
effects of atmospheric seeing from solar image data.

Usage The MOMFBD binaries can be added to your path with


module load apps/gcc/5.2/MOMFBD

Installation notes MOMFBD was installed using gcc 5.2. Using this script and is loaded by this modulefle.

STAR

STAR
Latest version 2.5.0c
URL https://github.com/alexdobin/STAR

Spliced Transcripts Alignment to a Reference.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrsh
command.
The latest version of STAR (currently 2.5.0c) is made available with the command
module load apps/gcc/5.2/STAR

Alternatively, you can load a specific version with


module load apps/gcc/5.2/STAR/2.5.0c

This command makes the STAR binary available to your session and also loads the gcc 5.2 compiler environment
which was used to build STAR. Check that STAR is working correctly by displaying the version
STAR --version

This should give the output


STAR_2.5.0c

Documentation The STAR manual is available online: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf

Installation notes STAR was installed using gcc 5.2


module load compilers/gcc/5.2

mkdir STAR
mv ./STAR-2.5.0c.zip ./STAR

2.2. Iceberg 21
Sheffield HPC Documentation, Release

cd ./STAR/
unzip STAR-2.5.0c.zip
cd STAR-2.5.0c
make
cd source
mkdir -p /usr/local/packages6/apps/gcc/5.2/STAR/2.5.0c
mv ./STAR /usr/local/packages6/apps/gcc/5.2/STAR/2.5.0c/

Testing No test suite was found.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/STAR/2.5.0c
It’s contents
#%Module1.0#####################################################################
##
## STAR 2.5.0c module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load compilers/gcc/5.2

module-whatis "Makes version 2.5.0c of STAR available"

set STAR_DIR /usr/local/packages6/apps/gcc/5.2/STAR/2.5.0c/

prepend-path PATH $STAR_DIR

Abaqus

Abaqus
Versions 6.13,6.12 and 6.11
Support Level FULL
Dependancies Intel Compiler
URL http://www.3ds.com/products-services/simulia/products/abaqus/
Local URL https://www.shef.ac.uk/wrgrid/software/abaqus

Abaqus is a software suite for Finite Element Analysis (FEA) developed by Dassault Systèmes.

Interactive usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qsh
command. Alternatively, if you require more memory, for example 16 gigabytes, use the command qsh -l
mem=16G
The latest version of Abaqus (currently version 6.13) is made available with the command
module load apps/abaqus

22 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Alternatively, you can make a specific version available with one of the following commands
module load apps/abaqus/613
module load apps/abaqus/612
module load apps/abaqus/611

After that, simply type abaqus to get the command-line interface to abaqus or type abaqus cae to get the GUI
interface.

Abaqus example problems Abaqus contains a large number of example problems which can be used to become
familiar with Abaqus on the system. These example problems are described in the Abaqus documentation, and can be
obtained using the Abaqus fetch command. For example, after loading the Abaqus module enter the following at the
command line to extract the input file for test problem s4d
abaqus fetch job=s4d

This will extract the input file s4d.inp, to run the computation defined by this input file replace input=myabaqusjob
with input=s4d in the commands and scripts below.

Batch submission of a single core job In this example, we will run the s4d.inp file on a single core using 8 Gigabytes
of memory. After connecting to iceberg (see Connect to iceberg), start an interactive sesssion with the qrsh command.
Load version 6.13-3 of Abaqus and fetch the s4d example by running the following commands
module load apps/abaqus/613
abaqus fetch job=s4d

Now, you need to write a batch submission file. We assume you’ll call this my_job.sge
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#$ -l rmem=8G
#$ -l mem=8G

module load apps/abaqus

abq6133 job=my_job input=s4d.inp scratch=/scratch memory="8gb" interactive

Submit the job with the command qsub my_job.sge


Important notes:
• We have requested 8 gigabytes of memory in the above job. The memory="8gb" switch tells abaqus to use
8 gigabytes. The #$ -l rmem=8G and #$ -l mem=8G tells the system to reserve 8 gigabytes of real and
virtual memory resptively. It is important that these numbers match.
• Note the word interactive at the end of the abaqus command. Your job will not run without it.

Batch submission of a single core job with user subroutine In this example, we will fetch a simulation from
Abaqus’ built in set of problems that makes use of user subroutines (UMATs) and run it in batch on a single core.
After connecting to iceberg (see Connect to iceberg), start an interactive sesssion with the qrsh command.
Load version 6.13-3 of Abaqus and fetch the umatmst3 example by running the following commands
module load apps/abaqus/613
abaqus fetch job=umatmst3*

2.2. Iceberg 23
Sheffield HPC Documentation, Release

This will produce 2 files: The input file umatmst3.inp and the Fortran user subroutine umatmst3.f.
Now, you need to write a batch submission file. We assume you’ll call this my_user_job.sge
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#$ -l rmem=8G
#$ -l mem=8G

module load apps/abaqus/613


module load compilers/intel/12.1.15

abq6133 job=my_user_job input=umatmst3.inp user=umatmst3.f scratch=/scratch memory="8gb" interactive

Submit the job with the command qsub my_user_job.sge


Important notes:
• In order to use user subroutimes, it is necessary to load the module for the intel compiler.
• The user-subroutine itself is passed to Abaqus with the switch user=umatmst3.f

Annovar

Annovar
Version 2015DEC14
URL http://annovar.openbioinformatics.org/en/latest/

ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants
detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and
many others).

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qsh or
qrsh command.
To add the annovar binaries to the system PATH, execute the following command
module load apps/binapps/annovar/2015DEC14

Documentation The annovar manual is available online at http://annovar.openbioinformatics.org/en/latest/

Installation notes The install is a collection of executable Perl scripts. Installation involves copying the directory
and adding it to the PATH. It seems that annovar uses dates to distinguish between releases rather than version numbers
tar -xvzf ./annovar.latest.tar.gz
mkdir -p mkdir -p /usr/local/packages6/apps/binapps/annovar
mv annovar /usr/local/packages6/apps/binapps/annovar/2015DEC14

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/binapps/annovar/2015DEC14

24 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Its contents are


#%Module1.0#####################################################################
##
## annovar modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes annovar 2015DEC14 available"
}

set ANNOVAR_DIR /usr/local/packages6/apps/binapps/annovar/2015Dec14

module-whatis "Makes annovar 2015DEC14 available"

prepend-path PATH $ANNOVAR_DIR

Ansys

Ansys
Version 14 , 14.5 , 15
Support Level FULL
Dependencies If using the User Defined Functions (UDF) will also need the following: For Ansys
Mechanical, Workbench, CFX and AutoDYN : INTEL 14.0 or above Compiler For Fluent :
GCC 4.6.1 or above
URL http://www.ansys.com/en_uk
Local URL http://www.shef.ac.uk/cics/research/software/fluent

ANSYS suite of programs can be used to numerically simulate a large variety of structural and fluid dynamics problems
found in many engineering, physics, medical, aeronotics and automative industry applications

Interactive usage After connecting to iceberg (see Connect to iceberg), start an interactive sesssion with the
qsh command. Alternatively, if you require more memory, for example 16 gigabytes, use the command qsh -l
rmem=16G
To make the latest version of Ansys available, run the following module command
module load apps/ansys

Alternatively, you can make a specific version available with one of the following commands
module load apps/ansys/14
module load apps/ansys/14.5
module load apps/ansys/15.0

After loading the modules, you can issue the following commands to run an ansys product. The teaching versions
mentioned below are installed for use during ANSYS and FLUENT teaching labs and will only allow models of upto
500,000 elements.

2.2. Iceberg 25
Sheffield HPC Documentation, Release

ansyswb : to run Ansys workbench


ansys : to run Ansys Mechanical outside the workbench
ansystext: to run line-mode version of ansys
Ansys and Ansystext : to run the teaching license version of the above two commands.
fluent : to run Fluent outside the workbench
fluentext: to run the line-mode version of Fluent
Fluent and Fluentext : to run the teaching license version of the above two commands.
icemcfx or icem: to run icemcfd outside the workbench.

Running Batch fluent and ansys jobs The easiest way of running batch ansys and fluent jobs is as follows:
module load {version_you_require} for example: module load apps/ansys/15.0
followed by runfluent or runansys

runfluent and runansys command submits a fluent journal or ansys input file into the batch system and can take a
number of different parameters, according to your requirements.

runfluent command Just typing runfluent will display information on how to use it.
Usage: runfluent [2d,2ddp,3d or 3ddp] fluent_journal_file -time hh:mm:ss [-mem=nn] [-rmem=nn] [-mail
your_email_address] [-nq] [-parallel nprocs][optional_extra_fluent_params].
Where all but the first two parameters are optional.
First parameter [2d , 2ddp , etc ] is the dimensionality of the problem.
Second parameter, fluent_journal_file, is the file containing the fluent commands.
Other 'optional' parameters are:
-time hh:mm:ss is the cpu time needed in hours:minutes:seconds
-mem=nn is the virtual memory needed (Default=8G). Example: -mem 12G (for 12 GBytes)
-rmem=nn is the real memory needed.(Default=2G). Example: -rmem 4G (for 4 GBytes)
-mail email_address. You will receive emails about the progress of your job.
Example:-mail J.Bloggs@sheffield.ac.uk
-nq is an optional parameter to submit without confirming
-parallel nprocs : Only needed for parallel jobs to specify the no.of processors.
-project project_name : The job will use a project allocation.
fluent_params : any parameter not recognised will be passed to fluent itself.

Example: runfluent 3d nozzle.jou -time 00:30:00 -mem=10G


Fluent journal files are essentially a sequence of Fluent Commands you would have entered by starting fluent in
non-gui mode.
Here is an example journal file:
/file/read-case test.cas
/file/read-data test.dat
/solve iter 200
/file/write-data testv5b.dat
yes
/exit
yes

Note that there can be no graphics output related commands in the journal file as the job will be run in batch mode.
Please see fluent documents for further details of journal files and how to create them.
By using the -g parameter, you can startup an interactive fluent session in non-gui mode to experiment. For example-
fluent 3d -g

26 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

runansys command RUNANSYS COMMAND SUBMITS ANSYS JOBS TO THE SUN GRID ENGINE
Usage: runansys ansys_inp_file [-time hh:mm:ss][-mem=nn] [-rmem=nn] [-parallel n] [-project proj_name] [-mail
email_address] [-fastdata] [other qsub parameters]
Where; ansys_inp_file is a file containing a series of Ansys commands.
-time hh:mm:ss is the cpu time needed in hours:minutes:seconds, if not specified 1 hour will be assu
-mem=nn is the virtual memory requirement.
-rmem=nn is the real memory requirement.
-parallel n request an n-way parallel ansys job
-gpu use GPU. Note for GPU users: -mem= must be greater than 18G.
-project project_name : The job will use a project's allocation.
-mail your_email_address : Job progress report is emailed to you.
-fastdata use /fastdata/$USER/$JOB_ID as the working directory

As well as time and memory, any other valid qsub parameter can be specified.Particularly users of UPF functions will
need to specify -v ANS_USER_PATH=the_working_directory
All parameters except the ansys_inp file are optional.
Output files created by Ansys take their names from the jobname specified by the user. You will be prompted for a
jobname as well as any other startup parameter you wish to pass to Ansys Example: runansys test1.dat -time 00:30:00
-mem 8G -rmem=3G -mail j.bloggs@shef.ac.uk

bcbio

bcbio
Latest version 0.9.6a
Dependancies gcc 5.2, R 3.2.1, Anaconda Python 2.3
URL http://bcbio-nextgen.readthedocs.org/en/latest/

A python toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis. You write
a high level configuration file specifying your inputs and analysis parameters. This input drives a parallel pipeline that
handles distributed execution, idempotent processing restarts and safe transactional steps. The goal is to provide a
shared community resource that handles the data processing component of sequencing analysis, providing researchers
with more time to focus on the downstream biology.

Usage Load version 0.9.6a of bcbio with the command


module load apps/gcc/5.2/bcbio/0.9.6a

There is also a development version of bcbio installed on iceberg. This could change without warning and should not
be used for production
module load apps/gcc/5.2/bcbio/devel

These module commands add bcbio commands to the PATH, load any supporting environments and correctly configure
the system for bcbio usage.
Once the module is loaded you can, for example, check the version of bcbio
bcbio_nextgen.py -v

/usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/anaconda/lib/python2.7/site-packages/matplotlib/__init

2.2. Iceberg 27
Sheffield HPC Documentation, Release

warnings.warn(self.msg_depr % (key, alt_key))


0.9.6a

To check how the loaded version of bcbio has been configured


more $BCBIO_DIR/config/install-params.yaml

At the time of writing, the output from the above command is


aligners:
- bwa
- bowtie2
- rtg
- hisat2
genomes:
- hg38
- hg19
- GRCh37
isolate: true
tooldir: /usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/tools
toolplus: []

Example batch submission TODO

Integration with SGE TODO

Installation Notes These are primarily for system administrators.


0.9.6a
Version 0.9.6a was installed using gcc 5.2, R 3.2.1 and Anaconda Python 2.3. The install was performed in two parts.
The first step was to run the SGE script below in batch mode. Note that the install often fails due to external services
being flaky. See https://github.com/rcgsheffield/iceberg_software/issues/219 for details. Depending on the reason for
the failure, it should be OK to simply restart the install. This particular install was done in one-shot...no restarts
necessary.
• install_bcbio_0.96a
The output from this batch run can be found in /usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/install_output/
Once the install completed, the module file (see Modulefile section) was created and loaded and the following upgrades
were performed
bcbio_nextgen.py upgrade --toolplus gatk=./GenomeAnalysisTK.jar
bcbio_nextgen.py upgrade --genomes hg38 --aligners hisat2

The GATK .jar file was obtained from https://www.broadinstitute.org/gatk/download/


A further upgrade was performed on 13th January 2016. STAR had to be run directly because the bcbio upgrade
command that made use of it kept stalling ( bcbio_nextgen.py upgrade –data –genomes GRCh37 –aligners bwa –
aligners star ). We have no idea why this made a difference but at least the direct STAR run could make use of
multiple cores whereas the bcbio installer only uses 1
#!/bin/bash
#$ -l rmem=3G -l mem=3G
#$ -P radiant
#$ -pe openmp 16

28 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

module load apps/gcc/5.2/bcbio/0.9.6a


STAR --genomeDir /usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/genomes/Hsapiens/GRCh37/star --genome

bcbio_nextgen.py upgrade --data --genomes GRCh37 --aligners bwa

Another upgrade was performed on 25th February 2016


module load apps/gcc/5.2/bcbio/0.9.6a
bcbio_nextgen.py upgrade -u stable --data --genomes mm10 --aligners star --aligners bwa

As is usually the case for us, this stalled on the final STAR command. The exact call to STAR was found in
/usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/genomes/Mmusculus/mm10/star/Log.out and run manually in a 16
core OpenMP script:
STAR --runMode genomeGenerate --runThreadN 16 --genomeDir /usr/local/packages6/apps/gcc/5.2/bcb

This failed (see https://github.com/rcgsheffield/iceberg_software/issues/272). The fix was to add the line
index mm10 /usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/genomes/Mmusculus/mm10/seq/mm10.fa

to the file
usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/galaxy/tool-data/sam_fa_indices.loc

Update: 14th March 2016


Another issue required us to modify /usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/genomes/Mmusculus/mm10/seq/mm10-
resources.yaml so that it read
version: 16

aliases:
snpeff: GRCm38.82
ensembl: mus_musculus_vep_83_GRCm38

variation:
dbsnp: ../variation/mm10-dbSNP-2013-09-12.vcf.gz
lcr: ../coverage/problem_regions/repeats/LCR.bed.gz

rnaseq:
transcripts: ../rnaseq/ref-transcripts.gtf
transcripts_mask: ../rnaseq/ref-transcripts-mask.gtf
transcriptome_index:
tophat: ../rnaseq/tophat/mm10_transcriptome.ver
dexseq: ../rnaseq/ref-transcripts.dexseq.gff3
refflat: ../rnaseq/ref-transcripts.refFlat
rRNA_fa: ../rnaseq/rRNA.fa

srnaseq:
srna-transcripts: ../srnaseq/srna-transcripts.gtf
mirbase-hairpin: ../srnaseq/hairpin.fa
mirbase-mature: ../srnaseq/hairpin.fa
mirdeep2-fasta: ../srnaseq/Rfam_for_miRDeep.fa

Development version
The development version was installed using gcc 5.2, R 3.2.1 and Anaconda Python 2.3.
• install_bcbio_devel.sge This is a SGE submit script. The long running time of the installer made it better-suited
to being run as a batch job.

2.2. Iceberg 29
Sheffield HPC Documentation, Release

• bcbio-devel modulefile located on the system at /usr/local/modulefiles/apps/gcc/5.2/bcbio/devel


The first install attempt failed with the error
To debug, please try re-running the install command with verbose output:
export CC=${CC:-`which gcc`} && export CXX=${CXX:-`which g++`} && export SHELL=${SHELL:-/bin/bash} &&
Traceback (most recent call last):
File "bcbio_nextgen_install.py", line 276, in <module>
main(parser.parse_args(), sys.argv[1:])
File "bcbio_nextgen_install.py", line 46, in main
subprocess.check_call([bcbio["bcbio_nextgen.py"], "upgrade"] + _clean_args(sys_argv, args, bcbio)
File "/usr/local/packages6/apps/binapps/anacondapython/2.3/lib/python2.7/subprocess.py", line 540,
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/local/packages6/apps/gcc/5.2/bcbio/devel/anaconda/bin/

I manually ran the command


export CC=${CC:-`which gcc`} && export CXX=${CXX:-`which g++`} && export SHELL=${SHELL:-/bin/bash} &&

and it completed successfully. I then resubmitted the submit script which eventually completed successfully. It took
several hours! At this point, I created the module file.
Bcbio was upgraded to the development version with the following interactive commands
module load apps/gcc/5.2/bcbio/devel
bcbio_nextgen.py upgrade -u development

The GATK .jar file was obtained from https://www.broadinstitute.org/gatk/download/ and installed to bcbio by running
the following commands interactively
module load apps/gcc/5.2/bcbio/devel
bcbio_nextgen.py upgrade --tools --toolplus gatk=./cooper/GenomeAnalysisTK.jar

Module files
• 0.9.6a

Testing Version 0.9.6a


The following test script was submitted to the system as an SGE batch script
#!/bin/bash
#$ -pe openmp 12
#$ -l mem=4G #Per Core!
#$ -l rmem=4G #Per Core!

module add apps/gcc/5.2/bcbio/0.9.6a

git clone https://github.com/chapmanb/bcbio-nextgen.git


cd bcbio-nextgen/tests
./run_tests.sh devel
./run_tests.sh rnaseq

The tests failed due to a lack of pandoc


[2016-01-07T09:40Z] Error: pandoc version 1.12.3 or higher is required and was not found.
[2016-01-07T09:40Z] Execution halted
[2016-01-07T09:40Z] Skipping generation of coverage report: Command 'set -o pipefail; /usr/local/pack
tput/report/qc-coverage-report-run.R

30 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Error: pandoc version 1.12.3 or higher is required and was not found.
Execution halted
' returned non-zero exit status 1

The full output of this testrun is on the system at /usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/tests/7-jan-2016/


Pandoc has been added to the list of applications that need to be installed on iceberg.
Development version
The following test script was submitted to the system. All tests passed. The output is at
/usr/local/packages6/apps/gcc/5.2/bcbio/0.9.6a/tests/tests_07_01_2016/
#!/bin/bash
#$ -pe openmp 12
#$ -l mem=4G #Per Core!
#$ -l rmem=4G #Per Core!

module add apps/gcc/5.2/bcbio/0.9.6a

git clone https://github.com/chapmanb/bcbio-nextgen.git


cd bcbio-nextgen/tests
./run_tests.sh devel
./run_tests.sh rnaseq

bcl2fastq

bcl2fastq
Versions 1.8.4
Support Level Bronze
URL http://support.illumina.com/downloads/bcl2fastq_conversion_software_184.html

Illumina sequencing instruments generate per-cycle BCL basecall files as primary sequencing output, but many down-
stream analysis applications use per-read FASTQ files as input. bcl2fastq combines these per-cycle BCL files from
a run and translates them into FASTQ files. bcl2fastq can begin bcl conversion as soon as the first read has been
completely sequenced.

Usage To make bcl2fastq available, use the following module command in your submission scripts
module load apps/bcl2fastq/1.8.4

Installation Notes These notes are primarily for system administrators.


Compilation was done using gcc 4.4.7. I tried it with gcc 4.8 but ended up with a lot of errors. The package is also
dependent on Perl. Perl 5.10.1 was used which was the system Perl installed at the time. The RPM Perl-XML-Simple
also needed installing.
export TMP=/tmp
export SOURCE=${TMP}/bcl2fastq
export BUILD=${TMP}/bcl2fastq-1.8.4-build
mkdir -p /usr/local/packages6/apps/gcc/4.4.7/bcl2fastq/1.8.4
export INSTALL=/usr/local/packages6/apps/gcc/4.4.7/bcl2fastq/1.8.4

2.2. Iceberg 31
Sheffield HPC Documentation, Release

mv bcl2fastq-1.8.4.tar.bz2 ${TMP}
cd ${TMP}
tar xjf bcl2fastq-1.8.4.tar.bz2

mkdir ${BUILD}
cd ${BUILD}
${SOURCE}/src/configure --prefix=${INSTALL}

make
make install

bedtools

bedtools
Versions 2.25.0
Dependancies compilers/gcc/5.2
URL https://bedtools.readthedocs.org/en/latest/

Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks.

Interactive usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qsh
command.
The latest version of bedtools (currently version 2.25.0) is made available with the command:
module load apps/gcc/5.2/bedtools

Alternatively, you can make a specific version available:


module load apps/gcc/5.2/bedtools/2.25.0

After that any of the bedtools commands can be run from the prompt.

Installation notes bedtools was installed using gcc 5.2 with the script install_bedtools.sh

Testing No test suite was found.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/bedtools/2.25.0
The contents of the module file is
#%Module1.0#####################################################################
##
## bedtools module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

32 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

proc ModulesHelp { } {
global bedtools-version

puts stderr " Adds `bedtools-$bedtools-version' to your PATH environment variable and necessa
}

set bedtools-version 2.25.0


module load compilers/gcc/5.2

prepend-path PATH /usr/local/packages6/apps/gcc/5.2/bedtools/2.25.0/bin

bitseq

bitseq
Versions 0.7.5
URL http://bitseq.github.io/

Transcript isoform level expression and differential expression estimation for RNA-seq

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrsh
command
module load apps/gcc/5.2/bitseq

Alternatively, you can load a specific version with


module load apps/gcc/5.2/bitseq/0.7.5

This command adds the BitSeq binaries to your PATH.

Documentation Documentation is available online http://bitseq.github.io/howto/index

Installation notes BitSeq was installed using gcc 5.2


module load compilers/gcc/5.2
tar -xvzf ./BitSeq-0.7.5.tar.gz
cd BitSeq-0.7.5

make
cd ..
mkdir -p /usr/local/packages6/apps/gcc/5.2/bitseq
mv ./BitSeq-0.7.5 /usr/local/packages6/apps/gcc/5.2/bitseq/

Testing No test suite was found.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/bitseq/0.7.5. It’s contents

2.2. Iceberg 33
Sheffield HPC Documentation, Release

#%Module1.0#####################################################################
##
## bitseq 0.7.5 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load compilers/gcc/5.2

proc ModulesHelp { } {
puts stderr "Makes bitseq 0.7.5 available"
}

set version 0.7.5


set BITSEQ_DIR /usr/local/packages6/apps/gcc/5.2/bitseq/BitSeq-$version

module-whatis "Makes bitseq 0.7.5 available"

prepend-path PATH $BITSEQ_DIR

BLAST

ncbi-blast
Version 2.3.0
URL https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs&DOC_TYPE=Download

BLAST+ is a new suite of BLAST tools that utilizes the NCBI C++ Toolkit. The BLAST+ applications have a number
of performance and feature improvements over the legacy BLAST applications. For details, please see the BLAST+
user manual.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrshx
or qrsh command.
The latest version of BLAST+ (currently 2.3.0) is made available with the command
module load apps/binapps/ncbi-blast

Alternatively, you can load a specific version with


module load apps/binapps/ncbi-blast/2.3.0

This command makes the BLAST+ executables available to your session by adding the install directory to your PATH
variable. It also sets the BLASTDB database environment variable.
You can now run commands directly. e.g.
blastn -help

Databases The following databases have been installed following a user request
• nr.*tar.gz Non-redundant protein sequences from GenPept, Swissprot, PIR, PDF, PDB, and NCBI RefSeq

34 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

• nt.*tar.gz Partially non-redundant nucleotide sequences from all traditional divisions of GenBank, EMBL, and
DDBJ excluding GSS,STS, PAT, EST, HTG, and WGS.
A full list of databases available on the NCBI FTP site is at ftp://ftp.ncbi.nlm.nih.gov/blast/documents/blastdb.html
If you need any of these installing, please make a request on our github issues log.

Installation notes This was an install from binaries


#get binaries and put in the correct location
wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/ncbi-blast-2.3.0+-x64-linux.tar.gz
tar -xzf ./ncbi-blast-2.3.0+-x64-linux.tar.gz
mkdir -p /usr/local/packages6/apps/binapps/ncbi-blast/
mv ncbi-blast-2.3.0+ /usr/local/packages6/apps/binapps/ncbi-blast/

#Create database directory


mkdir -p /usr/local/packages6/apps/binapps/ncbi-blast/ncbi-blast-2.3.0+/db

#Install the nr database


cd /usr/local/packages6/apps/binapps/ncbi-blast/ncbi-blast-2.3.0+/db
for num in `seq 0 48`;
do
paddednum=`printf "%02d" $num`
`wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.$paddednum.tar.gz`
done

#Install the nt database


for num in `seq 0 36`;
do
paddednum=`printf "%02d" $num`
`wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.$paddednum.tar.gz`
done

for f in *.tar.gz; do tar -xvzf $f; done

Testing No testing has been performed. If you can suggest a suitable test suite, please contact us.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/binapps/ncbi-blast/2.3.0
The contents of the module file is
#%Module1.0#####################################################################
##
## BLAST 2.3.0 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes BLAST 2.3.0 available"
}

2.2. Iceberg 35
Sheffield HPC Documentation, Release

set BLAST_DIR /usr/local/packages6/apps/binapps/ncbi-blast/ncbi-blast-2.3.0+

module-whatis "Makes BLAST 2.3.0 available"

prepend-path PATH $BLAST_DIR/bin


prepend-path BLASTDB $BLAST_DIR/db

bless

Bless
Version 1.02
URL http://sourceforge.net/p/bless-ec/wiki/Home/

BLESS: Bloom-filter-based Error Correction Solution for High-throughput Sequencing Reads

Interactive Usage Bless uses MPI and we currently have no interactive MPI environments available. As such, it is
not possible to run Bless interactively on Iceberg.

Batch Usage The latest version of bless (currently 1.02) is made available with the command
module load apps/gcc/4.9.2/bless

Alternatively, you can load a specific version with


module load apps/gcc/4.9.2/bless/1.02

The module create the environment variable BLESS_PATH which points to the Bless installation directory. It also
loads the dependent modules
• compilers/gcc/4.9.2
• mpi/gcc/openmpi/1.8.3
Here is an example batch submission script that makes use of two example input files we have included in our instal-
lation of BLESS
#!/bin/bash
#Next line is memory per slot
#$ -l mem=5G -l rmem=5G
#Ask for 4 slots in an OpenMP/MPI Hybrid queue
#These 4 slots will all be on the same node
#$ -pe openmpi-hybrid-4 4

#load the module


module load apps/gcc/4.9.2/bless/1.02

#Output information about nodes where code is running


cat $PE_HOSTFILE > nodes

data_folder=$BLESS_PATH/test_data
file1=test_small_1.fq
file2=test_small_2.fq

#Number of cores per node we are going to use.

36 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

cores=4
export OMP_NUM_THREADS=$cores

#Run BLESS
mpirun $BLESS_PATH/bless -kmerlength 21 -smpthread $cores -prefix test_bless -read1 $data_folder/$fil

BLESS makes use of both MPI and OpenMP parallelisation frameworks. As such, it is necessary to use the hybrid
MPI/OpenMP queues. The current build of BLESS does not work on more than one node. This will limit you to the
maximum number of cores available on one node.
For example, to use all 16 cores on a 16 core node you would request the following parallel environment
#$ -pe openmpi-hybrid-16 16

Remember that memory is allocated on a per-slot basis. You should ensure that you do not request more memory than
is available on a single node or your job will be permanently stuck in a queue-waiting (qw) status.

Installation notes Various issues were encountered while attempting to install bless. See
https://github.com/rcgsheffield/iceberg_software/issues/143 for details. It was necessary to install gcc 4.9.2 in
order to build bless. No other compiler worked!
Here are the install steps
tar -xvzf ./bless.v1p02.tgz

mkdir -p /usr/local/modulefiles/apps/gcc/4.9.2/bless/
cd v1p02/

Load Modules
module load compilers/gcc/4.9.2
module load mpi/gcc/openmpi/1.8.3

Modify the Makefile. Change the line


cd zlib; ./compile

to
cd zlib;

Manually compile zlib


cd zlib/
./compile

Finish the compilation


cd ..
make

Copy the bless folder to the central location


cd ..
cp -r ./v1p02/ /usr/local/packages6/apps/gcc/4.9.2/bless/

Testing No test suite was found.

2.2. Iceberg 37
Sheffield HPC Documentation, Release

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/4.9.2/bless/1.02
• The module file is on github.

Bowtie2

bowtie2
Versions 2.2.26
URL http://bowtie-bio.sourceforge.net/bowtie2/index.shtml

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is
particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning
to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory
footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped,
local, and paired-end alignment modes.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive sesssion with the qsh or
qrsh command.
The latest version of bowtie2 (currently 2.2.26) is made available with the command
module load apps/gcc/5.2/bowtie2

Alternatively, you can load a specific version with


module load apps/gcc/5.2/bowtie2/2.2.6

This command makes the bowtie2 executables available to your session by adding the install directory to your PATH
variable. This allows you to simply do something like the following
bowtie2 --version

which gives results that looks something like


/usr/local/packages6/apps/gcc/5.2/bowtie2/2.2.6/bowtie2-align-s version 2.2.6
64-bit
Built on node063
Fri Oct 23 08:40:38 BST 2015
Compiler: gcc version 5.2.0 (GCC)
Options: -O3 -m64 -msse2 -funroll-loops -g3 -DPOPCNT_CAPABILITY
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

Installation notes bowtie2 2.2.6 was installed using gcc 5.2


#Build
module load compilers/gcc/5.2
unzip bowtie2-2.2.6-source.zip
cd bowtie2-2.2.6
make

#Install
cd ..

38 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

mkdir -p /usr/local/packages6/apps/gcc/5.2/bowtie2
mv ./bowtie2-2.2.6 /usr/local/packages6/apps/gcc/5.2/bowtie2/2.2.6

Testing No test suite was found.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/bowtie2/2.2.6
The contents of the module file is
#%Module1.0#####################################################################
##
## bowtie2 2.2.6 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load compilers/gcc/5.2

proc ModulesHelp { } {
puts stderr "Makes bowtie 2.2.6 available"
}

set version 2.2.6


set BOWTIE2_DIR /usr/local/packages6/apps/gcc/5.2/bowtie2/$version

module-whatis "Makes bowtie2 v2.2.6 available"

prepend-path PATH $BOWTIE2_DIR

bwa

bwa
Versions 0.7.12
URL http://bio-bwa.sourceforge.net/

BWA (Burrows-Wheeler Aligner) is a software package for mapping low-divergent sequences against a large reference
genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The
first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged
from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment,
but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate.
BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrshx
command.
The latest version of bwa (currently 0.7.12) is made available with the command

2.2. Iceberg 39
Sheffield HPC Documentation, Release

module load apps/gcc/5.2/bwa

Alternatively, you can load a specific version with


module load apps/gcc/5.2/bwa/0.7.12
module load apps/gcc/5.2/bwa/0.7.5a

This command makes the bwa binary available to your session.

Documentation Once you have made bwa available to the system using the module command above, you can read
the man pages by typing
man bwa

Installation notes bwa 0.7.12


bwa 0.7.12 was installed using gcc 5.2
module load compilers/gcc/5.2

#build
module load compilers/gcc/5.2
tar -xvjf ./bwa-0.7.12.tar.bz2
cd bwa-0.7.12
make

#Sort out manfile


mkdir -p share/man/man1
mv bwa.1 ./share/man/man1/

#Install
mkdir -p /usr/local/packages6/apps/gcc/5.2/bwa/
cd ..
mv bwa-0.7.12 /usr/local/packages6/apps/gcc/5.2/bwa/0.7.12/

bwa 0.7.5a
bwa 0.7.5a was installed using gcc 5.2
module load compilers/gcc/5.2

#build
module load compilers/gcc/5.2
tar -xvjf bwa-0.7.5a.tar.bz2
cd bwa-0.7.5a
make

#Sort out manfile


mkdir -p share/man/man1
mv bwa.1 ./share/man/man1/

#Install
mkdir -p /usr/local/packages6/apps/gcc/5.2/bwa/
cd ..
mv bwa-0.7.5a /usr/local/packages6/apps/gcc/5.2/bwa/

Testing No test suite was found.

40 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Module files The default version is controlled by the .version file at /usr/local/modulefiles/apps/gcc/5.2/bwa/.version
#%Module1.0#####################################################################
##
## version file for bwa
##
set ModulesVersion "0.7.12"

Version 0.7.12
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/bwa/0.7.12
• On github: 0.7.12.
Version 0.7.5a
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/bwa/0.7.5a
• On github: 0.7.5a.

CDO

CDO
Version 1.7.2
URL https://code.zmaw.de/projects/cdo

CDO is a collection of command line Operators to manipulate and analyse Climate and NWP model Data. Supported
data formats are GRIB 1/2, netCDF 3/4, SERVICE, EXTRA and IEG. There are more than 600 operators available.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qsh or
qrsh command.
To add the cdo command to the system PATH, execute the following command
module load apps/gcc/5.3/cdo/1.7.2

Documentation The CDO manual is available online at https://code.zmaw.de/projects/cdo/embedded/index.html

Installation notes
Installation
(1) Download the latest current version:

wget https://code.zmaw.de/attachments/download/12350/cdo-current.tar.gz

(2) Extract the files into a working directory ( I have used /data/cs1ds/2016/cdo ) :

gunzip cdo-current.tar.gz
tar -xvf cdo-current.tar

(3) Install the program ( I have used /usr/local/extras/CDO ):

module load compilers/gcc/5.3

2.2. Iceberg 41
Sheffield HPC Documentation, Release

./configure --prefix=/usr/local/extras/CDO --with-netcdf=/usr/local/extras/netcdf/4.3.2


make install

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.3/cdo
Its contents are
#%Module1.0#####################################################################
##
## CDO Climate Data Operators Module file.
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes Climate Data Operator $version available"
}

set version 1.7.2


set BIN_DIR /usr/local/extras/CDO/$version

module-whatis "Makes CDO (Climate Data Operators) $version available"

prepend-path PATH $BIN_DIR/bin

Code Saturne 4.0

Code Saturne
Version 4.0
URL http://code-saturne.org/cms/
Documentation http://code-saturne.org/cms/documentation
Location /usr/local/packages6/apps/gcc/4.4.7/code_saturne/4.0

Code_Saturne solves the Navier-Stokes equations for 2D, 2D-axisymmetric and 3D flows, steady or unsteady, laminar
or turbulent, incompressible or weakly dilatable, isothermal or not, with scalars transport if required.

Usage To make code saturne available, run the following module command after starting a qsh session.
module load apps/code_saturne/4.0.0

Troubleshooting If you run Code Saturne jobs from /fastdata they will fail since the underlying filesystem used
by /fastdata does not support posix locks. If you need more space than is available in your home directory, use
/data instead. If you use /fastdata with Code Saturne you may get an error message similar to the one below
File locking failed in ADIOI_Set_lock(fd 14,cmd F_SETLKW/7,type F_WRLCK/1,whence 0) with return value
- If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running o
- If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option.
ADIOI_Set_lock:: Function not implemented

42 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

ADIOI_Set_lock:offset 0, length 8
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
solver script exited with status 1.
Error running the calculation.
Check Code_Saturne log (listing) and error* files for details.

Installation Notes These notes are primarily for system administrators.


Installation notes for the version referenced by module module load apps/code_saturne/4.0.0:
Pre-requisites: This version of Code Saturne was built with the following:-
• gcc 4.4.7
• cgns 3.2.1
• MED 3.0.8
• OpenMPI 1.8.3
• HDF5 1.8.14 built using gcc
module load libs/gcc/4.4.7/cgns/3.2.1

tar -xvzf code_saturne-4.0.0.tar.gz


mkdir code_saturn_build
cd code_saturn_build/
./../code_saturne-4.0.0/configure --prefix=/usr/local/packages6/apps/gcc/4.4.7/code_saturne/4.0 --wit

This gave the following configuration


Configuration options:
use debugging code: no
MPI (Message Passing Interface) support: yes
OpenMP support: no

The package has been configured. Type:


make
make install

To generate and install the PLE package

Configuration options:
use debugging code: no
use malloc hooks: no
use graphical user interface: yes
use long integers: yes
Zlib (gzipped file) support: yes
MPI (Message Passing Interface) support: yes
MPI I/O support: yes
MPI2 one-sided communication support: yes
OpenMP support: no
BLAS (Basic Linear Algebra Subprograms) support: no

2.2. Iceberg 43
Sheffield HPC Documentation, Release

Libxml2 (XML Reader) support: yes


ParMETIS (Parallel Graph Partitioning) support: no
METIS (Graph Partitioning) support: no
PT-SCOTCH (Parallel Graph Partitioning) support: no
SCOTCH (Graph Partitioning) support: no
CCM support: no
HDF (Hierarchical Data Format) support: yes
CGNS (CFD General Notation System) support: yes
MED (Model for Exchange of Data) support: yes
MED MPI I/O support: yes
MEDCoupling support: no
Catalyst (ParaView co-processing) support: no
EOS support: no
freesteam support: no
SALOME GUI support: yes
SALOME Kernel support: yes
Dynamic loader support (for YACS): dlopen

I then did
make
make install

Post Install Steps To make Code Saturne aware of the SGE system:
• Created /usr/local/packages6/apps/gcc/4.4.7/code_saturne/4.0/etc/code_saturne.cfg:
See code_saturne.cfg 4.0
• Modified /usr/local/packages6/apps/gcc/4.4.7/code_saturne/4.0/share/code_saturne/batch/ba
See: batch.SGE 4.0

Testing This module has not been yet been properly tested and so should be considered experimental.
Several user’s jobs up to 8 cores have been submitted and ran to completion.

Module File Module File Location: /usr/local/modulefiles/apps/code_saturne/4.0.0


#%Module1.0#####################################################################
##
## code_saturne 4.0 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global code-saturneversion

puts stderr " Adds `code_saturn-$codesaturneversion' to your PATH environment variable and ne
}

set codesaturneversion 4.0.


module load mpi/gcc/openmpi/1.8.3

module-whatis "loads the necessary `code_saturne-$codesaturneversion' library paths"

44 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

set cspath /usr/local/packages6/apps/gcc/4.4.7/code_saturne/4.0


prepend-path MANPATH $cspath/share/man
prepend-path PATH $cspath/bin

ctffind

ctffind
Versions 3.140609
URL http://ctffind.readthedocs.org/en/latest/

CTFFIND3 and CTFTILT are two programs for finding CTFs of electron micrographs. The program CTFFIND3 is
an updated version of the program CTFFIND2, which was developed in 1998 by Nikolaus Grigorieff at the MRC
Laboratory of Molecular Biology in Cambridge, UK with financial support from the MRC. This software is licensed
under the terms of the Janelia Research Campus Software Copyright 1.1.

Making ctffind available The following module command makes the latest version of gemini available to your
session
module load apps/gcc/4.4.7/ctffind

Alternatively, you can make a specific version available


module load apps/gcc/4.4.7/ctffind/3.140609

Installation notes These are primarily for system administrators.


• ctffind was installed using the gcc 4.4.7 compiler
• install_ctffind.sh

Modulefile The module file is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/ctffind/3.140609


The contents of the module file is
#%Module1.0#####################################################################
##
## ctfind module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global bedtools-version

puts stderr " Adds `ctffind-$ctffindversion' to your PATH environment variable"


}

set ctffindversion 3.140609

prepend-path PATH /usr/local/packages6/apps/gcc/4.4.7/ctffind/3.140609/

2.2. Iceberg 45
Sheffield HPC Documentation, Release

Cufflinks

Cufflinks
Version 2.2.1
URL http://cole-trapnell-lab.github.io/cufflinks

Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in
RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of tran-
scripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each
one, taking into account biases in library preparation protocols.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qsh or
qrsh command.
The latest version of Cufflinks (currently 2.2.1) is made available with the command
module load apps/binapps/cufflinks

Alternatively, you can load a specific version with


module load apps/binapps/cufflinks/2.2.1

This command makes the cufflinks binary directory available to your session by adding it to the PATH environment
variable.

Documentation The Cufflinks manual is available online at http://cole-trapnell-lab.github.io/cufflinks/manual/

Installation notes A binary install was used


tar -xvzf cufflinks-2.2.1.Linux_x86_64.tar.gz
mkdir -p /usr/local/packages6/apps/binapps/cufflinks
mv cufflinks-2.2.1.Linux_x86_64 /usr/local/packages6/apps/binapps/cufflinks/2.2.1

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/binapps/cufflinks/2.2.1
Its contents are
## cufflinks 2.2.1 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes cufflinks 2.2.1 available"
}

set version 2.2.1


set CUFF_DIR /usr/local/packages6/apps/binapps/cufflinks/$version

46 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

module-whatis "Makes cufflinks 2.2.1 available"

prepend-path PATH $CUFF_DIR

ffmpeg

ffmpeg
Version 2.8.3
URL https://www.ffmpeg.org/

FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play
pretty much anything that humans and machines have created. It supports the most obscure ancient formats up to the
cutting edge. No matter if they were designed by some standards committee, the community or a corporation. It is
also highly portable: FFmpeg compiles, runs, and passes our testing infrastructure FATE across Linux, Mac OS X,
Microsoft Windows, the BSDs, Solaris, etc. under a wide variety of build environments, machine architectures, and
configurations.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qsh or
qrsh command. The latest version of ffmpeg (currently 2.8.3) is made available with the command
module load apps/gcc/5.2/ffmpeg

Alternatively, you can load a specific version with


module load apps/gcc/5.2/ffmpeg/2.8.3

This command makes the ffmpeg binaries available to your session. It also loads version 5.2 of the gcc compiler
environment since gcc 5.2 was used to compile ffmpeg 2.8.3
You can now run ffmpeg. For example, to confirm the version loaded
ffmpeg -version

and to get help


ffmpeg -h

Documentation Once you have made ffmpeg available to the system using the module command above, you can
read the man pages by typing
man ffmpeg

Installation notes ffmpeg was installed using gcc 5.2


module load compilers/gcc/5.2

tar xf ./ffmpeg-2.8.3.tar.xz
cd ffmpeg-2.8.3
mkdir -p /usr/local/packages6/apps/gcc/5.2/ffmpeg/2.8.3
./configure --prefix=/usr/local/packages6/apps/gcc/5.2/ffmpeg/2.8.3
make
make install

2.2. Iceberg 47
Sheffield HPC Documentation, Release

Testing The test suite was executed


make check

All tests passed.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/ffmpeg/2.8.3
• The module file is on github.

Freesurfer

Freesurfer
Versions 5.3.0
URL http://freesurfer.net/

An open source software suite for processing and analyzing (human) brain MRI images.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive graphical session with
the qsh command.
The latest version of Freesurfer is made available with the command
module load apps/binapps/freesurfer

Alternatively, you can load a specific version with


module load apps/binapps/freesurfer/5.3.0

Important note Freesurfer is known to produce differing results when used across platforms. See
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0038234 for details. We recommend that you run all
of your analyses for any given study on the same operating system/hardware combination and record details of these
as part of your results.

Testing No formal test suite was found. A set of commands to try were suggested at
https://surfer.nmr.mgh.harvard.edu/fswiki/TestingFreeSurfer (Accessed November 6th 2015)
The following were tried
freeview -v $SUBJECTS_DIR/bert/mri/brainmask.mgz -v $SUBJECTS_DIR/bert/mri/aseg.mgz:colormap=lut:opac

tkmedit bert orig.mgz

tkmedit bert norm.mgz -segmentation aseg.mgz $FREESURFER_HOME/FreeSurferColorLUT.txt

tksurfer bert rh pial

qdec

48 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Installation notes These are primarily for administrators of the system. The github issue for the original install
request is at https://github.com/rcgsheffield/iceberg_software/issues/164
Freesurfer was installed as follows
wget -c ftp://surfer.nmr.mgh.harvard.edu/pub/dist/freesurfer/5.3.0/freesurfer-Linux-centos6_x86_64-st
tar -xvzf ./freesurfer-Linux-centos6_x86_64-stable-pub-v5.3.0.tar.gz
mv freesurfer 5.3.0
mkdir -p /usr/local/packages6/apps/binapps/freesurfer
mv ./5.3.0/ /usr/local/packages6/apps/binapps/freesurfer/

The license file was obtained from the vendor and placed in /usr/local/packages6/apps/binapps/freesurfer/license.txt.
The vendors were asked if it was OK to use this license on a central system. The answer is ‘yes’ - details at
http://www.mail-archive.com/freesurfer@nmr.mgh.harvard.edu/msg43872.html
To find the information necessary to create the module file
env >base.env
source /usr/local/packages6/apps/binapps/freesurfer/5.3.0/SetUpFreeSurfer.sh
env > after.env
diff base.env after.env

The script SetUpFreeSurfer.sh additionally creates a MATLAB startup.m file in the user’s home directory if it does
not already exist. This is for MATLAB support only, has not been replicated in the module file and is currently not
supported in this install.
The module file is at /usr/local/modulefiles/apps/binapps/freesurfer/5.3.0
#%Module10.2#####################################################################

## Module file logging


source /usr/local/etc/module_logging.tcl
##

# Freesurfer version (not in the user's environment)


set ver 5.3.0

proc ModulesHelp { } {
global ver

puts stderr "Makes Freesurfer $ver available to the system."


}

module-whatis "Sets the necessary Freesurfer $ver paths"

prepend-path PATH /usr/local/packages6/apps/binapps/freesurfer/$ver/bin


prepend-path FREESURFER_HOME /usr/local/packages6/apps/binapps/freesurfer/$ver

# The following emulates the results of 'source $FREESURFER_HOME/SetUpFreeSurfer.csh'


setenv FS_OVERRIDE 0
setenv PERL5LIB /usr/local/packages6/apps/binapps/freesurfer/5.3.0/mni/lib/perl5/5.8.5
setenv OS Linux
setenv LOCAL_DIR /usr/local/packages6/apps/binapps/freesurfer/5.3.0/local
setenv FSFAST_HOME /usr/local/packages6/apps/binapps/freesurfer/5.3.0/fsfast
setenv MNI_PERL5LIB /usr/local/packages6/apps/binapps/freesurfer/5.3.0/mni/lib/perl5/5.8.5
setenv FMRI_ANALYSIS_DIR /usr/local/packages6/apps/binapps/freesurfer/5.3.0/fsfast
setenv FSF_OUTPUT_FORMAT nii.gz
setenv MINC_BIN_DIR /usr/local/packages6/apps/binapps/freesurfer/5.3.0/mni/bin
setenv SUBJECTS_DIR /usr/local/packages6/apps/binapps/freesurfer/5.3.0/subjects

2.2. Iceberg 49
Sheffield HPC Documentation, Release

prepend-path PATH /usr/local/packages6/apps/binapps/freesurfer/5.3.0/fsfast/bin:/usr/local/packages6/

setenv FUNCTIONALS_DIR /usr/local/packages6/apps/binapps/freesurfer/5.3.0/sessions


setenv MINC_LIB_DIR /usr/local/packages6/apps/binapps/freesurfer/5.3.0/mni/lib
setenv MNI_DIR /usr/local/packages6/apps/binapps/freesurfer/5.3.0/mni
#setenv FIX_VERTEX_AREA #How do you set this to the empty string? This was done in the original scrip
setenv MNI_DATAPATH /usr/local/packages6/apps/binapps/freesurfer/5.3.0/mni/data

gemini

gemini
Versions 0.18.0
URL http://gemini.readthedocs.org/en/latest/

GEMINI (GEnome MINIng) is a flexible framework for exploring genetic variation in the context of the wealth of
genome annotations available for the human genome. By placing genetic variants, sample phenotypes and geno-
types, as well as genome annotations into an integrated database framework, GEMINI provides a simple, flexible, and
powerful system for exploring genetic variation for disease and population genetics.

Making gemini available The following module command makes the latest version of gemini available to your
session
module load apps/gcc/4.4.7/gemini

Alternatively, you can make a specific version available


module load apps/gcc/4.4.7/gemini/0.18

Installation notes These are primarily for system administrators.


• gemini was installed using the gcc 4.4.7 compiler
• install_gemini.sh

Testing The test script used was


• test_gemini.sh
The full output from the test run is on the system at /usr/local/packages6/apps/gcc/4.4.7/gemini/0.18/test_results/
There was one failure
effstring.t02...\c
ok
effstring.t03...\c
15d14
< gene impact impact_severity biotype is_exonic is_coding is_lof
64a64
> gene impact impact_severity biotype is_exonic is_coding is_lof
fail
updated 10 variants
annotate-tool.t1 ... ok

50 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

updated 10 variants
annotate-tool.t2 ... ok

This was reported to the developers who indicated that it was nothing to worry about

Modulefile The module file is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/gemini/0.18


The contents of the module file is
#%Module1.0#####################################################################
##
## Gemini module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global bedtools-version

puts stderr " Adds `gemini-$geminiversion' to your PATH environment variable"


}

set geminiversion 0.18

prepend-path PATH /usr/local/packages6/apps/gcc/4.4.7/gemini/0.18/bin/

GROMACS

GROMACS
Latest version 5.1
Dependancies mpi/intel/openmpi/1.10.0
URL http://www.gromacs.org/

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qsh or
qrsh command. To make GROMACS available in this session, run one of the following command:
source /usr/local/packages6/apps/intel/15/gromacs/5.1/bin/GMXRC

Installation notes

Version 5.1 Compilation Choices:


• Use latest intel compilers
• -DGMX_MPI=on
• -DCMAKE_PREFIX_PATH=/usr/local/packages6/apps/intel/15/gromcas
• -DGMX_FFT_LIBRARY=”fftw3”
• -DGMX_BUILD_OWN_FFTW=ON

2.2. Iceberg 51
Sheffield HPC Documentation, Release

The script used to build gromacs can be found here.

htop

htop
Versions 2.0
URL http://hisham.hm/htop/

This is htop, an interactive process viewer for Unix systems.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrsh or
qsh command.
The latest version of htop (currently 2.0) is made available with the command
module load apps/gcc/4.4.7/htop

Alternatively, you can load a specific version with


module load apps/gcc/4.4.7/htop/2.0

This command makes the htop binary available to your session.

Installation notes htop was installed using gcc 4.4.7


tar -xvzf ./htop-2.0.0.tar.gz
cd htop-2.0.0
mkdir -p /usr/local/packages6/apps/gcc/4.4.7/htop/2.0
./configure --prefix=/usr/local/packages6/apps/gcc/4.4.7/htop/2.0
make
make install

Testing No test suite was found.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/htop/2.0
• The module file is on github.

IDL

IDL
Version 8.5
Support Level extras
Dependancies java
URL http://www.exelisvis.co.uk/ProductsServices/IDL.aspx
Documentation http://www.exelisvis.com/docs/using_idl_home.html

52 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

IDL is a data analysis language that first appeared in 1977.

Usage If you wish to use the IDLDE then you may need to request more memory for the interactive session using
something like qsh -l mem=8G.
IDL can be activated using the module file:
module load apps/idl/8.5

then run using idl or idlde for the interactive development environment.

Installation notes Extract the supplied linux x86-64 tar file, run the install shell script and point it at the install
directory.

Integrative Genomics Viewer (IGV)

Integrative Genomics Viewer (IGV)


Version 2.3.63
URL https://www.broadinstitute.org/igv/

The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large,
integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation
sequence data, and genomic annotations.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an graphical interactive session with
the qsh command. In testing, we determined that you need to request at least 7 Gigabytes of memory to launch IGV
qsh -l mem=7G

The latest version of IGV (currently 2.3.63) is made available with the command
module load apps/binapps/igv

Alternatively, you can load a specific version with


module load apps/binapps/igv/2.3.63

This command makes the IGV binary directory available to your session by adding it to the PATH environment
variable. Launch the applications with the command
igv.sh

Documentation The IGV user guide is available online at https://www.broadinstitute.org/igv/UserGuide

Installation notes A binary install was used


unzip IGV_2.3.63.zip
mkdir -p /usr/local/packages6/apps/binapps/IGV
mv ./IGV_2.3.63 /usr/local/packages6/apps/binapps/IGV/2.3.63
rm /usr/local/packages6/apps/binapps/IGV/2.3.63/igv.bat

2.2. Iceberg 53
Sheffield HPC Documentation, Release

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/binapps/igv/2.3.63
Its contents
#%Module1.0#####################################################################
##
## IGV 2.3.63 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes IGV 2.3.63 available"
}

set version 2.3.63


set IGV_DIR /usr/local/packages6/apps/binapps/IGV/$version

module-whatis "Makes IGV 2.3.63 available"

prepend-path PATH $IGV_DIR

ImageJ

ImageJ
Version 1.50g
URL http://imagej.nih.gov/ij/

ImageJ is a public domain, Java-based image processing program developed at the National Institutes of Health.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrshx
command.
The latest version of ImageJ (currently 1.50g) is made available with the command
module load apps/binapps/imagej

Alternatively, you can load a specific version with


module load apps/binapps/imagej/1.50g

This module command also changes the environment to use Java 1.8 and Java 3D 1.5.2 An environment variable called
IMAGEJ_DIR is created by the module command that contains the path to the requested version of ImageJ.
You can start ImageJ, with default settings, with the command
imagej

This starts imagej with 512Mb of Java heap space. To request more, you need to run the Java .jar file directly and use
the -Xmx Java flag. For example, to request 1 Gigabyte

54 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

java -Xmx1G -Dplugins.dir=$IMAGEJ_DIR/plugins/ -jar $IMAGEJ_DIR/ij.jar

You will need to ensure that you’ve requested enough virtual memory from the scheduler to support the above request.
For more details, please refer to the Virtual Memory section of our Java documentation.

Documentation Links to documentation can be found at http://imagej.nih.gov/ij/download.html

Installation notes Install version 1.49 using the install_imagej.sh script


Run the GUI and update to the latest version by clicking on Help > Update ImageJ
Rename the run script to imagej since run is too generic.
Modify the imagej script so that it reads
java -Xmx512m -jar -Dplugins.dir=/usr/local/packages6/apps/binapps/imagej/1.50g/plugins/ /usr/local/p

ImageJ’s 3D plugin requires Java 3D which needs to be included in the JRE used by imagej.
I attempted a module-controlled version of Java3D but the plugin refused to recognise it. See
https://github.com/rcgsheffield/iceberg_software/issues/279 for details.
The plug-in will install Java3D within the JRE for you if you launch it and click Plugins>3D>3D Viewer

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/binapps/imagej/1.50g
Its contents are
#%Module1.0#####################################################################
##
## ImageJ 1.50g modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load apps/java/1.8u71

proc ModulesHelp { } {
puts stderr "Makes ImageJ 1.50g available"
}

set version 1.50g


set IMAGEJ_DIR /usr/local/packages6/apps/binapps/imagej/$version

module-whatis "Makes ImageJ 1.50g available"

prepend-path PATH $IMAGEJ_DIR


prepend-path CLASSPATH $IMAGEJ_DIR/
prepend-path IMAGEJ_DIR $IMAGEJ_DIR

R (Intel Build)

2.2. Iceberg 55
Sheffield HPC Documentation, Release

R
Dependencies BLAS
URL http://www.r-project.org/
Documentation http://www.r-project.org/

R is a statistical computing language. This version of R is built using the Intel Compilers and the Intel Math Kernel
Library. This combination can result in significantly faster runtimes in certain circumstances.
Most R extensions are written and tested for the gcc suite of compilers so it is recommended that you perform testing
before switching to this version of R.

CPU Architecture The Intel build of R makes use of CPU instructions that are only present on the most modern of
Iceberg’s nodes. In order to use them, you should add the following to your batch submission scripts
#$ -l arch=intel-e5-2650v2

If you do not do this, you will receive the following error message when you try to run R
Please verify that both the operating system and the processor support Intel(R) F16C and AVX instruct

Loading the modules After connecting to iceberg (see Connect to iceberg), start an interactive session with the
qrshx command.
There are two types of the Intel builds of R, sequential and parallel. sequential makes use of one CPU core and can
be used as a drop-in replacement for the standard version of R installed on Iceberg.
module load apps/intel/15/R/3.3.1_sequential

The parallel version makes use of multiple CPU cores for certain linear algebra routines since it is linked to the parallel
version of the Intel MKL. Note that only linear algebra routines are automatically parallelised.
module load apps/intel/15/R/3.3.1_parallel

When using the parallel module, you must also ensure that you set the bash environment variable
OMP_NUM_THREADS to the number of cores you require and also use the openmp parallel environment. E.g.
Add the following to your submission script
#$ -pe openmp 8
export OMP_NUM_THREADS=8

module load apps/intel/15/R/3.3.1_parallel

Example batch jobs Here is how to run the R script called linear_algebra_bench.r from the HPC Examples github
repository
#!/bin/bash
#This script runs the linear algebra benchmark multiple times using the intel-compiled version of R
#that's linked with the sequential MKL
#$ -l mem=8G -l rmem=8G
# Target the Ivy Bridge Processors
#$ -l arch=intel-e5-2650v2

module load apps/intel/15/R/3.3.1_sequential

56 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

echo "Intel R with sequential MKL on intel-e5-2650v2"


Rscript linear_algebra_bench.r output_data.rds

Here is how to run the same code using 8 cores


#!/bin/bash
#$ -l mem=3G -l rmem=3G # Memory per core
# Target the Ivy Bridge Processors
#$ -l arch=intel-e5-2650v2
#$ -pe openmp 8
export OMP_NUM_THREADS=8

module load apps/intel/15/R/3.3.1_parallel

echo "Intel R with parallel MKL on intel-e5-2650v2"


echo "8 cores"
Rscript inear_algebra_bench.r 8core_output_data.rds

Installing additional packages By default, the standard version of R allows you to install packages into the location
~/R/x86_64-unknown-linux-gnu-library/3.3/, where ~ refers to your home directory.
To ensure that the Intel builds do not contaminate the standard gcc builds, the Intel R module files set the environment
variable R_LIBS_USER to point to ~/R/intel_R/3.3.1
As a user, you should not need to worry about this detail and just install packages as you usuall would from within R.
e.g.
install.packages("dplyr")

The Intel build of R will ignore any packages installed in your home directory for the standard version of R and vice
versa

Installation Notes These notes are primarily for administrators of the system.
version 3.3.1
This was a scripted install. It was compiled from source with Intel Compiler 15.0.3 and with –enable-R-shlib enabled.
It was run in batch mode.
This build required several external modules including xz utils, curl, bzip2 and zlib
• install_intel_r_sequential.sh Downloads, compiles, tests and installs R 3.3.1 using Intel Compilers and
the sequential MKL. The install and test logs are at /usr/local/packages6/apps/intel/15/R/sequential-
3.3.1/install_logs/
• install_intel_r_parallel.sh Downloads, compiles, tests and installs R 3.3.1 using Intel Compilers and the parallel
MKL. The install and test logs are at /usr/local/packages6/apps/intel/15/R/sequential-3.3.1/install_logs/
• 3.3.1_parallel Parallel Module File
• 3.3.1_sequential Sequential Module File

JAGS

2.2. Iceberg 57
Sheffield HPC Documentation, Release

JAGS
Latest version 4.2
URL http://mcmc-jags.sourceforge.net/

JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain
Monte Carlo (MCMC) simulation not wholly unlike BUGS. JAGS was written with three aims in mind:
• To have a cross-platform engine for the BUGS language
• To be extensible, allowing users to write their own functions, distributions and samplers.
• To be a platform for experimentation with ideas in Bayesian modeling

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrshx
or qrsh command. To make JAGS available in this session, run one of the following module commands
module load apps/gcc/4.8.2/JAGS/4.2
module load apps/gcc/4.8.2/JAGS/3.4
module load apps/gcc/4.8.2/JAGS/3.1

You can now run the jags command


jags
Welcome to JAGS 4.2.0 on Thu Jun 30 09:21:17 2016
JAGS is free software and comes with ABSOLUTELY NO WARRANTY
Loading module: basemod: ok
Loading module: bugs: ok
.

The rjags and runjags interfaces in R rjags and runjags are CRAN packages that provide an R interface to jags.
They are not installed in R by default.
After connecting to iceberg (see Connect to iceberg), start an interactive session with the qrshx command. Run the
following module commands
module load compilers/gcc/4.8.2
module load apps/gcc/4.8.2/JAGS/4.2
module load apps/R/3.3.0

Launch R by typing R and pressing return. Within R, execute the following commands
install.packages('rjags')
install.packages('runjags')

and follow the on-screen inctructions. Answer y to any questions about the creation of a personal library should they
be asked.
The packages will be stored in a directory called R within your home directory.
You should only need to run the install.packages commands once. When you log into the system in future,
you will only need to run the module commands above to make JAGS available to the system.
You load the rjags packages the same as you would any other R package
library('rjags')
library('runjags')

58 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

If you received an error message such as


Error : .onLoad failed in loadNamespace() for 'rjags', details:
call: dyn.load(file, DLLpath = DLLpath, ...)
error: unable to load shared object '/home/fe1mpc/R/x86_64-unknown-linux-gnu-library/3.2/rjags/libs
libjags.so.3: cannot open shared object file: No such file or directory
Error: package or namespace load failed for 'rjags'

the most likely cause is that you forget to load the necessary modules before starting R.

Installation notes Version 4.2


JAGS 4.2 was built with gcc 4.8.2
• Install script on github - https://github.com/mikecroucher/HPC_Installers/blob/master/apps/jags/4.2.0/sheffield/iceberg/install_jags
• Module file on github - https://github.com/mikecroucher/HPC_Installers/blob/master/apps/jags/4.2.0/sheffield/iceberg/4.2
Version 3.4
JAGS 3.4 was built with gcc 4.8.2
module load compilers/gcc/4.8.2
tar -xvzf ./JAGS-3.4.0.tar.gz
cd JAGS-3.4.0
mkdir -p /usr/local/packages6/apps/gcc/4.8.2/JAGS/3.4
./configure --prefix=/usr/local/packages6/apps/gcc/4.8.2/JAGS/3.4
make
make install

Version 3.1
JAGS 3.1 was built with gcc 4.8.2
module load compilers/gcc/4.8.2
tar -xvzf ./JAGS-3.1.0.tar.gz
cd JAGS-3.1.0
mkdir -p /usr/local/packages6/apps/gcc/4.8.2/JAGS/3.1
./configure --prefix=/usr/local/packages6/apps/gcc/4.8.2/JAGS/3.1
make
make install

Java

Java
Latest Version 1.8u71
URL https://www.java.com/en/download/

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrsh or
qsh command.
The latest version of Java (currently 1.8u71) is made available with the command

2.2. Iceberg 59
Sheffield HPC Documentation, Release

module load apps/java

Alternatively, you can load a specific version with one of


module load apps/java/1.8u71
module load apps/java/1.7.0u55
module load apps/java/1.7
module load apps/java/1.6

Check that you have the version you expect. First, the runtime
java -version

java version "1.8.0_71"


Java(TM) SE Runtime Environment (build 1.8.0_71-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.71-b15, mixed mode)

Now, the compiler


javac -version

javac 1.8.0_71

Virtual Memory By default, Java requests a lot of virtual memory on startup. This is usually a given fraction of the
physical memory on a node, which can be quite a lot on Iceberg. This then exceeds a user’s virtual memory limit set
by the scheduler and causes a job to fail.
To fix this, we have created a wrapper script to Java that uses the -Xmx1G switch to force Java to only request one
Gigabyte of memory for its heap size. If this is insufficient, you are free to allocate as much memory as you require
but be sure to request enough from the scheduler as well. You’ll typically need to request more virtual memory from
the scheduler than you specify in the Java -Xmx switch.
For example, consider the following submission script. Note that it was necessary to request 9 Gigabytes of memory
from the scheduler even though we only allocated 5 Gigabytes heap size in Java. The requirement for 9 gigabytes was
determined empirically
#!/bin/bash
#Request 9 gigabytes of real memory from the scheduler (mem)
#and 9 gigabytes of virtual memory from the scheduler (mem)
#$ -l mem=9G -l rmem=9G

# load the Java module


module load apps/java/1.8u71

#Run java program allocating 5 Gigabytes


java -Xmx5G HelloWorld

Installation notes These are primarily for administrators of the system.


Unzip and copy the install directory to /usr/local/packages6/apps/binapps/java/jdk1.8.0_71/
To fix the virtual memory issue described above, we use a wrapper around the java install that sets Java’s Xmx
parameter to a reasonable value.
Create the file /usr/local/packages6/apps/binapps/java/jdk1.8.0_71/shef/java with contents
#!/bin/bash
#

60 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

# Java version 1.8 cannot be invoked without specifying the java virtual
# machine size due to the limitations imposed by us via SGE on memory usage.
# Therefore this script intercepts the java invocations and adds a
# memory constraint parameter to java engine unless there was one already
# specified on the command parameter.
#
#
if test -z "`echo $* | grep -e -Xmx`"; then
# user has not specified -Xmx memory requirement flag, so add it.
/usr/local/packages6/apps/binapps/java/jdk1.8.0_71/bin/java -Xmx1G $*
else
# user specified the -Xmx flag, so don't add it.
/usr/local/packages6/apps/binapps/java/jdk1.8.0_71/bin/java $*
fi

The module file is at /usr/local/modulefiles/apps/java/1.8u71. It’s contents are


#%Module10.2#####################################################################

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global helpmsg
puts stderr "\t$helpmsg\n"
}

set version 1.8

set javahome /usr/local/packages6/apps/binapps/java/jdk1.8.0_71/

if [ file isdirectory $javahome/bin ] {


module-whatis "Sets JAVA to version $version"
set helpmsg "Changes the default version of Java to Version $version"
# bring in new version
setenv JAVA_HOME $javahome
prepend-path PATH $javahome/bin
prepend-path PATH $javahome/shef
prepend-path MANPATH $javahome/man
} else {
module-whatis "JAVA $version not installed"
set helpmsg "JAVA $version not installed"
if [ expr [ module-info mode load ] || [ module-info mode display ] ] {
# bring in new version
puts stderr "JAVA $version not installed on [uname nodename]"
}
}

Julia

2.2. Iceberg 61
Sheffield HPC Documentation, Release

Julia
Versions 0.5.0-rc3
URL http://julialang.org/

Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is
familiar to users of other technical computing environments.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with either the
or qrsh or qrshx commands.
The latest version of Julia (currently 0.5.0-rc3) is made available with the commands
module load compilers/gcc/5.2
module load apps/gcc/5.2/julia/0.5.0-rc3

This adds Julia to your PATH and also loads the gcc 5.2 compiler environment with which Julia was built.
Start Julia by executing the command
julia

You can exit a Julia session with the quit() function.

Installation notes These are primarily for administrators of the system.


Julia was installed using gcc 5.2
module load apps/gcc/5.2/git/2.5
module load compilers/gcc/5.2
module load apps/python/anaconda2-2.5.0

git clone git://github.com/JuliaLang/julia.git


cd julia
git checkout release-0.5
#The next line targets the NEHALEM CPU architecture. This is the lowest architecture available on
#Iceberg and so the resulting binary will be supported on all nodes. Performance will not be as good
#could be on modern nodes.
sed -i s/OPENBLAS_TARGET_ARCH:=/OPENBLAS_TARGET_ARCH:=NEHALEM/ ./Make.inc
make

Module File The module file is as below


#%Module10.2####################################################################
#

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global ver

puts stderr " Adds Julia $ver to your environment variables."


}

# Mathematica version (not in the user's environment)

62 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

set ver 0.5.0-rc3

module-whatis "sets the necessary Julia $ver paths"

prepend-path PATH /usr/local/packages6/apps/gcc/5.2/julia/0.5.0-rc3

Knime

Knime
URL https://www.knime.org
Documentation http://tech.knime.org/documentation

KNIME Analytics Platform is an open solution for data-driven innovation, helping you discover the potential hidden
in your data, mine for fresh insights, or predict new futures.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qrshx
command.
The latest version of Knime including all of the extensions can be loaded with
module load apps/knime

Alternatively, you can load a specific version of Knime excluding all of the extensions using one of the following
module load apps/knime/3.1.2

With extensions
module load apps/knime/3.1.2ext

Knime can then be run with


$ knime

Batch usage There is a command line option allowing the user to run KNIME in batch mode:
knime -application org.knime.product.KNIME_BATCH_APPLICATION -nosplash -workflowDir="<path>"

• -application org.knime.product.KNIME_BATCH_APPLICATION launches the KNIME batch application.


• -nosplash does not show the initial splash window.
• -workflowDir=”<path>” provides the path of the directory containing the workflow.
Full list of options:
• -nosave do not save the workflow after execution has finished
• -reset reset workflow prior to execution
• -failonloaderror don’t execute if there are errors during workflow loading
• -updateLinks update meta node links to latest version
• -credential=name[;login[;password]] for each credential enter credential name and optional login/password

2.2. Iceberg 63
Sheffield HPC Documentation, Release

• -masterkey[=...] prompt for master password (used in e.g. database nodes),if provided with argument, use
argument instead of prompting
• -preferences=... path to the file containing eclipse/knime preferences,
• -workflowFile=... ZIP file with a ready-to-execute workflow in the root of the ZIP
• -workflowDir=... directory with a ready-to-execute workflow
• -destFile=... ZIP file where the executed workflow should be written to if omitted the workflow is only saved in
place
• -destDir=... directory where the executed workflow is saved to if omitted the workflow is only saved in place
• -workflow.variable=name,value,type define or overwrite workflow variable’name’ with value ‘value’ (possibly
enclosed by quotes). The’type’ must be one of “String”, “int” or “double”.
The following return codes are defined:
• 0 upon successful execution
• 2 if parameters are wrong or missing
• 3 when an error occurs during loading a workflow
• 4 if an error during execution occurred

Installation Notes These notes are primarily for administrators of the system.
Version 3.1.2 without extensions
• Download with wget https://download.knime.org/analytics-platform/linux/knime_3.1.2.linux.gtk.x86_64.tar.gz
• Move to /usr/local/extras/knime_analytics/3.1.2
• Unzip tar -xvzf knime_3.1.2.linux.gtk.x86_64.tar.gz
The modulefile is at /usr/local/extras/modulefiles/apps/knime/3.1.2
contains
#%Module1.0

proc ModulesHelp { } {
puts stderr " Adds KNIME to your PATH environment variable and necessary libraries"
}

prepend-path PATH /usr/local/extras/knime_analytics/3.1.2

Version 3.1.2 with extensions


• Download with wget https://download.knime.org/analytics-platform/linux/knime-
full_3.1.2.linux.gtk.x86_64.tar.gz
• Move to /usr/local/extras/knime_analytics/3.1.2ext
• Unzip tar -xvzf knime-full_3.1.2.linux.gtk.x86_64.tar.gz
The modulefile is at /usr/local/extras/modulefiles/apps/knime/3.1.2ext
contains
#%Module1.0

proc ModulesHelp { } {
puts stderr " Adds KNIME to your PATH environment variable and necessary libraries"

64 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

prepend-path PATH /usr/local/extras/knime_analytics/3.1.2ext

Lua

Lua
URL https://www.lua.org/
Documentation https://www.lua.org/docs.html

Lua is a powerful, efficient, lightweight, embeddable scripting language. It supports procedural programming, object-
oriented programming, functional programming, data-driven programming, and data description.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qrshx
command.
Lua can be loaded with
module load apps/lua/5.3.3

Lua can then be run with


$ lua

Installation Notes These notes are primarily for administrators of the system.
Version 5.3.3
• curl -R -O http://www.lua.org/ftp/lua-5.3.3.tar.gz
• tar zxf lua-5.3.3.tar.gz
• cd lua-5.3.3
• make linux test
The modulefile is at /usr/local/extras/modulefiles/apps/lua/5.3.3
contains
#%Module1.0

proc ModulesHelp { } {
puts stderr " Adds Lua to your PATH environment variable and necessary libraries"
}

prepend-path PATH /usr/local/extras/Lua/lua-5.3.3/src

Maple

2.2. Iceberg 65
Sheffield HPC Documentation, Release

Maple
Latest Version 2016
URL http://www.maplesoft.com/products/maple/

Scientific Computing and Visualisation

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qrshx
command.
The latest version of Maple (currently 2016) is made available with the command
module load apps/binapps/maple

Alternatively, you can load a specific version with


module load apps/binapps/maple/2016
module load apps/binapps/maple/2015

You can then run the graphical version of Maple by entering xmaple or the command line version by entering maple.

Batch usage It is not possible to run Maple worksheets in batch mode. Instead, you must convert your worksheet to
a pure text file that contains a set of maple input commands. You can do this in Maple by opening your worksheet and
clicking on File->Export As->Maple Input. The result will have the file extension .mpl
An example Sun Grid Engine submission script that makes use of a .mpl file called, for example, mycode.mpl is
#!/bin/bash
# Request 4 gigabytes of real memory (mem)
# and 4 gigabytes of virtual memory (mem)
#$ -l mem=4G -l rmem=4G

module load apps/binapps/maple/2015

maple < mycode.mpl

For general information on how to submit batch jobs refer to Running Batch Jobs on iceberg

Tutorials
• High Performance Computing with Maple A tutorial from the Sheffield Research Software Engineering group
on how to use Maple in a High Performance Computing environment

Installation notes These are primarily for administrators of the system.


Maple 2016
• Run the installer Maple2016.1LinuxX64Installer.run and follow instructions.
• Choose Install folder /usr/local/packages6/apps/binapps/maple2016
• Do not configure MATLAB
• Choose a network license. Details on CiCS internal wiki.
• Uncheck ‘Enable periodic checking for Maple 2016 updates’
• Check ‘Check for updates now’

66 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

The module file is at /usr/local/modulefiles/apps/binapps/maple/2016


#%Module10.2#####################################################################

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global ver

puts stderr " Makes Maple $ver available to the system."


}

# Maple version (not in the user's environment)


set ver 2016

module-whatis "sets the necessary Maple $ver paths"

prepend-path PATH /usr/local/packages6/apps/binapps/maple2016/bin

Maple 2015
The module file is at /usr/local/modulefiles/apps/binapps/maple/2015
#%Module10.2####################################################################
#

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global ver

puts stderr " Makes Maple $ver available to the system."


}

# Maple version (not in the user's environment)


set ver 2015

module-whatis "sets the necessary Maple $ver paths"

prepend-path PATH /usr/local/packages6/maple/bin/

Mathematica

Wolfram Mathematica
Dependancies None
URL http://www.wolfram.com/mathematica/
Latest version 10.3.1

Mathematica is a technical computing environment and programming language with strong symbolic and numerical
abilities.

2.2. Iceberg 67
Sheffield HPC Documentation, Release

Single Core Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session
with qsh.
The latest version of Mathematica can be loaded with
module load apps/binapps/mathematica

Alternatively, you can load a specific version of Mathematica using


module load apps/binapps/mathematica/10.3.1
module load apps/binapps/mathematica/10.2

Mathematica can then be started with the mathematica command


mathematica

Multicore Interactive Usage Mathematica has extensive parallel functionality. To use it, you should request a
parallel interactive session. For example, to request 4 cores
qsh -pe openmp 4

Load and launch Mathematica


module load apps/binapps/mathematica
mathematica

In Mathematica, let’s time how long it takes to calculate the first 20 million primes on 1 CPU core
AbsoluteTiming[primelist = Table[Prime[k], {k, 1, 20000000}];]

When I tried this, I got 78 seconds. Your results may vary greatly. Now, let’s launch 4 ParallelKernels and redo the
calculation in parallel
LaunchKernels[4]
AbsoluteTiming[primelist =
ParallelTable[Prime[k], {k, 1, 20000000},
Method -> "CoarsestGrained"];]

When I tried this, I got 29 seconds – around 2.7 times faster. This illustrates a couple of points:-
• You should always request the same number of kernels as you requested in your qsh command (in this case, 4).
If you request more, you will damage performance for yourself and other users of the system.
• N kernels doesn’t always translate to N times faster.

Batch Submission Unfortunately, it is not possible to run Mathematica notebook .nb files directly in batch. Instead,
you need to create a simple text file that contains all of the Mathematica commands you want to execute. Typically,
such files are given the extension .m. Let’s run the following simple Mathematica script as a batch job.
(*Find out what version of Mathematica this machine is running*)
Print["Mathematica version is " <> $Version]

(*Find out the name of the machine we are running on*)


Print["The machine name is " <> $MachineName]

(*Do a calculation*)
Print["The result of the integral is "]
Print [ Integrate[Sin[x]^2, x]]

68 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Copy and paste the above into a text file called very_simple_mathematica.m
An example batch submission script for this file is
#!/bin/bash
# Request 4 gigabytes of real memory (mem)
# and 4 gigabytes of virtual memory (mem)
#$ -l mem=4G -l rmem=4G

module load apps/binapps/mathematica/10.3.1

math -script very_simple_mathematica.m

Copy and paste the above into a file called run_job.sh and submit with
qsub run_job.sh

Once the job has successfully completed, the output will be in a file named like run_job.sh.o396699. The number
at the end refers to the job-ID given to this job by the system and will be different for you. Let’s take a look at the
contents of this file
more run_job.sh.o396699

Mathematica version is 10.2.0 for Linux x86 (64-bit) (July 28, 2015)
The machine name is node131
The result of the integral is
x/2 - Sin[2*x]/4

Installation notes These are primarily for administrators of the system


For Version 10.3.1
mkdir -p /usr/local/packages6/apps/binapps/mathematica/10.3.1
chmod +x ./Mathematica_10.3.1_LINUX.sh
./Mathematica_10.3.1_LINUX.sh

The installer is interactive. Here’s the session output


--------------------------------------------------------------------------------
Wolfram Mathematica 10.3 Installer
--------------------------------------------------------------------------------

Copyright (c) 1988-2015 Wolfram Research, Inc. All rights reserved.

WARNING: Wolfram Mathematica is protected by copyright law and international


treaties. Unauthorized reproduction or distribution may result in severe
civil and criminal penalties and will be prosecuted to the maximum extent
possible under law.

Enter the installation directory, or press ENTER to select


/usr/local/Wolfram/Mathematica/10.3:
> /usr/local/packages6/apps/binapps/mathematica/10.3.1

Now installing...

[*****************************************************************************]

Type the directory path in which the Wolfram Mathematica script(s) will be
created, or press ENTER to select /usr/local/bin:
> /usr/local/packages6/apps/binapps/mathematica/10.3.1/scripts

2.2. Iceberg 69
Sheffield HPC Documentation, Release

Create directory (y/n)?


> y

WARNING: No Avahi Daemon was detected so some Kernel Discovery features will
not be available. You can install Avahi Daemon using your distribution's
package management system.

For Red Hat based distributions, try running (as root):

yum install avahi

Installation complete.

Install the University network mathpass file at /usr/local/packages6/apps/binapps/mathematica/10.3.1/Config


For Version 10.2
mkdir -p /usr/local/packages6/apps/binapps/mathematica/10.2
chmod +x ./Mathematica_10.2.0_LINUX.sh
./Mathematica_10.2.0_LINUX.sh

The installer is interactive. Here’s the session output


-----------------------------------------------------------------------------------------------------
Wolfram Mathemati
-----------------------------------------------------------------------------------------------------

Copyright (c) 1988-2015 Wolfram Research, Inc. All rights reserved.

WARNING: Wolfram Mathematica is protected by copyright law and international treaties. Unauthorized r
prosecuted to the maximum extent possible under law.

Enter the installation directory, or press ENTER to select /usr/local/Wolfram/Mathematica/10.2:


>

Error: Cannot create directory /usr/local/Wolfram/Mathematica/10.2.

You may need to be logged in as root to continue with this installation.

Enter the installation directory, or press ENTER to select /usr/local/Wolfram/Mathematica/10.2:


> /usr/local/packages6/apps/binapps/mathematica/10.2

Now installing...

[****************************************************************************************************

Type the directory path in which the Wolfram Mathematica script(s) will be created, or press ENTER to
> /usr/local/packages6/apps/binapps/mathematica/10.2/scripts

Create directory (y/n)?


> y

WARNING: No Avahi Daemon was detected so some Kernel Discovery features will not be available. You ca

For Red Hat based distributions, try running (as root):

yum install avahi

70 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Installation complete.

Remove the playerpass file


rm /usr/local/packages6/apps/binapps/mathematica/10.2/Configuration/Licensing/playerpass

Install the University network mathpass file at /usr/local/packages6/apps/binapps/mathematica/10.2/Configur

Modulefiles
• The 10.3.1 module file.
• The 10.2 module file.

MATLAB

MATLAB
Versions 2013a , 2013b , 2014a, 2015a, 2016a
Support Level FULL
Dependancies None
URL http://uk.mathworks.com/products/matlab
Local URL http://www.shef.ac.uk/wrgrid/software/matlab
Documentation http://uk.mathworks.com/help/matlab

Scientific Computing and Visualisation

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qsh
command.
The latest version of MATLAB (currently 2016a) is made available with the command
module load apps/matlab

Alternatively, you can load a specific version with one of of the following commands
module load apps/matlab/2013a
module load apps/matlab/2013b
module load apps/matlab/2014a
module load apps/matlab/2015a
module load apps/matlab/2016a

You can then run MATLAB by entering matlab

Serial (one CPU) Batch usage Here, we assume that you wish to run the program hello.m on the system.
First, you need to write a batch submission file. We assume you’ll call this my_job.sge.
#!/bin/bash
#$ -l rmem=4G # Request 4 Gigabytes of real memory
#$ -l mem=16G # Request 16 Gigabytes of virtual memory
$ -cwd # Run job from current directory
module load apps/matlab # Make latest version of MATLAB available

2.2. Iceberg 71
Sheffield HPC Documentation, Release

matlab -nodesktop -r 'hello'

Ensuring that hello.m and myjob.sge are both in your current working directory, submit your job to the batch
system.
qsub my_job.sge

Some notes about this example:


• We are running the script hello.m but we drop the .m in the call to MATLAB. That is, we do -r ’hello’
rather than -r hello.m.
• All of the module commands introduced in the Interactive usage section will also work in batch mode. This
allows you to select a specific version of MATLAB if you wish.

Easy Way of Running MATLAB Jobs on the Batch Queue Firstly prepare a MATLAB script that contains all the
commands for running a MATLAB task. Let us assume that this script is called mymatlabwork.m. Next select the
version of MATLAB you wish to use by using the module load command, for example;
module load apps/matlab/2016a

Now submit a job that runs this MATLAB script as a batch job.
runmatlab mymatlabwork.m

That is all to it!


The runmatlab command can take a number of parameters to refine the control of your MATLAB batch job, such
as the maximum time and memory needs. To get a full listing of these parameters simply type runmatlab on iceberg
command line.

MATLAB Compiler and running free-standing compiled MATLAB programs The MATLAB compiler mcc is
installed on iceberg that can be used to generate free standing executables. Such executables can then be run on other
computers that does not have MATLAB installed. We strongly recommend you use R2016a or later versions to take
advantage of this feature.
To compile a MATLAB function or script for example called myscript.m the following steps are required.
module load apps/matlab/2016a # Load the matlab 2016a module
mcc -m myscript.m # Compile your program to generate the executable myscript and
# also generate a shell script named run_myscript.sh
./run_myscript.sh $MCRROOT # Finally run your program

If myscript.m is a MATLAB function that require inputs these can be suplied on the command line. For example if the
first line of myscript.m reads:
function out = myscript ( a , b , c )

then to run it with 1.0, 2.0, 3.0 as its parameters you will need to type:
./run_myscript.sh $MCRROOT 1.0 2.0 3.0

After a successful compilation and running you can transfer your executable and the runscript to another computer.
That computer does not have to have MATLAB installed or licensed on it but it will have to have the MATLAB runtime
system installed. This can be done by either downloading the MATLAB runtime environment from Mathworks web
site or by copying the installer file from iceberg itself which resides in:

72 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

/usr/local/packages6/matlab/R2016a/toolbox/compiler/deploy/glnxa64/MCRInstaller.zip

This file can be unzipped in a temporary area and run the setup script that unzipping yields to install the MATLAB
runtime environment. Finally the environment variable $MCRROOT can be set to the directory containing the runtime
environment.

Parallel MATLAB on iceberg Currently we recommend the 2015a version of MATLAB for parallel work.
The default cluster configuration named local provides parallel working environment by using the CPUs of the worker
node that is running the current MATLAB session. Each iceberg worker node can run multiple users’ jobs simultane-
ously. Therefore depending on who else is using that node at the time, parallel MATLAB jobs can create contentions
between jobs and slow them considerably. It is therefore advisable to start parallel MATLAB jobs that will use the
local profile from a parallel SGE job. For example, to use the local profile with 5 workers, do the following;
Start a parallel OpenMP job with 6 workers:
qsh -pe openmp 6

Run MATLAB in that session and select 5 workers:


matlab
parpool ('local' , 5 )

The above example will use 5 MATLAB workers on a single iceberg node to run a parallel task.
To take advantage of the multiple iceberg nodes, you will need to make use of a parallel cluster profile named sge.
This can be done by issuing a locally provided MATLAB command named iceberg that imports the parallel cluster
profile named sge that can take advantage of the SGE scheduler to run larger parallel jobs.
When using the sge profile, MATLAB will be able to submit multiple MATLAB jobs the the SGE scheduler from
within MATLAB itself. However, each job will have the default resource requirements unless the following trick is
deployed. For example, during your MATLAB session type:
global sge_params
sge_params='-l mem=16G -l h_rt=36:00:00'

to make sure that all the MATLAB batch jobs will use up to 16GBytes of memory and will not be killed unless they
exceed 36 hours of run time.

Training
• CiCS run an Introduction to Matlab course
• In November 2015, CiCS hosted a Parallel Computing in MATLAB Masterclass. The materials are available at
http://rcg.group.shef.ac.uk/courses/mathworks-parallelmatlab/

Installation notes These notes are primarily for system administrators.


Requires the floating license server licserv4.shef.ac.uk to serve the licenses. An install script named
installer_input.txt and associated files are downloadable from Mathworks site along with all the required
toolbox specific installation files.
The following steps are performed to install MATLAB on iceberg.
1. If necessary, update the floating license keys on licserv4.shef.ac.uk to ensure that the licenses are
served for the versions to install.
2. Log onto Mathworks site to download the MATLAB installer package for 64-bit Linux ( for R2016a this was
called matlab_R2016a_glnxa64.zip )

2.2. Iceberg 73
Sheffield HPC Documentation, Release

3. Unzip the installer package in a temporary directory: unzip matlab_R2016a_glnxa64.zip ( This will
create a few items including files named install and installer_input.txt)
4. Run the installer: ./install
5. Select install choice of Log in to Mathworks Account
6. Select Download only.
7. Select the offered default Download path ( this will be in your home area
$HOME/Downloads/MathWorks/... ) Note: This is the default download location that is later
used by the silent installer. Another option is to move all downloaded files to the same directory where install
script resides.
8. Finally run the installer using our customized installer_input.txt script as input ( ./install
-inputFile installer_input.txt ).
Installation should finish with exit status 0 if all has worked.
Note: A template installer_input file for 2016a is available at /usr/local/packages6/matlab directory named
2016a_installer_input.txt. This will need minor edits to install the next versions in the same way.

MrBayes

MrBayes
Version 3.2.6
URL https://sourceforge.net/projects/mrbayes/

MrBayes is a program for the Bayesian estimation of phylogeny.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrshx
command.
The latest version of MrBayes (currently 3.2.6) is made available with the command
module load apps/gcc/4.4.7/mrbayes

Alternatively, you can load a specific version with


module load apps/gcc/4.4.7/mrbayes/3.2.6

This command makes the MrBayes mb binary available to your session.

Installation notes MrBayes was installed with gcc 4.4.7


tar -xvzf ./mrbayes-3.2.6.tar.gz
ls
cd mrbayes-3.2.6
autoconf
./configure --with-beagle=/usr/local/packages6/libs/gcc/4.4.7/beagle/2.1.2 --prefix=/usr/local/packag
make
make install

74 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/mrbayes/3.2.6
• The module file is on github.

Octave

Ocatve
Versions 4.0.0
URL https://www.gnu.org/software/octave/

GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabil-
ities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It
also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through
its interactive command line interface, but it can also be used to write non-interactive programs. The Octave language
is quite similar to MATLAB so that most programs are easily portable.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with either the
qsh or qrsh commands.
The latest version of Octave (currently 4.0.0) is made available with the command
module load apps/gcc/5.2/octave

Alternatively, you can load a specific version with


module load apps/gcc/5.2/octave/4.0

This adds Octave to your PATH and also loads all of the supporting libraries and compilers required by the system.
Start Octave by executing the command
octave

If you are using a qsh session, the graphical user interface version will begin. If you are using a qrsh session, you will
only be able to use the text-only terminal version.

Batch Usage Here is an example batch submission script that will run an Octave program called foo.m
#!/bin/bash
# Request 5 gigabytes of real memory (mem)
# and 5 gigabytes of virtual memory (mem)
#$ -l mem=5G -l rmem=5G
# Request 64 hours of run time
#$ -l h_rt=64:00:00

module load apps/gcc/5.2/octave/4.0

octave foo.m

2.2. Iceberg 75
Sheffield HPC Documentation, Release

Using Packages (Toolboxes) Octave toolboxes are referred to as packages. To see which ones are installed, use the
command ver from within Octave.
Unlike MATLAB, Octave does not load all of its packages at startup. It is necessary to load the package before its
commands are available to your session. For example, as with MATLAB, the pdist command is part of the statistics
package. Unlike MATLAB, pdist is not immediately available to you
octave:1> pdist([1 2 3; 2 3 4; 1 1 1])
warning: the 'pdist' function belongs to the statistics package from Octave
Forge which you have installed but not loaded. To load the package, run
`pkg load statistics' from the Octave prompt.

Please read `http://www.octave.org/missing.html' to learn how you can


contribute missing functionality.
warning: called from
__unimplemented__ at line 524 column 5
error: 'pdist' undefined near line 1 column 1

As the error message suggests, you need to load the statistics package
octave:1> pkg load statistics
octave:2> pdist([1 2 3; 2 3 4; 1 1 1])
ans =

1.7321 2.2361 3.7417

Installation notes These are primarily for administrators of the system.


Octave was installed using gcc 5.2 and the following libraries:
• Java 1.8u60
• FLTK 1.3.3
• fftw 3.3.4
• Octave was installed using a SGE batch job. The install script is on github
• The make log is on the system at /usr/local/packages6/apps/gcc/5.2/octave/4.0/make_octave4.0.0.log
• The configure log is on the system at /usr/local/packages6/apps/gcc/5.2/octave/4.0/configure_octave4.0.0.log
For full functionality, Octave requires a large number of additional libraries to be installed. We have currently not
installed all of these but will do so should they be required.
For information, here is the relevant part of the Configure log that describes how Octave was configured
Source directory: .
Installation prefix: /usr/local/packages6/apps/gcc/5.2/octave/4.0
C compiler: gcc -pthread -fopenmp -Wall -W -Wshadow -Wforma
t -Wpointer-arith -Wmissing-prototypes -Wstrict-prototypes -Wwrite-strings -Wcas
t-align -Wcast-qual -I/usr/local/packages6/compilers/gcc/5.2.0/include
C++ compiler: g++ -pthread -fopenmp -Wall -W -Wshadow -Wold-s
tyle-cast -Wformat -Wpointer-arith -Wwrite-strings -Wcast-align -Wcast-qual -g -
O2
Fortran compiler: gfortran -O
Fortran libraries: -L/usr/local/packages6/compilers/gcc/5.2.0/lib -
L/usr/local/packages6/compilers/gcc/5.2.0/lib64 -L/usr/local/packages6/compilers
/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/usr/local/packages6/compile
rs/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/../../../../lib64 -L/lib/../
lib64 -L/usr/lib/../lib64 -L/usr/local/packages6/libs/gcc/5.2/fftw/3.3.4/lib -L/
usr/local/packages6/libs/gcc/5.2/fltk/1.3.3/lib -L/usr/local/packages6/compilers

76 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/../../.. -lgfortran -lm -lquad


math
Lex libraries:
LIBS: -lutil -lm

AMD CPPFLAGS:
AMD LDFLAGS:
AMD libraries:
ARPACK CPPFLAGS:
ARPACK LDFLAGS:
ARPACK libraries:
BLAS libraries: -lblas
CAMD CPPFLAGS:
CAMD LDFLAGS:
CAMD libraries:
CARBON libraries:
CCOLAMD CPPFLAGS:
CCOLAMD LDFLAGS:
CCOLAMD libraries:
CHOLMOD CPPFLAGS:
CHOLMOD LDFLAGS:
CHOLMOD libraries:
COLAMD CPPFLAGS:
COLAMD LDFLAGS:
COLAMD libraries:
CURL CPPFLAGS:
CURL LDFLAGS:
CURL libraries: -lcurl
CXSPARSE CPPFLAGS:
CXSPARSE LDFLAGS:
CXSPARSE libraries:
DL libraries:
FFTW3 CPPFLAGS:
FFTW3 LDFLAGS:
FFTW3 libraries: -lfftw3_threads -lfftw3
FFTW3F CPPFLAGS:
FFTW3F LDFLAGS:
FFTW3F libraries: -lfftw3f_threads -lfftw3f
FLTK CPPFLAGS: -I/usr/local/packages6/libs/gcc/5.2/fltk/1.3.3/in
clude -I/usr/include/freetype2 -I/usr/local/packages6/compilers/gcc/5.2.0/includ
e -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_THREAD_SAFE -D_REENTRANT
FLTK LDFLAGS: -L/usr/local/packages6/libs/gcc/5.2/fltk/1.3.3/li
b -Wl,-rpath,/usr/local/packages6/libs/gcc/5.2/fltk/1.3.3/lib -L/usr/local/packa
ges6/compilers/gcc/5.2.0/lib -L/usr/local/packages6/compilers/gcc/5.2.0/lib64 -l
fltk_gl -lGLU -lGL -lfltk -lXcursor -lXfixes -lXext -lXft -lfontconfig -lXineram
a -lpthread -ldl -lm -lX11
FLTK libraries:
fontconfig CPPFLAGS:
fontconfig libraries: -lfontconfig
FreeType2 CPPFLAGS: -I/usr/include/freetype2
FreeType2 libraries: -lfreetype
GLPK CPPFLAGS:
GLPK LDFLAGS:
GLPK libraries:
HDF5 CPPFLAGS:
HDF5 LDFLAGS:
HDF5 libraries: -lhdf5
Java home: /usr/local/packages6/apps/binapps/java/jre1.8.0_6

2.2. Iceberg 77
Sheffield HPC Documentation, Release

0/
Java JVM path: /usr/local/packages6/apps/binapps/java/jre1.8.0_6
0/lib/amd64/server
Java CPPFLAGS: -I/usr/local/packages6/apps/binapps/java/jre1.8.0
_60//include -I/usr/local/packages6/apps/binapps/java/jre1.8.0_60//include/linux
Java libraries:
LAPACK libraries: -llapack
LLVM CPPFLAGS:
LLVM LDFLAGS:
LLVM libraries:
Magick++ CPPFLAGS:
Magick++ LDFLAGS:
Magick++ libraries:
OPENGL libraries: -lfontconfig -lGL -lGLU
OSMesa CPPFLAGS:
OSMesa LDFLAGS:
OSMesa libraries:
PCRE CPPFLAGS:
PCRE libraries: -lpcre
PortAudio CPPFLAGS:
PortAudio LDFLAGS:
PortAudio libraries:
PTHREAD flags: -pthread
PTHREAD libraries:
QHULL CPPFLAGS:
QHULL LDFLAGS:
QHULL libraries:
QRUPDATE CPPFLAGS:
QRUPDATE LDFLAGS:
QRUPDATE libraries:
Qt CPPFLAGS: -I/usr/include/QtCore -I/usr/include/QtGui -I/usr
/include/QtNetwork -I/usr/include/QtOpenGL
Qt LDFLAGS:
Qt libraries: -lQtNetwork -lQtOpenGL -lQtGui -lQtCore
READLINE libraries: -lreadline
Sndfile CPPFLAGS:
Sndfile LDFLAGS:
Sndfile libraries:
TERM libraries: -lncurses
UMFPACK CPPFLAGS:
UMFPACK LDFLAGS:
UMFPACK libraries:
X11 include flags:
X11 libraries: -lX11
Z CPPFLAGS:
Z LDFLAGS:
Z libraries: -lz

Default pager: less


gnuplot: gnuplot

Build Octave GUI: yes


JIT compiler for loops: no
Build Java interface: no
Do internal array bounds checking: no
Build static libraries: no
Build shared libraries: yes
Dynamic Linking: yes (dlopen)

78 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Include support for GNU readline: yes


64-bit array dims and indexing: no
OpenMP SMP multithreading: yes
Build cross tools: no

configure: WARNING:

I didn't find gperf, but it's only a problem if you need to


reconstruct oct-gperf.h

configure: WARNING:

I didn't find icotool, but it's only a problem if you need to


reconstruct octave-logo.ico, which is the case if you're building from
VCS sources.

configure: WARNING: Qhull library not found. This will result in loss of functi
onality of some geometry functions.
configure: WARNING: GLPK library not found. The glpk function for solving linea
r programs will be disabled.
configure: WARNING: gl2ps library not found. OpenGL printing is disabled.
configure: WARNING: OSMesa library not found. Offscreen rendering with OpenGL w
ill be disabled.
configure: WARNING: qrupdate not found. The QR & Cholesky updating functions wi
ll be slow.
configure: WARNING: AMD library not found. This will result in some lack of fun
ctionality for sparse matrices.
configure: WARNING: CAMD library not found. This will result in some lack of fu
nctionality for sparse matrices.
configure: WARNING: COLAMD library not found. This will result in some lack of
functionality for sparse matrices.
configure: WARNING: CCOLAMD library not found. This will result in some lack of
functionality for sparse matrices.
configure: WARNING: CHOLMOD library not found. This will result in some lack of
functionality for sparse matrices.
configure: WARNING: CXSparse library not found. This will result in some lack o
f functionality for sparse matrices.
configure: WARNING: UMFPACK not found. This will result in some lack of functio
nality for sparse matrices.
configure: WARNING: ARPACK not found. The eigs function will be disabled.
configure: WARNING: Include file <jni.h> not found. Octave will not be able to
call Java methods.
configure: WARNING: Qscintilla library not found -- disabling built-in GUI editor
configure:

• Some commonly-used packages were additionally installed from Octave Forge using the following commands
from within Octave
pkg install -global -forge io
pkg install -global -forge statistics
pkg install -global -forge mapping
pkg install -global -forge image
pkg install -global -forge struct
pkg install -global -forge optim

Module File The module file is octave_4.0

2.2. Iceberg 79
Sheffield HPC Documentation, Release

OpenFOAM

OpenFOAM
Version 3.0.0
Dependancies gcc/5.2 mpi/gcc/openmpi/1.10.0 scotch/6.0.4
URL http://www.openfoam.org
Documentation http://www.openfoam.org/docs/

OpenFOAM is free, open source software for computational fluid dynamics (CFD).

Usage The lastest version of OpenFoam can be activated using the module file:
module load apps/gcc/5.2/openfoam

Alternatively, you can load a specific version of OpenFOAM:


module load apps/gcc/5.2/openfoam/2.4.0
module load apps/gcc/5.2/openfoam/3.0.0

This has the same effect as sourcing the openfoam bashrc file, so that should not be needed.

Installation notes OpenFoam was compiled using the install_openfoam.sh script, the module file is 3.0.0.

OpenSim

OpenSim
Support Level bronze
Dependancies None
URL https://simtk.org/home/opensim
Version 3.3

OpenSim is a freely available, user extensible software system that lets users develop models of musculoskeletal
structures and create dynamic simulations of movement.

Usage The latest version of OpenSim can be loaded with


module load apps/gcc/4.8.2/opensim

Installation Notes These are primarily for administrators of the system.


Built using:
cmake /home/cs1sjm/Downloads/OpenSim33-source/
-DCMAKE_INSTALL_PREFIX=/usr/local/packages6/apps/gcc/4.8.2/opensim/3.3/
-DSIMBODY_HOME=/usr/local/packages6/libs/gcc/4.8.2/simbody/3.5.3/
-DOPENSIM_STANDARD_11=ON

make -j 8

80 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

make install

orca

orca
Versions 3.0.3
URL https://orcaforum.cec.mpg.de/

ORCA is a flexible, efficient and easy-to-use general purpose tool for quantum chemistry with specific emphasis on
spectroscopic properties of open-shell molecules. It features a wide variety of standard quantum chemical methods
ranging from semiempirical methods to DFT to single- and multireference correlated ab initio methods. It can also
treat environmental and relativistic effects.

Making orca available The following module command makes the latest version of orca available to your session
module load apps/binapps/orca

Alternatively, you can make a specific version available


module load apps/binapps/orca/3.0.3

Example single core job Create a file called orca_serial.inp that contains the following orca commands
#
# My first ORCA calculation :-)
#
# Taken from the Orca manual
# https://orcaforum.cec.mpg.de/OrcaManual.pdf
! HF SVP
* xyz 0 1
C 0 0 0
O 0 0 1.13
*

Create a Sun Grid Engine submission file called submit_serial.sh that looks like this
#!/bin/bash
# Request 4 Gig of virtual memory per process
#$ -l mem=4G
# Request 4 Gig of real memory per process
#$ -l rmem=4G

module load apps/binapps/orca/3.0.3


$ORCAHOME/orca example1.inp

Submit the job to the queue with the command


qsub submit_serial.sh

Example parallel job An example Sun Grid Engine submission script is

2.2. Iceberg 81
Sheffield HPC Documentation, Release

#!/bin/bash
#Request 4 Processes
#Ensure that this matches the number requested in your Orca input file
#$ -pe openmpi-ib 4
# Request 4 Gig of virtual memory per process
#$ -l mem=4G
# Request 4 Gig of real memory per process
#$ -l rmem=4G

module load mpi/gcc/openmpi/1.8.8


module load apps/binapps/orca/3.0.3

ORCAPATH=/usr/local/packages6/apps/binapps/orca/3.0.3/
$ORCAPATH/orca example2_parallel.inp

Register as a user You are encouraged to register as a user of Orca at https://orcaforum.cec.mpg.de/ in order to take
advantage of updates, announcements and also of the users forum.

Documentation A comprehensive .pdf manual is available online.

Installation notes These are primarily for system administrators. Orca was a binary install
tar -xjf ./orca_3_0_3_linux_x86-64.tbz
cd orca_3_0_3_linux_x86-64
mkdir -p /usr/local/packages6/apps/binapps/orca/3.0.3
mv * /usr/local/packages6/apps/binapps/orca/3.0.3/

Modulefile The module file is on the system at /usr/local/modulefiles/apps/binapps/orca/3.0.3


The contents of the module file are
#%Module1.0#####################################################################
##
## Orca module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global bedtools-version

puts stderr " Adds `orca-$orcaversion' to your PATH environment variable"


}

set orcaversion 3.0.3


prepend-path ORCAHOME /usr/local/packages6/apps/binapps/orca/3.0.3/
prepend-path PATH /usr/local/packages6/apps/binapps/orca/3.0.3/

Paraview

82 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Paraview
Version 4.3
Support Level extras
Dependancies openmpi (1.8)
URL http://paraview.org/
Documentation http://www.paraview.org/documentation/

Paraview is a parallel visualisation tool.

Using Paraview on iceberg This guide describes how to use Paraview from iceberg. Paraview is a parallel visualisa-
tion tool designed for use on clusters like iceberg. Because iceberg is designed primarily as a headless computational
cluster paraview has not been installed on iceberg in such a way that you load the GUI remotely on iceberg 1 . Paraview
therefore runs on iceberg in client + server mode, the server running on iceberg, and the client on your local machine.

Configuring the Client To use Paraview on iceberg, first download and install Paraview 4.3 2 . Once you have
installed Paraview locally (the client) you need to configure the connection to iceberg. In Paraview go to File >
Connect, then click Add Server, name the connection iceberg, and select ‘Client / Server (reverse connection)’ for
the Server Type, the port should retain the default value of 11111. Then click configure, on the next screen leave the
connection as manual and click save. Once you are back at the original connect menu, click connect to start listening
for the connection from iceberg.

Starting the Server Once you have configured the local paraview client, login to iceberg from the client machine
via ssh 3 and run qsub-paraview. This will submit a job to the scheduler que for 16 processes with 4GB or RAM each.
This is designed to be used for large visualisation tasks, smaller jobs can be requested by specifying standard qsub
commands to qsub-paraview i.e. qsub-paraview -pe openmpi-ib 1 will only request one process.
Assuming you still have the client listening for connections, once the paraview job starts in the que it should connect
to your client and you should be able to start accessing data stored on iceberg and rendering images.

A Note on Performance When you run Paraview locally it will use the graphics hardware of your local machine
for rendering. This is using hardware in your computer to create and display these images. On iceberg there is no
such hardware, it is simulated via software. Therefore for small datasets you will probably find you are paying a
performance penalty for using Paraview on iceberg.
The advantage however, is that the renderer is on the same cluster as your data, so no data transfer is needed, and you
have access to very large amounts of memory for visualising very large datasets.

Manually Starting the Server The qsub-paraview command is a wrapper that automatically detects the client IP
address from the SSH connection and submits the job. It is possible to customise this behavior by copying and
modifying this script. This for instance would allow you to start the paraview server via MyApps or from a different
computer to the one with the client installed. The script used by qsub-paraview also serves as a good example script
and can be copied into your home directory by running cp /usr/local/bin/pvserver_submit.sh ~/. This script can then
be qsubmitted as normal by qsub. The client IP address can be added manually by replacing echo $SSH_CLIENT |
awk ‘{ print $1}’ with the IP address. More information on Paraview client/server can be found Here.
1 It is not possible to install the latest version of the paraview GUI on iceberg due to the Qt version shipped with Scientific Liunx 5.
2 The client and server versions have to match.
3 Connecting to Paraview via the automatic method descibed here is not supported on the MyApps portal.

2.2. Iceberg 83
Sheffield HPC Documentation, Release

Installation Custom build scripts are availible in /usr/local/extras/paraview/build_scripts which can be used to re-
compile.

Perl

Perl
Latest Version 5.10.1
URL https://www.perl.org/

Perl 5 is a highly capable, feature-rich programming language with over 27 years of development. Perl 5 runs on over
100 platforms from portables to mainframes and is suitable for both rapid prototyping and large scale development
projects.

Usage Perl 5.10.1 is installed by default on the system so no module command is required to use it
perl --version

This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi

Copyright 1987-2009, Larry Wall

Phyluce

Phyluce
Version 1.5.0
Dependancies apps/python/conda
URL https://github.com/faircloth-lab/phyluce
Documentation http://faircloth-lab.github.io/phyluce/

Phyluce (phy-loo-chee) is a software package that was initially developed for analyzing data collected from ultracon-
served elements in organismal genomes.

Usage Phyluce can be activated using the module file:


module load apps/binapps/phyluce/1.5.0

Phyluce makes use of the apps/python/conda module, therefore this module will be loaded by loading Phyluce.
As Phyluce is a Python package your default Python interpreter will be changed by loading Phyluce.

Installation notes As root:


$ module load apps/python/conda
$ conda create -p /usr/local/packages6/apps/binapps/conda/phyluce python=2
$ source activate /usr/local/packages6/apps/binapps/conda/phyluce
$ conda install -c https://conda.binstar.org/faircloth-lab phyluce

This installs Phyluce as a conda environment in the /usr/local/packages6/apps/binapps/conda/phyluce folder, which is


then loaded by the module file phyluce/1.5.0, which is a modification of the anaconda module files.

84 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Picard

Picard
Version 1.129
URL https://github.com/broadinstitute/picard/

A set of Java command line tools for manipulating high-throughput sequencing (HTS) data and formats. Picard is
implemented using the HTSJDK Java libraryHTSJDK, supporting accessing of common file formats, such as SAM
and VCF, used for high-throughput sequencing data.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrshx
or qrsh command, preferably requesting at least 8 Gigabytes of memory
qrsh -l mem=8G -l rmem=8G

The latest version of Picard (currently 1.129) is made available with the command
module load apps/binapps/picard

Alternatively, you can load a specific version with


module load apps/binapps/picard/1.129
module load apps/binapps/picard/1.101

These module commands also changes the environment to use Java 1.6 since this is required by Picard 1.129. An
environment variable called PICARDHOME is created by the module command that contains the path to the requested
version of Picard.
Thus, you can run the program with the command
java -jar $PICARDHOME/picard.jar

Installation notes Version 1.129


A binary install was used. The binary came from the releases page of the project’s github repo
unzip picard-tools-1.129.zip
mkdir -p /usr/local/packages6/apps/binapps/picard
mv ./picard-tools-1.129 /usr/local/packages6/apps/binapps/picard/1.129

Version 1.101
A binary install was used. The binary came from the project’s sourceforge site
https://sourceforge.net/projects/picard/files/picard-tools/
unzip picard-tools-1.101.zip
mv ./picard-tools-1.101 /usr/local/packages6/apps/binapps/picard/1.101/

Modulefile Version 1.129


The module file is on the system at /usr/local/modulefiles/apps/binapps/picard/1.129
Its contents are

2.2. Iceberg 85
Sheffield HPC Documentation, Release

#%Module1.0#####################################################################
##
## Picard 1.129 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

#This version of Picard needs Java 1.6


module load apps/java/1.6

proc ModulesHelp { } {
puts stderr "Makes Picard 1.129 available"
}

set version 1.129


set PICARD_DIR /usr/local/packages6/apps/binapps/picard/$version

module-whatis "Makes Picard 1.129 available"

prepend-path PICARDHOME $PICARD_DIR

Version 1.101
The module file is on the system at /usr/local/modulefiles/apps/binapps/picard/1.101
Its contents are
#%Module1.0#####################################################################
##
## Picard 1.101 modulefile
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

#This version of Picard needs Java 1.6


module load apps/java/1.6

proc ModulesHelp { } {
puts stderr "Makes Picard 1.101 available"
}

set version 1.101


set PICARD_DIR /usr/local/packages6/apps/binapps/picard/$version

module-whatis "Makes Picard 1.101 available"

prepend-path PICARDHOME $PICARD_DIR

Plink

86 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Plink
Versions 1.9 beta 3
Support Level Bronze
Dependancies None
URL https://www.cog-genomics.org/plink2

PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-
scale analyses in a computationally efficient manner.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive sesssion with the ‘qsh
or qrsh command.
The latest version of Plink is made available with the command
module load apps/binapps/plink

Alternatively, you can load a specific version with


module load apps/binapps/plink/1.9

You can now execute the plink command on the command line.

Installation notes These are primarily for administrators of the system.


The binary version of Plink was installed
mkdir plink_build
cd plink_build
unzip plink_linux_x86_64.zip
rm plink_linux_x86_64.zip
mkdir -p /usr/local/packages6/apps/binapps/plink/1.9
mv * /usr/local/packages6/apps/binapps/plink/1.9

The module file is at /usr/local/modulefiles/apps/binapps/plink/1.9


#%Module10.2####################################################################
#

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global ver

puts stderr " Makes Plink $ver available to the system."


}

# Plink version (not in the user's environment)


set ver 1.9

module-whatis "sets the necessary Plink $ver paths"

prepend-path PATH /usr/local/packages6/apps/binapps/plink/$ver

2.2. Iceberg 87
Sheffield HPC Documentation, Release

povray

povray
Version 3.7.0
URL http://www.povray.org/

The Persistence of Vision Raytracer is a high-quality, Free Software tool for creating stunning three-dimensional
graphics. The source code is available for those wanting to do their own ports.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qsh or
qrsh command. The latest version of povray (currently 3.7) is made available with the command
module load apps/gcc/5.2/povray

Alternatively, you can load a specific version with


module load apps/gcc/5.2/povray/3.7

This command makes the povray binary available to your session. It also loads version 5.2 of the gcc compiler
environment since gcc 5.2 was used to compile povray 3.7.
You can now run povray. For example, to confirm the version loaded
povray --version

and to get help


povray --help

Documentation Once you have made povray available to the system using the module command above, you can
read the man pages by typing
man povray

Installation notes povray 3.7.0 was installed using gcc 5.2 using the following script
• install_povray-3.7.sh

Testing The test suite was executed


make check

All tests passed.

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/povray/3.7
• The module file is on github.

88 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Python

Python
Support Level Gold
Dependencies None
URL https://python.org
Version All

This page documents the “miniconda” installation on iceberg. This is the recommended way of using Python on
iceberg, and the best way to be able to configure custom sets of packages for your use.
“conda” a Python package manager, allows you to create “environments” which are sets of packages that you can
modify. It does this by installing them in your home area. This page will guide you through loading conda and then
creating and modifying environments so you can install and use whatever Python packages you need.

Using conda Python After connecting to iceberg (see Connect to iceberg), start an interactive session with the qsh
or qrsh command.
Conda Python can be loaded with:
module load apps/python/conda

The root conda environment (the default) provides Python 3 and no extra modules, it is automatically updated, and
not recommended for general use, just as a base for your own environments. There is also a python2 environment,
which is the same but with a Python 2 installation.

Quickly Loading Anaconda Environments There are a small number of environments provided for everyone to
use, these are the default root and python2 environments as well as various versions of Anaconda for Python 3
and Python 2.
The anaconda environments can be loaded through provided module files:
module load apps/python/anaconda2-2.4.0
module load apps/python/anaconda3-2.4.0
module load apps/python/anaconda3-2.5.0

Where anaconda2 represents Python 2 installations and anaconda3 represents Python 3 installations. These
commands will also load the apps/python/conda module and then activate the anaconda environment specified.

Note: Anaconda 2.5.0 is compiled with Intel MKL libraries which should result in higher numerical performance.

Using conda Environments Once the conda module is loaded you have to load or create the desired conda environ-
ments. For the documentation on conda environments see the conda documentation.
You can load a conda environment with:
source activate python2

where python2 is the name of the environment, and unload one with:
source deactivate

2.2. Iceberg 89
Sheffield HPC Documentation, Release

which will return you to the root environment.


It is possible to list all the available environments with:
conda env list

Provided system-wide are a set of anaconda environments, these will be installed with the anaconda version number
in the environment name, and never modified. They will therefore provide a static base for derivative environments or
for using directly.

Creating an Environment Every user can create their own environments, and packages shared with the system-
wide environments will not be reinstalled or copied to your file store, they will be symlinked, this reduces the space
you need in your /home directory to install many different Python environments.
To create a clean environment with just Python 2 and numpy you can run:
conda create -n mynumpy python=2.7 numpy

This will download the latest release of Python 2.7 and numpy, and create an environment named mynumpy.
Any version of Python or list of packages can be provided:
conda create -n myscience python=3.5 numpy=1.8.1 scipy

If you wish to modify an existing environment, such as one of the anaconda installations, you can clone that envi-
ronment:
conda create --clone anaconda3-2.3.0 -n myexperiment

This will create an environment called myexperiment which has all the anaconda 2.3.0 packages installed with
Python 3.

Installing Packages Inside an Environment Once you have created your own environment you can install addi-
tional packages or different versions of packages into it. There are two methods for doing this, conda and pip, if a
package is available through conda it is strongly recommended that you use conda to install packages. You can search
for packages using conda:
conda search pandas

then install the package using:


conda install pandas

if you are not in your environment you will get a permission denied error when trying to install packages, if this
happens, create or activate an environment you own.
If a package is not available through conda you can search for and install it using pip:
pip search colormath

pip install colormath

Previous Anaconda Installation There is a legacy anaconda installation which is accessible through the
binapps/anacondapython/2.3 module. This module should be considered deprecated and should no longer
be used.

90 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Using Python with MPI There is an experimental set of packages for conda that have been compiled by the iceberg
team, which allow you to use a MPI stack entirely managed by conda. This allows you to easily create complex
evironments and use MPI without worrying about other modules or system libraries.
To get access to these packages you need to run the following command to add the repo to your conda config:
conda config --add channels file:///usr/local/packages6/conda/conda-bld/

you should then be able to install the packages with the openmpi feature, which currently include openmpi, hdf5,
mpi4py and h5py:
conda create -n mpi python=3.5 openmpi mpi4py

Currently, there are Python 2.7, 3.4 and 3.5 versions of mpi4py and h5py compiled in this repository.
The build scripts for these packages can be found in this GitHub repository.

Installation Notes These are primarily for administrators of the system.


The conda package manager is installed in /usr/share/packages6/conda, it was installed using the miniconda
installer.
The two “root” environments root and python2 can be updated using the update script located in
/usr/local/packages6/conda/_envronments/conda-autoupdate.sh. This should be run regu-
larly to keep this base environments upto date with Python, and more importantly with the conda package manager
itself.

Installing a New Version of Anaconda Perform the following:


$ cd /usr/local/packages6/conda/_envronments/
$ cp anaconda2-2.3.0.yml anaconda2-x.y.z.yml

then edit that file modifying the environment name and the anaconda version under requirements then run:
$ conda env create -f anaconda2-x.y.z.yml

then repeat for the Python 3 installation.


Then copy the modulefile for the previous version of anaconda to the new version and update the name of the environ-
ment. Also you will need to append the new module to the conflict line in apps/python/.conda-environments.tcl.

R
Dependencies BLAS
URL http://www.r-project.org/
Documentation http://www.r-project.org/

R is a statistical computing language.

Interactive Usage After connecting to iceberg (see Connect to iceberg), start an interactive session with the qrshx
command.
The latest version of R can be loaded with

2.2. Iceberg 91
Sheffield HPC Documentation, Release

module load apps/R

Alternatively, you can load a specific version of R using one of the following
module load apps/R/3.3.1
module load apps/R/3.3.0
module load apps/R/3.2.4
module load apps/R/3.2.3
module load apps/R/3.2.2
module load apps/R/3.2.1
module load apps/R/3.2.0
module load apps/R/3.1.2

R can then be run with


$ R

Serial (one CPU) Batch usage Here, we assume that you wish to run the program my_code.R on the system.
With batch usage it is recommended to load a specific version of R, for example module load apps/R/3.2.4,
to ensure the expected output is achieved.
First, you need to write a batch submission file. We assume you’ll call this my_job.sge
#!/bin/bash
#$ -S /bin/bash
#$ -cwd # Run job from current directory

module load apps/R/3.3.0 # Recommended to load a specific version of R

R CMD BATCH my_code.R my_code.R.o$JOB_ID

Note that R must be called with both the CMD and BATCH options which tell it to run an R program, in this case
my_code.R. If you do not do this, R will attempt to open an interactive prompt.
The final argument, my_code.R.o$JOBID, tells R to send output to a file with this name. Since $JOBID will
always be unique, this ensures that all of your output files are unique. Without this argument R sends all output to a
file called my_code.Rout.
Ensuring that my_code.R and my_job.sge are both in your current working directory, submit your job to the
batch system
qsub my_job.sge

Replace my_job.sge with the name of your submission script.

Graphical output By default, graphical output from batch jobs is sent to a file called Rplots.pdf

Installing additional packages As you will not have permissions to install packages to the default folder, additional
R packages can be installed to your home folder ~/. To create the appropriate folder, install your first package in R in
interactive mode. Load an interactive R session as described above, and install a package with
install.packages()

You will be prompted to create a personal package library. Choose yes. The package will download and install from a
CRAN mirror (you may be asked to select a nearby mirror, which you can do simply by entering the number of your
preferred mirror).

92 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Once the chosen package has been installed, additional packages can be installed either in the same way, or by creating
a .R script. An example script might look like
install.packages("dplyr")
install.packages("devtools")

Call this using source(). For example if your script is called packages.R and is stored in your home folder,
source this from an interactive R session with
source("~/packages.R")

These additional packages will be installed without prompting to your personal package library.
To check your packages are up to date, and update them if necessary, run the following line from an R interactive
session
update.packages(lib.loc = "~/R/x86_64-unknown-linux-gnu-library/3.3/")

The folder name after ~/R/ will likely change, but this can be completed with tab autocompletion from the R session.
Ensure lib.loc folder is specified, or R will attempt to update the wrong library.

R Packages that require external libraries Some R packages require external libraries to be installed before you
can install and use them. Since there are so many, we only install those libraries that have been explicitly requested by
users of the system.
The associated R packages are not included in the system install of R, so you will need to install them yourself to your
home directory following the instructions linked to below.
• geos This is the library required for the rgeos package.
• JAGS This is the library required for the rjags and runjags packages

Using the Rmath library in C Programs The Rmath library allows you to access some of R’s functionality from a
C program. For example, consider the C-program below
#include <stdio.h>
#define MATHLIB_STANDALONE
#include "Rmath.h"

main(){
double shape1,shape2,prob;

shape1 = 1.0;
shape2 = 2.0;
prob = 0.5;

printf("Critical value is %lf\n",qbeta(prob,shape1,shape2,1,0));


}

This makes use of R’s qbeta function. You can compile and run this on a worker node as follows.
Start a session on a worker node with qrsh or qsh and load the R module
module load apps/R/3.3.0

Assuming the program is called test_rmath.c, compile with


gcc test_rmath.c -lRmath -lm -o test_rmath

2.2. Iceberg 93
Sheffield HPC Documentation, Release

For full details about the functions made available by the Rmath library, see section 6.7 of the document Writing R
extensions

Accelerated version of R There is an experimental, accelerated version of R installed on Iceberg that makes use of
the Intel Compilers and the Intel MKL. See R (Intel Build) for details.

Installation Notes These notes are primarily for administrators of the system.
version 3.3.1
• What’s new in R version 3.3.1
This was a scripted install. It was compiled from source with gcc 4.4.7 and with –enable-R-shlib enabled. It was run
in batch mode.
This build required several external modules including xz utils, curl, bzip2 and zlib
• install_R_3.3.1.sh Downloads, compiles, tests and installs R 3.3.1 and the Rmath library.
• R 3.3.1 Modulefile located on the system at /usr/local/modulefiles/apps/R/3.3.1
• Install log-files, including the output of the make check tests are available on the system at
/usr/local/packages6/R/3.3.1/install_logs
version 3.3.0
• What’s new in R version 3.3.0
This was a scripted install. It was compiled from source with gcc 4.4.7 and with –enable-R-shlib enabled. You will
need a large memory qrshx session in order to successfully run the build script. I used qrshx -l rmem=8G -l mem=8G
This build required several external modules including xz utils, curl, bzip2 and zlib
• install_R_3.3.0.sh Downloads, compiles, tests and installs R 3.3.0 and the Rmath library.
• R 3.3.0 Modulefile located on the system at /usr/local/modulefiles/apps/R/3.3.0
• Install log-files, including the output of the make check tests are available on the system at
/usr/local/packages6/R/3.3.0/install_logs
Version 3.2.4
• What’s new in R version 3.2.4
This was a scripted install. It was compiled from source with gcc 4.4.7 and with –enable-R-shlib enabled. You will
need a large memory qrshx session in order to successfully run the build script. I used qrshx -l rmem=8G -l mem=8G
This build made use of new versions of xz utils and curl
• install_R_3.2.4.sh Downloads, compiles, tests and installs R 3.2.4 and the Rmath library.
• R 3.2.4 Modulefile located on the system at /usr/local/modulefiles/apps/R/3.2.4
• Install log-files, including the output of the make check tests are available on the system at
/usr/local/packages6/R/3.2.4/install_logs
Version 3.2.3
• What’s new in R version 3.2.3
This was a scripted install. It was compiled from source with gcc 4.4.7 and with --enable-R-shlib enabled. You
will need a large memory qrsh session in order to successfully run the build script. I used qrsh -l rmem=8G -l
mem=16G
• install_R_3.2.3.sh Downloads, compiles, tests and installs R 3.2.3 and the Rmath library.

94 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

• R 3.2.3 Modulefile located on the system at /usr/local/modulefiles/apps/R/3.2.3


• Install log-files, including the output of the make check tests are available on the system at
/usr/local/packages6/R/3.2.3/install_logs
Version 3.2.2
• What’s new in R version 3.2.2
This was a scripted install. It was compiled from source with gcc 4.4.7 and with --enable-R-shlib enabled. You
will need a large memory qrsh session in order to successfully run the build script. I used qrsh -l rmem=8G -l
mem=16G
• install_R_3.2.2.sh Downloads, compiles and installs R 3.2.2 and the Rmath library.
• R 3.2.2 Modulefile located on the system at /usr/local/modulefiles/apps/R/3.2.2
• Install log-files were manually copied to /usr/local/packages6/R/3.2.2/install_logs on the
system. This step should be included in the next version of the install script.
Version 3.2.1
This was a manual install. It was compiled from source with gcc 4.4.7 and with --enable-R-shlib enabled.
• Install notes
• R 3.2.1 Modulefile located on the system at /usr/local/modulefiles/apps/R/3.2.1
Older versions
Install notes for older versions of R are not available.

relion

relion
Versions 1.4
URL http://relion.readthedocs.org/en/latest/

RELION is a software package that performs an empirical Bayesian approach to (cryo-EM) structure de-termination
by single-particle analysis. Note that RELION is distributed under a GPL license.

Making RELION available The following module command makes the latest version of gemini available to your
session
module load apps/gcc/4.4.7/relion

Alternatively, you can make a specific version available


module load apps/gcc/4.4.7/relion/1.4

Installation notes These are primarily for system administrators.


Install instructions: http://www2.mrc-lmb.cam.ac.uk/relion/index.php/Download_%26_install
• RELION was installed using the gcc 4.4.7 compiler and Openmpi 1.8.8
• install_relion.sh

2.2. Iceberg 95
Sheffield HPC Documentation, Release

• Note - the environment variable RELION_QSUB_TEMPLATE points to an SGE qsub template, which needs
customizing to work with our environment

Modulefile The module file is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/relion/1.4


The contents of the module file is
#%Module1.0#####################################################################
##
## relion module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global bedtools-version

puts stderr " Setups `relion-$relionversion' environment variables"


}

set relionversion 1.4

module load apps/gcc/4.4.7/ctffind/3.140609


module load apps/binapps/resmap/1.1.4
module load mpi/gcc/openmpi/1.8.8

prepend-path PATH /usr/local/packages6/apps/gcc/4.4.7/relion/1.4/bin


prepend-path LD_LIBRARY_PATH /usr/local/packages6/apps/gcc/4.4.7/relion/1.4/lib
setenv RELION_QSUB_TEMPLATE /usr/local/packages6/apps/gcc/4.4.7/relion/1.4/bin/relion_qsub.csh
setenv RELION_CTFFIND_EXECUTABLE ctffind3_mp.exe
setenv RELION_RESMAP_EXECUTABLE /usr/local/packages6/apps/binapps/resmap/1.1.4

ResMap

ResMap
Versions 1.1.4
URL http://resmap.readthedocs.org/en/latest/

ResMap (Resolution Map) is a software package for computing the local resolution of 3D density maps studied in
structural biology, primarily electron cryo-microscopy.

Making ResMap available The following module command makes the latest version of gemini available to your
session
module load apps/binapps/resmap

Alternatively, you can make a specific version available


module load apps/binapps/resmap/1.1.4

96 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Installation notes These are primarily for system administrators.


• ResMap was installed as a precompiled binary from https://sourceforge.net/projects/resmap/files/ResMap-1.1.4-
linux64/download

Modulefile The module file is on the system at /usr/local/modulefiles/apps/binapps/1.1.4


The contents of the module file is
#%Module1.0#####################################################################
##
## ResMap module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
global bedtools-version

puts stderr " Adds `ResMap-$resmapversion' to your PATH environment variable"


}

set resmapversion 1.1.4

prepend-path PATH /usr/local/packages6/apps/binapps/resmap/$resmapversion/

Samtools

Samtools
Versions 1.2
URL http://samtools.sourceforge.net/

SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qsh
command.
The latest version of Samtools (currently 1.2) is made available with the command
module load apps/gcc/5.2/samtools

Alternatively, you can load a specific version with


module load apps/gcc/5.2/samtools/1.2

This command makes the samtools binary directory available to your session.

Documentation Once you have made samtools available to the system using the module command above, you can
read the man pages by typing

2.2. Iceberg 97
Sheffield HPC Documentation, Release

man samtools

Installation notes Samtools was installed using gcc 5.2


module load compilers/gcc/5.2

tar -xvjf ./samtools-1.2.tar.bz2


cd samtools-1.2
mkdir -p /usr/local/packages6/apps/gcc/5.2/samtools/1.2
make prefix=/usr/local/packages6/apps/gcc/5.2/samtools/1.2
make prefix=/usr/local/packages6/apps/gcc/5.2/samtools/1.2 install
#tabix and bgzip are not installed by the above procedure.
#We can get them by doing the following
cd htslib-1.2.1/
make
mv ./tabix /usr/local/packages6/apps/gcc/5.2/samtools/1.2/bin/
mv ./bgzip /usr/local/packages6/apps/gcc/5.2/samtools/1.2/bin/

Testing The test suite was run with


make test 2>&1 | tee make_tests.log

The summary of the test output was


Test output:
Number of tests:
total .. 368
passed .. 336
failed .. 0
expected failure .. 32
unexpected pass .. 0

test/merge/test_bam_translate test/merge/test_bam_translate.tmp
test/merge/test_pretty_header
test/merge/test_rtrans_build
test/merge/test_trans_tbl_init
cd test/mpileup && ./regression.sh
Samtools mpileup tests:

EXPECTED FAIL: Task failed, but expected to fail;


when running $samtools mpileup -x -d 8500 -B -f mpileup.ref.fa deep.sam|awk '{print $4}'

Expected passes: 123


Unexpected passes: 0
Expected failures: 1
Unexpected failures: 0

The full log is on the system at /usr/local/packages6/apps/gcc/5.2/samtools/1.2/make_tests.log

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/5.2/samtools/1.2
• The module file is on github.

98 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

spark

Spark
Version 2.0
URL http://spark.apache.org/

Apache Spark is a fast and general engine for large-scale data processing.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrsh or
qrshx command.
To make Spark available, execute the following module command
module load apps/gcc/4.4.7/spark/2.0

Installation notes Spark was built using the system gcc 4.4.7
tar -xvzf ./spark-2.0.0.tgz
cd spark-2.0.0
build/mvn -DskipTests clean package

mkdir /usr/local/packages6/apps/gcc/4.4.7/spark
mv spark-2.0.0 /usr/local/packages6/apps/gcc/4.4.7/spark/

Modulefile Version 2.0


• The module file is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/spark/2.0
Its contents are
#%Module1.0#####################################################################
##
## Spark module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

# Use only one core. User can override this if they want
setenv MASTER local\[1\]
prepend-path PATH /usr/local/packages6/apps/gcc/4.4.7/spark/spark-2.0.0/bin

SU2

SU2
Version 4.1.0
Dependancies gcc/4.4.7 mpi/gcc/openmpi/1.10.1
URL http://su2.stanford.edu/
Documentation https://github.com/su2code/SU2/wiki

2.2. Iceberg 99
Sheffield HPC Documentation, Release

The SU2 suite is an open-source collection of C++ based software tools for performing Partial Differential Equation
(PDE) analysis and solving PDE constrained optimization problems.

Usage SU2 can be activated using the module file:


module load apps/gcc/4.4.7/su2

Installation notes Su2 was compiled using the install_su2.sh script. SU2 also has it’s own version of CGNS which
was compiled with the script install_cgns.sh. The module file is 4.1.0.

Tophat

Tophat
Versions 2.1.0
URL https://ccb.jhu.edu/software/tophat/index.shtml

TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes
using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice
junctions between exons.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with either the
qsh or qrsh command.
The latest version of tophat (currently 2.1.0) is made available with the command:
module apps/gcc/4.8.2/tophat

Alternatively, you can load a specific version with


module load apps/gcc/4.8.2/tophat/2.1.0

This command makes the tophat binary available to your session.

Installation notes Tophat 2.1.0 was installed using gcc 4.8.2. Installs were attempted using gcc 5.2 and gcc 4.4.7
but both failed (see this issue on github )
This install has dependencies on the following
• GNU Compiler Collection (gcc) 4.8.2
• Boost C++ Library 1.58
• Bowtie2 (not needed at install time but is needed at runtime)
Install details
module load compilers/gcc/4.8.2
module load libs/gcc/4.8.2/boost/1.58

mkdir -p /usr/local/packages6/apps/gcc/4.8.2/tophat/2.1.0

tar -xvzf ./tophat-2.1.0.tar.gz


cd tophat-2.1.0
./configure --with-boost=/usr/local/packages6/libs/gcc/4.8.2/boost/1.58.0/ --prefix=/usr/local/packag

100 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Configuration results
-- tophat 2.1.0 Configuration Results --
C++ compiler: g++ -Wall -Wno-strict-aliasing -g -gdwarf-2 -Wuninitialized -O3 -DNDEBUG -I./s
Linker flags: -L./samtools-0.1.18 -L/usr/local/packages6/libs/gcc/4.8.2/boost/1.58.0//lib
BOOST libraries: -lboost_thread -lboost_system
GCC version: gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
Host System type: x86_64-unknown-linux-gnu
Install prefix: /usr/local/packages6/apps/gcc/4.8.2/tophat/2.1.0
Install eprefix: ${prefix}

Built with
make
make install

Testing A test script was executed based on the documentation on the tophat website (retrieved 29th October 2015).
It only proves that the code can run without error, not that the result is correct.
• tophat_test.sh

Modulefile
• The module file is on the system at /usr/local/modulefiles/apps/gcc/4.8.2/tophat/2.1.0
• The module file is on github.

VCFtools

VCFtools
Version 0.1.14
Dependancies gcc/5.2
URL https://vcftools.github.io/
Documentation https://vcftools.github.io/examples.html

VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes
Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data
in the form of VCF files.

Usage VCFtools can be activated using the module file:


module load apps/gcc/5.2/vcftools/0.1.14

This will both add the binary files to the $PATH and the Perl module to $PERL5LIB.

Installation notes VCFtools was compiled using the install_vcftools.sh script, the module file is 0.1.14.

Velvet

2.2. Iceberg 101


Sheffield HPC Documentation, Release

Velvet
Version 1.2.10
URL https://www.ebi.ac.uk/~zerbino/velvet/

Sequence assembler for very short reads.

Interactive Usage After connecting to Iceberg (see Connect to iceberg), start an interactive session with the qrshx
or qrsh command.
To add the velvet binaries to the system PATH, execute the following command
module load apps/gcc/4.4.7/velvet/1.2.10

This makes two programs available:-


• velvetg - de Bruijn graph construction, error removal and repeat resolution
• velveth - simple hashing program

Example submission script If the command you want to run is velvetg /fastdata/foo1bar/velvet/assembly_31 -
exp_cov auto -cov_cutoff auto, here is an example submission script that requests 60Gig memory
#!/bin/bash
#$ -l mem=60G
#$ -l rmem=60G

module load apps/gcc/4.4.7/velvet/1.2.10

velvetg /fastdata/foo1bar/velvet/assembly_31 -exp_cov auto -cov_cutoff auto

Put the above into a text file called submit_velvet.sh and submit it to the queue with the command qsub submit_velvet.sh

Velvet Performance Velvet has got multicore capabilities but these have not been compiled in our ver-
sion. This is because there is evidence that the performance is better running in serial than in parallel. See
http://www.ehu.eus/ehusfera/hpc/2012/06/20/benchmarking-genetic-assemblers-abyss-vs-velvet/ for details.

Installation notes Velvet was compiled with gcc 4.4.7


tar -xvzf ./velvet_1.2.10.tgz
cd velvet_1.2.10
make

mkdir -p /usr/local/packages6/apps/gcc/4.4.7/velvet/1.2.10
mv * /usr/local/packages6/apps/gcc/4.4.7/velvet/1.2.10

Modulefile The module file is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/velvet/1.2.10


Its contents are
#%Module1.0#####################################################################
##
## velvet 1.2.10 module file
##

102 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes velvet 1.2.10 available"
}

set VELVET_DIR /usr/local/packages6/apps/gcc/4.4.7/velvet/1.2.10

module-whatis "Makes velevt 1.2.10 available"

prepend-path PATH $VELVET_DIR

xz utils

xz utils
Latest version 5.2.2
URL http://tukaani.org/xz/

XZ Utils is free general-purpose data compression software with a high compression ratio. XZ Utils were written for
POSIX-like systems, but also work on some not-so-POSIX systems. XZ Utils are the successor to LZMA Utils.
The core of the XZ Utils compression code is based on LZMA SDK, but it has been modified quite a lot to be suitable
for XZ Utils. The primary compression algorithm is currently LZMA2, which is used inside the .xz container format.
With typical files, XZ Utils create 30 % smaller output than gzip and 15 % smaller output than bzip2.
XZ Utils consist of several components:
• liblzma is a compression library with an API similar to that of zlib.
• xz is a command line tool with syntax similar to that of gzip.
• xzdec is a decompression-only tool smaller than the full-featured xz tool.
• A set of shell scripts (xzgrep, xzdiff, etc.) have been adapted from gzip to ease viewing, grepping, and comparing
compressed files.
• Emulation of command line tools of LZMA Utils eases transition from LZMA Utils to XZ Utils.
While liblzma has a zlib-like API, liblzma doesn’t include any file I/O functions. A separate I/O library is planned,
which would abstract handling of .gz, .bz2, and .xz files with an easy to use API.

Usage There is an old version of xz utils available on the system by default. We can see its version with
xz --version

which gives
xz (XZ Utils) 4.999.9beta
liblzma 4.999.9beta

Version 4.999.9beta of xzutils was released in 2009.


To make version 5.2.2 (released in 2015) available, run the following module command

2.2. Iceberg 103


Sheffield HPC Documentation, Release

module load apps/gcc/4.4.7/xzutils/5.2.2

Documentation Standard man pages are available. The documentation version you get depends on wether or not
you’ve loaded the module.
man xz

Installation notes This section is primarily for administrators of the system. xz utils 5.2.2 was compiled with gcc
4.4.7
tar -xvzf ./xz-5.2.2.tar.gz
cd xz-5.2.2
mkdir -p /usr/local/packages6/apps/gcc/4.4.7/xzutils/5.2.2
./configure --prefix=/usr/local/packages6/apps/gcc/4.4.7/xzutils/5.2.2
make
make install

Testing was performed with


make check

Final piece of output was


==================
All 9 tests passed
==================

Module file Modulefile is on the system at /usr/local/modulefiles/apps/gcc/4.4.7/xzutils/5.2.2


#%Module1.0#####################################################################
##
## xzutils 5.2.2 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes xzutils 5.2.2 available"
}

set XZUTILS_DIR /usr/local/packages6/apps/gcc/4.4.7/xzutils/5.2.2

module-whatis "Makes xzutils 5.2.2 available"

prepend-path PATH $XZUTILS_DIR/bin


prepend-path LD_LIBRARY_PATH $XZUTILS_DIR/lib
prepend-path CPLUS_INCLUDE_PATH $XZUTILS_DIR/include
prepend-path CPATH $XZUTILS_DIR/include
prepend-path LIBRARY_PATH $XZUTILS_DIR/lib
prepend-path MANPATH $XZUTILS_DIR/share/man/

104 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Libraries

bclr

bclr
Latest version UNKNOWN
URL http://crd.lbl.gov/departments/computer-science/CLaSS/research/BLCR/

Future Technologies Group researchers are developing a hybrid kernel/user implementation of checkpoint/restart.
Their goal is to provide a robust, production quality implementation that checkpoints a wide range of applications,
without requiring changes to be made to application code. This work focuses on checkpointing parallel applications
that communicate through MPI, and on compatibility with the software suite produced by the SciDAC Scalable Sys-
tems Software ISIC. This work is broken down into 4 main areas:

Usage BCLR was installed before we started using the module system. As such, it is currently always available to
worker nodes. It should be considered experimental.
The checkpointing is performed at the kernel level, so any batch code should be checkpointable without modification
(it may not work with our MPI environment though... although it should cope with SMP codes).
To run a code, use
cr_run ./executable

To checkpoint a process with process id PID


cr_checkpoint -f checkpoint.file PID

Use the –term flag if you want to checkpoint and kill the process
To restart the process from a checkpoint
cr_restart checkpoint.file

Using with SGE in batch A checkpoint environment has been setup called BLCR An example of a checkpointing
job would look something like
#!/bin/bash
#$ -l h_rt=168:00:00
#$ -c sx
#$ -ckpt blcr
cr_run ./executable >> output.file

The -c sx options tells SGE to checkpoint if the queue is suspended, or if the execution daemon is killed. You can also
specify checkpoints to occur after a given time period.
A checkpoint file will be produced before the job is terminated. This file will be called checkpoint.[jobid].[pid]. This
file will contain the complete in-memory state of your program at the time that it terminates, so make sure that you
have enough disk space to save the file.
To resume a checkpointed job submit a job file which looks like
#!/bin/bash
#$ -l h_rt=168:00:00
#$ -c sx

2.2. Iceberg 105


Sheffield HPC Documentation, Release

#$ -ckpt blcr
cr_restart [name of checkpoint file]

Installation notes Installation notes are not available

beagle

beagle
Latest version 2.1.2
URL https://github.com/beagle-dev/beagle-lib
Location /usr/local/packages6/libs/gcc/4.4.7/beagle/2.1.2/

General purpose library for evaluating the likelihood of sequence evolution on trees

Usage To make this library available, run the following module command
module load libs/gcc/4.4.7/beagle/2.1.2

This populates the environment variables C_INCLUDE_PATH, LD_LIBRARY_PATH and LD_RUN_PATH with the
relevant directories.

Installation notes This section is primarily for administrators of the system.


Beagle 2.1.2 was compiled with gcc 4.4.7
• Install script - <https://github.com/mikecroucher/HPC_Installers/blob/master/libs/beagle/2.1.2/sheffield/iceberg/install_beagle_2_
• Module file - <https://github.com/mikecroucher/HPC_Installers/blob/master/libs/beagle/2.1.2/sheffield/iceberg/2.1.2>‘_

Boost C++ Library

Boost C++ Library


Latest version 1.58
URL www.boost.org

Boost provides free, peer-reviewed and portable C++ source libraries.

Usage On Iceberg, different versions of Boost were built using different versions of gcc. We suggest that you use
the matching version of gcc to build your code.
The latest version of Boost, version 1.59, was built using gcc 5.2. To make both the compiler and Boost library
available to the system, execute the following module commands while in a qrsh or qsh session
module load compilers/gcc/5.2
module load libs/gcc/5.2/boost/1.59

Boost version 1.58 was built using gcc 4.8.2. To make both the compiler and Boost library available to the system,
execute the following module commands while in a qrsh or qsh session

106 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

module load compilers/gcc/4.8.2


module load libs/gcc/4.8.2/boost/1.58

Version 1.41 of Boost uses version 4.4.7 of the gcc compiler. Since this is the default version of gcc on the system,
you only need to load the module for the library
module load libs/gcc/4.4.7/boost/1.41

Build a simple program using Boost Many boost libraries are header-only which makes them partic-
ularly simple to compile. The following program reads a sequence of integers from standard input,
uses Boost.Lambda to multiply each number by three, and writes them to standard output (taken from
http://www.boost.org/doc/libs/1_58_0/more/getting_started/unix-variants.html):
#include <boost/lambda/lambda.hpp>
#include <iostream>
#include <iterator>
#include <algorithm>

int main()
{
using namespace boost::lambda;
typedef std::istream_iterator<int> in;

std::for_each(
in(std::cin), in(), std::cout << (_1 * 3) << " " );
}

Copy this into a file called example1.cpp and compile with


g++ example1.cpp -o example
Provided you loaded the correct modules given above, the program should compile without error.

Linking to a Boost library The following program is taken from the official Boost documentation
http://www.boost.org/doc/libs/1_58_0/more/getting_started/unix-variants.html
#include <boost/regex.hpp>
#include <iostream>
#include <string>

int main()
{
std::string line;
boost::regex pat( "^Subject: (Re: |Aw: )*(.*)" );

while (std::cin)
{
std::getline(std::cin, line);
boost::smatch matches;
if (boost::regex_match(line, matches, pat))
std::cout << matches[2] << std::endl;
}
}

This program makes use of the Boost.Regex library, which has a separately-compiled binary component we need to
link to. Assuming that the above program is called example2.cpp, compile with the following command
g++ example2.cpp -o example2 -lboost_regex

2.2. Iceberg 107


Sheffield HPC Documentation, Release

If you get an error message that looks like this:


example2.cpp:1:27: error: boost/regex.hpp: No such file or directory
the most likely cause is that you forgot to load the correct modules as detailed above.

Installation Notes This section is primarily for administrators of the system


version 1.59: Compiled with gcc 5.2 and icu version 55
module load compilers/gcc/5.2
module load libs/gcc/4.8.2/libunistring/0.9.5
module load libs/gcc/4.8.2/icu/55

mkdir -p /usr/local/packages6/libs/gcc/5.2/boost/1.59.0/
tar -xvzf ./boost_1_59_0.tar.gz
cd boost_1_59_0
./bootstrap.sh --prefix=/usr/local/packages6/libs/gcc/5.2/boost/1.59.0/

It complained that it could not find the icu library but when I ran
./b2 install --prefix=/usr/local/packages6/libs/gcc/5.2/boost/1.59.0/

It said that it had detected the icu library and was compiling it in
Version 1.58: Compiled with gcc 4.8.2 and icu version 55
module load compilers/gcc/4.8.2
module load libs/gcc/4.8.2/libunistring/0.9.5
module load libs/gcc/4.8.2/icu/55
tar -xvzf ./boost_1_58_0.tar.gz
cd boost_1_58_0
./bootstrap.sh --prefix=/usr/local/packages6/libs/gcc/4.8.2/boost/1.58.0/

It complained that it could not find the icu library but when I ran
./b2 install --prefix=/usr/local/packages6/libs/gcc/4.8.2/boost/1.58.0

It said that it had detected the icu library and was compiling it in
Version 1.41: This build of boost was built with gcc 4.4.7 and ICU version 42
module load libs/gcc/4.4.7/icu/42
tar -xvzf ./boost_1_41_0.tar.gz
cd boost_1_41_0
./bootstrap.sh --prefix=/usr/local/packages6/libs/gcc/4.4.7/boost/1.41
./bjam -sICU_PATH=/usr/local/packages6/libs/gcc/4.4.7/icu/42 install

Testing The two examples above were compiled and run.

Module Files Version 1.59


Module file location: /usr/local/modulefiles/libs/gcc/5.2/boost/1.59
#%Module1.0#####################################################################
##
## boost 1.59 module file
##

108 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load libs/gcc/4.8.2/libunistring/0.9.5


module load libs/gcc/4.8.2/icu/55

proc ModulesHelp { } {
puts stderr "Makes the Boost 1.59 library available"
}

set BOOST_DIR /usr/local/packages6/libs/gcc/5.2/boost/1.59.0

module-whatis "Makes the Boost 1.59 library available"

prepend-path LD_LIBRARY_PATH $BOOST_DIR/lib


prepend-path CPLUS_INCLUDE_PATH $BOOST_DIR/include
prepend-path LIBRARY_PATH $BOOST_DIR/lib

Version 1.58
Module file location: /usr/local/modulefiles/libs/gcc/4.8.2/boost/1.58
#%Module1.0#####################################################################
##
## boost 1.58 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load libs/gcc/4.8.2/libunistring/0.9.5


module load libs/gcc/4.8.2/icu/55

proc ModulesHelp { } {
puts stderr "Makes the Boost 1.58 library available"
}

set BOOST_DIR /usr/local/packages6/libs/gcc/4.8.2/boost/1.58.0

module-whatis "Makes the Boost 1.58 library available"

prepend-path LD_LIBRARY_PATH $BOOST_DIR/lib


prepend-path CPLUS_INCLUDE_PATH $BOOST_DIR/include
prepend-path LIBRARY_PATH $BOOST_DIR/lib

Version 1.41
The module file is on the system at /usr/local/modulefiles/libs/gcc/4.4.7/boost/1.41
#%Module1.0#####################################################################
##
## Boost 1.41 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

2.2. Iceberg 109


Sheffield HPC Documentation, Release

module load libs/gcc/4.4.7/icu/42

proc ModulesHelp { } {
puts stderr "Makes the Boost 1.41 library available"
}

set BOOST_DIR /usr/local/packages6/libs/gcc/4.4.7/boost/1.41

module-whatis "Makes the Boost 1.41 library available"

prepend-path LD_LIBRARY_PATH $BOOST_DIR/lib


prepend-path CPLUS_INCLUDE_PATH $BOOST_DIR/include
prepend-path LIBRARY_PATH $BOOST_DIR/lib

bzip2

bzip2 is a freely available, patent free (see below), high-quality data compressor. It typically compresses files to within
10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as
fast at compression and six times faster at decompression.

bzip2
Version 1.0.6
URL http://www.bzip.org/
Location /usr/local/packages6/libs/gcc/4.4.7/bzip2/1.0.6

Usage The default version of bzip2 on the system is version 1.0.5. If you need a newer version run the following
module command
module load libs/gcc/4.4.7/bzip2/1.0.6

Check the version number that’s available using


bzip2 --version

Documentation Standard man pages are available


man bzip2

Installation notes This section is primarily for administrators of the system.


It was built with gcc 4.4.7
wget http://www.bzip.org/1.0.6/bzip2-1.0.6.tar.gz
mkdir -p /usr/local/packages6/libs/gcc/4.4.7/bzip2/1.0.6
tar -xvzf ./bzip2-1.0.6.tar.gz
cd bzip2-1.0.6

make -f Makefile-libbz2_so
make
make install PREFIX=/usr/local/packages6/libs/gcc/4.4.7/bzip2/1.0.6
mv *.so* /usr/local/packages6/libs/gcc/4.4.7/bzip2/1.0.6/lib/

110 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

testing The library is automatically tested when you do a make. The results were
Doing 6 tests (3 compress, 3 uncompress) ...
If there's a problem, things might stop at this point.

./bzip2 -1 < sample1.ref > sample1.rb2


./bzip2 -2 < sample2.ref > sample2.rb2
./bzip2 -3 < sample3.ref > sample3.rb2
./bzip2 -d < sample1.bz2 > sample1.tst
./bzip2 -d < sample2.bz2 > sample2.tst
./bzip2 -ds < sample3.bz2 > sample3.tst
cmp sample1.bz2 sample1.rb2
cmp sample2.bz2 sample2.rb2
cmp sample3.bz2 sample3.rb2
cmp sample1.tst sample1.ref
cmp sample2.tst sample2.ref
cmp sample3.tst sample3.ref

If you got this far and the 'cmp's didn't complain, it looks
like you're in business.

Module File Module location is /usr/local/modulefiles/libs/gcc/4.4.7/bzip2/1.0.6. Module contents


#%Module1.0#####################################################################
##
## bzip2 1.0.6 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the bzip 1.0.6 library available"
}

module-whatis "Makes the bzip 1.0.6 library available"

set BZIP_DIR /usr/local/packages6/libs/gcc/4.4.7/bzip2/1.0.6

prepend-path LD_LIBRARY_PATH $BZIP_DIR/lib


prepend-path CPATH $BZIP_DIR/include
prepend-path MANPATH $BZIP_DIR/man
prepend-path PATH $BZIP_DIR/bin

cfitsio

cfitsio
Latest version 3.380
URL http://heasarc.gsfc.nasa.gov/fitsio/fitsio.html

CFITSIO is a library of C and Fortran subroutines for reading and writing data files in FITS (Flexible Image Transport
System) data format. CFITSIO provides simple high-level routines for reading and writing FITS files that insulate the
programmer from the internal complexities of the FITS format.

2.2. Iceberg 111


Sheffield HPC Documentation, Release

Usage To make this library available, run the following module command
module load libs/gcc/5.2/cfitsio

The modulefile creates a variable $CFITSIO_INCLUDE_PATH which is the path to the include directory.

Installation notes This section is primarily for administrators of the system. CFITSIO 3.380 was compiled with gcc
5.2. The compilation used this script and it is loaded with this modulefile .

cgns

cgns
Version 3.2.1
Support Level Bronze
Dependancies libs/hdf5/gcc/openmpi/1.8.14
URL http://cgns.github.io/WhatIsCGNS.html
Location /usr/local/packages6/libs/gcc/4.4.7/cgnslib

The CFD General Notation System (CGNS) provides a general, portable, and extensible standard for the storage and
retrieval of computational fluid dynamics (CFD) analysis data.

Usage To make this library available, run the following module command
module load libs/gcc/4.4.7/cgns/3.2.1

This will also load the module files for the prerequisite libraries, Open MPI 1.8.3 and HDF5 1.8.14 with parallel
support.

Installing This section is primarily for administrators of the system.


• This is a prerequisite for Code Saturne version 4.0.
• It was built with gcc 4.4.7, openmpi 1.8.3 and hdf 1.8.14
module load libs/hdf5/gcc/openmpi/1.8.14
tar -xvzf cgnslib_3.2.1.tar.gz
mkdir /usr/local/packages6/libs/gcc/4.4.7/cgnslib
cd /usr/local/packages6/libs/gcc/4.4.7/cgnslib
mkdir 3.2.1
cd 3.2.1
cmake ~/cgnslib_3.2.1/
ccmake .

Configured the following using ccmake


CGNS_ENABLE_PARALLEL ON
MPIEXEC /usr/local/mpi/gcc/openmpi/1.8.3/bin/mpiexec
MPI_COMPILER /usr/local/mpi/gcc/openmpi/1.8.3/bin/mpic++
MPI_EXTRA_LIBRARY /usr/local/mpi/gcc/openmpi/1.8.3/lib/libmpi.s
MPI_INCLUDE_PATH /usr/local/mpi/gcc/openmpi/1.8.3/include
MPI_LIBRARY /usr/local/mpi/gcc/openmpi/1.8.3/lib/libmpi_c
ZLIB_LIBRARY /usr/lib64/libz.so

FORTRAN_NAMING LOWERCASE_

112 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

HDF5_INCLUDE_PATH /usr/local/packages6/hdf5/gcc-4.4.7/openmpi-1.8.3/hdf5-1.8.14/includ
HDF5_LIBRARY /usr/local/packages6/hdf5/gcc-4.4.7/openmpi-1.8.3/hdf5-1.8.14/lib/li
HDF5_NEED_MPI ON
HDF5_NEED_SZIP OFF
HDF5_NEED_ZLIB ON
CGNS_BUILD_CGNSTOOLS OFF
CGNS_BUILD_SHARED ON
CGNS_ENABLE_64BIT ON
CGNS_ENABLE_FORTRAN ON
CGNS_ENABLE_HDF5 ON
CGNS_ENABLE_SCOPING OFF
CGNS_ENABLE_TESTS ON
CGNS_USE_SHARED ON
CMAKE_BUILD_TYPE Release
CMAKE_INSTALL_PREFIX /usr/local/packages6/libs/gcc/4.4.7/cgnslib/3.2.1

Once the configuration was complete, I did


make
make install

Module File Module File Location: /usr/local/modulefiles/libs/gcc/4.4.7/cgns/3.2.1


#%Module1.0#####################################################################
##
## cgns 3.2.1 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the cgns 3.2.1 library available"
}

module-whatis "Makes the cgns 3.2.1 library available"


module load libs/hdf5/gcc/openmpi/1.8.14

set CGNS_DIR /usr/local/packages6/libs/gcc/4.4.7/cgnslib/3.2.1

prepend-path LD_LIBRARY_PATH $CGNS_DIR/lib


prepend-path CPATH $CGNS_DIR/include

CUDA

CUDA, which stands for Compute Unified Device Architecture, is a parallel computing platform and application pro-
gramming interface (API) model created by NVIDIA. It allows software developers to use a CUDA-enabled graphics
processing unit (GPU) for general purpose processing – an approach known as GPGPU

Usage There are several versions of the CUDA library available. As with many libraries installed on the system,
CUDA libraries are made available via module commands which are only available once you have started a qrsh or
qsh session.
The latest version CUDA is loaded with the command

2.2. Iceberg 113


Sheffield HPC Documentation, Release

module load libs/cuda

Alternatively, you can load a specific version with one of the following
module load libs/cuda/7.5.18
module load libs/cuda/6.5.14
module load libs/cuda/4.0.17
module load libs/cuda/3.2.16

Compiling the sample programs You do not need to be using a GPU-enabled node to compile the sample programs
but you do need a GPU to run them.
In a qrsh session
#Load modules
module load libs/cuda/7.5.18
module load compilers/gcc/4.9.2

#Copy CUDA samples to a local directory


#It will create a directory called NVIDIA_CUDA-7.5_Samples/
mkdir cuda_samples
cd cuda_samples
cp -r $CUDA_SDK .

#Compile (This will take a while)


cd NVIDIA_CUDA-7.5_Samples/
make

A basic test is to run one of the resulting binaries, deviceQuery.

Documentation
• CUDA Toolkit Documentation
• The power of C++11 in CUDA 7

Determining the NVIDIA Driver version Run the command


cat /proc/driver/nvidia/version

Example output is
NVRM version: NVIDIA UNIX x86_64 Kernel Module 340.32 Tue Aug 5 20:58:26 PDT 2014
GCC version: gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC

Installation notes These are primarily for system administrators


CUDA 7.5.18
The device drivers were updated separately by one of the sysadmins.
A binary install was performed using a .run file
mkdir -p /usr/local/packages6/libs/binlibs/CUDA/7.5.18/

chmod +x ./cuda_7.5.18_linux.run
./cuda_7.5.18_linux.run --toolkit --toolkitpath=/usr/local/packages6/libs/binlibs/CUDA/7.5.18/cuda --

114 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Previous version
No install notes are available

Module Files
• The module file is on the system at /usr/local/modulefiles/libs/cuda/7.5.18
• The module file is on github.

curl

curl
Latest version 7.47.1
URL https://curl.haxx.se/

curl is an open source command line tool and library for transferring data with URL syntax, supporting DICT, FILE,
FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB,
SMTP, SMTPS, Telnet and TFTP. curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP
form based upload, proxies, HTTP/2, cookies, user+password authentication (Basic, Plain, Digest, CRAM-MD5,
NTLM, Negotiate and Kerberos), file transfer resume, proxy tunneling and more.

Usage There is a default version of curl available on the system but it is rather old
curl-config --version

gives the result


libcurl 7.19.7

Version 7.19.7 was released in November 2009!


A newer version of the library is available via the module system. To make it available
module load libs/gcc/4.4.7/curl/7.47.1

The curl-config command will now report the newer version


curl-config --version

Should result in
libcurl 7.47.1

Documentation Standard man pages are available


man curl

Installation notes This section is primarily for administrators of the system.


Curl 7.47.1 was compiled with gcc 4.47
• Install script: install_curl_7.47.1.sh
• Module file: 7.47.1

2.2. Iceberg 115


Sheffield HPC Documentation, Release

fftw

fftw
Latest version 3.3.4
URL http://www.fftw.org/
Location /usr/local/packages6/libs/gcc/5.2/fftw/3.3.4

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of
arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine
transforms or DCT/DST).

Usage To make this library available, run the following module command
module load libs/gcc/5.2/fftw/3.3.4

Installation notes This section is primarily for administrators of the system. FFTW 3.3.4 was compiled with gcc
5.2
module load compilers/gcc/5.2
mkdir -p /usr/local/packages6/libs/gcc/5.2/fftw/3.3.4
tar -xvzf fftw-3.3.4.tar.gz
cd fftw-3.3.4
./configure --prefix=/usr/local/packages6/libs/gcc/5.2/fftw/3.3.4 --enable-threads --enable-openmp --
make
make check

Result was lots of numerical output and


--------------------------------------------------------------
FFTW transforms passed basic tests!
--------------------------------------------------------------

--------------------------------------------------------------
FFTW threaded transforms passed basic tests!
--------------------------------------------------------------

Installed with
make install

Module file Modulefile is on the system at /usr/local/modulefiles/libs/gcc/5.2/fftw/3.3.4


#%Module1.0#####################################################################
##
## fftw 3.3.4 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load compilers/gcc/5.2

proc ModulesHelp { } {

116 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

puts stderr "Makes the FFTW 3.3.4 library available"


}

set FFTW_DIR /usr/local/packages6/libs/gcc/5.2/fftw/3.3.4

module-whatis "Makes the FFTW 3.3.4 library available"

prepend-path LD_LIBRARY_PATH $FFTW_DIR/lib


prepend-path CPLUS_INCLUDE_PATH $FFTW_DIR/include
prepend-path LIBRARY_PATH $FFTW_DIR/lib

FLTK

FLTK
Version 1.3.3
URL http://www.fltk.org/index.php
Location /usr/local/packages6/libs/gcc/5.2/fltk/1.3.3

FLTK (pronounced “fulltick”) is a cross-platform C++ GUI toolkit for UNIX®/Linux® (X11), Microsoft® Win-
dows®, and MacOS® X. FLTK provides modern GUI functionality without the bloat and supports 3D graphics via
OpenGL® and its built-in GLUT emulation.

Usage
module load libs/gcc/5.2/fltk/1.3.3

Installation notes This section is primarily for administrators of the system.


• This is a pre-requisite for GNU Octave version 4.0
• It was built with gcc 5.2
module load compilers/gcc/5.2
mkdir -p /usr/local/packages6/libs/gcc/5.2/fltk/1.3.3
#tar -xvzf ./fltk-1.3.3-source.tar.gz
cd fltk-1.3.3

#Fixes error relating to undefined _ZN18Fl_XFont_On_Demand5valueEv


#Source https://groups.google.com/forum/#!topic/fltkgeneral/GT6i2KGCb3A
sed -i 's/class Fl_XFont_On_Demand/class FL_EXPORT Fl_XFont_On_Demand/' FL/x.H

./configure --prefix=/usr/local/packages6/libs/gcc/5.2/fltk/1.3.3 --enable-shared --enable-xft


make
make install

Module File Modulefile at /usr/local/modulefiles/libs/gcc/5.2/fltk/1.3.3


#%Module1.0#####################################################################
##
## fltk 1.3.3 module file
##

2.2. Iceberg 117


Sheffield HPC Documentation, Release

## Module file logging


source /usr/local/etc/module_logging.tcl
##

module load compilers/gcc/5.2

proc ModulesHelp { } {
puts stderr "Makes the FLTK 1.3.3 library available"
}

set FLTK_DIR /usr/local/packages6/libs/gcc/5.2/fltk/1.3.3

module-whatis "Makes the FLTK 1.3.3 library available"

prepend-path LD_LIBRARY_PATH $FLTK_DIR/lib


prepend-path CPLUS_INCLUDE_PATH $FLTK_DIR/include
prepend-path LIBRARY_PATH $FLTK_DIR/lib
prepend-path PATH $FLTK_DIR/bin

geos

geos
Version 3.4.2
Support Level Bronze
Dependancies compilers/gcc/4.8.2
URL http://trac.osgeo.org/geos/
Location /usr/local/packages6/libs/gcc/4.8.2/geos/3.4.2

GEOS - Geometry Engine, Open Source

Usage To make this library available, run the following module commands
module load compilers/gcc/4.8.2
module load libs/gcc/4.8.2/geos/3.4.2

We load version 4.8.2 of gcc since gcc 4.8.2 was used to build this version of geos.

The rgeos interface in R rgeos is a CRAN package that provides an R interface to geos. It is not installed in R by
default so you need to install a version in your home directory.
After connecting to iceberg (see Connect to iceberg), start an interactive session with the qrsh or qsh command.
Run the following module commands
module load apps/R/3.2.0
module load compilers/gcc/4.8.2
module load libs/gcc/4.8.2/geos/3.4.2

Launch R and run the command


install.packages('rgeos')

If you’ve never installed an R package before on the system, it will ask you if you want to install to a personal library.
Answer y to any questions you are asked.

118 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

The library will be installed to a sub-directory called R in your home directory and you should only need to perform
the above procedure once.
Once you have performed the installation, you will only need to run the module commands above to make the geos
library available to the system. Then, you use regos as you would any other library in R
library('rgeos')

Installation notes This section is primarily for administrators of the system.


qrsh
tar -xvjf ./geos-3.4.2.tar.bz2
cd geos-3.4.2
mkdir -p /usr/local/packages6/libs/gcc/4.8.2/geos/3.4.2
module load compilers/gcc/4.8.2
./configure prefix=/usr/local/packages6/libs/gcc/4.8.2/geos/3.4.2

Potentially useful output at the end of the configure run


Swig: false
Python bindings: false
Ruby bindings: false
PHP bindings: false

Once the configuration was complete, I did


make
make install

Testing Compile and run the test-suite with


make check

All tests passed.

Module File Module File Location: /usr/local/modulefiles/libs/gcc/4.8.2/geos/3.4.2


more /usr/local/modulefiles/libs/gcc/4.8.2/geos/3.4.2
#%Module1.0#####################################################################
##
## geos 3.4.2 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the geos 3.4.2 library available"
}

set GEOS_DIR /usr/local/packages6/libs/gcc/4.8.2/geos/3.4.2

module-whatis "Makes the geos 3.4.2 library available"

prepend-path LD_LIBRARY_PATH $GEOS_DIR/lib


prepend-path PATH $GEOS_DIR/bin

2.2. Iceberg 119


Sheffield HPC Documentation, Release

HDF5

HDF5
Version 1.8.16, 1.8.15-patch1, 1.8.14 and 1.8.13
Dependencies gcc or pgi compiler, openmpi (optional)
URL http://www.hdfgroup.org/HDF5/
Documentation http://www.hdfgroup.org/HDF5/doc/

HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of
datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. HDF5 is portable and
is extensible, allowing applications to evolve in their use of HDF5. The HDF5 Technology suite includes tools and
applications for managing, manipulating, viewing, and analyzing data in the HDF5 format.
Two primary versions of this library are provided, MPI parallel enabled versions and serial versions.

Usage - Serial The serial versions were built with gcc version 4.8.2. As such, if you are going to build anything
against these versions of HDF5, we recommend that you use gcc 4.8.2 which can be enabled with the following module
command
module load compilers/gcc/4.8.2

To enable the serial version of HDF5, use one of the following module commands depending on which version of the
library you require:
module load libs/hdf5/gcc/1.8.14
module load libs/hdf5/gcc/1.8.13

Usage – Parallel There are multiple versions of parallel HDF5 installed with different openmpi and compiler ver-
sions.
Two versions of HDF were built using gcc version 4.4.7 and OpenMPI version 1.8.3. Version 4.4.7 of gcc is the default
compiler on the system so no module command is required for this
module load libs/hdf5/gcc/openmpi/1.8.14
module load libs/hdf5/gcc/openmpi/1.8.13

One version was built with the PGI compiler version 15.7 and openmpi version 1.8.8
module load libs/hdf5/pgi/1.8.15-patch1

The above module also loads the relevant modules for OpenMPI and PGI Compiler. To see which modules have been
loaded, use the command module list
Finally, another version was built with GCC 4.4.7 and openmpi 1.10.1, this version is also linked against ZLIB and
SZIP.:
module load libs/gcc/4.4.7/openmpi/1.10.1/hdf5/1.8.16

Installation notes This section is primarily for administrators of the system.


Version 1.8.16 built using GCC Compiler, with seperate ZLIB and SZIP
• install_hdf5.sh Install script
• 1.8.16

120 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Version 1.8.15-patch1 built using PGI Compiler


Here are the build details for the module libs/hdf5/pgi/1.8.15-patch1
Compiled using PGI 15.7 and OpenMPI 1.8.8
• install_pgi_hdf5_1.8.15-patch1.sh Install script
• Modulefile located on the system at /usr/local/modulefiles/libs/hdf5/pgi/1.8.15-patch1
gcc versions
This package is built from the source code distribution from the HDF Group website.
Two primary versions of this library are provided, a MPI parallel enabled version and a serial version. The serial
version has the following configuration flags enabled:
--enable-fortran --enable-fortran2003 --enable-cxx --enable-shared

The parallel version has the following flags:


--enable-fortran --enable-fortran2003 --enable-shared --enable-parallel

The parallel library does not support C++, hence it being disabled for the parallel build.

icu - International Components for Unicode

icu
Latest Version 55
URL http://site.icu-project.org/

ICU is the premier library for software internationalization, used by a wide array of companies and organizations

Usage Version 55 of the icu library requires gcc version 4.8.2. To make the compiler and library available, run the
following module commands
module load compilers/gcc/4.8.2
module load libs/gcc/4.8.2/icu/55

Version 42 of the icu library uses gcc version 4.4.7 which is the default on Iceberg. To make the library available, run
the following command
module load libs/gcc/4.4.7/icu/42

Installation Notes This section is primarily for administrators of the system.


Version 55
Icu 55 is a pre-requisite for the version of boost required for an experimental R module used by one of our users.
module load compilers/gcc/4.8.2
tar -xvzf icu4c-55_1-src.tgz
cd icu/source
./runConfigureICU Linux/gcc --prefix=/usr/local/packages6/libs/gcc/4.8.2/icu/55/
make

2.2. Iceberg 121


Sheffield HPC Documentation, Release

Version 42
Icu version 42 was originally installed as a system RPM. This install moved icu to a module-based install
tar -xvzf icu4c-4_2_1-src.tgz
cd icu/source/
./runConfigureICU Linux/gcc --prefix=/usr/local/packages6/libs/gcc/4.4.7/icu/42
make
make install

Testing Version 55
make check
Last few lines of output were
[All tests passed successfully...]
Elapsed Time: 00:00:00.086
make[2]: Leaving directory `/home/fe1mpc/icu/icu/source/test/letest'
---------------
ALL TESTS SUMMARY:
All tests OK: testdata intltest iotest cintltst letest
make[1]: Leaving directory `/home/fe1mpc/icu/icu/source/test'
make[1]: Entering directory `/home/fe1mpc/icu/icu/source'
verifying that icu-config --selfcheck can operate
verifying that make -f Makefile.inc selfcheck can operate
PASS: config selfcheck OK
rm -rf test-local.xml

Version 42
make check
Last few lines of output were
All tests passed successfully...]
Elapsed Time: 00:00:12.000
make[2]: Leaving directory `/home/fe1mpc/icu/source/test/cintltst'
---------------
ALL TESTS SUMMARY:
ok: testdata iotest cintltst
===== ERRS: intltest
make[1]: *** [check-recursive] Error 1
make[1]: Leaving directory `/home/fe1mpc/icu/source/test'
make: *** [check-recursive] Error 2

The error can be ignored since it is a bug in the test itself:


• http://sourceforge.net/p/icu/mailman/message/32443311/

Module Files Version 55 Module File Location: /usr/local/modulefiles/libs/gcc/4.8.2/icu/55


#%Module1.0#####################################################################
##
## icu 55 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

122 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

proc ModulesHelp { } {
puts stderr "Makes the icu library available"
}

set ICU_DIR /usr/local/packages6/libs/gcc/4.8.2/icu/55

module-whatis "Makes the icu library available"

prepend-path LD_LIBRARY_PATH $ICU_DIR/lib


prepend-path LIBRARY_PATH $ICU_DIR/lib
prepend-path CPLUS_INCLUDE_PATH $ICU_DIR/include

Version 42 Module File Location: /usr/local/modulefiles/libs/gcc/4.4.7/icu/42


#%Module1.0#####################################################################
##
## icu 42 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the icu library available"
}

set ICU_DIR /usr/local/packages6/libs/gcc/4.4.7/icu/42

module-whatis "Makes the icu library available"

prepend-path LD_LIBRARY_PATH $ICU_DIR/lib


prepend-path LIBRARY_PATH $ICU_DIR/lib
prepend-path CPLUS_INCLUDE_PATH $ICU_DIR/include

libunistring

libunistring
Latest version 0.9.5
URL http://www.gnu.org/software/libunistring/#TOCdownloading/

Text files are nowadays usually encoded in Unicode, and may consist of very different scripts – from Latin letters
to Chinese Hanzi –, with many kinds of special characters – accents, right-to-left writing marks, hyphens, Roman
numbers, and much more. But the POSIX platform APIs for text do not contain adequate functions for dealing with
particular properties of many Unicode characters. In fact, the POSIX APIs for text have several assumptions at their
base which don’t hold for Unicode text.
This library provides functions for manipulating Unicode strings and for manipulating C strings according to the
Unicode standard.

Usage To make this library available, run the following module command

2.2. Iceberg 123


Sheffield HPC Documentation, Release

module load libs/gcc/4.8.2/libunistring/0.9.5

This correctly populates the environment variables LD_LIBRARY_PATH, LIBRARY_PATH and


CPLUS_INCLUDE_PATH

Installation notes This section is primarily for administrators of the system. libunistring 0.9.5 was compiled with
gcc 4.8.2
module load compilers/gcc/4.8.2
tar -xvzf ./libunistring-0.9.5.tar.gz
cd ./libunistring-0.9.5
./configure --prefix=/usr/local/packages6/libs/gcc/4.8.2/libunistring/0.9.5

#build
make
#Testing
make check
#Install
make install

Testing Run make check after make and before make install. This runs the test suite.
Results were
============================================================================
Testsuite summary for
============================================================================
# TOTAL: 492
# PASS: 482
# SKIP: 10
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================

Module file The module file is on the system at /usr/local/modulefiles/libs/gcc/4.8.2/libunistring/0.9.5


#%Module1.0#####################################################################
##
## libunistring 0.9.5 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the libunistring library available"
}

set LIBUNISTRING_DIR /usr/local/packages6/libs/gcc/4.8.2/libunistring/0.9.5

module-whatis "Makes the libunistring library available"

124 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

prepend-path LD_LIBRARY_PATH $LIBUNISTRING_DIR/lib/


prepend-path LIBRARY_PATH $LIBUNISTRING_DIR/lib
prepend-path CPLUS_INCLUDE_PATH $LIBUNISTRING_DIR/include

MED

MED
Version 3.0.8
Support Level Bronze
URL http://www.salome-platform.org/downloads/current-version
Location /usr/local/packages6/libs/gcc/4.4.7/med/3.0.8

The purpose of the MED module is to provide a standard for storing and recovering computer data associated to
numerical meshes and fields, and to facilitate the exchange between codes and solvers.

Usage To make this library available, run the following module command
module load libs/gcc/4.4.7/med/3.0.8

Installation notes This section is primarily for administrators of the system.


• This is a pre-requisite for Code Saturne version 4.0.
• It was built with gcc 4.4.7, openmpi 1.8.3 and hdf5 1.8.14
module load mpi/gcc/openmpi/1.8.3
tar -xvzf med-3.0.8.tar.gz
cd med-3.0.8
mkdir -p /usr/local/packages6/libs/gcc/4.4.7/med/3.0.8
./configure --prefix=/usr/local/packages6/libs/gcc/4.4.7/med/3.0.8 --disable-fortran --with-hdf5=/usr
make
make install

Fortran was disabled because otherwise the build failed with compilation errors. It’s not needed for Code Saturne 4.0.
Python was disabled because it didn’t have MPI support.

testing The following was submiited as an SGE job from the med-3.0.8 build directory
#!/bin/bash

#$ -pe openmpi-ib 8
#$ -l mem=6G

module load mpi/gcc/openmpi/1.8.3


make check

All tests passed

Module File

2.2. Iceberg 125


Sheffield HPC Documentation, Release

#%Module1.0#####################################################################
##
## MED 3.0.8 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the MED 3.0.8 library available"
}

module-whatis "Makes the MED 3.0.8 library available"

set MED_DIR /usr/local/packages6/libs/gcc/4.4.7/med/3.0.8

prepend-path LD_LIBRARY_PATH $MED_DIR/lib64


prepend-path CPATH $MED_DIR/include

NAG Fortran Library (Serial)

Produced by experts for use in a variety of applications, the NAG Fortran Library has a global reputation for its
excellence and, with hundreds of fully documented and tested routines, is the largest collection of mathematical and
statistical algorithms available.
This is the serial (1 CPU core) version of the NAG Fortran Library. For many routines, you may find it beneficial to
use the parallel version of the library.

Usage There are several versions of the NAG Fortran Library available. The one you choose depends on which
compiler you are using. As with many libraries installed on the system, NAG libraries are made available via module
commands which are only available once you have started a qrsh or qsh session.
In addition to loading a module for the library, you will usually need to load a module for the compiler you are using.
NAG for Intel Fortran
Use the following command to make Mark 25 of the serial (1 CPU core) version of the NAG Fortran Library for Intel
compilers available
module load libs/intel/15/NAG/fll6i25dcl

Once you have ensured that you have loaded the module for the Intel Compilers you can compile your NAG program
using
ifort your_code.f90 -lnag_mkl -o your_code.exe

which links to a version of the NAG library that’s linked against the high performance Intel MKL (which provides
high-performance versions of the BLAS and LAPACK libraries). Alternatively, you can compile using
ifort your_code.f90 -lnag_nag -o your_code.exe

Which is linked against a reference version of BLAS and LAPACK. If you are in any doubt as to which to choose, we
suggest that you use -lnag_mkl

126 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Running NAG’s example programs Most of NAG’s routines come with example programs that show how to use
them. When you use the module command to load a version of the NAG library, the script nag_example becomes
available. Providing this script wth the name of the NAG routine you are interested in will copy, compile and run the
example program for that routine into your current working directory.
For example, here is an example output for the NAG routine a00aaf which identifies the version of the NAG library
you are using. If you try this yourself, the output you get will vary according to which version of the NAG library you
are using
nag_example a00aaf

If you have loaded the module for fll6i25dcl this will give the following output
Copying a00aafe.f90 to current directory
cp /usr/local/packages6/libs/intel/15/NAG/fll6i25dcl/examples/source/a00aafe.f90 .

Compiling and linking a00aafe.f90 to produce executable a00aafe.exe


ifort -I/usr/local/packages6/libs/intel/15/NAG/fll6i25dcl/nag_interface_blocks a00aafe.f90 /usr/local

Running a00aafe.exe
./a00aafe.exe > a00aafe.r
A00AAF Example Program Results

*** Start of NAG Library implementation details ***

Implementation title: Linux 64 (Intel 64 / AMD64), Intel Fortran, Double Precision (32-bit integers)
Precision: FORTRAN double precision
Product Code: FLL6I25DCL
Mark: 25.1.20150610 (self-contained)

*** End of NAG Library implementation details ***

Functionality The key numerical and statistical capabilities of the Fortran Library are shown below.
• Click here for a complete list of the contents of the Library.
• Cick here to see what’s new in Mark 25 of the library.
Numerical Facilities
• Optimization, both Local and Global
• Linear, quadratic, integer and nonlinear programming and least squares problems
• Ordinary and partial differential equations, and mesh generation
• Solution of dense, banded and sparse linear equations and eigenvalue problems
• Solution of linear and nonlinear least squares problems
• Curve and surface fitting and interpolation
• Special functions
• Numerical integration and integral equations
• Roots of nonlinear equations (including polynomials)
• Option Pricing Formulae
• Wavelet Transforms
Statistical Facilities

2.2. Iceberg 127


Sheffield HPC Documentation, Release

• Random number generation


• Simple calculations on statistical data
• Correlation and regression analysis
• Multivariate methods
• Analysis of variance and contingency table analysis
• Time series analysis
• Nonparametric statistics

Documentation
• The NAG Fortran Library Manual (Link to NAG’s webbsite)

Installation notes fll6i25dcl


These are primarily for system administrators
tar -xvzf ./fll6i25dcl.tgz
./install.sh

The installer is interactive. Answer the installer questions as follows


Do you wish to install NAG Mark 25 Library? (yes/no):
yes

License file gets shown


[accept/decline]? :
accept

Where do you want to install the NAG Fortran Library Mark 25?
Press return for default location (/opt/NAG)
or enter an alternative path.
The directory will be created if it does not already exist.
>
/usr/local/packages6/libs/intel/15/NAG/

Module Files fll6i25dcl


• The module file is on the system at /usr/local/modulefiles/libs/intel/15/NAG/fll6i25dcl
• The module file is on github.

NetCDF (fortran)

NetCDF
Latest version 4.4.3
Dependancies mpi/gcc/openmpi/1.10.1 libs/gcc/4.4.7/openmpi/1.10.1/hdf5/1.8.16
URL http://www.unidata.ucar.edu/software/netcdf/
Documentation http://www.unidata.ucar.edu/software/netcdf/docs/

128 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation,
access, and sharing of array-oriented scientific data. NetCDF was developed and is maintained at Unidata. Unidata
provides data and software tools for use in geoscience education and research. Unidata is part of the University
Corporation for Atmospheric Research (UCAR) Community Programs (UCP). Unidata is funded primarily by the
National Science Foundation.

Usage This version of NetCDF was compiled using version 4.4.7 of the gcc compiler, openmpi 1.10.1 and HDF5
1.8.16. To make it available, run the following module command after starting a qsh or qrsh session
module load libs/gcc/4.4.7/openmpi/1.10.1/netcdf-fortran/4.4.3

This module also loads the relevant modules for OpenMPI, PGI Compiler and HDF5. To see which modules have
been loaded, use the command module list

Installation notes This section is primarily for administrators of the system.


Version 4.4.3
• install_gcc_netcdf-fortran_4.4.3.sh Install script
• Modulefile.

NetCDF (PGI build)

NetCDF
Latest version
Dependancies PGI Compiler 15.7, PGI OpenMPI 1.8.8, PGI HDF5 1.8.15-patch1
URL http://www.unidata.ucar.edu/software/netcdf/
Documentation http://www.unidata.ucar.edu/software/netcdf/docs/

NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation,
access, and sharing of array-oriented scientific data. NetCDF was developed and is maintained at Unidata. Unidata
provides data and software tools for use in geoscience education and research. Unidata is part of the University
Corporation for Atmospheric Research (UCAR) Community Programs (UCP). Unidata is funded primarily by the
National Science Foundation.

Usage This version of NetCDF was compiled using version 15.7 of the PGI Compiler, OpenMPI 1.8.8 and HDF5
1.8.15-patch1. To make it available, run the following module command after starting a qsh or qrsh session
module load libs/pgi/netcdf/4.3.3.1

This module also loads the relevant modules for OpenMPI, PGI Compiler and HDF5. To see which modules have
been loaded, use the command module list

Installation notes This section is primarily for administrators of the system.


Version 4.3.3.1
Compiled using PGI 15.7, OpenMPI 1.8.8 and HDF5 1.8.15-patch1
• install_pgi_netcdf_4.3.3.1.sh Install script
• Install logs are on the system at /usr/local/packages6/libs/pgi/netcdf/4.3.3.1/install_logs

2.2. Iceberg 129


Sheffield HPC Documentation, Release

• Modulefile located on the system at ls /usr/local/modulefiles/libs/pgi/netcdf/4.3.3.1

pcre

pcre
Version 8.37
Support Level Bronze
Dependancies None
URL http://www.pcre.org/
Location /usr/local/packages6/libs/gcc/4.4.7/pcre/8.37

The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and
semantics as Perl 5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX
regular expression API. The PCRE library is free, even for building proprietary software.

Usage To make this library available, run the following module command after starting a qsh or qrsh session.
module load libs/gcc/4.4.7/pcre/8.37

This also makes the updated pcregrep command available and will replace the system version. Check the version
you are using with pcregrep -V
pcregrep -V

pcregrep version 8.37 2015-04-28

Installation notes This section is primarily for administrators of the system.


qrsh
tar -xvzf pcre-8.37.tar.gz
cd pcre-8.37
mkdir -p /usr/local/packages6/libs/gcc/4.4.7/pcre/8.37
./configure --prefix=/usr/local/packages6/libs/gcc/4.4.7/pcre/8.37

The configuration details were


pcre-8.37 configuration summary:

Install prefix .................. : /usr/local/packages6/libs/gcc/4.4.7/pcre/8.37


C preprocessor .................. : gcc -E
C compiler ...................... : gcc
C++ preprocessor ................ : g++ -E
C++ compiler .................... : g++
Linker .......................... : /usr/bin/ld -m elf_x86_64
C preprocessor flags ............ :
C compiler flags ................ : -g -O2 -fvisibility=hidden
C++ compiler flags .............. : -O2 -fvisibility=hidden -fvisibility-inlines-hidden
Linker flags .................... :
Extra libraries ................. :

Build 8 bit pcre library ........ : yes


Build 16 bit pcre library ....... : no
Build 32 bit pcre library ....... : no

130 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Build C++ library ............... : yes


Enable JIT compiling support .... : no
Enable UTF-8/16/32 support ...... : no
Unicode properties .............. : no
Newline char/sequence ........... : lf
\R matches only ANYCRLF ......... : no
EBCDIC coding ................... : no
EBCDIC code for NL .............. : n/a
Rebuild char tables ............. : no
Use stack recursion ............. : yes
POSIX mem threshold ............. : 10
Internal link size .............. : 2
Nested parentheses limit ........ : 250
Match limit ..................... : 10000000
Match limit recursion ........... : MATCH_LIMIT
Build shared libs ............... : yes
Build static libs ............... : yes
Use JIT in pcregrep ............. : no
Buffer size for pcregrep ........ : 20480
Link pcregrep with libz ......... : no
Link pcregrep with libbz2 ....... : no
Link pcretest with libedit ...... : no
Link pcretest with libreadline .. : no
Valgrind support ................ : no
Code coverage ................... : no

Once the configuration was complete, I did


make
make install

Testing was performed and all tests passed


make check

============================================================================
Testsuite summary for PCRE 8.37
============================================================================
# TOTAL: 5
# PASS: 5
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================

Module File Module File Location: /usr/local/modulefiles/libs/gcc/4.4.7/pcre/8.37


#%Module1.0#####################################################################
##
## pcre 8.37 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

2.2. Iceberg 131


Sheffield HPC Documentation, Release

proc ModulesHelp { } {
puts stderr "Makes the pcre 8.37 library available"
}

module-whatis "Makes the pcre 8.37 library available"

set PCRE_DIR /usr/local/packages6/libs/gcc/4.4.7/pcre/8.37

prepend-path LD_LIBRARY_PATH $PCRE_DIR/lib


prepend-path CPATH $PCRE_DIR/include
prepend-path PATH $PCRE_DIR/bin

SCOTCH

SCOTCH
Latest version 6.0.4
URL http://www.labri.fr/perso/pelegrin/scotch/
Location /usr/local/packages6/libs/gcc/5.2/scotch/6.0.4

Software package and libraries for sequential and parallel graph partitioning, static mapping and clustering, sequential
mesh and hypergraph partitioning, and sequential and parallel sparse matrix block ordering.

Usage To make this library available, run the following module command
module load libs/gcc/5.2/scotch/6.0.4

Installation notes This section is primarily for administrators of the system. SCOTCH 6.0.4 was compiled with gcc
5.2 using the bash file install_scotch.sh

SimBody

SimBody
Version 3.5.3
Support Level Bronze
Dependancies compilers/gcc/4.8.2
URL https://simtk.org/home/simbody
Location /usr/local/packages6/libs/gcc/4.8.2/simbody/3.5.3

Usage To make this library available, run the following module command
module load libs/gcc/4.8.2/simbody

Installing This section is primarily for administrators of the system.


Installed using the following procedure:

132 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

module load compilers/gcc/4.8.2

cmake ../simbody-Simbody-3.5.3/ -DCMAKE_INSTALL_PREFIX=/usr/local/packages6/libs/gcc/4.8.2/simbody/3.

make -j 8

zlib

zlib
Version 1.2.8
URL http://zlib.net/
Location /usr/local/packages6/libs/gcc/4.4.7/zlib/1.2.8

A Massively Spiffy Yet Delicately Unobtrusive Compression Library.

Usage To make this library available, run the following module command
module load libs/gcc/4.4.7/zlib/1.2.8

Installation notes This section is primarily for administrators of the system.


It was built with gcc 4.4.7
#!/bin/bash

install_dir=/usr/local/packages6/libs/gcc/4.4.7/zlib/1.2.8

wget http://zlib.net/zlib-1.2.8.tar.gz
tar -xvzf ./zlib-1.2.8.tar.gz
cd zlib-1.2.8

mkdir -p $install_dir

./configure --prefix=$install_dir
make
make install

testing The library was tested with make check. The results were
make check

hello world
zlib version 1.2.8 = 0x1280, compile flags = 0xa9
uncompress(): hello, hello!
gzread(): hello, hello!
gzgets() after gzseek: hello!
inflate(): hello, hello!
large_inflate(): OK
after inflateSync(): hello, hello!
inflate with dictionary: hello, hello!
*** zlib test OK ***
hello world

2.2. Iceberg 133


Sheffield HPC Documentation, Release

zlib version 1.2.8 = 0x1280, compile flags = 0xa9


uncompress(): hello, hello!
gzread(): hello, hello!
gzgets() after gzseek: hello!
inflate(): hello, hello!
large_inflate(): OK
after inflateSync(): hello, hello!
inflate with dictionary: hello, hello!
*** zlib shared test OK ***
hello world
zlib version 1.2.8 = 0x1280, compile flags = 0xa9
uncompress(): hello, hello!
gzread(): hello, hello!
gzgets() after gzseek: hello!
inflate(): hello, hello!
large_inflate(): OK
after inflateSync(): hello, hello!
inflate with dictionary: hello, hello!
*** zlib 64-bit test OK ***

Module File Module location is /usr/local/modulefiles/libs/gcc/4.4.7/zlib/1.2.8. Module contents


#%Module1.0#####################################################################
##
## zlib 1.2.8 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the zlib 1.2.8 library available"
}

module-whatis "Makes the zlib 1.2.8 library available"

set ZLIB_DIR /usr/local/packages6/libs/gcc/4.4.7/zlib/1.2.8

prepend-path LD_LIBRARY_PATH $ZLIB_DIR/lib


prepend-path CPATH $ZLIB_DIR/include
prepend-path MANPATH $ZLIB_DIR/share/man

Development Tools

CMake

CMake is a build tool commonly used when compiling other libraries.


CMake is installed in /usr/local/packages6/cmake.

Usage CMake can be loaded with:


module load compilers/cmake

134 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Installation Run the following commands:


module load apps/python/2.7

./bootstrap --prefix=/usr/local/packages6/cmake/3.3.0/
--mandir=/usr/local/packages6/cmake/3.3.0/man --sphinx-man

gmake -j8

gmake install

GNU Compiler Collection (gcc)

The GNU Compiler Collection (gcc) is a widely used, free collection of compilers including C (gcc), C++ (g++) and
Fortran (gfortran). The defaut version of gcc on the system is 4.4.7
gcc -v

Using built-in specs.


Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC)

It is possible to switch to other versions of the gcc compiler suite using modules. After connecting to iceberg (see
Connect to iceberg), start an interactive sesssion with the qrsh or qsh command. Choose the version of the compiler
you wish to use using one of the following commands
module load compilers/gcc/6.2
module load compilers/gcc/5.4
module load compilers/gcc/5.3
module load compilers/gcc/5.2
module load compilers/gcc/4.9.2
module load compilers/gcc/4.8.2
module load compilers/gcc/4.5.3

Alternatively load the most recent available version using


module load compilers/gcc

Confirm that you’ve loaded the version of gcc you wanted using gcc -v.

Documentation man pages are available on the system. Once you have loaded the required version of gcc, type
man gcc

• What’s new in the gcc version 6 series?


• What’s new in the gcc version 5 series?

Installation Notes These notes are primarily for system administrators


• gcc version 6.2 was installed using :
– install_gcc_6.2.sh
– gcc 6.2 modulefile located on the system at /usr/local/modulefiles/compilers/gcc/6.2
• gcc version 5.4 was installed using :

2.2. Iceberg 135


Sheffield HPC Documentation, Release

– install_gcc_5.4.sh
– gcc 5.4 modulefile located on the system at /usr/local/modulefiles/compilers/gcc/5.4
• Installation notes for version 5.3 are not available.
• gcc version 5.2 was installed using :
– install_gcc_5.2.sh
– gcc 5.2 modulefile located on the system at /usr/local/modulefiles/compilers/gcc/5.2
• gcc version 4.9.2 was installed using :
– install_gcc_4.9.2.sh
– gcc 4.9.2 modulefile located on the system at /usr/local/modulefiles/compilers/gcc/4.9.2
• Installation notes for versions 4.8.2 and below are not available.

git

git
Latest version 2.5
Dependancies gcc 5.2
URL https://git-scm.com/

Git is a free and open source distributed version control system designed to handle everything from small to very large
projects with speed and efficiency.

Usage An old version of git is installed as part of the system’s opertaing system. As such, it is available everywhere,
including on the log-in nodes
$ git --version
git version 1.7.1

This was released in April 2010. We recommend that you load the most up to date version using modules - something
that can only be done after starting an interactive qrsh or qsh session
module load apps/gcc/5.2/git/2.5

Installation notes Version 2.5 of git was installed using gcc 5.2 using the following install script and module file:
• install_git_2.5.sh
• git 2.5 modulefile located on the system at /usr/local/modulefiles/compilers/git/2.5

Intel Compilers

Intel Compilers help create C, C++ and Fortran applications that can take full advantage of the advanced hardware
capabilities available in Intel processors and co-processors. They also simplify that development by providing high
level parallel models and built-in features like explicit vectorization and optimization reports.

136 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Making the Intel Compilers available After connecting to iceberg (see Connect to iceberg), start an interactive
session with the qsh or qrsh command. To make one of the versions of the Intel Compiler available, run one of the
following module commands
module load compilers/intel/15.0.3
module load compilers/intel/14.0
module load compilers/intel/12.1.15
module load compilers/intel/11.0

Compilation examples C
To compile the C hello world example into an executable called hello using the Intel C compiler
icc hello.c -o hello

C++
To compile the C++ hello world example into an executable called hello using the Intel C++ compiler
icpc hello.cpp -o hello

Fortran
To compile the Fortran hello world example into an executable called hello using the Intel Fortran compiler
ifort hello.f90 -o hello

Detailed Documentation Once you have loaded the module on Iceberg, man pages are available for Intel compiler
products
man ifort
man icc

The following links are to Intel’s website


• User and Reference Guide for the Intel® C++ Compiler 15.0
• User and Reference Guide for the Intel® Fortran Compiler 15.0
• Step by Step optimizing with Intel C++ Compiler

Related Software on the system Users of the Intel Compilers may also find the following useful:
• NAG Fortran Library (Serial) - A library of over 1800 numerical and statistical functions.

Installation Notes The following notes are primarily for system administrators.
• Version 15.0.3
– The install is located on the system at /usr/local/packages6/compilers/intel/2015/
– The license file is at /usr/local/packages6/compilers/intel/license.lic
– The environment variable INTEL_LICENSE_FILE is set by the environment module and points to the
license file location
– Download the files l_ccompxe_2015.3.187.tgz (C/C++) and l_fcompxe_2015.3.187.tgz
(Fortran) from Intel Portal.
– Put the above .tgz files in the same directory as install_intel15.sh and silent_master.cfg

2.2. Iceberg 137


Sheffield HPC Documentation, Release

– Run install_intel15.sh
– To find what was required in the module file, I did
env > base.env
source /usr/local/packages6/compilers/intel/2015/composer_xe_2015.3.187/bin/compilervars.sh
env > after_intel.env
diff base.env after_intel.env

– The module file is on iceberg at /usr/local/modulefiles/compilers/intel/15.0.3

version 14 and below Installation notes are not available for these older versions of the Intel Compiler.

NAG Fortran Compiler

The NAG Fortran Compiler is robust, highly tested, and valued by developers all over the globe for its checking capa-
bilities and detailed error reporting. The Compiler is available on Unix platforms as well as for Microsoft Windows
and Apple Mac platforms. Release 6.0 has extensive support for both legacy and modern Fortran features, and also
supports parallel programming with OpenMP.

Making the NAG Compiler available After connecting to iceberg (see Connect to iceberg), start an interactive
sesssion with the qsh or qrsh command. To make the NAG Fortran Compiler available, run the following module
command
module load compilers/NAG/6.0

Compilation examples To compile the Fortran hello world example into an executable called hello using the
NAG compiler
nagfor hello.f90 -o hello

Detailed Documentation Once you’ve run the NAG Compiler module command, man documentation is available
man nagfor

Online documentation:
• PDF version of the NAG Fortran Compiler Manual
• NAG Fortran Compiler Documentation Index (NAG’s Website)

Installation Notes The following notes are primarily for system sysadmins
mkdir -p /usr/local/packages6/compilers/NAG/6.0
tar -xvzf ./npl6a60na_amd64.tgz
cd NAG_Fortran-amd64/

Run the interactive install script


./INSTALL.sh

Accept the license and answer the questions as follows

138 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Install compiler binaries to where [/usr/local/bin]?


/usr/local/packages6/compilers/NAG/6.0

Install compiler library files to where [/usr/local/lib/NAG_Fortran]?


/usr/local/packages6/compilers/NAG/6.0/lib/NAG_Fortran

Install compiler man page to which directory [/usr/local/man/man1]?


/usr/local/packages6/compilers/NAG/6.0/man/man1

Suffix for compiler man page [1] (i.e. nagfor.1)?


Press enter

Install module man pages to which directory [/usr/local/man/man3]?


/usr/local/packages6/compilers/NAG/6.0/man/man3

Suffix for module man pages [3] (i.e. f90_gc.3)?


Press Enter

Licensing Add the license key to /usr/local/packages5/nag/license.lic


The license key needs to be updated annually before 31st July.
Point to this license file using the environment variable $NAG_KUSARI_FILE
This is done in the environment module

Module File Module file location is /usr/local/modulefiles/compilers/NAG/6.0


#%Module1.0#####################################################################
##
## NAG Fortran Compiler 6.0 module file
##

## Module file logging


source /usr/local/etc/module_logging.tcl
##

proc ModulesHelp { } {
puts stderr "Makes the NAG Fortran Compiler v6.0 available"
}

module-whatis "Makes the NAG Fortran Compiler v6.0 available"

set NAGFOR_DIR /usr/local/packages6/compilers/NAG/6.0

prepend-path PATH $NAGFOR_DIR


prepend-path MANPATH $NAGFOR_DIR/man
setenv NAG_KUSARI_FILE /usr/local/packages5/nag/license.lic

PGI Compilers

The PGI Compiler suite offers C,C++ and Fortran Compilers. For full details of the features of this compiler suite, see
PGI’s website at http://www.pgroup.com/products/pgiworkstation.htm

2.2. Iceberg 139


Sheffield HPC Documentation, Release

Making the PGI Compilers available After connecting to iceberg (see Connect to iceberg), start an interactive
session with the qsh or qrsh command. To make one of the versions of the PGI Compiler Suite available, run one
of the following module commands
module load compilers/pgi/15.10
module load compilers/pgi/15.7
module load compilers/pgi/14.4
module load compilers/pgi/13.1

Once you’ve loaded the module, you can check the version with
pgcc --version

Compilation examples C
To compile a C hello world example into an executable called hello using the PGI C compiler
pgcc hello.c -o hello

C++
To compile a C++ hello world example into an executable called hello using the PGI C++ compiler
pgc++ hello.cpp -o hello

Fortran
To compile a Fortran hello world example into an executable called hello using the PGI Fortran compiler
pgf90 hello.f90 -o hello

Additional resources
• Using the PGI Compiler with GPUs on Iceberg

Installation Notes Version 15.10


The interactive installer was slightly different to that of 15.7 (below) but the questions and answers were essentially
the same.
Version 15.7
The installer is interactive. Most of the questions are obvious. Here is how I answered the rest
Installation type
A network installation will save disk space by having only one copy of the
compilers and most of the libraries for all compilers on the network, and
the main installation needs to be done once for all systems on the network.

1 Single system install


2 Network install

Please choose install option: 1

Path

140 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Please specify the directory path under which the software will be installed.
The default directory is /opt/pgi, but you may install anywhere you wish,
assuming you have permission to do so.

Installation directory? [/opt/pgi] /usr/local/packages6/compilers/pgi

CUDA and AMD components


Install CUDA Toolkit Components? (y/n) y
Install AMD software components? (y/n) y

AMCL version
This PGI version links with ACML 5.3.0 by default. Also available:
(1) ACML 5.3.0
(2) ACML 5.3.0 using FMA4
Enter another value to override the default (1)
1

Other questions
Install JAVA JRE [yes] yes
Install OpenACC Unified Memory Evaluation package? (y/n) n
Do you wish to update/create links in the 2015 directory? (y/n) y
Do you wish to install MPICH? (y/n) y
Do you wish to generate license keys? (y/n) n
Do you want the files in the install directory to be read-only? (y/n) n

The license file is on the system at /usr/local/packages6/compilers/pgi/license.dat and is a 5 seat


network license. Licenses are only used at compile time.

Extra install steps Unlike gcc, the PGI Compilers do not recognise the environment variable LIBRARY_PATH
which is used by a lot of installers to specify the locations of libraries at compile time. This is fixed by creating
a siterc file at /usr/local/packages6/compilers/pgi/linux86-64/VER/bin/siterc with the
following contents
# get the value of the environment variable LIBRARY_PATH
variable LIBRARY_PATH is environment(LD_LIBRARY_PATH);

# split this value at colons, separate by -L, prepend 1st one by -L


variable library_path is
default($if($LIBRARY_PATH,-L$replace($LIBRARY_PATH,":", -L)));

# add the -L arguments to the link line


append LDLIBARGS=$library_path;

Where VER is the version number in question: 15.7, 15.10 etc


At the time of writing (August 2015), this is documented on PGI’s website at
https://www.pgroup.com/support/link.htm#lib_path_ldflags

Modulefile Version 15.10 The PGI compiler installer creates a suitable modulefile that’s configured to our system. It
puts it at /usr/local/packages6/compilers/pgi/modulefiles/pgi64/15.10 so all that is required
is to copy this to where we keep modules at /usr/local/modulefiles/compilers/pgi/15.10
Version 15.7

2.2. Iceberg 141


Sheffield HPC Documentation, Release

The PGI compiler installer creates a suitable modulefile that’s configured to our system. It puts it at
/usr/local/packages6/compilers/pgi/modulefiles/pgi64/15.7 so all that is required is to copy
this to where we keep modules at /usr/local/modulefiles/compilers/pgi/15.7

MPI

OpenMPI (gcc version)

OpenMPI (gcc version)


Latest Version 1.10.1
Dependancies gcc
URL http://www.open-mpi.org/

The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained
by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise,
technologies, and resources from all across the High Performance Computing community in order to build the best
MPI library available. Open MPI offers advantages for system and software vendors, application developers and
computer science researchers.

Module files The latest version is made available using


module load mpi/gcc/openmpi

Alternatively, you can load a specific version using one of


module load mpi/gcc/openmpi/1.10.1
module load mpi/gcc/openmpi/1.10.0
module load mpi/gcc/openmpi/1.8.8
module load mpi/gcc/openmpi/1.8.3
module load mpi/gcc/openmpi/1.6.4
module load mpi/gcc/openmpi/1.4.4
module load mpi/gcc/openmpi/1.4.3

Installation notes These are primarily for administrators of the system.


Version 1.10.1
Compiled using gcc 4.4.7.
• install_openMPI_1.10.1.sh Downloads, compiles and installs OpenMPI 1.10.0 using the system gcc.
• Modulefile 1.10.1 located on the system at /usr/local/modulefiles/mpi/gcc/openmpi/1.10.1
Version 1.10.0
Compiled using gcc 4.4.7.
• install_openMPI_1.10.0.sh Downloads, compiles and installs OpenMPI 1.10.0 using the system gcc.
• Modulefile 1.10 located on the system at /usr/local/modulefiles/mpi/gcc/openmpi/1.10.0
Version 1.8.8
Compiled using gcc 4.4.7.
• install_openMPI.sh Downloads, compiles and installs OpenMPI 1.8.8 using the system gcc.

142 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

• Modulefile located on the system at /usr/local/modulefiles/mpi/gcc/openmpi/1.8.8

OpenMPI (PGI version)

OpenMPI (PGI version)


Latest Version 1.8.8
Support Level FULL
Dependancies PGI
URL http://www.open-mpi.org/

The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained
by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise,
technologies, and resources from all across the High Performance Computing community in order to build the best
MPI library available. Open MPI offers advantages for system and software vendors, application developers and
computer science researchers.
These versions of OpenMPI make use of the PGI compiler suite.

Module files The latest version is made available using


module load mpi/pgi/openmpi

Alternatively, you can load a specific version using one of


module load mpi/pgi/openmpi/1.8.8
module load mpi/pgi/openmpi/1.6.4

Installation notes These are primarily for administrators of the system.


Version 1.8.8
Compiled using PGI 15.7.
• install_openMPI.sh Downloads, compiles and installs OpenMPI 1.8.8 using v15.7 of the PGI Compiler.
• Modulefile located on the system at /usr/local/modulefiles/mpi/pgi/openmpi/1.8.8
Installation notes for older versions are not available.
There are many versions of applications, libraries and compilers installed on the iceberg cluster. In order to avoid
conflict between these software items we employ a system called modules. Having logged onto a worker nodes, users
are advised to select the software (and the version of the software) they intend to use by utilising the module command.
• Modules on Iceberg
• Applications
• Development Tools (including compilers)
• Libraries
• Scheduler Commands for interacting with the scheduler
• MPI Parallel Programming

2.2. Iceberg 143


Sheffield HPC Documentation, Release

2.2.2 Accessing Iceberg

Note: If you have not connected to iceberg before the Getting Started documentation will guide you through the
process.

Below is a list of the various ways to connect to or use iceberg:

Terminal Access

Warning: This page is a stub and needs writing. Please send a PR.

JupyterHub

The Jupyter Notebook is a web application that allows you to create and share documents that contain live code,
equations, visualizations and explanatory text. It is an excellent interface to explore your data and share your research.
The JupyterHub is a web interface for running notebooks on the iceberg cluster. It allows you to login and have a
notebook server started up on the cluster, with access to both your data stores and the resources of iceberg.

Warning: This service is currenty experimental, if you use this service and encounter a problem, please provide
feedback to research-it@sheffield.ac.uk.

Logging into the JupyterHub

To get started visit https://jupyter.shef.ac.uk and log in with your university account. A notebook session will be
submitted to the iceberg queue once you have logged in, this can take a minute or two, so the page may seem to be
loading for some time.

Note: There is currently no way of specifying any options to the queue when submitting the server job. So you can
not increase the amount of memory assigned or the queue the job runs in. There is an ongoing project to add this
functionality.

Using the Notebook on iceberg

To create a new notebook session, select the “New” menu on the right hand side of the top menu.

144 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

You will be presented with a menu showing all available conda environments that have Jupyter available. To learn
more about installing custom Python environments with conda see Python.
To learn how to use the Jupyter Notebook see the official documentation.

Using the Notebook Terminal In the “New” menu it is possible to start a terminal, this is a fully featured web
terminal, running on a worker node. You can use this terminal to perform any command line only operation on
iceberg. Including configuring new Python environments and installing packages.

Troubleshooting

Below is a list of common problems:


1. If you modify the PYTHON_PATH variable in your .bashrc file your jupyter server may not start correctly,
and you may encounter a 503 error after logging into the hub. The solution to this is to remove these lines from
your .bashrc file.
2. If you have previously tried installing and running Jupyter yourself (i.e. not using this JupyterHub interface) then
you may get 503 errors when connecting to JupyterHub due to the old .jupyter profile in your home directory; if
you then find a jupyter log file in your home directory containing SSL WRONG_VERSION_NUMBER then try
deleting (or renaming) the .jupyter directory in your home directory.

Remote Visualisation

Warning: This page is a stub and needs writing. Please send a PR.

2.2. Iceberg 145


Sheffield HPC Documentation, Release

2.2.3 Filestore on Iceberg

Every user on the system has access to three different types of filestore. They differ in terms of the amount of space
available, the speed of the underlying storage system, frequency of backup and the time that the data can be left there.
Here are the current details of filestore available to each user.

Home directory

All users have a home directory in the location /home/username. The filestore quota is 10 GB per user.
Backup policy: /home has backup snapshots taken every 4 hours and we keep the 10 most recent. /home also has
daily snapshots taken each night, and we keep 28 days worth, mirrored onto a separate storage system.
The filesystem is NFS.

Data directory

Every user has access to a much larger data-storage area provided at the location /data/username.
The quota for this area is 100 GB per user.
Backup policy: /data has snapshots taken every 4 hours and we keep the 10 most recent. /data also has daily
snapshots taken each night, and we keep 7 days worth, but this is not mirrored.
The filesystem is NFS.

Fastdata directory

All users also have access to a large fast-access data storage area under /fastdata.
In order to avoid interference from other users’ files it is vitally important that you store your files in a directory
created and named the same as your username. e.g.
mkdir /fastdata/yourusername

By default the directory you create will have world-read access - if you want to restrict read access to just your account
then run
chmod 700 /fastdata/yourusername

after creating the directory. A more sophisticated sharing scheme would have private and public directories
mkdir /fastdata/yourusername
mkdir /fastdata/yourusername/public
mkdir /fastdata/yourusername/private

chmod 755 /fastdata/yourusername


chmod 755 /fastdata/yourusername/public
chmod 700 /fastdata/yourusername/private

The fastdata area provides 260 Terabytes of storage in total and takes advantage of the internal infiniband network for
fast access to data.
Although /fastdata is available on all the worker nodes, only by accessing from the Intel-based nodes ensures that
you can benefit from these speed improvements.

146 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

There are no quota controls on the /fastdata area but files older than 3 months will be automatically deleted
without warning. We reserve the right to change this policy without warning in order to ensure efficient running of
the service.
You can use the lfs command to find out which files under /fastdata are older than a certain number of days and
hence approaching the time of deletion. For example, to find files 50 or more days old
lfs find -ctime +50 /fastdata/yourusername

/fastdata uses the Lustre filesystem. This does not support POSIX locking which can cause issues for some
applications.
fastdata is optimised for large file operations and does not handle lots of small files very well. An example of how
slow it can be for large numbers of small files is detailed at http://www.walkingrandomly.com/?p=6167

Shared directories

When you purchase an extra filestore from CiCS you should be informed of its name. Once you know this you can
access it
• as a Windows-style (SMB) file share on machines other than Iceberg using
\\uosfstore.shef.ac.uk\shared\
• as a subdirectory of /shared on Iceberg.
Note that this subdirectory will be mounted on demand on Iceberg: it will not be visible if you simply list
the contents of the /shared directory but will be accessible if you cd (change directory) into it e.g. cd
/shared/my_group_file_share1
A note regarding permissions: behind the scenes, the file server that provides this shared storage manages permissions
using Windows-style ACLs (which can be set by area owners via Group Management web interface). However, the
filesystem is mounted on a Linux cluster using NFSv4 so the file server therefore requires a means for mapping
Windows-style permissions to Linux ones. An effect of this is that the Linux mode bits as seen on Iceberg are not
always to be believed for files under /shared: the output of ls -l somefile.sh may indicate that a file is
readable/writable/executable when the ACLs are what really determine access permissions. Most applications have
robust ways of checking for properties such as executability but some applications can cause problems when accessing
files/directories on /shared by naievely checking permissions just using Linux mode bits:
• which: a directory under /shared may be on your path and you may be able to run a contained executable
without prefixing it with a absolute/relative directory but which may fail to find that executable.
• Perl: scripts that check for executability of files on /shared using -x may fail unless Perl is explicitly told to
test for file permissions in a more thorough way (see the mention of use filetest ’access’ here).

Determining your current filestore allocation

To find out your current filestore quota allocation and usage type quota.

If you exceed your file storage allocation

As soon as the quota is exceeded your account becomes frozen. In order to avoid this situation it is strongly recom-
mended that you
• Use the quota command to check your usage regularly.
• Copy files that do not need to be backed to the /data/username area, or remove them from iceberg com-
pletely.

2.2. Iceberg 147


Sheffield HPC Documentation, Release

Efficiency considerations - The /scratch areas

For jobs requiring a lot of Input and Output (I/O), it may sometimes be necessary to store copies of the data on the
actual compute node on which your job is running. For this, you can create temporary areas of storage under the
directory /scratch. The /scratch area is local to each worker node and is not visible to the other worker
nodes or to the head-nodes. Therefore any data created by jobs should be transferred to either your /data or /home
area before the job finishes if you wish to keep them.
The next best I/O performance that requires the minimum amount of work is achieved by keeping your data in the
/fastdata area and running your jobs on the new intel nodes by specifying -l arch=intel in your job submis-
sion script.
These methods provide much faster access to data than the network attached storage on either /home or /data areas,
but you must remember to copy important data back onto your /home area.
If you decide to use the /scratch area we recommend that under /scratch you create a directory with the same
name as your username and work under that directory to avoid the possibility of clashing with other users.
Anything under the /scratch is deleted periodically when the worker-node is idle, whereas files on the /fastdata
area will be deleted only when they are 3 months old.
\scratch uses the ext4 filesystem.

Recovering snapshots

We take regular back-ups of your /home and /data directories and it is possible to directly access a limited subset
of them.
There are 7 days worth of snapshots available in your /home and /data directories in a hidden directory called
.snapshot. You need to explicitly cd into this directory to get at the files:
cd /home/YOURUSERNAME/.snapshot

The files are read-only. This allows you to attempt recover any files you might have accidentally deleted recently.
This does not apply for /fastdata for which we take no back-ups.

2.2.4 Iceberg specifications

• Total CPUs: 3440 cores


• Total GPUs: 16 units
• Total Memory: 31.8 TBytes
• Permanent Filestore: 45 TBytes
• Temporary Filestore: 260 TBytes
• Physical size: 8 Racks
• Maximum Power Consumption: 83.7 KWatts
• All nodes are connected via fast infiniband.
For reliability, there are two iceberg head-nodes ‘for logging in’ configured to take over from each other in case of
hardware failure.

148 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Worker Nodes CPU Specifications

Intel Ivybridge based nodes


• 92 nodes, each with 16 cores and 64 GB of total memory (i.e. 4 GB per core).
• 4 nodes, each with 16 cores and 256 GB of total memory (i.e. 16GB per core).
• Each node uses2 of Intel E5 2650V2 8-core processors (hence 2*8=16 cores).
• Scratch space on local disk of on each node is 400 GB
Intel Westmere based nodes
• 103 nodes, each with 12 cores and 24 GB of total memory ( i.e. 2 GB per core )
• 4 nodes with 12 cores and 48 GB of total memory ( i.e. 4GB per core )
• Each node uses 2 of Intel X5650 6-core processors ( hence 2*6=12 cores )

GPU Units Specifications

8 Nvidia Tesla Kepler K40Ms GPU units


• Each GPU unit contains 2880 thread processor cores
• Each GPU unit has 12GB of GDR memory. Hence total GPU memory is 8*12=96 GB
• Each GPU unit is capable of about 4.3TFlop of single precision floating point performance, or 1.4TFlops at
double precision.
8 Nvidia Tesla Fermi M2070s GPU units
• Each GPU unit contains 448 thread processor cores
• Each GPU unit contains 6GB of GDR memory. Hence total GPU memory is 8*6=48 GB
• Each GPU unit is capable of about 1TFlop of single precision floating point performance, or 0.5TFlops at double
precision.

Software and Operating System

Users normally log into a head node and then use one (or more) of the worker nodes to run their jobs on. Scheduling
of users’ jobs on the worker nodes are managed by the ‘Sun Grid Engine’ software. Jobs can be run interactively ( qsh
) or submitted as batch jobs ( qsub ).
• The operating system is 64-bit Scientific Linux (which is Redhat based) on all nodes
• The Sun Grid Engine for batch and interactive job scheduling
• Many Applications, Compilers, Libraries and Parallel Processing Tools. See Software on iceberg

2.3 ShARC

The University of Sheffield’s new HPC service, ShARC, is currently under development and is not yet ready for users
to access. Please use our current system, Iceberg, for now.

2.3. ShARC 149


Sheffield HPC Documentation, Release

2.3.1 Software on ShARC

These pages list the software available on ShARC. If you notice an error or an omission, or wish to request new
software please create an issue on our github page

Applications on Sharc

Libraries on Sharc

Development Tools on Sharc

Parallel Systems

• Applications on Sharc
• Development Tools on Sharc
• Libraries on Sharc
• Parallel Systems

2.4 Parallel Computing

Modern computers contain more than one processor with a typical laptop usually containing either 2 or 4 (if you
think your laptop contains 8, it is because you have been fooled by Hyperthreading). Systems such as Iceberg or Sharc
contain many hundreds of processors and the key to making your research faster is to learn how to distribute your work
across them. If your program is designed to run on only one processor, running it on our systems without modification
will not make it any faster (it may even be slower!). Learning how to use parallelisation technologies is vital.
This section explains how to use the most common parallelisation technologies on our systems.
If you need advice on how to parallelise your workflow, please contact the Research Software Engineering Group

2.4.1 SGE Job Arrays

The simplest way of exploiting parallelism on the clusters is to use Job Arrays. A Job Array is a set of batch jobs run
from a single job script. For example
#!/bin/bash
#
#$ -t 1-100
#
echo "Task id is $SGE_TASK_ID"

./myprog.exe $SGE_TASK_ID > output.$SGE_TASK_ID

The above script will submit 100 tasks to the system at once. The difference between each of these 100 jobs is the
value of the environment variable $SGE_TASK_ID which will range from 1 to 100, determined by the line #$ -t 1-100.
In this example, the program myprog.exe will be run 100 times with differing input values of $SGE_TASK_ID. 100
output files will be created with filenames output.1, output.2 and so on.
Job arrays are particularly useful for Embarrassingly Parallel problems such as Monte Carlo Simulations (Where
$SGE_TASK_ID might correspond to random number seed), or batch file processing (where $SGE_TASK_ID might
refer to a file in a list of files to be processed).

150 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Examples

• MATLAB SGE Job Array example

2.4.2 Shared Memory Parallelism (SMP)

Shared Memory Parallelism (SMP) is when multiple processors can operate independently but they have access to the
same memory space. On our systems, programs that use SMP can make use of only the resources available on a single
node. This limits you to 16 processor cores on our current system.
If you wish to scale to more processors, you should consider more complex parallelisation schemes such as MPI or
Hybrid OpenMP/MPI.

OpenMP

OpenMP (Open Multi-Processing) is one of the most common programming frameworks used to implement shared
memory parallelism. To submit an OpenMP job to our system, we make use of the OpenMP parallel environment
which is specified by adding the line
`#$ -pe openmp N`

to your job submission script, changing the value N to the number of cores you wish to use. Here’s an example for 4
cores
#!/bin/bash
# Request 4 cores from the scheduler
#$ -pe openmp 4
#$ Request 4 Gb of memory per CPU core
#$ -l rmem=4G
#$ -l mem=4G

# Tell the OpenMP executable to use 4 cores


export OMP_NUM_THREADS=4
./myprogram

Note that you have to specify the number of cores twice. Once to the scheduler (#$ -pe openmp 4) and once to the
OpenMP runtime environment export OMP_NUM_THREADS=4

Running other SMP schemes

Any system that uses shared memory parallelism, such as pthreads, Python’s multiprocessing module, R’s mcapply
and so on can be submitted using the OpenMP parallel environment.

2.4.3 Message Passing Interface (MPI)

The Message Passing Interface (MPI) Standard is a specification for a message passing library. MPI was originally
designed for distributed memory architectures and is used on systems ranging from a few interconnected Raspberry
Pi’s through to the UK’s national supercomputer, Archer.

MPI Implementations

Our systems have several MPI implementations installed. See the MPI section in software for details

2.4. Parallel Computing 151


Sheffield HPC Documentation, Release

Example MPI jobs

Some example MPI jobs are available in the HPC Examples repository of Sheffield’s Research Software Engineering
group

Batch MPI

The queue to submit to is openmpi-ib. Here is an example that requests 4 slots with 8Gb per slot using the gcc
implementation of OpenMPI
#!/bin/bash
#$ -l h_rt=1:00:00
# Change 4 to the number of slots you want
#$ -pe openmpi-ib 4
# 8Gb per slot
#$ -l rmem=8G
#$ -l mem=8G

module load mpi/gcc/openmpi


mpirun ./executable

Interactive MPI

Our general-access interactive queues currently don’t have any MPI-compatible parallel environments enabled. Thus,
it is not possible to run MPI jobs interactively.

MPI Training

Course notes from the national supercomputing centre are available here

2.4.4 Hybrid OpenMP / MPI

Support for Hybrid OpenMP/MPI is in the preliminary stages. Here is an example job submission for an executable
that combines OpenMP with MPI
#$ -pe openmpi-hybrid-4 8
#$ -l rmem=2G
#$ -l mem=6G
module load mpi/intel/openmpi
export OMP_NUM_THREADS=4
mpirun -bynode -np 2 -cpus-per-proc 4 [executable + options]

There would be 2 MPI processes running, each with 4x (6GB mem, 2GB rmem) shared between the 4 threads per MPI
process, for a total of 8 x (6GB mem, 2GB rmem) for the entire job. When we tried this, we got warnings saying that
the -cpus-per-proc was getting deprecated. A quick google suggests that
mpirun -np 2 --map-by node:pe=1 [executable + options]

would be the appropriate replacement.

152 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

2.4.5 GPU Computing

Graphics Processing Units (GPUs) were, as the name suggests, originally designed for the efficient processing of
graphics. Over time, they were developed into systems that were capable of performing general purpose computing
which is why some people refer to modern GPUs as GP-GPUs (General Purpose Graphical Processing Units).
Graphics processing tpically involves relatively simple computations that need to applied to millions of on-screen
pixels in parallel. As such, GPUs tend to be very quick and efficient at computing certain types of parallel workloads.
The GPUComputing@Sheffield website aims to facilitate the use of GPU computing within University of Sheffield
research by providing resources for training and purchasing of equipment as well as providing a network of GPU users
and research projects within the University.

Requesting access to GPU facilities

In order to ensure that the nodes hosting the GPUs run only GPU related tasks, we have defined a special project-
group for accessing these nodes. If you wish to take advantage of the GPU processors, please contact research-
it@sheffield.ac.uk asking to join the GPU project group.
Any iceberg user can apply to join this group. However, because our GPU resources are limited we will need to discuss
the needs of the user and obtain consent from the project leader before allowing usage.

GPU Community and NVIDIA Research Centre

The University of Sheffield has been officially affiliated with NVIDIA since 2011 as an NVIDIA CUDA Research
Centre. As such NVIDIA offer us some benefits as a research institution including discounts on hardware, technical
liaisons, online training and priority seed hardware for new GPU architectures. For first access to hardware, training
and details of upcoming events, discussions and help please join the GPUComputing google group.

GPU Hardware on iceberg

Iceberg currently contains 16 GPU units:


• 8 Nvidia Tesla Kepler K40M GPU units. Each unit contains 2880 CUDA cores, 12GB of memory and is capable
of up to 1.43 Teraflops of double precision compute power.
• 8 Nvidia Tesla Fermi M2070 GPU units. Each unit contains 448 CUDA cores, 6GB of memory and is capable
of up to 515 Gigaflops of double precision compute power.

Compiling on the GPU using the PGI Compiler

The PGI Compilers are a set of commercial Fortran,C and C++ compilers from the Portland Group. To make use
of them, first start an interactive GPU session and run one of the following module commands, depending on which
version of the compilers you wish to use
module load compilers/pgi/13.1
module load compilers/pgi/14.4

The PGI compilers have several features that make them interesting to users of GPU hardware:-

2.4. Parallel Computing 153


Sheffield HPC Documentation, Release

OpenACC Directives

OpenACC is a relatively new way of programming GPUs that can be significantly simpler to use than low-level
language extensions such as CUDA or OpenCL. From the OpenACC website :
The OpenACC Application Program Interface describes a collection of compiler directives to specify
loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached
accelerator. OpenACC is designed for portability across operating systems, host CPUs, and a wide range
of accelerators, including APUs, GPUs, and many-core coprocessors.
The directives and programming model defined in the OpenACC API document allow programmers to
create high-level host+accelerator programs without the need to explicitly initialize the accelerator, man-
age data or program transfers between the host and accelerator, or initiate accelerator startup and shut-
down.
For more details concerning OpenACC using the PGI compilers, see The PGI OpenACC website.

CUDA Fortran

In mid 2009, PGI and NVIDIA cooperated to develop CUDA Fortran. CUDA Fortran includes a Fortran 2003 compiler
and tool chain for programming NVIDIA GPUs using Fortran.
• CUDA Fortran Programming Guide.

CUDA-x86

NVIDIA CUDA was developed to enable offloading computationally intensive kernels to massively parallel GPUs.
Through API function calls and language extensions, CUDA gives developers explicit control over the mapping of
general-purpose computational kernels to GPUs, as well as the placement and movement of data between an x86
processor and the GPU.
The PGI CUDA C/C++ compiler for x86 platforms allows developers using CUDA to compile and optimize their
CUDA applications to run on x86-based workstations, servers and clusters with or without an NVIDIA GPU accelera-
tor. When run on x86-based systems without a GPU, PGI CUDA C applications use multiple cores and the streaming
SIMD (Single Instruction Multiple Data) capabilities of Intel and AMD CPUs for parallel execution.
• PGI CUDA-x86 guide.

Deep Learning with Theano on GPU Nodes

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving
multi-dimensional arrays efficiently. Theano is most commonly used to perform deep learning and has excellent GPU
support and integration through PyCUDA. The following steps can be used to setup and configure Theano on your
own profile.
Request an interactive session with a sufficient amount of memory:
qsh -l gpu=1 -P gpu –l gpu_arch=nvidia-k40m -l mem=13G
Load the relevant modules
module load apps/python/anaconda3-2.5.0 module load libs/cuda/7.5.18
Create a conda environment to load relevant modules on your local user account
conda create -n theano python=3.5 anaconda3-2.5.0 source activate theano

154 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Install the other Python module dependencies which are required using pip (alternatively these could be installed with
conda if you prefer)
pip install theano pip install nose pip install nose-parameterized pip install pycuda
For optimal Theano performance, enable the CUDA memory manager CNMeM. To do this, create the .theanorc file in
your HOME directory and set the fraction of GPU memory reserved by Theano. The exact amount of energy may have
to be hand-picked: if Theano asks for more memory that is currently available on the GPU, an error will be thrown
during import of theano module. Create or edit the .theanorc file with nano
nano ~/.theanorc
Add the following lines and, if necessary, change the 0.8 number to whatever works for you:
[lib] cnmem=0.8
Run python and verify that Theano is working correctly:
python -c “import theano;theano.test()”

Interactive use of the GPUs

Once you are included in the GPU project group you may start using the GPU enabled nodes interactively by typing
qsh -l gpu=1 -P gpu

the gpu= parameter determines how many GPUs you are requesting. Currently, the maximum number of GPUs
allowed per job is set to 4, i.e. you cannot exceed gpu=4. Most jobs will only make use of one GPU.
If your job requires selecting the type of GPU hardware, one of the following two optional parameters can be used to
make that choice
-l gpu_arch=nvidia-m2070
-l gpu_arch=nvidia-k40m

Interactive sessions provide you with 2 Gigabytes of CPU RAM by default which is significantly less than the amount
of GPU RAM available. This can lead to issues where your session has insufficient CPU RAM to transfer data to and
from the GPU. As such, it is recommended that you request enough CPU memory to communicate properly with the
GPU
-l gpu_arch=nvidia-m2070 -l mem=7G
-l gpu_arch=nvidia-k40m -l mem=13G

The above will give you 1Gb more CPU RAM than GPU RAM for each of the respective GPU architectures.

Submitting GPU jobs

Interactive use of the GPUs

You can access GPU enabled nodes interactively by typing


qsh -l gpu=1

the gpu= parameter determines how many GPUs you are requesting. Currently, the maximum number of GPUs
allowed per job is set to 4, i.e. you cannot exceed gpu=4. Most jobs will only make use of one GPU.
If your job requires selecting the type of GPU hardware, one of the following two optional parameters can be used to
make that choice

2.4. Parallel Computing 155


Sheffield HPC Documentation, Release

-l gpu_arch=nvidia-m2070
-l gpu_arch=nvidia-k40m

Interactive sessions provide you with 2 Gigabytes of CPU RAM by default which is significantly less than the amount
of GPU RAM available. This can lead to issues where your session has insufficient CPU RAM to transfer data to and
from the GPU. As such, it is recommended that you request enough CPU memory to communicate properly with the
GPU
-l gpu_arch=nvidia-m2070 -l mem=7G
-l gpu_arch=nvidia-k40m -l mem=13G

The above will give you 1Gb more CPU RAM than GPU RAM for each of the respective GPU architectures.

Submitting batch GPU jobs

To run batch jobs on gpu nodes, edit your jobfile to include a request for gpus, e.g. for a single GPU
#$ -l gpu=1

You can also use the the gpu_arch discussed aboved to target a specific gpu model
#$ -l gpu_arch=nvidia-m2070

GPU enabled Software

Ansys

See Issue #25 on github

Maple

TODO

MATLAB

TODO

NVIDIA Compiler

TODO - Link to relevant section

PGI Compiler

TODO - Link to releveant section

156 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

Compiling on the GPU using the NVIDIA Compiler

To compile GPU code using the NVIDA compiler, nvcc, first start an interactive GPU session. Next, you need to set
up the compiler environment via one of the following module statements
module load libs/cuda/3.2.16
module load libs/cuda/4.0.17
module load libs/cuda/6.5.14

depending on the version of CUDA you intend to use. This makes the nvcc CUDA compiler available. For example
nvcc filename.cu -arch sm_20

will compile the CUDA program contained in the file filename.cu. The -arch flag above signifies the compute
capability of the intended GPU hardware, which is 20 for our current GPU modules. If you are intending to generate
code for older architectures you may have to specify sm_10 or sm_13 for example.

2.5 Troubleshooting

In this section, we’ll discuss some tips for solving problems with iceberg. It is suggested that you work through some
of the ideas here before contacting the service desk for assistance.

2.5.1 Frequently Asked Questions

I’m a new user and my password is not recognised

When you get a username on the system, the first thing you need to do is to syncronise your passwords which will set
your password to be the same as your network password.

Strange things are happening with my terminal

Symptoms include many of the commands not working and just bash-4.1$ being displayed instead of your username
at the bash prompt.
This may be because you’ve deleted your .bashrc and .bash_profile files - these are ‘hidden’ files which live in your
home directory and are used to correctly set up your environment. If you hit this problem you can run the command
resetenv which will restore the default files.

I can no longer log onto iceberg

If you are confident that you have no password or remote-access-client related issues but you still can not log onto
iceberg you may be having problems due to exceeding your iceberg filestore quota. If you exceed your filestore quota
in your /home area it is sometimes possible that crucial system files in your home directory gets truncated that effect
the subsequent login process.
If this happens, you should immediately email research-it@sheffield.ac.uk and ask to be unfrozen.

I can not log into iceberg via the applications portal

Most of the time such problems arise due to due to JAVA version issues. As JAVA updates are released regularly, these
problems are usually caused by the changes to the JAVA plug-in for the browser. Follow the trouble-shooting link from
the iceberg browser-access page to resolve these problems. There is also a link on that page to test the functionality

2.5. Troubleshooting 157


Sheffield HPC Documentation, Release

of your java plug-in. It can also help to try a different browser to see if it makes any difference. All failing, you may
have to fall back to one of the non-browser access methods.

My batch job terminates without any messages or warnings

When a batch job that is initiated by using the qsub command or runfluent, runansys or runabaqus commands, it gets
allocated specific amount of virtual memory and real-time. If a job exceeds either of these memory or time limits it
gets terminated immediately and usually without any warning messages.
It is therefore important to estimate the amount of memory and time that is needed to run your job to completion and
specify it at the time of submitting the job to the batch queue.
Please refer to the section on hitting-limits and estimating-resources for information on how to avoid these problems.

Exceeding your disk space quota

Each user of the system has a fixed amount of disk space available in their home directory. If you exceed this quota,
various problems can emerge such as an inability to launch applications and run jobs. To see if you have exceeded
your disk space quota, run the quota command:
quota

Size Used Avail Use% Mounted on


10.1G 5.1G 0 100% /home/foo11b
100G 0 100G 0% /data/foo11b

In the above, you can see that the quota was set to 10.1 gigabytes and all of this is in use. Any jobs submitted by this
user will likely result in an Eqw status. The recommended action is for the user to delete enough files to allow normal
work to continue.
Sometimes, it is not possible to log-in to the system because of a full quota, in which case you need to contact
research-it@sheffield.ac.uk and ask to the unfrozen.

I am getting warning messages and warning emails from my batch jobs about insufficient memory

There are two types of memory resources that can be requested when submitting batch jobs using the qsub command.
These are, virtual memory ( -l mem=nnn ) and real memory ( -l rmem=nnn ). Virtual memory limit specified should
always be greater than equal to the real memory limit specification.
If a job exceeds its virtual memory resource it gets terminated. However if a job exceeds its real memory resource it
does not get terminated but an email message is sent to the user asking him to specify a larger rmem= parameter the
next time, so that the job can run more efficiently.

What is rmem ( real_memory) and mem ( virtual_memory)

Running a program always involves loading the program instructions and also its data i.e. all variables and arrays
that it uses into the computers “RAM” memory. A program’s entire instructions and its entire data, along with any
dynamic link libraries it may use, defines the VIRTUAL STORAGE requirements of that program. If we did not
have clever operating systems we would need as much physical memory (RAM) as the virtual-storage requirements
of that program. However, operating systems are clever enough to deal with situations where we have insufficient
REAL MEMORY to load all the program instructions and data into the available Real Memory ( i.e. RAM ) . This
technique works because hardly any program needs to access all its instructions and its data simultaneously. Therefore
the operating system loads into RAM only those bits of the instructions and data that are needed by the program at

158 Chapter 2. Research Software Engineering Team


Sheffield HPC Documentation, Release

a given instance. This is called PAGING and it involves copying bits of the programs instructions and data to/from
hard-disk to RAM as they are needed.
If the REAL MEMORY (i.e. RAM) allocated to a job is much smaller than the entire memory requirements of a job (
i.e. VIRTUAL MEMORY) then there will be excessive need for ‘paging’ that will slow the execution of the program
considerably due to the relatively slow speeds of transferring information to/from the disk into RAM.
On the other hand if the Real Memory (RAM) allocated to a job is larger than the Virtual Memory requirement of that
job then it will result in waste of RAM resources which will be idle duration of that job.
It is therefore crucial to strike a fine balance between the VIRTUAL MEMORY (i.e. mem) and the PHYSICAL
MEMORY ( i.e. rmem) allocated to a job. Virtual memory limit defined by the -l mem parameter defines the maximum
amount of virtual-memory your job will be allowed to use. If your job’s virtual memory requirements exceed this limit
during its execution your job will be killed immediately. Real memory limit defined by the -l rmem parameter defines
the amount of RAM that will be allocated to your job.
The way we have configured SGE, if your job starts paging excessively your job is not killed but you receive warning
messages to increase the RAM allocated to your job next time by means of the rmem parameter.
It is important to make sure that your -l mem value is always greater than your -l rmem value so as not to waste the
valuable RAM resources as mentioned earlier.

Insufficient memory in an interactive session

By default, an interactive session provides you with 2 Gigabytes of RAM (sometimes called real memory) and 6
Gigabytes of Virtual Memory. You can request more than this when running your qsh or qrsh command
qsh -l mem=64G -l rmem=8G

This asks for 64 Gigabytes of Virtual Memory and 8 Gigabytes of RAM (real memory). Note that you should
• not specify more than 768 Gigabytes of virtual memory (mem)
• not specify more than 256 GB of RAM (real memory) (rmem)

Windows-style line endings

If you prepare text files such as your job submission script on a Windows machine, you may find that they do not work
as intended on the system. A very common example is when a job immediately goes into Eqw status after you have
submitted it.
The reason for this behaviour is that Windows and Unix machines have different conventions for specifying ‘end of
line’ in text files. Windows uses the control characters for ‘carriage return’ followed by ‘linefeed’, \r\n, whereas
Unix uses just ‘linefeed’ \n.
The practical upshot of this is that a script prepared in Windows using Notepad looking like this
#!/bin/bash
echo 'hello world'

will look like the following to programs on a Unix system


#!/bin/bash\r
echo 'hello world'\r

If you suspect that this is affecting your jobs, run the following command on the system
dos2unix your_files_filename

2.5. Troubleshooting 159


Sheffield HPC Documentation, Release

error: no DISPLAY variable found with interactive job

If you receive the error message


error: no DISPLAY variable found with interactive job

the most likely cause is that you forgot the -X switch when you logged into iceberg. That is, you might have typed
ssh username@iceberg.sheffield.ac.uk

instead of
ssh -X username@iceberg.sheffield.ac.uk

Problems connecting with WinSCP

Some users have reported issues while connetcing to the system using WinSCP, usually when working from home
with a poor connection and when accessing folders with large numbers of files.
In these instances, turning off Optimize Connection Buffer Size in WinSCP can help:
• In WinSCP, goto the settings for the site (ie. from the menu Session->Sites->SiteManager)
• From the Site Manager dialog click on the selected session and click edit button
• Click the advanced button
• The Advanced Site Settings dialog opens.
• Click on connection
• Untick the box which says Optimize Connection Buffer Size

Login Nodes RSA Fingerprint

The RSA key fingerprint for Iceberg’s login nodes is “de:72:72:e5:5b:fa:0f:96:03:d8:72:9f:02:d6:1d:fd”.

2.6 Glossary of Terms

The worlds of scientific and high performance computing are full of technical jargon that can seem daunting to new-
comers. This glossary attempts to explain some of the most common terms as quickly as possible.
HPC ‘High Performance Computing’. The exact definition of this term is sometimes a subject of great debate but
for the purposes of our services you can think of it as ‘Anything that requires more resources than a typical
high-spec desktop or laptop PC’.
GPU Acronymn for Graphics Processing Unit.
SGE Acronymn for Sun Grid Engine.
Sun Grid Engine The Sun Grid Engine is the software that controls and allocates resources on the system. There
are many variants of ‘Sun Grid Engine’ in use worldwide and the variant in use at Sheffield is the Son of Grid
Engine
Wallclock time The actual time taken to complete a job as measured by a clock on the wall or your wristwatch.

160 Chapter 2. Research Software Engineering Team


Index

G
GPU, 160

H
HPC, 160

S
SGE, 160
Sun Grid Engine, 160

W
Wallclock time, 160

161

Você também pode gostar