Gadi HPC system

Refer to the Gadi user guide, the Gadi quick reference, and the NCI documentation for more details.

Note

This page is a work in progress, and contains notes and observations that will be condensed into a coherent step-by-step guide.

Connecting to Gadi

ssh username@gadi.nci.org.au

Loading modules

The process is very similar to that on Spartan.

You can search for available modules with the following command:

module avail

You can load a particular module with module load:

module load intel-compiler-llvm/2023.2.0

We probably want to use the most recent intel-compiler-llvm and intel-mkl modules (version 2023.2.0, at the time of writing), and submit jobs to the "normal" queue (see Gadi queues for further information).

Note

The Gadi user guide recommends trying your workflow in an interactive job before submitting a job script, by using qsub -I. This will be the best way to identify which modules we need to load in order to (a) compile MCAS; (b) run MCAS; and (c) run mcasopt.

Running jobs

To submit a job defined in the shell script job.sh, use qsub:

qsub -P project job.sh

This should return the job identifier, which will have the form <sequence number>.<server name>.

To check on the status of a submitted job, use qstat:

qstat -w -x JOB_ID

The Gadi user guide includes an example job script:

#!/bin/bash

#PBS -l ncpus=48
#PBS -l mem=190GB
#PBS -l jobfs=200GB
#PBS -q normal
#PBS -P a00
#PBS -l walltime=02:00:00
#PBS -l storage=gdata/a00+scratch/a00
#PBS -l wd

module load python3/3.7.4
python3 main.py $PBS_NCPUS > /g/data/a00/$USER/job_logs/$PBS_JOBID.log

The qsub man page indicates that we can use Python job scripts and include the PBS directives in a comment block. For example, we could create a Python script with the following contents:

#!/usr/bin/python
#PBS -l select=1:ncpus=3:mem=1gb
#PBS -N HelloJob
print "Hello"

To run this job under Linux, we need to define the path to the Python executable on the execution host by using the -S argument:

qsub -S $PBS_EXEC/bin/pbs_python <script name>

Python virtual environments

See the NCI Python documentation, which shows how to create and activate a Python virtual environment. Note that they recommend using the system site-packages directory in order to use optimised versions of installed Python modules. The example also shows how to unload default modules and load specific versions of required modules.

Also see the Gadi environment modules documentation for an example of creating a user-defined module that loads other modules and activates an existing Python virtual environment. If you load the use.own module, you can then load your own modules with module load.