Machine-specific notes, tips, and submit scripts

Aitken

To use Aitken, follow instructions for logging into Pleiades on the NAS website - the machines share login nodes.

Aitken: Cascade Lake nodes

Compiling GIZMO

To compile GIZMO for Aitken Cascade Lake nodes, uncomment the Pleiades line in Makefile.systype and load the following (can add this to your .profile to run automatically upon login):

module load comp-intel mpi-hpe/mpt pkgsrc/2021Q2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PKGSRC_BASE/lib

Optimal Parallelization

Aitken has 2 types of nodes: Intel Cascade Lake and AMD Rome. Currently only Cascade Lake nodes have been tested, which have 2 CPUs with 20 physical cores each, with 2 hyperthreads each for a total of 80 possible concurrent processes per node. However quite often GIZMO is limited by cache/memory throughput rather than instruction throughput, so running the maximum number of processes is not always optimal.

Testing a variety of MPI and hybrid MPI/OpenMP configurations on the slowest (most load-imbalanced) part of a typical mid-sized (2e7 gas cell) run, the optimal node configuration appeared to be 20 MPI ranks per node with 2 OpenMP threads per rank (i.e. compile with OPENMP=2).

Submit script

Example submit script for a STARFORGE run on 8 Aitken Cascade Lake node with 20 MPI ranks per node and 2 OpenMP threads per MPI rank:

#PBS -l select=8:ncpus=40:mpiprocs=20:model=cas_ait
#PBS -l walltime=120:00:00
#PBS -q long

source ~/.profile
export OMP_NUM_THREADS=2

export MPI_DSM_DISTRIBUTE=0
export KMP_AFFINITY=disabled

mpiexec -np 160 omplace -nt 2 ./GIZMO params.txt 2 1>gizmo.out 2> gizmo.err
wait