Machine-specific notes, tips, and submit scripts
Aitken
To use Aitken, follow instructions for logging into Pleiades on the NAS website - the machines share login nodes.
Aitken: Cascade Lake nodes
Compiling GIZMO
To compile GIZMO for Aitken Cascade Lake nodes, uncomment the Pleiades line in Makefile.systype and load the following (can add this to your .profile to run automatically upon login):
module load comp-intel mpi-hpe/mpt pkgsrc/2021Q2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PKGSRC_BASE/lib
Optimal Parallelization
Aitken has 2 types of nodes: Intel Cascade Lake and AMD Rome. Currently only Cascade Lake nodes have been tested, which have 2 CPUs with 20 physical cores each, with 2 hyperthreads each for a total of 80 possible concurrent processes per node. However quite often GIZMO is limited by cache/memory throughput rather than instruction throughput, so running the maximum number of processes is not always optimal.
Testing a variety of MPI and hybrid MPI/OpenMP configurations on the slowest (most load-imbalanced) part of a typical mid-sized (2e7 gas cell) run, the optimal node configuration appeared to be 20 MPI ranks per node with 2 OpenMP threads per rank (i.e. compile with OPENMP=2).
Submit script
Example submit script for a STARFORGE run on 8 Aitken Cascade Lake node with 20 MPI ranks per node and 2 OpenMP threads per MPI rank:
#PBS -l select=8:ncpus=40:mpiprocs=20:model=cas_ait
#PBS -l walltime=120:00:00
#PBS -q long
source ~/.profile
export OMP_NUM_THREADS=2
export MPI_DSM_DISTRIBUTE=0
export KMP_AFFINITY=disabled
mpiexec -np 160 omplace -nt 2 ./GIZMO params.txt 2 1>gizmo.out 2> gizmo.err
wait