Machine-specific notes, tips, and submit scripts ------------------------------------------------ Aitken ~~~~~~ To use Aitken, follow instructions for logging into Pleiades on the NAS website - the machines share login nodes. Aitken: Cascade Lake nodes ^^^^^^^^^^^^^^^^^^^^^^^^^^ Compiling GIZMO ''''''''''''''' To compile GIZMO for Aitken Cascade Lake nodes, uncomment the ``Pleiades`` line in ``Makefile.systype`` and load the following (can add this to your ``.profile`` to run automatically upon login): .. code:: bash module load comp-intel mpi-hpe/mpt pkgsrc/2021Q2 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PKGSRC_BASE/lib | Optimal Parallelization ''''''''''''''''''''''' Aitken has 2 types of nodes: Intel Cascade Lake and AMD Rome. Currently only Cascade Lake nodes have been tested, which have 2 CPUs with 20 physical cores each, with 2 hyperthreads each for a total of 80 possible concurrent processes per node. **However** quite often GIZMO is limited by cache/memory throughput rather than instruction throughput, so running the maximum number of processes is not always optimal. Testing a variety of MPI and hybrid MPI/OpenMP configurations on the slowest (most load-imbalanced) part of a typical mid-sized (2e7 gas cell) run, the optimal node configuration appeared to be 20 MPI ranks per node with 2 OpenMP threads per rank (i.e. compile with ``OPENMP=2``). Submit script ''''''''''''' Example submit script for a STARFORGE run on 8 Aitken Cascade Lake node with 20 MPI ranks per node and 2 OpenMP threads per MPI rank: .. code:: bash #PBS -l select=8:ncpus=40:mpiprocs=20:model=cas_ait #PBS -l walltime=120:00:00 #PBS -q long source ~/.profile export OMP_NUM_THREADS=2 export MPI_DSM_DISTRIBUTE=0 export KMP_AFFINITY=disabled mpiexec -np 160 omplace -nt 2 ./GIZMO params.txt 2 1>gizmo.out 2> gizmo.err wait |