Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion toolchain/bootstrap/modules.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ if [ -v $u_c ]; then
log "$BR""Brown$W: Oscar (o)"
log "$B""DoD$W: Carpenter Cray (cc) | Carpenter GNU (c) | Nautilus (n)"
log "$OR""Florida$W: HiPerGator (h)"
log_n "($G""a$W/$G""f$W/$G""s$W/$G""w$W/$B""tuo$W/$C""b$W/$C""e$CR/$C""d/$C""dai$CR/$Y""p$CR/$R""r$CR/$B""cc$CR/$B""c$CR/$B""n$CR/$BR""o"$CR"/$OR""h"$CR"): "
log "$C""WPI $W: Turing (t)"
log_n "($G""a$W/$G""f$W/$G""s$W/$G""w$W/$B""tuo$W/$C""b$W/$C""e$CR/$C""d/$C""dai$CR/$Y""p$CR/$R""r$CR/$B""cc$CR/$B""c$CR/$B""n$CR/$BR""o"$CR"/$OR""h"$CR/$C""t""$CR"): "
read u_c
log
fi
Expand Down
8 changes: 8 additions & 0 deletions toolchain/modules
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,11 @@ h-gpu CC=/apps/compilers/nvhpc/25.9/Linux_x86_64/25.9/comm_libs/mpi/bin/mpicc
h-gpu CXX=/apps/compilers/nvhpc/25.9/Linux_x86_64/25.9/comm_libs/mpi/bin/mpicxx
h-gpu FC=/apps/compilers/nvhpc/25.9/Linux_x86_64/25.9/comm_libs/mpi/bin/mpifort
h-gpu NVCOMPILER_COMM_LIBS_HOME=/apps/compilers/nvhpc/25.9/Linux_x86_64/25.9/comm_libs/12.9

t WPI Turing
t-all slurm
t-cpu gcc/12.1.0/i6yk33f openmpi/4.1.3/ebae7zc python/3.13.5/6anz4qy
t-gpu nvhpc/24.3/m4bujn7 python/3.13.5/6anz4qy
t-gpu CC=nvc CXX=nvc++ FC=nvfortran
t-gpu MFC_CUDA_CC=80,86
t-gpu MPI_HOME=/cm/shared/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-13.2.0/nvhpc-24.3-m4bujn7c3a4or53dloraad6pcqfxyqul/Linux_x86_64/24.3/comm_libs/openmpi/openmpi-3.1.5
66 changes: 66 additions & 0 deletions toolchain/templates/turing.mako
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#!/usr/bin/env bash

<%namespace name="helpers" file="helpers.mako"/>

% if engine == 'batch':
#SBATCH --nodes=${nodes}
#SBATCH --ntasks-per-node=${tasks_per_node}
#SBATCH --cpus-per-task=1
#SBATCH --job-name="${name}"
#SBATCH --time=${walltime}
% if partition:
#SBATCH --partition=${partition}
% endif
% if account:
#SBATCH --account="${account}"
% endif
% if gpu_enabled:
#SBATCH --gres=gpu:1
#SBATCH --mem=208G
#SBATCH -C "A30|A100"
% endif
#SBATCH --output="${name}.out"
#SBATCH --error="${name}.err"
#SBATCH --export=ALL
% if email:
#SBATCH --mail-user=${email}
#SBATCH --mail-type="BEGIN, END, FAIL"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor: Mail-type value has spaces which may cause parsing issues.

The mail-type directive has spaces: "BEGIN, END, FAIL". Some Slurm versions expect comma-separated values without spaces.

🔧 Proposed fix
-#SBATCH --mail-type="BEGIN, END, FAIL"
+#SBATCH --mail-type=BEGIN,END,FAIL
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#SBATCH --mail-type="BEGIN, END, FAIL"
`#SBATCH` --mail-type=BEGIN,END,FAIL

% endif
% endif

${helpers.template_prologue()}

ok ":) Loading modules:\n"
cd "${MFC_ROOT_DIR}"
. ./mfc.sh load -c t -m ${'g' if gpu_enabled else 'c'}
cd - > /dev/null
echo

% if gpu_enabled:
export LD_LIBRARY_PATH=/cm/shared/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-13.2.0/cuda-12.3.0-vuydybqum6mloi2vvov7yn2juaurmtao/lib64:$LD_LIBRARY_PATH
% endif

% for target in targets:
${helpers.run_prologue(target)}

% if not mpi:
(set -x; ${profiler} "${target.get_install_binpath(case)}")
% else:
% if gpu_enabled:
(set -x; ${profiler} \
srun --gres=gpu:1 -C "A30|A100" \
$MPI_HOME/bin/mpirun --np ${nodes*tasks_per_node} \
"${target.get_install_binpath(case)}")
Comment on lines +50 to +53
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find and read the turing.mako template file
find . -name "turing.mako" -type f | head -5

Repository: MFlowCode/MFC

Length of output: 90


🏁 Script executed:

# List all .mako template files in the toolchain directory
ls -la toolchain/templates/*.mako 2>/dev/null || find toolchain -name "*.mako" -type f

Repository: MFlowCode/MFC

Length of output: 1384


🏁 Script executed:

# Read the turing.mako file to verify the actual code
if [ -f "toolchain/templates/turing.mako" ]; then
  cat -n toolchain/templates/turing.mako | sed -n '40,65p'
fi

Repository: MFlowCode/MFC

Length of output: 1153


🏁 Script executed:

# Also check other template files to see the pattern used elsewhere
fd -e mako toolchain/ -x echo "File: {}" \; -x head -70 {}

Repository: MFlowCode/MFC

Length of output: 402


🏁 Script executed:

# Check other template files for comparison - particularly GPU+MPI patterns
for file in toolchain/templates/*.mako; do
  echo "=== $(basename $file) ==="
  cat -n "$file" | grep -A 5 -B 2 "gpu_enabled\|mpirun\|srun" || echo "(no GPU/MPI pattern found)"
done

Repository: MFlowCode/MFC

Length of output: 14827


🏁 Script executed:

# Search for documentation or comments about the MPI launcher choice
rg -i "mpirun|srun.*mpi|gpu.*mpi" toolchain/ docs/ --max-count=5

Repository: MFlowCode/MFC

Length of output: 5158


🏁 Script executed:

# Check if there's any cluster-specific documentation
find . -name "*turing*" -type f | head -10

Repository: MFlowCode/MFC

Length of output: 90


🏁 Script executed:

# Check if there's any Turing cluster-specific documentation or comments
find . -type f \( -name "*.md" -o -name "*.txt" -o -name "*.sh" \) | xargs grep -l -i "turing" | head -10

Repository: MFlowCode/MFC

Length of output: 138


🏁 Script executed:

# Look at hipergator.mako more closely since it also uses NVHPC
cat -n toolchain/templates/hipergator.mako | sed -n '49,64p'

Repository: MFlowCode/MFC

Length of output: 860


Fix inconsistency: GPU MPI path should use srun --mpi=pmi2 like the CPU path, not wrap mpirun.

The GPU execution (lines 50-53) unusually combines srun with mpirun, which can cause process management conflicts. The CPU path (lines 55-57) correctly uses srun --mpi=pmi2. For consistency and to avoid launcher conflicts, the GPU path should follow the same pattern:

Suggested fix
        % if gpu_enabled:
            (set -x; ${profiler}                                    \
-                srun --gres=gpu:1 -C "A30|A100"                     \
-                $MPI_HOME/bin/mpirun --np ${nodes*tasks_per_node}   \
+                srun --mpi=pmi2 --gres=gpu:1 -C "A30|A100"          \
+                --ntasks=${nodes*tasks_per_node}                    \
                "${target.get_install_binpath(case)}")

% else:
(set -x; ${profiler} \
srun --mpi=pmi2 --ntasks=${nodes*tasks_per_node} \
"${target.get_install_binpath(case)}")
% endif
% endif

${helpers.run_epilogue(target)}

echo
% endfor

${helpers.template_epilogue()}
Loading