As of February 2022, running jobs on Kraken cluster compute nodes is only possible through the queuing system ([[https://slurm.schedmd.com/documentation.html|SLURM]]). Compute nodes are not directly accessible from the network. Special administrative node "kraken" is available for connecting to the cluster, preparing and submitting jobs to the queuing system. The Administrative node is not intended to run computations, use it primarily for data manipulation, submitting jobs to queues, and compiling custom programs. You cannot run parallel jobs outside of the queues. The "NoCompute" queue is intended for computationally inefficient parallel programs (e.g. Paraview). This node runs only on the administrative node (which has increased RAM to 320GB to allow for big data processing). In addition to [[computing:cluster:fronty:start#zakladni_prikazy|SLURM commands]], the command shows the current cluster load ''freenodes'' {{:computing:cluster:fronty:computing:cluster:fronty:freenodes.png?direct&400|}} **Users queue a job and let the queue system run the computation. The job is queued in order of the system's internal priorities and waits for exectuion. The queuing system runs the job as soon as the requested compute resources are available. Users do not need to monitor compute capacity availability themselves, they can log out of the cluster and wait for the computation to completed. There is an option for notification email to be sent upon completion of the job, for details see [[computing:cluster:fronty:start#parametry_prikazu_srun_a_sbatch|Command Overview]]).** See below for a basic description of how to work with the queue system (SLURM), specifics on starting specific applications are on separate pages: * [[computing:cluster:fronty:abaqus|Abaqus]] * [[computing:cluster:fronty:ansys|Ansys (Fluent)]] * [[computing:cluster:fronty:comsol|Comsol]] * [[computing:cluster:fronty:matlab|Matlab]] * [[computing:cluster:fronty:openfoam|OpenFOAM]] * [[computing:cluster:fronty:paraview|Paraview]] * [[computing:cluster:fronty:pmd|PMD]] ====== SLURM queuing system ====== The queuing system takes care of optimal cluster utilization, providing a number of tools for job submission, job control and parallelization. All tasks are performed by logging into the administrative node "kraken" (ssh username@kraken). Full documentation can be found at [[https://slurm.schedmd.com|slurm.schedmd.com]] ==== Basic commands: ==== === Running jobs === There are 2 commands to queue a job, ''srun'' and ''sbatch'': srun // Key command for queuing a job.// For parallel jobs, it replaces the "mpirun" command (mpi libraries in modules therefore do not offer the mpirun command either...).\\ ''srun'' in this form requests resources according to '''' and runs the program on them. If you are running a **non-**parallel job, leave the parameter ''-n 1'' (the default), if you choose a higher value, the non-parallel program will run n times! sbatch //Submit the job to the queue according to the prepared script, see examples below. The script for parallel jobs usually includes a line with the "srun" command (commercial codes are usually run without //srun//, like codes). // The most common way to queue a job is just sbatch+script. === Task management ==== sinfo //Lists the queues and their current usage.// squeue //Lists information about running jobs in the queue system.// Meaning of abbreviations in //squeue// ([[https://curc.readthedocs.io/en/latest/running-jobs/squeue-status-codes.html|complete list here]]): - In the "ST" (status) column: **R** - running, **PD** - pending (waiting for allocation of resources), **CG** - completing (some processes are finished, but some are still active),... - in the REASON column: **Priority** - the task(s) with higher priority is/are in the queue, **Dependency** - the task is waiting for the completion of the task in the dependency and will be started afterwards, **Resources** - the task is waiting for the release of the required resources,... scancel //Quit the queued task.// === User information === sacct //Lists information about the user's jobs (including history).// {{:computing:cluster:fronty:slurm_command_summary.pdf|List of commands with parameters in PDF document.}} ===== Running the tasks ===== Tasks can be run on multiple nodes, but always on one of the parts of the server: * part **M** - kraken machines-m1 to m10 (all users) * part **L** - kraken-l1 to l4 machines (limited access) Tasks can be run on: * directly from the line with the ''srun'' command * using a script with the ''sbatch'' command ==== Guidelines for running jobs ==== * A job must always run under a queue (partition). If no queue is specified, ''Mexpress'' is used. A list of defined queues is given below. * By specifying a queue, a run time limit is defined. * Tasks in the express and short queues cannot be given a longer run time using ''-''''-time''. The default queue long time is set to 1 week, but they allow running up to 2 weeks, e.g. 9 days and 5 hours by specifying '''-p Llong''' ''-''''-time=9-05:00:0''. * Slurm will prioritize the task and user that the cluster uses less when queuing pending tasks. Therefore, it is not advantageous to declare a longer computation time than strictly necessary. ==== Predefined queues and time limits ==== There are 6 queues ("partitions") on the Kraken cluster, divided by job run length (express, short, long) and cluster partition ("Mxxx" and "Lxxx"). If the user does not specify a queue with the ``-````-partition`` switch, the default value (Mexpress) is used: ^ cluster part ^ partition ^ node ^ time limit ^ | M (nodes kraken-m[1-10]) | **Mexpress** | kraken-m[1-10] | 6 hours | | ::: | Mshort | kraken-m[1-10] | **2 days** | | ::: | ::: | ::: | 3 days | | ::: | Mlong | kraken-m[3-6], kraken-m8 | **1 week** | | ::: | ::: | ::: | 2 weeks | | L (nodes kraken-l[1-4]) | Lexpress | kraken-l[1-4] | 6 hours | | ::: | Lshort | kraken-l[1-4] | 2 days | | ::: | Llong | kraken-l[1-4] | **1 week** | | ::: | ::: | ::: | 2 months (max) | | admin node only | NoCompute | kraken | **1 hour** | | ::: | ::: | ::: | 8 hours | *bold=default Details of the settings can also be viewed using the command scontrol show partition [partition_name] ==== Parameters for the ''srun'' and ''sbatch'' commands ==== The program run is controlled by parameters. For the ''srun'' command they are entered directly into the command line, for the ''sbatch'' command they are written into the startup script. In the script, each parameter is preceded by the identifier ''#SBATCH''. Options can be entered in two forms, either the full form ''-''''-ntasks=5'' (two hyphens and an equal sign) or the abbreviated form ''-n 5'' (one hyphen and a space). ^ option ^ description ^ example ^ | ``-J``, ``-````-job-name=`` | Job name, shown e.g. in output of //squeue// | ``-J my_first_job`` | | ``-p``, ``-````-partition=`` | Request a specific partition for the resource allocation | ``-p Mshort`` | | ``-n``, ``-````-ntasks=`` | Number of resources (~cores) to be allocated for the task | ``-n 50`` | | ``-N``, ``-````-nodes=`` | Number of nodes to be used | ``-N 3`` | | ``-````-mem`` | Job memory request | ``-````-mem=1gb`` | | ``-o``, ``-````-output=`` | Name of file where slurm will output | ``-o out.txt`` | | ``-e``, ``-````-error=`` | standard error to file | ``-e err.txt`` | | ``-````-mail-user=`` | User to receive email notification of state changes as defined by --mail-type | ``-````-mail-user=my@email`` | | ``-````-mail-type=`` | Send email with BEGIN, END, FAIL, ALL,... | ``-````-mail-type=BEGIN,END`` | | ``-````-ntasks-per-node=`` | Request that ntasks be invoked on each node | | | ``-t``, ``-````-time=