[[computing:cluster:|{{:up1.png?direct|}}]]
[[computing:cluster:|Cluster KRAKEN]]
====== KRAKEN - Hardware ======
KRAKEN cluster is composed of **frontend node** (access, queue control, job preparation, ...)
**It is prohibited to run computational tasks at the frontend node!**
^ Processor:|[[https://www.amd.com/en/products/cpu/amd-epyc-7302p|AMD EPYC 7302P 16-Core Processor]] |16 cores, 3.0GHz, hyperthreading, 128MB cache|
^ Memory:|320GB|DDR4 3200 ECC|
^ Disk space:|2x 960GB| NVMe M.2 SSD|
^ Remote control:|IPMI |KVM-o-E |
and two computing parts (only the "M" part is accessible to all users):
=== M - as MultiCore ====
Part M contains a total of **10 compute nodes** (576 cores in total, 3.33TB RAM) built on three processor architectures:
**1. Intel - broadwell**, 6 nodes (kraken-m1, ..., kraken-m6):
^ Motherboard:|[[https://www.supermicro.nl/products/motherboard/Xeon/C600/X10DRW-ET.cfm|SUPERMICRO X10DRW-ET]] |2x Intel Xeon processor E5-2600 v4, max. 2TB RAM, 2x 10 Gbit Ethernet, Remote controll |
^ Processors:|2x [[https://ark.intel.com/products/91766/Intel-Xeon-Processor-E5-2683-v4-40M-Cache-2_10-GHz|Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10-3.0GHz]] |**16 cores, hyperthreading, 48MB cache** |
^ Memory:|256GB per node |DDR4 2400MHz ECC reg. |
^ Disk storage:|4x 6TB SATA, 2x1TB SSD | [[https://toshiba.semicon-storage.com/us/product/storage-products/enterprise-hdd/mg04acaxxxx.html|TOSHIBA MG04ACA6]], [[ https://www.micron.com/parts/solid-state-storage/ssd/mtfddak960tcc-1ar16ab?pc={BD70429E-50BD-4D5C-A386-3E2D4143F5B3}|Micron_5100_MTFD]] |
^ Remote controll:|IPMI |IPMI 2.0 with virtual media over LAN and KVM-over-LAN support |
**2. AMD - zen 2**, 3 nodes (kraken-m7,...,kraken-m9), in operation from 10/2021:
^ Processors:|2x [[https://www.amd.com/en/products/cpu/amd-epyc-7552|2nd Gen AMD EPYC(TM) 7552]] |**48 cores, 2.2-3.3GHz, 192MB cache** (96 cores per node) |
^ Memory:|512GB per node|DDR4 3200MHz ECC |
^ Disk storage:|960GB per node |NVMe M.2 SSD |
^ Remote controll:|IPMI |KVM-o-E |
**3. AMD - zen 4**, 1 node (kraken-m10), in operation from 11/2023:
^ Processors:|1x [[https://www.amd.com/en/products/cpu/amd-epyc-9654p|4nd Gen AMD EPYC(TM) 9654P]] |**96 cores, 2.4-3.7GHz, 384MB cache** |
^ Memory:|256GB |DDR5 4800MHz ECC |
^ Disk storage:|960GB |NVMe M.2 SSD |
^ Remote controll:|IPMI |KVM-o-E |
=== L - as LowCore (available only to selected users)===
Part L contains **4 nodes** (kraken-l1,...,kraken-l4):
^ Motherboard:|[[https://www.supermicro.nl/products/motherboard/Xeon/C600/X10DRW-ET.cfm|SUPERMICRO X10DRW-ET]] |2x Intel Xeon processor E5-2600 v4, max. 2TB RAM, 2x 10 Gbit Ethernet, Remote controll |
^ Processors:|2x [[https://ark.intel.com/products/92983/Intel-Xeon-Processor-E5-2637-v4-15M-Cache-3_50-GHz|Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz]] |**4 cores, hyperthreading, 16MB cache** |
^ Memory:|256GB per node |DDR4 2400 ECC reg. |
^ Disks:|4x 6TB SATA, 2x1TB SSD | [[https://toshiba.semicon-storage.com/us/product/storage-products/enterprise-hdd/mg04acaxxxx.html|TOSHIBA MG04ACA6]], [[ https://www.micron.com/parts/solid-state-storage/ssd/mtfddak960tcc-1ar16ab?pc={BD70429E-50BD-4D5C-A386-3E2D4143F5B3}|Micron_5100_MTFD]] |
^ Remote controll:|IPMI |IPMI 2.0 with virtual media over LAN and KVM-over-LAN support |
== Server room temperature ==
{{ https://eye.it.cas.cz/g/?.png? |Temperature TR1- does not work}}
Limit cluster performance based on server room temperature:
- 32˚C - 34˚C to restrict running additional queued jobs (DRAIN mode)
- 34˚C - 36˚C to shut down machines (DOWN mode)
//DRAIN mode// limiting is first performed on machines that are about to run out of jobs.
//DOWN mode// is first performed on machines running jobs with lower "run time/declared run time" ratio.
During extended periods of high temperature problems, cluster is kept down in the following order of nodes
- m1-m6 nodes
- l1-l4 nodes
- m7-m10 nodes