HPC Resources

Aurora (Intel-HPE): Website

Aurora, Argonne’s first exascale computer, features Intel’s Data Center GPU Max Series GPUs and Intel XEON Max Series CPUs with HBM, a CPU-GPU interconnect (PCI-e), the HPE Slingshot fabric system interconnect, and a Cray EX platform. Aurora features several technological innovations and includes a revolutionary I/O system – the Distributed Asynchronous Object Store (DAOS) – to support new types of machine learning and data science workloads alongside traditional modeling and simulation workloads.

The Aurora software stack includes the HPE HPCM software stack, the Intel oneAPI software development kit, and data and learning frameworks. Supported programming models include MPI, Intel OneAPI, OpenMP, SYCL/DPC++., Kokkos, RAJA, and others. HIP is supported via ChipStar/CHIP-SPV.

Aurora System Configuration
Architecture	Intel/HPE
Node	2 x 4th Gen Intel Xeon Max Series CPUs with HBM 6x Intel Data Center GPU Max Series
Node Count	10,624
GPU Architecture	Intel Data Center GPU Max Series; Tile based, chiplets, HBM stack, Foveros 3d integration
Interconnect	HPE Slingshot 11; Dragonfly topology with adaptive routing 25.6 TB/s per switch, from 64-200 GB ports (25GB/s per direction)
File System	230 PB, 31 TB/s (DAOS)
Peak Performance	≥ 2EF DP peak	Aggregate System Memory	20PB
Aggregate DDR5 Memory	10.6PB	Aggregate CPU HBM	1.3PB
Aggregate GPU HBM	8.1PB	Node Memory Architecture	Unified memory architecture, RAMBO

The most recent information on Aurora can be found at https://docs.alcf.anl.gov/aurora/machine-overview/

Frontier (Cray-HPE): Website

Frontier is a HPE Cray EX supercomputer located at the Oak Ridge Leadership Computing Facility. With a theoretical peak double-precision performance of approximately 2 exaflops (2 quintillion calculations per second), it is the fastest system in the world for a wide range of traditional computational science applications. The system has 77 Olympus rack HPE cabinets, each with 128 AMD compute nodes, and a total of 9,858 AMD compute nodes. 

Each Frontier compute node consists of a single 64-core AMD 3rd Gen EPYC CPU and 512 GB of DDR4 memory. Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. Each GCD has 64 GB of high-bandwidth memory (HBM2E) for a total of 512 GB of HBM memory per node.

Frontier System Configuration
Architecture	HPE / AMD
Node	1 3rd Gen AMD EPYC CPU 4 AMD Instinct 250X GPUs
Node Count	9,858
GPU link	AMD Infinity Fabric
Interconnect:	4x HPE Slingshot NICs providing 100 GB/s network bandwidth
File system:	700 PB, Lustre center-wide file system + 11 PB Flash
Peak Performance	≥ 2EF DP peak

Further details may be found https://docs.olcf.ornl.gov/systems/frontier_user_guide.html

Polaris (HPE): Website

Developed in collaboration with Hewlett Packard Enterprise (HPE), the 44-petaflop system called Polaris, is available to users through INCITE 2028 program. It was originally leaving INCITE after 2026, but it will now be in through 2028. Polaris is a hybrid CPU/GPU leading-edge testbed system that will give scientists and application developers a platform to test and optimize codes for Aurora, Argonne’s upcoming Intel-HPE exascale supercomputer.

The Polaris software environment is equipped with the HPE Cray programming environment, PE Performance Cluster Manager (HPCM) system software, and the ability to test programming models, such as OpenMP and SYCL, that will be available on Aurora and the next generation of DOE’s high performance computing systems. Polaris users will also benefit from NVIDIA’s HPC software development kit, a suite of compilers, libraries, and tools for GPU code development.

Polaris System Configuration
Platform:	HPE Apollo 6500 Gen10+
Compute Node:	1 AMD EPYC “Milan” processor + 4 NVIDIA A100 GPUs
System Size:	560 nodes
System Peak:	44 PF DP
System Memory:	280 TB CPU DDR4 + 87.5 TB GPU HBM2
Peak Power:	1.8 MW
Node Performance:	78 TF DP
Memory/node:	512 GB CPU DDR4 + 160 GB GPU HBM2
Interconnect:	HPE Slingshot
Node-to-Node Interconnect:	200 Gb
Programming Models:	OpenMP 4.5/5, SYCL, Kokkos, RAJA, HIP
Performance/Debugging:	GPU tools, PAPI, TAU, HPCToolkit, DDT
Frameworks:	Python/Numba, TensorFlow, PyTorch

Aurora (Intel-HPE): Website

Frontier (Cray-HPE): Website

Polaris (HPE): Website

User Facilities