The System Integrates Inspur’s Clusterengine – a Cluster Management Platform, and Teye – an HPC Application Performance Monitoring and Analysis System, to Effectively Address Management and Application Optimization Challenges for Massive HPC Clusters.
At the SC 2019 in Denver, Inspur released an HPC system based on Intel Xeon Platinum 9200 processors in collaboration with Intel. The system supports up to 112 cores, 9.3 TFLOPS of FP64 performance and 24 memory channels per node, a maximum of four nodes on a 2U server, and on-board liquid cooling. It will provide a cost-efficient, intensive and green supercomputing platform for global HPC users.
With data size and computing load surging, HPC applications need higher operation performance, memory bandwidth and data throughput. Massive HPC systems present new challenges on intensity and cooling modes of computing devices. The HPC system released by Inspur answers to the increasing demands on computation performance from massive parallel HPC applications to help HPC users better solve major scientific problems featuring increasingly larger-scale computation.
The system, equipped with the Intel Xeon Platinum 9200 processors designed for intensive computation, supports up to dual 56-core CPUs with built-in AI acceleration fueled by Intel Deep Learning Boost and 24 DDR4 2933MHz memory banks per computing node. Its FP64 performance reaches 9.3 TFLOPS per node at the maximum, making it the highest-performing processor in Intel’s 2nd generation Xeon Scalable processor family. Interconnected and scalable through the Intel OmniPath Architecture high-speed network, the system enables easy cluster expansion to thousands of nodes. It is particularly suitable for HPC, big data, image and video processing, virtual applications and other computation-intensive scenarios. Tests on a single-node Intel Xeon Scalable 9242 platform show that with 24 32GB of 2933MHz memory, the system can deliver a memory bandwidth of up to 300 GB/s when running a WRF application; when the system is used to run a VASP application, its memory bandwidth reaches as high as 450 GB/s, with the performance standing at 4,000 GFLOPS.
In addition, the system integrates Inspur’s ClusterEngine – a cluster management platform, and Teye – a HPC application performance monitoring and analysis system, to effectively address management and application optimization challenges for massive HPC clusters. Specifically, ClusterEngine enables comprehensive management of HPC clusters. Teye provides scientific and effective guidance for application bottleneck investigation, application algorithm improvement and parallel computing efficiency increase. Meanwhile, Inspur’s HPC experts have more than 10 years of experience in application optimization and nearly 300 application databases in more than 10 industries. This enables Inspur to provide customized services for application optimization and effectively enhance computing efficiency.