A Performance Tool for Tomorrow’s Supercomputers and Applications

Minimal Metrics has partnered with Sandia National Laboratory to build a performance tool for tomorrow’s Exascale-class supercomputers – and to help inform their design.

The system has its roots in a prototype first developed by Philip and colleagues at the Parallel Dator Centrum at the Royal Institute of Technology in Stockholm Sweden. It is a set of software designed to optimize the entire HPC software and hardware ecosystem of an institution. It is capable of analyzing individual HPC applications and their threads of execution as well as entire workloads, groups, users and multiple disjoint systems.

It accomplishes this by integrating the best-of-breed dashboarding and visualization methodologies with state-of-the-art performance data collection. detailed performance metrics from the underlying architecture, including memory bandwidth, memory hierarchy behavior and latencies, vectorization, hardware resource utilization, computational intensity and instruction mix are provided. The system is able to identify issues of on and off-node scaling, including message passing performance, load-imbalance, false-sharing, and coherency operations. Through its architecture specific metrics, it is able identify applications and code amendable to acceleration, be those GPU’s, FPGA’s or many-core systems such as Intel’s Knights Landing.

The system treats performance as system-health issue, providing complete drill-down accountability of the entire, site-wide workload down to individual application threads. It is completely transparent; it works on applications written in any language and does not require any modifications to those applications by the user. The system is also highly efficient and daemon free – it does not require any additional processes or threads to perform measurement. Advanced monitoring methods allow for near-zero overhead performance data collection, even for applications with hundreds of millions of threads.

We’re not the only people working on this problem – and many of them are our friends. But a few are our competitors and as such, we can’t say much more at this time. Plus, we don’t want to spoil the surprise. Watch this space for more!

If your organization is interested in using this system or collaborating on its development, please drop us an email.

This entry was posted in Analytics, Architecture Evaluation, HPC, Optimization. Bookmark the permalink.

Comments are closed.