High-Quality Debugging Capabilities Help STFC Daresbury Laboratory Simplify and Shorten Software Development

The UK’s Science & Technologies Facilities Council (STFC), one of seven UK research councils, provides scientists and researchers with access to supercomputing facilities around the world in support of technology and scientific development.

STFC’s Daresbury Laboratory is located on the Daresbury Science & Innovation Campus in Warrington, UK, which is one of two national Science and Innovation Campuses. The 550 employees at Daresbury Laboratory support the hosting and operation of strategic facilities and services for the UK’s scientific research community and a broad base of academic, government, commercial, and industrial customers.

The goals of STFC include the advancement of knowledge and technology, funding for university researchers, and providing trained resources to contribute to the nation’s economic competitiveness and well-being. Proving the significant scientific, social, and economic impact of the research programs is of vital importance in maintaining and growing the programs of the Laboratory, particularly in the financial environment of the past several years.

TotalView Helps STFC…

icon-scale

Develop across a wide range of scientific disciplines.

icon-technology

Create a technically advanced development environment.

icon-reduce

Reduce software development time across development teams.

"TotalView … greatly simplifies the design-develop-test cycle, allowing our scientists

to be more productive and to develop better software. This impacts STFC’s productivity, makes us more competitive, and increases opportunities for future projects.”

 

Devs Need a Comprehensive Debugging Tool

STFC Daresbury Laboratory provides the scientific and research community access to high-end computers, including an IBM Blue Gene/P system, for very complex large-scale parallel applications. In establishing the most efficient and technically advanced environment possible for the developers and scientists using the system, Daresbury invests in tools and applications that will make the system most productive to its users. This environment enables the highest quality of code to be produced during the software development cycle. The ability to resolve source problems, investigate code issues, and utilize computing resources optimally depends, to a large degree, upon reliable, state-of-the-art debugging and analysis tools.

The need for developers to be able to debug and understand data flows in parallel code compelled STFC Daresbury Laboratory to look for a comprehensive debugging tool. Without such a tool, development time for parallelizing serial applications would be significantly longer, as would the time required for understanding and optimizing existing parallel code. In some cases, without such a debugger, it would be virtually impossible to port code from one platform to another. A high-quality debugger allows for better code validation, especially where graphical views of data arrays can be used to identify numerical anomalies. A good debugger can even help avert potentially inferior performance due to non-optimal message passing strategies, memory utilization, or data placement.

The Best Features, Speed, and the Most Reliable Performance

After analyzing and investigating debugger possibilities, STFC Daresbury Laboratory selected TotalView debugger. TotalView provided the best features, speed, and the most reliable performance at a competitive price. TotalView has been used to help scientists at Daresbury develop applications across a wide range of scientific disciplines: atomic and molecular physics, computational chemistry, molecular dynamics, materials, engineering, computational fluid dynamics (CFD), and ocean and atmospheric simulations. These applications require code to run efficiently on tens of thousands of cores. TotalView is helping the developers to meet that challenging requirement.

TotalView has reduced software development time by replacing compile-run-print cycles and command line debugging with a much more time-efficient and comprehensive interactive debugging procedure. This is important as the STFC Daresbury Laboratory’s software is constantly evolving. This has led to improved software quality since debugging and resolving one specific problem in parallel code will often lead to a better understanding of how data flows through and between the parallel processes as the computation evolves. This leads to a better understanding of the overall organization and design of the software.

Mike Ashworth, Associate Director at STFC’s Computational Science and Engineering Department, reports that TotalView was also used to improve the numerical stability of one of the codes. The root causes of the instability were difficult to track, as the numerical problems emerged gradually during run-time at non-deterministic points in a large multi-dimensional array. TotalView’s multi-dimensional data array graphical viewer was an invaluable aid in tracking down and resolving the issue, allowing the code developers to map how the problem was emerging, growing, and spreading as the computation proceeded.

Asked about the greatest value that TotalView brings to his organization, Mike summarized:

“TotalView has provided a critical part of an application development environment which greatly simplifies the design-develop-test cycle, allowing our scientists to be more productive and to develop better software. This impacts STFC’s productivity, makes us more competitive, and increases opportunities for future projects.”
Mike Ashworth, Associate Director, Computational Science & Engineering Department, STFC Daresbury Laboratory

Simplify and Shorten Your Software Development

TotalView helps organizations like STFC intuitively diagnose and understand their complex code. See for yourself how TotalView will help you do the same.

START FREE TRIAL