|Titel:||Language Extensibility and Configurability to Support Stencil Code Development||Sprache:||Englisch||Autor*in:||Jum'ah, Nabeeh||Schlagwörter:||HPC; DSL; Supercomputer; Stencil-computation; Simulation||Erscheinungsdatum:||2020||Tag der mündlichen Prüfung:||2021-02-18||Zusammenfassung:||
Stencil computations are essential for large-scale scientific computing, e.g., in earth system modeling. Such computations are normally time consuming. Execution time is a key concern to consider when developing code. For instance, it would be impractical to run a weather prediction model for one day while predictions should be generated multiple times per day. To minimize execution time, enormous efforts are dedicated to optimize stencil codes to exploit underlying hardware.
Scientific codes are usually developed using general-purpose languages, e.g., Fortran or C++. However, general-purpose languages lack the necessary semantics that allow to exploit some optimization possibilities as a result of the generality of such languages. To overcome this shortcoming, important code transformations that optimize code are done manually. This puts the burden on scientists, who must spend more time on optimization and learn architectural details of computer systems.
In addition to the challenge of understanding architectural details arise related challenges. One of those challenges is the pace of the architectural evolution (which takes place frequently to support HPC applications) in comparison to the life-time of models. This urges to port code to support architectural features that are introduced frequently. Another challenge is the diversity of architectures arising with heterogeneous computing on supercomputers, which complicates the situation even more.
Besides the architectural challenges, the wide range of algorithmic choices at the application level such as numerical methods diversity and grid types, form another factor of complexity for model development. Limitations of some methods and grid types push towards using new grids with different characteristics, e.g., icosahedral grids. Introducing new grids leads to different representations of stencils to apply numerical methods, e.g., triangular tessellations bring new forms of neighborhoods.
To overcome the challenges, this thesis lifts the semantical level of modeling languages to a higher level abstraction. We suggest reforming the software engineering of model development to maximize the use of application semantics to drive optimization by tools: Application requirements are analyzed to identify the grids and the stencils that comprise an application. The spatial relationships among the points forming the different stencils within the application are analyzed. Those spatial relationships are used to define new language extensions in addition to a set of basic extensions that we suggest. Thus, our suggestion is to use an application-adaptable set of language extensions to maximize the use of the application-enabled semantics to enable the optimization process.
In the suggested approach, we allow users to define language extensions and their role in the optimization process. Such details are provided through separate configuration files. This way, we keep the source code clean of optimization and remove the burden of optimization from scientists, who are now able to write the scientific problem in the source code following an abstraction closer to their scientific concepts. The configuration files are prepared by scientific programmers who master the optimization for some target architecture. This enables a clear separation of concerns in the software engineering of models.
Important points we investigate in this thesis are the way to exploit the mentioned application-adaptable language extensions to drive the optimization process and the scalability over multiple nodes to support modern supercomputers. We also evaluate the impact of the new techniques on the quality of code and on development costs.
The key contribution of this work is developing an integrated approach with techniques to maximize the use of semantics in optimizing and scaling stencil computations to support modern supercomputers. This is accompanied with limiting scientists role to coding scientific problems, conforming to principle of separation of concerns, and improved code quality. The effectiveness of the approach is validated by conducting experiments on various architectures: multi-core processors, GPUs, and vector engines. In order to be versatile, we demonstrate the achievable efficiency of the generated codes and the productivity for the scientists.
Analysis and experimental results show that we can achieve high percentages of the achievable performance on each architecture. We can minimize the number of field loads from memory to caches, and achieve about 80% of memory bandwidth, which is the limiting factor for performance of memory-bound computations. Those experimental results align with the theoretical expectations of achievable performance on the tested architectures. To evaluate performance portability, we use same source code (without any per-architecture changes or special code) on different architecture.
Using the suggested language extensions allows to reduce the code size to one third, and the development costs to less than one half. A key conclusion of this work is that the application-adaptable language extensions maximize code optimization through application-specific semantics while they can be tailored to the needs of specific applications or domains.
|Enthalten in den Sammlungen:||Elektronische Dissertationen und Habilitationen|
geprüft am 12.04.2021
geprüft am 12.04.2021