|Titel:||Analyzing Convergence Opportunities of HPC and Cloud for Data Intensive Science||Sonstige Titel:||Analyse der Konvergenzmöglichkeiten von HPC und Cloud für datenintensive Wissenschaft||Sprache:||Englisch||Autor*in:||Gadban, Frank||Schlagwörter:||HPC; Cloud; Convergence; Cloud Storage; Performance Overhead; Datenspeicherung||GND-Schlagwörter:||Cloud ComputingGND
|Erscheinungsdatum:||2022-12-22||Tag der mündlichen Prüfung:||2022-12-05||Zusammenfassung:||
With the advent of the exascale era, the exponential growth in data volumes, and the rapid development of networking/cloud technologies, the cloud and HPC convergence is the subject of many conversations within the scientific community.
In particular, HPC and cloud storage come from different assumptions that led to different underlying storage architectures and optimization techniques, which seem to be incompatible with one another from an abstract perspective. However, it is mandatory to overcome these differences and converge on the architectures so that scientific workflows do not need to differentiate between HPC and cloud storage but can benefit from the advantages of both.
This thesis investigates the HPC-Cloud convergence in the broader sense and focuses on the usage of cloud storage infrastructure for HPC workloads to optimize the scalability, performance, and cost-efficiency of the underlying infrastructure and improve the productivity of scientists running complex compute workflows.
In this work, the following research questions are addressed:
Can we use HPC and cloud storage technologies concurrently? What workflows will benefit from such settings, and which I/O interfaces are suitable? How can we achieve optimal data sharing between HPC and cloud resources? Is moving resource-demanding applications from on-premise to the public cloud a cost-effective solution compared to a hybrid alternative?
For this purpose, the term convergence is precisely defined, and a full-featured convergence assessment model is introduced and used to compare possible convergence scenarios. The comparison shows that using cloud storage inside HPC is one of the most promising approaches to leveraging HPC Cloud convergence.
The performance of this scenario depends on the overhead introduced by using REST as a storage protocol in an HPC environment.
A performance model for the relevant HTTP operations based on hardware counters is presented and experimentally validated.
The obtained results reveal that an accurately configured REST implementation can provide high performance and match the HPC-specific implementation of MPI in terms of throughput for most file sizes and in terms of latency for file sizes exceeding one MB.
Furthermore, the performance of the S3 interface offered by different object storage implementations on HPC and in the cloud is thoroughly investigated.
The results indicate that the tested S3 implementations are not yet ready to serve HPC workloads directly, mainly because of the drastic performance loss and the lack of scalability.
The approach to identifying the cause of the performance loss --- by systematically replacing parts of the S3 stack --- leads to introducing a new S3 access library, S3embedded, which proves to be highly scalable and capable of leveraging the shared cluster file systems of HPC infrastructure to accommodate several S3 client applications.
Using S3Embedded as a lightweight drop-in replacement for LibS3 is an enabling factor for Cloud-HPC agnostic applications that can be seamlessly executed in the public cloud or HPC and a massive step towards achieving HPC Cloud convergence.
|Enthalten in den Sammlungen:||Elektronische Dissertationen und Habilitationen|
Dateien zu dieser Ressource:
|Dissertation Frank Gadban-20230127.pdf||Dissertation Frank Gadban- HPC and Cloud Convergence||28bcf0bb7746da7ffa23298936b144f0||8.03 MB||Adobe PDF||Öffnen/Anzeigen|
geprüft am 31.03.2023
geprüft am 31.03.2023