Status: Bibliographieeintrag
Standort: ---
Exemplare:
---
| Online-Ressource |
Verfasst von: | Neuwirth, Sarah [VerfasserIn]  |
| Frey, Dirk [VerfasserIn]  |
| Brüning, Ulrich [VerfasserIn]  |
Titel: | Communication models for distributed Intel Xeon Phi coprocessors |
Verf.angabe: | Sarah Neuwirth, Dirk Frey and Ulrich Bruening |
Umfang: | 8 S. |
Fussnoten: | Published online: 18 January 2016 ; Gesehen am 23.05.2018 |
Titel Quelle: | Enthalten in: IEEE International Conference on Parallel and Distributed Systems (21. : 2015 : Melbourne): 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS) |
Jahr Quelle: | 2015 |
Band/Heft Quelle: | (2015), S. 499-506 |
ISBN Quelle: | 978-0-7695-5785-4 |
Abstract: | The emergence of accelerator technology in current supercomputing systems is changing the landscape of supercom-puting architectures. Accelerators like GPGPUs and coprocessors are optimized for parallel computation while being more energy efficient. Their computational power per watt plays a crucial role in developing exaflop systems. Today's accelerators come with some limitations. They require a local host to configure and operate them. In addition, the number of host CPUs and accelerators does not scale independently. Another problem is the unbalanced communication between distributed accelerators. New communication frameworks are developed to optimize the internode communication. In this paper, four communication models using the Intel Xeon Phi coprocessor technology are compared. The Intel Xeon Phi coprocessor is based on the Intel Many Integrated Cores technology. It is an attractive accelerator due to its embedded Linux operating system, up to 1 TFLOPS of performance on a single chip, and its x86 64 compatibility. DCFA-MPI, MVAPICH2-MIC, and HAM-Offload are compared against the communication architecture for network-attached accelerators (NAA). Each communication model optimizes a different layer of the MIC communication architecture. The NAA approach makes the accelerator device independent from a local host system. Furthermore, it enables the accelerator to source and sink network traffic. Workloads can be dynamically assigned during run-time in an N to M ratio between CPUs and accelerators. The latency, bandwidth, and performance of the MPI communication layer of a prototype implementation are evaluated. |
DOI: | doi:10.1109/ICPADS.2015.69 |
URL: | Bitte beachten Sie: Dies ist ein Bibliographieeintrag. Ein Volltextzugriff für Mitglieder der Universität besteht hier nur, falls für die entsprechende Zeitschrift/den entsprechenden Sammelband ein Abonnement besteht oder es sich um einen OpenAccess-Titel handelt.
Resolving-System: http://dx.doi.org/10.1109/ICPADS.2015.69 |
| Verlag: https://ieeexplore.ieee.org/document/7384332/ |
| DOI: https://doi.org/10.1109/ICPADS.2015.69 |
Datenträger: | Online-Ressource |
Sprache: | eng |
K10plus-PPN: | 1575433664 |
Verknüpfungen: | → Sammelwerk |
Communication models for distributed Intel Xeon Phi coprocessors / Neuwirth, Sarah [VerfasserIn] (Online-Ressource)
68253953