On ERI sorting for SIMD execution of large-scale Hartree-Fock SCF

Ramdas, Tirath, Egan, Gregory K., Abramson, David and Baldridge, Kim K. (2008) On ERI sorting for SIMD execution of large-scale Hartree-Fock SCF. Computer Physics Communications, 178 11: 817-834. doi:10.1016/j.cpc.2008.01.045

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Ramdas, Tirath
Egan, Gregory K.
Abramson, David
Baldridge, Kim K.
Title On ERI sorting for SIMD execution of large-scale Hartree-Fock SCF
Journal name Computer Physics Communications   Check publisher's open access policy
ISSN 0010-4655
Publication date 2008-06-01
Year available 2008
Sub-type Article (original research)
DOI 10.1016/j.cpc.2008.01.045
Open Access Status Not yet assessed
Volume 178
Issue 11
Start page 817
End page 834
Total pages 18
Place of publication Amsterdam, The Netherlands
Publisher Elsevier BV * North-Holland
Language eng
Subject 1706 Computer Science Applications
3100 Physics and Astronomy
Abstract Given the resurgent attractiveness of single-instruction-multiple-data (SIMD) processing, it is important for high-performance computing applications to be SIMD-capable. The Hartree-Fock SCF (HF-SCF) application, in it's canonical form, cannot fully exploit SIMD processing. Prior attempts to implement Electron Repulsion Integral (ERI) sorting functionality to essentially "SIMD-ify" the HF-SCF application have met frustration because of the low throughput of the sorting functionality. With greater awareness of computer architecture, we discuss how the sorting functionality may be practically implemented to provide high-performance. Overall system performance analysis, including memory locality analysis, is also conducted, and further emphasises that a system with ERI sorting is capable of very high throughput. We discuss two alternative implementation options, with one immediately accessible software-based option discussed in detail. The impact of workload characteristics on expected performance is also discussed, and it is found that in general as basis set size increases the potential performance of the system also increases. Consideration is given to conventional CPUs, GPUs, FPGAs, and the Cell Broadband Engine architecture.
Keyword Electron repulsion integrals
Hartree Fock self consistent field
Single instruction-multiple data processing
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: School of Information Technology and Electrical Engineering Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 6 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 7 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 20 Dec 2013, 02:10:57 EST by Ms Diana Cassidy on behalf of Research Computing Centre