Runtime verification of scientific computing: towards an extreme scale

Minh Ngoc Dinh, Jin, Chao, Abramson, David and Jeffery, Clinton L. (2016). Runtime verification of scientific computing: towards an extreme scale. In: Proceedings of ESPT 2016: 5Th Workshop On Extreme-Scale Programming Tools. Workshop on Extreme-Scale Programming Tools (ESPT), Salt Lake City, UT, United States, (26-33). 13-18 November 2016. doi:10.1109/ESPT.2016.008


Author Minh Ngoc Dinh
Jin, Chao
Abramson, David
Jeffery, Clinton L.
Title of paper Runtime verification of scientific computing: towards an extreme scale
Conference name Workshop on Extreme-Scale Programming Tools (ESPT)
Conference location Salt Lake City, UT, United States
Conference dates 13-18 November 2016
Convener IEEE
Proceedings title Proceedings of ESPT 2016: 5Th Workshop On Extreme-Scale Programming Tools
Journal name Proceedings of ESPT 2016: 5th Workshop on Extreme-Scale Programming Tools - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis
Place of Publication Piscataway, NJ, United States
Publisher IEEE
Publication Year 2016
Sub-type Fully published paper
DOI 10.1109/ESPT.2016.008
Open Access Status Not yet assessed
ISBN 9781509039180
Start page 26
End page 33
Total pages 8
Language eng
Abstract/Summary Relative debugging helps trace software errors by comparing two concurrent executions of a program - one code being a reference version and the other faulty. By locating data divergence between the runs, relative debugging is effective at finding coding errors when a program is scaled up to solve larger problem sizes or migrated from one platform to another. In this work, we envision potential changes to our current relative debugging scheme in order to address exascale factors such as the increase of faults and the nondeterministic outputs. First, we propose a statistical-based comparison scheme to support verifying results that are stochastic. Second, we leverage a scalable data reduction network to adapt to the complex network hierarchy of an exascale system, and extend our debugger to support the statistical-based comparison in an environment subject to failures.
Keyword Exascale computing
Stochastic online verification
Invariants
Q-Index Code E1
Q-Index Status Provisional Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Sun, 05 Feb 2017, 01:00:30 EST by Web Cron on behalf of Learning and Research Services (UQ Library)