MICA: a fast short-read aligner that takes full advantage of Many Integrated Core Architecture (MIC)

Luo, Ruibang, Cheung, Jeanno, Wu, Edward, Wang, Heng, Chan, Sze-Hang, Law, Wai-Chun, He, Guangzhu, Yu, Chang, Liu, Chi-Man, Zhou, Dazong, Li, Yingrui, Li, Ruiqiang, Wang, Jun, Zhu, Xiaoqian, Peng, Shaoliang and Lam, Tak-Wah (2015) MICA: a fast short-read aligner that takes full advantage of Many Integrated Core Architecture (MIC). BMC Bioinformatics, 16 Supp. 7: 1-8. doi:10.1186/1471-2105-16-S7-S10

Author Luo, Ruibang
Cheung, Jeanno
Wu, Edward
Wang, Heng
Chan, Sze-Hang
Law, Wai-Chun
He, Guangzhu
Yu, Chang
Liu, Chi-Man
Zhou, Dazong
Li, Yingrui
Li, Ruiqiang
Wang, Jun
Zhu, Xiaoqian
Peng, Shaoliang
Lam, Tak-Wah
Title MICA: a fast short-read aligner that takes full advantage of Many Integrated Core Architecture (MIC)
Journal name BMC Bioinformatics   Check publisher's open access policy
ISSN 1471-2105
Publication date 2015-04
Sub-type Article (original research)
DOI 10.1186/1471-2105-16-S7-S10
Open Access Status DOI
Volume 16
Issue Supp. 7
Start page 1
End page 8
Total pages 8
Place of publication London, United Kingdom
Publisher BioMed Central
Collection year 2016
Language eng
Formatted abstract
Background: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU.
An uprising alterative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with
48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be
parallelized easily; however, the performance is often inferior to GPU counterparts as an MIC card contains only
~60 cores (while a GPU card typically has over a thousand cores).

Results: To better utilize MIC-enabled computers for NGS data analysis, we developed a new short-read aligner
MICA that is optimized in view of MIC’s limitation and the extra parallelism inside each MIC core. By utilizing the
512-bit vector units in the MIC and implementing a new seeding strategy, experiments on aligning 150 bp pairedend
reads show that MICA using one MIC card is 4.9 times faster than BWA-MEM (using 6 cores of a top-end CPU),
and slightly faster than SOAP3-dp (using a GPU). Furthermore, MICA’s simplicity allows very efficient scale-up when
multiple MIC cards are used in a node (3 cards give a 14.1-fold speedup over BWA-MEM).

Summary: MICA can be readily used by MIC-enabled supercomputers for production purpose. We have tested
MICA on Tianhe-2 with 90 WGS samples (17.47 Tera-bases), which can be aligned in an hour using 400 nodes.
MICA has impressive performance even though MIC is only in its initial stage of development.

Availability and implementation: MICA’s source code is freely available at http://sourceforge.net/projects/micaaligner
under GPL v3.

Supplementary information: Supplementary information is available as “Additional File 1”. Datasets are available
at www.bio8.cs.hku.hk/dataset/mica.
Keyword Alignment
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ
Additional Notes Article # S10

Document type: Journal Article
Sub-type: Article (original research)
Collections: Non HERDC
Institute for Molecular Bioscience - Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 1 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 01 Sep 2015, 12:30:55 EST by System User on behalf of Scholarly Communication and Digitisation Service