Garcia, Alexander Garcia, Thoraval, Samuel, Garcia, Leyla J. and Ragan, Mark A. (2005) Workflows in bioinformatics: Meta-analysis and prototype implementation of a workflow generator. BMC Bioinformatics, 6 87.1-87.10. doi:10.1186/1471-2105-6-87

Journal name BMC Bioinformatics   Check publisher's open access policy
ISSN 1471-2105
Publication date 2005-04-01
Sub-type Article (original research)
DOI 10.1186/1471-2105-6-87
Volume 6
Start page 87.1
End page 87.10
Total pages 10
Place of publication London, United Kingdom
Publisher Biomed Central
Collection year 2005
Language eng
Subject 279999 Biological Sciences not elsewhere classified
280103 Information Storage, Retrieval and Management
289999 Other Information, Computing and Communication Sciences
230199 Mathematics not elsewhere classified
780105 Biological sciences
280102 Information Systems Management
Formatted abstract Background
Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces) in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported.

We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results) can be reproduced or shared among users. Availability: (interactive), (download).


From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous analytical tools.
Keyword Workflow capabilities
Workflow operators
syntactic components
Algebraic operators
Analytical workflows
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status UQ
Additional Notes Article number 87.

