This project presents a Materials Informatics Workbench that resolves the challenges confronting materials scientists in the aspects of materials science data assimilation and dissemination. It adopts an approach that has ingeniously combined and extended the technologies of the Semantic Web, Web Service Business Process Execution Language (WSBPEL) and Open Archive Initiative Object Reuse and Exchange (OAI-ORE). These technologies enable the development of novel user interfaces and innovative algorithms and techniques behind the major components of the proposed workbench.
In recent years, materials scientists have been struggling with the challenge of dealing with the ever-increasing amount of complex materials science data that are available from online sources and generated by the high-throughput laboratory instruments and data-intensive software tools, respectively. Meanwhile, the funding organizations have encouraged, and even mandated, the sponsored researchers across many domains to make the scientifically-valuable data, together with the traditional scholarly publications, available to the public. This open access requirement provides the opportunity for materials scientists who are able to exploit the available data to expedite the discovery of novel compound materials. However, it also poses challenges for them. The materials scientists raise concerns about the difficulties of precisely locating and processing diverse, but related, data from different data sources and of effectively managing laboratory information and data. In addition, they also lack the simple tools for data access and publication, and require measures for Intellectual Property protection and standards for data sharing, exchange and reuse. The following paragraphs describe how the major workbench components resolve these challenges.
First, the materials science ontology, represented in the Web Ontology Language (OWL), enables, (1) the mapping between and the integration of the disparate materials science databases, (2) the modelling of experimental provenance information acquired in the physical and digital domains and, (3) the inferencing and extraction of new knowledge within the materials science domain. Next, the federated search interface based on the materials science ontology enables the materials scientists to search, retrieve, correlate and integrate diverse, but related, materials science data and information across disparate databases. Then, a workflow management system underpinning the WSBPEL engine is not only able to manage the scientific investigation process that incorporates multidisciplinary scientists distributed over a wide geographic region and self-contained computational services, but also systematically acquire the experimental data and information generated by the process. Finally, the provenance-aware scientific compound-object publishing system provides the scientists with a view of the highly complex scientific workflow at multiple-grained levels. Thus, they can easily comprehend the science of the workflow, access experimental information and keep the confidential information from unauthorised viewers. It also enables the scientists to quickly and easily author and publish a scientific compound object that, (1) incorporates not only the internal experimental data with the provenance information from the rendered view of a scientific experimental workflow, but also external digital objects with the metadata, for example, published scholarly papers discoverable via the World Wide Web (the Web), (2) is self- contained and explanatory with IP protection and, (3) is guaranteed to be disseminated widely on the Web.
The prototype systems of the major workbench components have been developed. The quality of the material science ontology has been assessed, based on Gruber’s principles for the design of ontologies used for knowledge–sharing, while its applicability has been evaluated through two of the workbench components, the ontology-based federated search interface and the provenance-aware scientific compound object publishing system.
Those prototype systems have been deployed within a team of fuel cell scientists working within the Australian Institute for Bioengineering and Nanotechnology (AIBN) at the University of Queensland. Following the user evaluation, the overall feedback to date has been very positive. First, the scientists were impressed with the convenience of the ontology-based federated search interface because of the easy and quick access to the integrated databases and analytical tools. Next, they felt the surge of the relief that the complex compound synthesis process could be managed by and monitored through the WSBPEL workflow management system.
They were also excited because the system is able to systematically acquire huge amounts of complex experimental data produced by self-contained computational services that is no longer handled manually with paper-based laboratory notebooks. Finally, the scientific compound object publishing system inspired them to publish their data voluntarily, because it provides them with a scientific-friendly and intuitive interface that enables scientists to, (1) intuitively access experimental data and information, (2) author self-contained and explanatory scientific compound objects that incorporate experimental data and information about research outcomes, and published scholarly papers and peer-reviewed datasets to strengthen those outcomes, (3) enforce proper measures for IP protection, (4) comply those objects with the Open Archives Initiative Protocol – Object Exchange and Reuse (OAI-ORE) to maximize its dissemination over the Web and,(5) ingest those objects into a Fedora-based digital library.