A Roadmap for Nanoinformatics
A Decade-Long Vision
This Roadmap is proposing a decade of growth, beginning with initial activities focused primarily around workshops and pilot projects. The Roadmap is intended to be dynamic, being updated periodically. Workshops review recent progress in nanoinformatics and identify critical needs and emerging opportunities. Pilots mobilize fast action on a small number of priority topics. The workshops, pilots, and broader nanoinformatics R & D activity all feed the continued development of the Roadmap. The Nanoinformatics 2010 Workshop demonstrated current projects, opened a discussion of potential solutions to nanoinformatics problems and drivers, and established new pilot projects to solve those problems. The workshop will be followed by a period of work by distinct groups on specific problems—the pilot projects—that will come together in a follow up workshop in late 2011 to report on progress and update the nanoinformatics roadmap for 2012.
The first few years of activity are truly critical as they will demonstrate the willingness of the nanoinformatics community’s commitment to moving forward. The Roadmap will identify cross-cutting issues that impact the long-term vibrancy of nanoinformatics and propose a path forward. The pilot projects outlined are foundational and will serve to produce more activity in following years.
Through year five, foundational projects and advocacy will make nanoinformatics an essential component of the nanotechnology research and development enterprise. It is further expected that during this time additional areas of pilot development will be established as new themes evolve through the ongoing workshops and discussion amongst participants. Critical to these emerging themes will be both input and adaptation by industry to better address the key needs of this essential portion of nanotechnology stakeholders. Moving toward the ten-year perspective, the Roadmap will identify objectives that move toward a robust, collaborative nanoinformatics system that will have a demonstrated impact on the scientific and societal aspects of nanotechnology.
Pilot projects are intended to demonstrate the feasibility and showcase the impact of nanoinformatics on specific, tightly-focused topics. Cooperative efforts can demonstrate successful implementation of projects with low investment and significant results, while laying the groundwork for later, more extensive efforts.
To the maximum extent possible, these pilots will leverage cooperative activities already funded and underway. In some cases, new resources will be needed to realize pilot activity. In all cases, each proposed pilot is not intended to duplicate a similar effort that may be of interest to a particular funding agency. Rather, it is the intent that each proposed pilot be adopted or absorbed by one or more funding source, and that, through affiliation with this Nanoinformatics Roadmap community, each pilot is provided with more comprehensive set of expert resources and a platform from which to achieve and showcase progress.
Eleven pilots proposed during the workshop have been consolidated into seven complementary, one-year pilot projects, described below. Two of them are concerned with engagement; two are focused on metadata and standards; and three are geared toward tools development and deployment. As “one-year” pilots, they are expected to make some definitive progress within the first year of activity. This does not preclude ongoing work or future activities. The “one-year” designation is a helpful mechanism to spur focused activity within each group.
1. Consortium for Coordinating Nanomaterials Research Data
Given the breadth of complementary nanomaterials research efforts underway, coordination of activities is necessary to ensure functional integration and sharing of data/information, to improve the efficiency of information transfer from data to knowledge, and to reduce the incidence of duplicative studies. Currently, there is no program or agency liaison to coordinate between various organizations and enforce the standardization of data set information.
This pilot seeks to establish a consortium, for example a Nanotechnology Research Coordination Network, for coordinating between the various organizations for such activities as issuing data set quality factors, establishing ILS calibrations, and ensuring that the necessary requirements and information exists for follow-on risk assessment studies. In addition, such a consortium would coordinate interdisciplinary, collaborative research efforts; communicate networking efforts and educational outreach opportunities; and provide expertise to government, academia, and industry on nanomaterials.
The outcome of this pilot would be, minimally, a proposal for funding to establish a dedicated nanomaterials informatics consortium. Impact would be community-wide and foundational, potentially reaching all sectors engaged with nanomaterials for study or commercial application.
2. Workshops for Focused Nanomaterials Development Using Nanoinformatics
This pilot will run workshops to target two specific nanomaterials of high potential impact to use as scientific drivers and areas of proof-of-concept assessment for the application of nanoinformatics methods. The two topics will focus on a specific area within the field of nanocomposites and a specific area of nanomedicine. Topics are chosen which already have a substantial base of literature and data, and for which some informatics tools already exist. These will start with virtual workshops and follow up with an in-person meeting including participants from the industrial and research sectors.
Each workshop will frame the outcome in terms of materials challenge, for example, materials by design and use of the web for materials development. It is a priority to work with industry and trade organizations. The workshops will determine type of data and information needed with emphasis on physico-chemical properties, EHS and other desired data. The workshop will engage suppliers and users of nanomaterials, modelers, experimentalists, and informatics specialists. The objective is to use nanoinformatics tools to identify scientific information gaps and inform funding agencies of high priority topics as potential areas for support.
Metadata and Standards Pilots
3. Meta-ontology for Cross-discipline, Cross-sector Information Exchange
The diversity of domain-specific ontologies and taxonomies for nanotechnology R&D is an impediment to broad-based and effective information and knowledge sharing. Creating an upper-level ontology and demonstrating its applicability across multiple domains would provide a common vocabulary for the nanotechnology community and present a facile mechanism for the sharing of data among complementary but distinct research programs.
The Meta-ontology Pilot will focus on integrating and rationalizing standards already in use, and defining interactions between concepts as validated and reusable methods that deliver value to the stakeholders. This pilot is designed to model existing knowledge in a way that it can be correctly delivered. The two activities are clearly complimentary and in concert would deliver more value to the user. The first activity will be addressed by creating an abstract core ontology that would eventually allow stakeholder groups to map their taxonomies and semantic web ontologies to the core, although initially that would be done by the project team. Multiple and overlapping terms can coexist and be managed by contextual relevance and equivalence maps. The second activity will be addressed by defining each concept in the core ontology as a set of formalized quantitative and/or qualitative scenarios, including (but not limited to) rules, formulas, fuzzy logic, standards, measures, methods for validation and sensitivity, as well as contextual parameters. As a result, rather than aiming for a static definition of each concept, additional scenarios can be proposed as science and technology evolves and vetted scenarios can be added to the core. The evolving model can become an engine for some aspects of validation and development of new theories.
Primary output of this pilot would be a core nano ontology. Two of our critical success indicators would be 1) usefulness and usability across a range of stakeholders that is likely to grow, and 2) scientific validity or verifiability that is consistent across diverse stakeholder groups, which could include regulatory agencies, discovery researchers, and product engineers spanning multiple application domains. All Nanotechnology R&D stakeholders (e.g., researchers, product designers, and regulators from diverse application areas) would potentially be impacted by such an overarching approach.
4. Minimum Information Requirements for Data Sharing (Completeness and Quality)
Present materials characterization provides data sets that may include a range of analytical characterization techniques providing specific properties for a given nanomaterial being studied. This information is then made accessible through a given platform database where the information is archived. As databases are further developed, or a given data set is expanded upon, no specification of information requirements are provided, either in terms of data quality or completeness.
This pilot will determine the minimal information required for nanomaterials data sets, both in terms of completeness and quality. Activities will determine the necessary information requirements for data sets to enable sharing and/or incorporating data within pre-existing databases in such a way that a quality factor can be associated with each data set and that all data sets contributed to a given database have some standards for further sharing and use. This includes specifications of analytical techniques used to characterize a given nanomaterial, and further provides the basis for standardization of these techniques, along with how the data is actually processed and archived.
The outcome of the pilot would be a list of standard materials information and characterization techniques, as well as the minimum data set necessary to obtain a specified quality factor to be included within a database. Any researcher desiring to access and use the data sets archived within a database have some assurance regarding the types of analysis, characterization, and integrity of the information. This would potentially impact the entire community utilizing nanomaterials.
5. Meta-crawler for Mining Nanotechnology Repositories and Open Access Sources
Conventional search engines crawl the Web broadly, not deeply. To obtain all the information possible from each existing nanotechnology/nanomaterial database on a selected material and conduct gap analysis on this collection requires a tool that can deeply and intelligently explore the known nanotechnology databases.
This pilot will create and implement a custom metacrawler to mine the known nanotechnology databases as well as open access nanotechnology resources. Such a project could be used with the goal answering a specific research question and could demonstrate data gaps and help to articulate subsequent calls for action.
In addition, this project will include the objective to recommend a guideline of minimum suggested content for literature abstracts to facilitate metacrawler search and discovery. Although research abstracts and author keywords are required metadata for publication, there is a lack of uniformity for such information among various STEM and society publishers, making the search, retrieval, and mining of such data a challenge. A meta-crawler with requirements for abstract content would not only generate richer and more meaningful search results but also facilitate the systematic mining of such literature for large-scale exploration of literature sets.
6. nano-SAR Education and Dissemination
This project will demonstrate the ability to combine structural data and modeling to develop nanoscale structure-activity relationships (nano-SARs) that can be disseminated as an educational tool; efforts of this pilot will coordinate with materials activities in other pilots and existing projects.
Initial work will focus on assessing and consolidating existing knowledge of nano-SARs from the current literature. Also, the group will define the minimal standards required for performing valid nano-SARs as compared to more general minimal information standards required for the characterization of nanomaterials.
Subsequent work for this pilot will be to create an educational model for structural data that clearly illustrates nanoscale structure-activity relationships. This model will be widely disseminated via the nanoHUB to engender consistent understanding of what is a fundamental element for simulating nanomaterial-biological interactions.
The outcome of this pilot will be a nanoHUB module on nano-SARs with supporting reference and pedagogical material. This pilot’s most immediate impact would be among the nanotechnology education community, but could have broader impact as a “textbook” equivalent for structure activity relationships.
7. Simulation Resources and Simulation Challenge
The medical community has highly accurate reference calculations, to help comparisons. Validation and verification is integral to the accuracy of such research tools. For software and model development, a simulation challenge—such as a blind prediction challenge—targeting the properties of a specific nanomaterial could produce such standard reference systems to compare and validate data emerging from such calculational tools. Such a challenge would need to have well-defined goals, identify data gaps, and include mechanisms of sensitivity analysis.
For simulation and modeling, a target material must be selected. The choice of material depends on a pressing end goal, a specific property, or combination of properties. For toxicity studies, both a toxic material and nontoxic analogs are needed. A nanomedicine target is another interesting choice. These pilot activities could be coupled as a satellite to an allied conference. The limitations of the physical aspects of the models also need to be considered. Materials that undergo large structural changes, or particles with surfaces that change over time in response to their environment, may exceed current calculational capabilities. Standard nanostructures that are well characterized should be used to compare and validate calculation tools. (Analogous to the use of G20 for Gaussian.)
The desired inputs and outputs need to be addressed. The data and the tools used to associate that data need to be clearly defined. Also, additional property measurements should be identified that are needed to make the challenge robust. The desired tool types—first principle simulation, empirical models, data analysis, data visualization, and data exchange tools—should be clearly indicated, as driven by research or development needs.
Communication and Assessment Recommendations
As we move forward with The Nanoinformatics 2020 Roadmap, and with respect to the geographically-distributed and diverse nature of the community, we recommend that existing initiatives and pilot projects build education and communication components into their day-to-day work that utilize networked communication tools in efficient ways and enable productive knowledge transfer among community members. For example,
- Employ metadata standards, such as the NanoParticle Ontology, in current work projects and integrate them into routine workflows and project documentation and procedures;
- Create APIs useful for federation and deployment of web services for nanotechnology research and development;
- Use analytics as metrics for evaluating the impact of data and web services;
- Engage in workshops and bring new mechanisms for nanotechnology research and development to the community where it can be tested and used;
- Use existing tools for dissemination and sharing of data and information;
- Consider using author addenda when submitting manuscripts to closed-access journals, publishing in Open Access journals, or self-archiving manuscripts in institutional or domain appropriate repositories so that research becomes more widely discoverable.
These are just a few examples of ways that day-to-day research activities, either in academic, government, or industry labs, that can make data more readily available to the nanoinformatics community.