An official website of the United States government.

This is not the current EPA website. To navigate to the current EPA website, please go to This website is historical material reflecting the EPA website as it existed on January 19, 2021. This website is no longer updated and links to external websites and some internal pages may not work. More information »

Computational Toxicology Communities of Practice: Informatics-Based Approaches for Managing and Curating Exposure Data

Date and Time

Thursday 09/24/2020 11:00AM to 12:00PM EDT
Add to Calendar


Please feel free to forward to others who may be interested in joining!

You are invited to the EPA CompTox Communities of Practice.

Topic: Informatics-Based Approaches for Managing and Curating Exposure Data

Who: Dr. Kathie Dionisio, Environmental Health Scientist in the Center for Computational Toxicology and Exposure and Dr. Kristin Isaacs, Research Physical Scientist in the Center for Computational Toxicology and Exposure

When: September 24, 2020 from 11:00 AM- 12:00 PM EST

Where: Register for webinar using Eventbrite and check the confirmation email for webinar information.

Topic overview:

Exposure data, including chemical use and consumer product information, are required to inform chemical prioritization workflows and other assessments. As such the demand for transparent, curated, and high-quality exposure data is increasing. Under EPA’s ExpoCast project, new informatics-based methods and tools are being developed to facilitate collection, curation, and management of exposure-relevant information. We present here the ChemExpoDB/Factotum suite. ChemExpoDB is an integrated family of exposure databases linking data across multiple exposure domains. A web-based software application (called Factotum) has been developed to facilitate manual and automated management and annotation of data included in ChemExpoDB.

Currently, ChemExpoDB includes over 500,000 primary source documents linked to >3.9 million chemical records, each containing product composition (consumer, industrial, and occupational products), functional use, or general chemical use information. Reported chemical identifiers were curated to unique chemical structures (DTXSIDs) using automated and manual techniques. The Factotum application allows for tracking of data extraction processes, documentation of randomized QA checks, and tracking of the original chemical records mapped to over 27,000 DTXSIDs. Machine learning-based natural language classifiers are being used to assign documents associated with consumer products to standard categories for linking with exposure models. In addition, elastic search algorithms have been implemented for rapidly identifying documents relevant to evaluation of individual chemicals.

The ChemExpoDB/Factotum framework is being expanded to incorporate additional exposure data domains, including multimedia monitoring measurements and other exposure factor data (e.g., consumer product use patterns). Further, much of the data is now available through the EPA’s CompTox Chemicals Dashboard ( These tools can increase the volume, scope, and quality of chemical information available for use in Agency decision-making.

This abstract does not necessarily reflect U.S. EPA policy.

For more information visit the EPA's Computational Toxicology Communities of Practice webpage