Downloadable Computational Toxicology Data
EPA’s computational toxicology research efforts evaluate the potential health effects of thousands of chemicals. The process of evaluating potential health effects involves generating data that investigates the potential harm, or hazard of a chemical, the degree of exposure to chemicals as well as the unique chemical characteristics.
As part of EPA’s commitment to share data, all of the computational toxicology data is publicly available for anyone to access and use. EPA's computational toxicology data is considered "open data", and thus all of the data below are free of all copyright restrictions, and fully and freely available for both non-commercial and commercial use.
High-throughput Screening Data
EPA researchers use rapid chemical screening (called high-throughput screening assays) to limit the number of laboratory animal tests while quickly and efficiently testing thousands of chemicals for potential health effects.
- ToxCast Data: High-throughput screening data on thousands of chemicals.
Rapid Exposure and Dose Data
EPA researchers develop and use rapid exposure estimates to predict potential exposure for thousands of chemicals.
- High-throughput toxicokinetics data: It is important to link the external dose of a chemical to an internal blood or tissue concentration, this process is called toxicokinetics. EPA researchers measure the critical factors that determine the distribution and metabolic clearance for hundreds of chemicals and incorporate these data into computer models. The high-throughput toxicokinetic data can be paired with the high-throughput screening data to estimate real-world exposures.
Sustainable Chemistry Data
EPA researchers use chemistry data such as chemical structures and physicochemical property information to evaluate thousands of chemicals for potential health effects.
- Collaborative Estrogen Receptor Activity Prediction Project Data: Data and supplemental files from CERAPP (A large-scale modeling project). CERAPP combined multiple models developed in collaboration with 17 groups in the United States and Europe to predict estrogen receptor activity of a common set of 32,464 chemical structures. Quantitative structure-activity relationship models and docking approaches were employed, to build a total of 40 categorical and 8 continuous models for binding, agonist, and antagonist ER activity.
- Chemicals Dashboard Data: Data from the Chemicals Dashboard including the mappings between the DTXSIDs and the InChIStrings and Keys, SDF files containing all chemical structures and relevant information, and a file containing CAS Number, Preferred Chemical Name and DTXSID file.
Virtual Tissues Data
EPA researchers develop virtual tissue computer models to simulate how chemicals may affect human development. Virtual tissue models are some of the most advanced methods being developed today. The models will help reduce dependence on animal study data and provide much faster chemical risk assessments.
- Tipping Point Data: EPA researchers develop mathematical models to predict perturbation of biological systems and determine when cellular systems are no longer able to recover. EPA researchers use these models to determine the “Tipping Point”, the point when biological systems are unable to recover from or adapt to chemical exposure. When cellular systems are unable to recover, chemical exposures could lead to adverse outcomes such as cancer.
The Abstract Sifter is a Microsoft Excel based tool that greatly enhances literature searching in PubMed. The tool implements a novel “sifter” functionality for relevance ranking, giving the researcher a way to find articles of interest quickly. The Sifter assists researchers to triage results and keep track of articles of interest. The tool also gives researchers a view of the literature landscape for a set of entities such as chemicals or genes and makes it easy to dive deeper into areas of interest.
- Abstract Sifter: Contains the abstract sifter tool and database along with a user guide.