An official website of the United States government.

This is not the current EPA website. To navigate to the current EPA website, please go to This website is historical material reflecting the EPA website as it existed on January 19, 2021. This website is no longer updated and links to external websites and some internal pages may not work. More information »

Data Entry for Science Hub


On January 26, 2009 Barack Obama signed a Memorandum1 on Transparency and Open government “to ensure the public trust and establish a system of transparency, public participation and collaboration”. On February 22, 2013 John P. Holdren, Director of the Office of Science and Technology Policy issued a memorandum2 “Increasing Access to the Results of Federally funded Scientific Research” which directed Federal agencies with more than $100M in R&D expenditures “to develop plans to make the results of federally funded research freely available to the public—generally within one year of publication.”

ScienceHub is EPA’s vehicle for meeting the above open data obligation and our opportunity to provide information generated from our research to colleagues and the public. There is an expectation that every EPA product cleared through the Science and Technology Information Clearance System (STICS) will have a corresponding data entry in ScienceHub. There are, however, a few exceptions and the type of information entered in ScienceHub can vary. Following is general guidance for making data available, or at least discoverable, in ScienceHub. Deviations for particular products can be considered but should ultimately maintain the spirit and intent of a transparent and open government see Table 1 for additional examples.

1. Products requiring complete data entry (metadata plus primary/secondary data) are those ‘owned’ by EPA either through in-house or EPA-funded efforts:

  •  Primary data from field or laboratory experiments used to inform/develop the product (e.g., surveys, citizen science/crowdsourcing, computationally, etc.)'
  • Primary data supporting model development
  •  Products containing EPA-generated secondary data (adaptations or additions to primary data) to inform/develop the product (at a minimum, the location of primary data should be cited in a metadata entry)

2. Products requiring metadata entry only are those where the data used in the study were not generated or funded by EPA; the data is already available in the public domain; or, are not available to EPA for public dissemination. Metadata are intended to identify, at a minimum, the format, content and point-of-contact or location where the data can be found. The metadata should have a meaningful title, so the information can be located by a search engine.

  • Data from external co-authors used to inform/develop the product
  • Publicly available data used to inform/develop the product
  • Cases where EPA generated data contains sensitive information (Human subjects research, PII, CBI, CUI, DURC or other homeland security risks)

3. Products requiring no metadata or data entry are those products excepted from data entry requirements. Note that this category, which requires a ‘No’ response in STICS line 14a, must be clearly justified in line 14b. Justifications must identify the reason for exclusion (e.g., product is an editorial with no data or model generated or presented). This category requires Branch Chief approval and the explanation in 14b will likely be reviewed.

  •  Review papers where no new data or models are generated or developed
  • Editorials or opinions
  • Instances where the EPA author conducted the work prior to joining the EPA (e.g., a recent recruit), or when an EPA employee mentors a student, and the data are stored elsewhere. Data belongs to the student or university or the author lists their affiliation in the article as their prior employer/organization affiliation and not EPA. May not require a STICS or ScienceHub entry.
  • EPA Reports and Assessments do not require a ScienceHub entry

Table 1. ScienceHub Journal Article Dataset Entry Requirements






Associated Data? Answer:

ScienceHub required elements


Article used EPA

primary data.

Primary data is data you generated directly from your research.

Results for samples analyzed by EPA labs or for which EPA paid for analysis.


  • Research Effort
  • Scientific Data Management Plan (SDMP)
  • Metadata
  • Dataset


Article uses EPA

secondary data

Secondary is data generated by EPA from data previously published.

Statistical analysis of 15 previously published studies (EPA or Non-EPA) to determine an association between the presence of two pollutants.


  • Research Effort
  • SDMP
  • Metadata
  • Dataset


Model Use

Journal article describes the use of an existing model. No data was collected or analyzed.

Journal article that describes how the use of the SWIM model can be applied to differing situations. Uses sample data to demonstrate the model but not to draw conclusions.




Model development/ refinement

Journal article describes the development or refinement of a model.

Provide data that was used to update the model.

May be primary or secondary data

An improvement to the SWIM model was made based upon new data used to improve the model.


  • Research Effort
  • SDMP
  • Metadata
  • Dataset


Sensitive Data – PII, CUI, DURC* or other homeland security risks

Data was used as part of the research, but the data contains confidential, proprietary business or personal identification information. Data if published would need to be redacted to remove what can’t be shared.

Explanation and instructions for how to contact EPA expert for more information should be provided.

Through a CRADA a private company agrees to share confidential business information and data. This data cannot be released because it would reveal a proprietary secret.


  • Research Effort
  • SDMP
  • Metadata
  • Dataset – if any remains after redaction


Literature review

No scientific data. Just a review of existing literature.

Article reviewingthe state of knowledge on a drinking water treatment process



6 Data collected for the research but doesn’t belong to EPA

Data was not collected in EPA labs or paid for by EPA

Note: If EPA collected any data or generated secondary data in addition to the  collaborator’s data, then must provide EPA’s data (Scenario 1 or 2) and provide other information on how to access collaborator’s data.

Collaborator at another agency or a university asks you to be an author, but they collect the data.  You are listed as an author but did not contribute to the data Yes
  • Research Effort
  • SDMP
  • Metadata
  • Data – Any portion collected by EPA.

See guidance below.


Article is based upon data you or someone else has already made publicly


Data is available already to the public A second article is published based on the same data released for a prior article. Yes
  • Research Effort
  • SDMP
  • Metadata See guidance below.

Note: Sciencehub metadata includes information Collaboration, Description of Data, Keywords, Date of Last Update, and Data Dictionary entries. Additional metadata/descriptors are necessary in the Data tab for articles with data not owned/collected by EPA.


1 Federal Register Vol. 74 No. 15, Jan 26, 2009