An official website of the United States government.

This is not the current EPA website. To navigate to the current EPA website, please go to www.epa.gov. This website is historical material reflecting the EPA website as it existed on January 19, 2021. This website is no longer updated and links to external websites and some internal pages may not work. More information »

CADDIS Volume 4

Loading Data into R

How to Load Data into R

Data of various formats can be loaded into R. The sample data supplied in this module (see the Download Scripts and Sample Data page) is formatted as tab-delimited text and can be loaded using the following command:

data.set <- read.delim(filename)  # Loads a tab-delimited data file

Then, load these data files into R as follows:

site.species <- read.delim("site.species.txt")
site.species.or <- read.delim("site.species.or.txt")
env.data <- read.delim("env.data.txt")
env.data.or <- read.delim("env.data.or.txt")

Examine the data files you loaded and review how they are formatted:

fix(site.species)
fix(env.data)

The site-species data has a column with a site identifier and then a column for each taxon and a row for each site. Each numerical entry indicates the number of individuals for a given taxon and a given site.

The environmental data has a column with a site identifier and a column for each environmental variable (stream temperature, "temp"; and log-transformed percent sand and fines, "sed").

Next, merge data files so that each set of environmental data is matched with the appropriate biological data.

dfmerge <- merge(site.species, env.data, by = "SITE.ID")
dfmerge.or <- merge(site.species.or, env.data.or, by = "SITE.ID")

Top of Page