An official website of the United States government.

This is not the current EPA website. To navigate to the current EPA website, please go to www.epa.gov. This website is historical material reflecting the EPA website as it existed on January 19, 2021. This website is no longer updated and links to external websites and some internal pages may not work. More information »

CADDIS Volume 4

Using R to Estimate Central Tendencies

How to Estimate Central Tendencies

In R, the basic formula for computing a central tendency for a taxon can be represented in general as follows (also see Central Tendencies page, Equation 1, link in Helpful Links box) :

WA <- sum(Y*x)/sum(Y)    

where Y is a vector containing the abundance of the taxon of interest for each sample. Y can also contain presence/absence data coded as 1 for present and 0 for absent. x is vector containing the value of the environmental variable of interest for each sample, and sum computes the sum of the values of a numerical vector.

A for loop can be used to create weighted averages for many different taxa from previously loaded data. To run this script, first make sure that you have loaded the sample biological and environmental data (see Download Scripts and Sample Data in Quick Links) and merged them into a single data frame called dfmerge.

Designate the taxa for which you want to compute tolerance values.

taxa.names <- c("ACENTRELLA", "DIPHETOR", "AMELETUS")

Then, run a for loop that repeats the weighted average computation for each of the selected taxa.

# Define WA to be vector of length the same as the 
# number of taxa of interest
WA <- rep(NA, times = length(taxa.names))

for (i in 1:length(taxa.names)) {
    WA[i] <- sum(dfmerge[,taxa.names[i]]*dfmerge$temp)/
                 sum(dfmerge[,taxa.names[i]])
}
# Name each of the elements of the vector by their taxa name
names(WA) <- taxa.names
print(WA)

Top of Page