An official website of the United States government.

This is not the current EPA website. To navigate to the current EPA website, please go to This website is historical material reflecting the EPA website as it existed on January 19, 2021. This website is no longer updated and links to external websites and some internal pages may not work. More information »

CADDIS Volume 4

Using R to Estimate Taxon-Environment Relationships

How to Estimate Taxon-Environment Relationships Using taxon.env()

The simple script provided on the R Scripts for Parametric Regressions page estimates relationships between taxon occurrences and a single environmental variable. This script can be modified to model multiple variables, but an easier approach is to use the script taxon.env included in the the library bio.infer. An added advantage of using taxon.env is that model results are automatically formatted for using maximum likelihood inferences (mlsolve) to compute biological inferences (see more description on the Compute Biological Inferences page in the Helpful Links box).

If you have not yet tried Using Existing Taxon-Environment Relationships (see the Helpful Links box) to compute inferences at your sites, please do so. The reasons for working through methods for Using Existing Taxon-Environment Relationships are twofold. First, many of the scripts introduced in this process are also used to estimate taxon-environment relationships from your local data. Second, inferences computed from existing taxon-environment relationships can often provide a useful point of comparison when developing your own models.

This remainder of this page provides a step-by-step guide for estimating taxon-environment relationships using Oregon data as an example.

The results from this process (coef.local) can now be used to compute inferences, using the same procedure as with existing taxon-environment relationships.

  1. Load inference library.
    Set up a workspace and install the R library bio.infer (see Download Files and Set Up R in the Helpful Links box).

    Load the biological inference library by typing at the R prompt:

  2. Load local benthic macroinvertebrate count and environmental data.

    Load sample data into R with the following commands.

  3. Standardize taxonomy of benthic count data (for more information, see Standardize Taxonomy in the Helpful Links box).

    Type at the R prompt:

    data(itis.ttable) # Load ITIS taxonomic information <- get.taxonomic(bcnt.OR, itis.ttable) 
  4. Estimate taxon-environment relationships.

    Type at the R prompt:

    coef.local <- taxon.env(~sed + sed^2,, envdata.OR, 
    bcnt.siteid = "SVN", bcnt.abndid ="CountValue", env.siteid = "STRM.ID")

    The calling statement above specifies that we wish to model the occurrence of different taxa as a function of the variable sed (the percent sand and fines in the substrate) and the square of sed. Then, the names of the benthic count data ( and the environmental data (envdata.or) are provided. Finally, the names of different fields that contain sample identifiers and the abundance data in the benthic count file are supplied.

  5. The results from this process (coef.local) can now be used to compute inferences, using the same procedure as described on the page Using Existing Taxon-Environment Relationships (see the Helpful Links box).

Top of Page