An official website of the United States government.

This is not the current EPA website. To navigate to the current EPA website, please go to This website is historical material reflecting the EPA website as it existed on January 19, 2021. This website is no longer updated and links to external websites and some internal pages may not work. More information »

Environmental Economics Seminar: Machine Learning for Causal Inference: An Application to Air Quality Impacts on House Prices

Date and Time

Thursday 11/16/2017 3:00PM to 4:30PM EST
Add to Calendar


Room 1426, William Jefferson Clinton West Building
1301 Constitution Ave., NW
Washington, DC 20001


Contact: Carl Pasurka, 202-566-2275

Presenter:  Jennifer Ho (Economic Analysis Group, Antitrust Division, U.S. Department of Justice)

Description:  Hedonic models are commonly used to recover the implicit prices of house attributes and local nonmarket public goods such as environmental quality. Yet they are plagued by omitted variable bias when variables that are correlated with the attribute in question are unobservable. Typically, researchers have relied on fixed effects, instrumental variables, or quasi-randomness to control for this. However, these methods require strong underlying assumptions that are often a priori implausible. The increase in availability of big data and unstructured data in the form of text and images allow for a more extensive set of variables that are relevant to consumers to be included in hedonic methods. Unstructured data are high-dimensional and require machine learning methods that are robust to multicollinearity and irrelevant variables. I collect a rich and comprehensive dataset of property listings from and extract features from house descriptions and curbside view images using natural language and computer vision tools. I apply machine learning techniques to estimate the effects of air pollution on house prices in Pennsylvania. Coupled with the inclusion of more data, this approach nests previous methods to further reduce bias. My results show that omitting important variables can understate the negative effects of air pollution on home prices.