Multiple Linear Model Selection: A Data Study on the Ecological Controls on Methylmercury Production
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Data on methylmercury (MeHg) concentrations and other environment parameters was collected between 2006 to 2012 from wetlands in the prairie pothole region in Saskatchewan by members of the Department of Biology, University of Regina. My research goal was to use the Backward Elimination model selection to develop the best fitted linear regression model to test the impact of seven environment variables on methylmercury concentration. Before analysis, I cleaned the whole data set and removed all missing values and transformed some variables such as methylmercury, sulfate, and conductivity. I compared linear regression model with two-way interactions and no interaction linear model. After applying the linear model selection, I used several methods to check the model including Residual vs. Fitted Values, Normal Q-Q plots, compare AIC; after I checked for outliers in the combined data set with final model. The results of this study indicate that temperature (Temp), dissolved organic carbon (DOC), specific ultra-violet absorbance (SUVA), logarithm transformation of conductivity (Cond) and sulfate (SO_4) impacted methylmercury concentrations in our model. The best fitted model I selected was log (MeHg) = β_0+ β_1Temp + β_2log (Cond) + β_3log (SO_4) + β_4DOC + β_5SUVA + ε