S-ID.7 Used Subaru Foresters II


Alignments to Content Standards: S-ID.C.7

Task

Jane wants to sell her Subaru Forester and does research online to find other cars for sale in her area. She checks on craigslist.com and finds 22 Subaru Foresters recently listed, along with their mileage (in miles), age (in years), and listed price (in dollars). (Collected on June 6th, 2012 for the San Francisco Bay Area.)

She examines the scatterplot of price versus age and determines that a linear model is appropriate. She finds the equation of the least squares regression equation:

$$ \text{predicted price} = 24,247.56 -1482.06 \text{ age}. $$

  1. What variable is the explanatory (independent) variable and what are the units it is measured in? What variable is the response (dependent) variable and what are the units it is measured in?
  2. What is the slope of the least squares regression line and what are its units?
  3. Interpret the slope of the least squares regression line in the context of the problem, discussing what the slope tells you about how price and age are related. Use appropriate units in your answer.
  4. What is the $y$-intercept of the least squares regression line? Interpret the $y$-intercept in the context of the problem, including appropriate units.

IM Commentary

This problem could be used as a lesson or an assessment. It is important to emphasize that regression lines always have context in statistics and that we can understand the importance of slope by thinking about “rise over run” using units from the context. It is also a good place to remind them that while slope always has an important meaning in a regression model, the $y$-intercept may not, though in this example it does. The US News and World Report website lists the average national price of a 2012 ($\text{age} = 0$) Subaru Forester to be a range of \$20,505 to \$29,411 depending on the features ( http://usnews.rankingsandreviews.com/cars-trucks/Subaru_Forester/prices/ ).

Solution

  1. In this case, age is the explanatory (independent) variable measured in years. The response (dependent) variable is price measured in dollars.
  2. The slope of the least squares regression line is -1482.06. Its units are \$/year. It helps to sketch a graph of the least squares regression line and draw in a representative "rise over run". That way we can see the units of the response variable over the units of the explanatory variable.
  3. The slope tells us that for each additional year of age, predicted price drops by \$1482. Alternatively a student might conclude that Foresters are expected to lose about \$1500 in value each year. The key idea is to see that the slope tells us about the change in the response variable with respect to changes in the explanatory variable.
  4. The $y$-intercept is \$24,247.56. The meaning of this intercept is the predicted price of a car that is brand new ($\text{age} = 0$). When 0 is far outside the range of explanatory variable values in the data set, it is usually not appropriate to interpret the intercept in context. However, in this case the value of the intercept is right in line with national average cost of a new (2012) Subaru Forester.