Update all PDFs

Coffee and Crime


Alignments to Content Standards: S-ID.B.6 S-ID.C.7 S-ID.C.8 S-ID.C.9

Task

Many counties in the United States are governed by a county council. At public county council meetings, county residents are usually allowed to bring up issues of concern. At a recent public County Council meeting, one resident expressed concern that 3 new coffee shops from a popular coffee shop chain were planning to open in the county, and the resident believed that this would create an increase in property crimes in the county. (Property crimes include burglary, larceny-theft, motor vehicle theft, and arson -- From http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2010/crime-in-the-u.s.-2010/property-crime accessed on December 5, 2012.)

To support this claim, the resident presented the following data and scatterplot (with the least-squares line shown) for 8 counties in the state:

CountyShopsCrimes
A94000
B12700
C0500
D64200
E156800
F5020800
G52800
H2415400

Task_1_93ca2cb666d7e07a8a5018d939358f02

The scatterplot shows a positive linear relationship between "Shops" (the number of coffee shops of this coffee shop chain in the county) and "Crimes" (the number of annual property crimes for the county). In other words, counties with more of these coffee shops tend to have more property crimes annually.

  1. Does the relationship between Shops and Crimes appear to be linear? Would you consider the relationship between Shops and Crimes to be strong, moderate, or weak?
  2. Compute the correlation coefficient. Does the value of the correlation coefficient support your choice in part (a)? Explain.
  3. The equation of the least-squares line for these data is:

    $$ \text{Predicted Crimes} = 1434 + 415.7 \text{(Shops)} $$

    Based on this line, what is the estimated number of additional annual property crimes for a given county that has 3 more coffee shops than another county?

  4. Do these data support the claim that building 3 additional coffee shops will necessarily cause an increase in property crimes? What other variables might explain the positive relationship between the number of coffee shops for this coffee shop chain and the number of annual property crimes for these counties?

  5. If the following two counties were added to the data set, would you still consider using a line to model the relationship? If not, what other types (forms) of model would you consider?

    CountyShopsCrimes
    I2536900
    J2724100

IM Commentary

Note: The data in this task are roughly based on actual values but have been modified to facilitate the task and to disguise the counties in question.

This task addresses many standards regarding the description and analysis of bivariate quantitative data, including regression and correlation. Students should recognize that the pattern shown is one of a strong, positive, linear association, and thus a correlation coefficient value near +1 is plausible. Students should also be able to interpret the slope of the least-squares line as an estimated increase in $y$ per unit change in $x$ (and thus for a 3 unit increase in $x$, students should expect an estimated increase in y that equals 3 times the model's slope value).

From a perspective of context, students should consider other variables that may explain the association (e.g., counties with higher populations or higher population density may have both more coffee shops and more property crimes). This would also reinforce the fact that correlation (even strong correlation) does not specifically imply causation. Depending upon student knowledge of experiments and observational studies, a discussion can occur reinforcing the risk associated with implying causation based on data from an observational study. Lastly, students should consider how a trend observed in a small sample of bivariate observations may change drastically with the addition of just a few additional observations.

Solution

  1. The relationship does appear to be linear. The relationship would be considered a strong and positive given how closely the points adhere to a line with positive slope.
  2. $r = 0.968$. Since the pattern shown is one of very strong, positive, linear association, a correlation coefficient value near +1 is plausible.
  3. $415.7 \cdot 3 = 1247.1$. According to the model, the predicted increase in the number of annual property crimes for a county with 3 additional coffee shops would be 1247.
  4. Association (no matter how strong) does not necessarily imply causation. It is unlikely that building a new coffee shop would cause crime rates to increase, for such logic would imply that coffee drinkers engage in more criminal behavior than non-coffee drinkers, the coffee shop attracts criminals to the county, etc. From a perspective of context, students should consider other variables that may be responsible for the association (e.g., counties with higher populations or higher population density may have both more coffee shops and more property crimes). As stated in the "commentary" above, depending upon student knowledge of experiments and observational studies, a discussion can occur reinforcing the risk associated with stating/implying causation based on data from an observational study.
  5. With the addition of the two observations, the scatterplot now displays a curved relationship with one outlier at (50, 20800). The scatterplot still shows a positive relationship between "Shops" (the number of coffee shops of this coffee shop chain in the county) and "Crimes" (the number of annual property crimes for the county in the previous year) – but the relationship no longer appears to be linear (or does not appear as linear as before). When only a few observations are used to assess a trend, sometimes just adding one or two points can change the appearance significantly. The new plot is shown below. This relationship might be modeled using a quadratic or an exponential curve.

    Sol_1_f31271104d6f0e8f33f7fa90123f10f2

Isao says:

over 1 year

My students and I discovered that, on part "e", we ran the regression line and found that the correlation coefficient became r=0.74 which, according to the reference table on Lesson 19 of M2 of EngageNY Algebra 1 would still be strong positive linear relationship. We also graphed the residual plot without a strong indication that the plot had a particular pattern. Thus, we are confused because we believe that, even when we added the two counties, the relationship is still strong. Any advice on this?

Tim says:

almost 2 years

Is there a way to bookmark tasks still? Thanks.

Heidi says:

over 2 years

https://www.dropbox.com/s/alqinsohid0jbqs/Illustrative-Coffee%20and%20Crime.tns?dl=0

Here is my TI-nspire document that can be used with this activity.

Heidi says:

over 2 years

removed

Heidi says:

over 2 years

removed

Lori Edwards says:

over 4 years

The content code S-ID.B.6 includes using residuals to informally assess the fit of the function. Since this concept is not include in the task, it seems line either S-ID.6a or S-ID.6c would be more appropriate. What is the difference between S-ID.6a and S-ID.6c?

Thank you.

Cam says:

over 4 years

Thanks for your questions! Basically, there's a slight difference between illustrating a standard and illustrating some specified subset of its parts. This tasks is aligned to the standard S-ID.B.6 (Representating data on two quantitatives variables on a scatter plot...), and not specifically to any of its subparts S-ID.6.(a,b,c). Frequently we align to the whole standard when a task addresses multiple things crossing over various parts, as you note that this task does, but we do also have tasks that align to specific parts.

For the second question, note that S-ID.6c is specifically about linear models, whereas S-IS.6a includes quadratic and exponential models, and includes material on using that model to solve problems. Hope that helps!