Join our community!

Update all PDFs

Who is the better batter?


Alignments to Content Standards: 7.EE.B.3

Task

Below is a table showing the number of hits and the number of times at bat for two Major League Baseball players during two different seasons:

SeasonDerek Jeter David Justice
1995 12 hits in 48 at bats 104 hits in 411 at bats
1996 183 hits in 582 at bats 45 hits in 140 at bats

A player's batting average is the fraction of times at bat when the player gets a hit.

  1. For each season, find the players' batting averages. Who has the better batting average?
  2. For the combined 1995 and 1996 seasons, find the players' batting averages. Who has the better batting average?
  3. Are the answers to (a) and (b) consistent? Explain.

IM Commentary

The purpose of this task is to give students a real-world context for comparing fractions where it is natural to convert the fractions to decimals or describe the situation in terms of percents. While it is possible to compare the batting averages in fraction form (as shown in the second solution), many people find it helpful to convert to decimals or to use the language of percents. This is because writing a fraction in decimal form amounts to finding a common denominator; in the first solution shown, we are actually finding approximately equivalent fractions with a common denominator of 1,000. Converting to decimals also helps students easily get a handle on the relative magnitude of the fractions, and students need to be able to translate between these forms fluently, especially when they are solving complex modeling problems.

This task also gives a concrete example, with real data, of what is called Simpson's paradox. In this case, David Justice has a higher batting average than Derek Jeter both in 1995 and in 1996. Paradoxically, Derek Jeter has the higher batting average when the two sets of data are combined. The source of the paradox lies in the fact that the number of at bats is variable: If, for example, each player had 100 at bats in both seasons, then it could not happen that one player has higher batting averages in each season considered separately but a lower batting average when the seasons are combined. Similarly, if the batting averages for each player were the same in the two seasons, then this paradox cannot occur.

The data and idea for this task come from the Batting Averages section of the Wikipedia entry on Simpson's paradox:

https://en.wikipedia.org/wiki/Simpson's_paradox

Much more information and further examples of the paradox can be found at this website.

The main standard for mathematical practice relevant for this problem is MP2, Reason Abstractly and Quantitatively, as the main work of the task involves constructing fractions from the context, reasoning about the fractions, and then making a conclusion about the batting averages. The task may also provide an opportunity to work on MP3, Construct Viable Arguments and Critique the Reasoning of Others, particularly if students are initially surprised by the results of their calculations.

This task was designed for an NSF-supported summer program for teachers and undergraduate students held at the University of New Mexico from July 29 through August 2, 2013 (http://www.math.unm.edu/mctp/).

Solutions

Solution: 1 Converting to decimals

  1. We are given that the batting average is the fraction of at bats in which the player got a hit. So, for example, in 1995 Derek Jeters' batting average was $$ \frac{12}{48} = \frac{1}{4}. $$ It is common practice in listing batting averages to convert these fractions to decimals, rounding off to the nearest thousandth. The table below presents the batting averages for each player in the two seasons, both as fractions and as decimals rounded to the nearest thousandth:

    SeasonDerek Jeter David Justice
    1995 $\frac{12}{48} = 0.250$ $\frac{104}{411} \approx 0.253$
    1996 $\frac{183}{582} \approx 0.314$ $\frac{45}{140} \approx 0.321$

    The table shows that David Justice had a slightly higher batting average than Derek Jeter in both 1995 (25.3% vs. 25.0%) and 1996 (32.1% vs. 31.4%).

  2. If we combine the data for the two seasons, then Derek Jeter had 12 + 183 = 195 total hits in 48 + 582 = 630 total at bats. So his overall batting average for the two seasons was $$ \frac{195}{630} \approx 0.310. $$ David Justice had 104 + 45 = 149 hits in 411 + 140 = 551 at bats for an overall batting average of $$ \frac{149}{551} \approx 0.270. $$ So Derek Jeter had a higher batting average of 31.0% for the 1995 and 1996 seasons combined compared with David Justice's batting average of 27.0%.
  3. Even though David Justice had a higher batting average than Derek Jeter in 1995 and in 1996, Derek Jeter's average was higher for the two years combined. This is due to the fact that the two seasons do not carry the same weight when the data is combined. Derek Jeter had a very high batting average in 1996 when he had most of his at bats. David Justice, on the other hand, took most his hits in 1995 and his average in this season is much lower than Derek Jeter's average in 1996. It is the batting average from the season in which the players took the majority of their at bats that has the biggest influence on the overall batting numbers and this is why Derek Jeter comes out ahead.

Solution: 2 Working with fractions

  1. We can compare the fractional batting averages for Derek Jeter and David Justice without converting these fractions to decimals. For the 1995 season, Derek Jeter's batting average was $\frac{12}{48}$ while David Justice's batting average was $\frac{104}{411}$. One way to compare these (without finding a common denominator) is given here:

    \begin{align} \frac{12}{48} &= \frac{1}{4} \\ &= \frac{1 \times 104}{4 \times 104} \\ &= \frac{104}{416}\\ &\lt \frac{104}{411}. \end{align}

    The last step works because 411ths are larger than 416ths (because there are fewer 411ths in a whole). So David Justice has a higher batting average than Derek Jeter in 1995.

    To compare the batting averages in 1996 we can reason in a similar way:

    \begin{align} \frac{45}{140} &= \frac{9}{28} \\ &\gt \frac{9}{27} \\ &= \frac{1}{3}\\ &= \frac{183}{549}\\ &\gt \frac{183}{583}. \end{align}

    So in 1996 David Justice again had a higher batting average than Derek Jeter.

  2. If we combine the data for the two seasons, then Derek Jeter had 12 + 183 = 195 total hits in 48 + 582 = 630 total at bats. David Justice had 104 + 45 = 149 hits in 411 + 140 = 551 at bats. The fractions $\frac{195}{630}$ and $\frac{149}{551}$ can be calculated via long division or we might reason as follows, using the methods of part (a):

    \begin{align} \frac{149}{551} &\lt \frac{150}{550} \\ &= \frac{3}{11} \\ &= \frac{195}{715}\\ &\lt \frac{195}{630}. \end{align}

    So Derek Jeter's batting average is higher than David Justice's when the 1995 and 1996 data are combined.

  3. Even though David Justice had a higher batting average than Derek Jeter in 1995 and in 1996, Derek Jeter's average was higher for the two years combined. This is due to the fact that the two seasons do not carry the same weight when the data is combined. Derek Jeter had a very high batting average in 1996 when he had most of his at bats. David Justice, on the other hand, took most his hits in 1995 and his average in this season is much lower than Derek Jeter's average in 1996. It is the batting average from the season in which the players took the majority of their at bats that has the biggest influence on the overall batting numbers and this is why Derek Jeter comes out ahead.

Michael Nakamaye says:

almost 3 years

I thought that it was nice to leave it open to support a number of different approaches and the teacher can encourage students to use percents if this is desired. Students familiar with baseball statistics will likely take this approach but it would be interesting to see if other approaches arise. If you use this in the classroom, please let us know what happens and what we can do to make it work better.

Dorothy says:

almost 3 years

Hi! Why not write this so it could be used as a problem specifically for using percents?