Skip to main content

Section 5.1 Data Interpretation

By the end of this section, you will be able to:
  1. Define and distinguish between different types of data (primary, secondary, and tertiary).
  2. Identify appropriate data collection methods, including surveys, experiments, and observations.
  3. Organize and present data using frequency distribution tables for both grouped and ungrouped data.
  4. Compute and interpret measures of central tendency (mean, median, and mode) to summarize datasets.
  5. Visualize data effectively through histograms and frequency polygons.
  6. Apply statistical methods to analyze real-world problems and draw meaningful conclusions.

Subsection 5.1.1 Determining Appropriate Class Width For Grouping Data

Data can be classified mainly into two types on the basis of how it is organized: Grouped Data and Ungrouped Data. Ungrouped data is raw data, which consists of a simple list of values where each value corresponds to a distinct observation or measurement.
For example, a list of marks of students in a classroom (e.g., 95, 94, 96, 92, 98, 99, ...). This kind of representation is used when we have a smaller dataset and need to deal with individual data points.
Grouped data, on the other hand, is a collection of data that has been organized into groups or intervals. This is done to simplify the analysis and interpretation of large datasets. Grouping data helps in identifying patterns, trends, and distributions within the data.
When we have a larger dataset, it is preferable to group similar data values into intervals and assign a value to each interval corresponding to the frequency of data points in that range. The intervals should have a uniform length defined by their upper and lower limits.

Activity 5.1.1. Work in Groups:- Measuring Arm Span.

  1. You will need a tape measure.
  2. In pairs, measure each other’s arm span (from fingertip to fingertip with arms stretched out horizontally) in centimetres. Record the measurements in your exercise books.
  3. Identify the shortest and the longest arm span in your group.
  4. Calculate the difference between the longest and the shortest arm span.
  5. Share your findings with the class—were the tallest students also the ones with the longest arm spans?

Activity 5.1.2. Work in Groups:- Temperature.

You are provided with the daily afternoon temperatures (in °C) recorded in your area over the last 22 days:
29, 31, 26, 33, 30, 27, 25, 35, 28, 24, 32, 26, 30, 34, 33, 29, 31, 27, 36, 25, 28, 32
  1. Identify the lowest and highest temperatures in the data
  2. Work out the difference between them.
  3. Choose a suitable class width that allows you to divide the temperatures into 5 equal intervals.
  4. Write down the 5 class intervals.
  5. Create a frequency table showing how many days fall into each interval.
  6. Share your findings with your classmates.
\(\textbf{Class}\) width is the size of each interval when grouping data. It is determined by the range of the data and the number of intervals you want to create.
When the data is widely spread out, it is often useful to group the data into intervals. This makes it easier to analyze and interpret the data. The choice of class width is crucial for effective grouping.
The diferrence between the highest and lowest values in a dataset is called the \(\textbf{range}\text{.}\) The range gives you an idea of how spread out the data is.
The lowest number in the class interval is called the \(\textbf{lower class boundary} \, \text{or} \, \textbf{lower class limit}\text{,}\) and the highest number is called the \(\textbf{upper class boundary} \, \text{or} \, \textbf{upper class limit}\text{.}\) The class width is the difference between the upper and lower class boundaries.
To determine an appropriate class width for grouping data, follow these steps:
In the class \(12-15\text{,}\) the lower limit is \(12\) and the upper limit is \(15\text{.}\) While the lower class boundary is \(11.5\) and the upper class boundary is \(15.5\text{.}\) The class width is calculated as follows:
The range of numbers from smallest to largest in a distribution helps in determining class size as follows:
\begin{equation*} \text{Class width} = \frac{\text{Range}}{\text{Number of classes expected}} \end{equation*}
Round up to the next higher whole number if necessary if the result is not a whole number.

Example 5.1.1. Organizing Smartphone Battery Life.

The battery life (in hours) of 30 different smartphone models was recorded as follows:
9, 12, 11, 15, 18, 10, 14, 17, 13, 16, 19, 15, 12, 14, 17, 10, 13, 11, 20, 19, 9, 13, 16, 18, 17, 12, 10, 11, 15, 16
  1. What is the shortest and longest battery life recorded?
  2. Find the range of the data.
  3. Choose a class width that allows you to sort the battery life data into 6 equal intervals.
  4. Write out the class intervals.
Solution.
  1. The smallest battery life is \(9\) hours and the longest is \(20\) hours.
  2. The range of the data is the difference between the longest and shortest battery life:
    \begin{equation*} \text{Range} = 20 - 9 \end{equation*}
    \begin{equation*} = 11 \text{ hours} \end{equation*}
  3. \begin{equation*} \text{Class width} = \frac{\text{Range}}{\text{Number of classes expected}} \end{equation*}
    \begin{equation*} = \frac{11}{6} \end{equation*}
    \begin{equation*} \approx 1.83 \end{equation*}
    rounding to nearest whole number gives us a class width of \(2\) hours.
  4. Therefore, the class intervals are:
    Time in Hours 9 - 11 11 - 13 13 - 15 15 - 17 17 - 19 19 - 21

Example 5.1.2. Organising Heights of Students.

The heights of 60 students in a class were measured in centimetres and recorded as follows:
150, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210
  1. What is the shortest and heighest height recorded?
  2. Find the range of the data.
  3. Choose a class width that allows you to sort the heights into 5 equal intervals.
  4. Write out the class intervals.
Solution.
  1. The shortest height is \(150\) cm and the highest is \(210\) cm.
  2. \begin{equation*} \text{Range} = 210 - 150 = 60 \text{ cm} \end{equation*}
  3. \begin{equation*} \text{Class width} = \frac{\text{Range}}{\text{Number of classes expected}} \end{equation*}
    \begin{equation*} = \frac{60}{5} \end{equation*}
    \begin{equation*} = 12 \text{ cm} \end{equation*}
  4. The class intervals are:
    Height (cm) 150 - 161 162 - 173 174 - 185 186 - 197 198 - 210

Subsection 5.1.2 Drawing Frequency Distribution Tables For Grouped Data

Activity 5.1.3. Work in Groups.

  1. The numbers of house units in 40 blocks of flats in a certain area are given below:
    150, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190
  2. Make a frequency distribution table for the data above, using class intervals of \(8\) units.
    Marks Tally Number of House Units
    150-157 \(\cancel{|||||} ||\) 7
    158-165
    166-173
    174-181
    182-189
    190-197
  3. Share your frequency distribution table with other learners in the class.
A frequency distribution table is a way of organizing data into classes or groups, so as to make the data more meaningful and easier to analyze.

Subsection 5.1.3 Identifying Modal Class of Grouped Data

Activity 5.1.4. Work in Groups..

  1. Below is frequency distributiion table showing the marks of learners in a class out of 100 points.
    Marks Frequency
    0-10 0
    11-19 0
    20-29 1
    30-39 3
    40-49 1
    50-59 3
    60-69 23
    80-89 15
    70-79 15
    90-100 9
Key Takeaway
The highest frequency in a distribution is called modal frequency. The class that has the highest is reffered to as the modal class.

Example 5.1.6. Sunshine records.

  1. Daily hours of sunshine at 50 weather stations in a country were recorded and the results are shown in the frequency table below.
    Numbers of Hours 1-3 4-6 7-9 10-12 13-15
    Frequency 10 8 14 12 6
  2. State the modal class of the data.
Solution.
The modal class is \(7 - 9\) hours of sunshine, because it has the highest frequency of \(14\text{.}\)

Example 5.1.7. Masses of men..

  1. The table below gives the masses of \(75\) men.
    Mass 55 - 58 59 - 62 63 - 66 67 - 70 71 - 74 75 - 78 79 - 81
    Frequency 7 10 14 17 13 9 5
  2. What is the modal class of the data?
  3. State the modal frequency.
Solution.
  1. The modal class is \(67 - 70\) kg, because it has the highest frequency of \(17\text{.}\)
  2. The modal frequency is \(17\text{.}\)

Subsection 5.1.4 Calculating Mean of Grouped Data

Activity 5.1.6. Work in Groups..

  1. The table below shows the marks scored by \(100\) in an assessment.
    Marks Mid point Number of Learners (\(f\)) f(\(x\))
    10 - 22 \(\frac{10 + 22}{2} = 16\) 5 80
    23 - 35 \(\frac{23 + 35}{2} = 29\) 10 290
    36 - 48 20
    49 - 61 30
    62 - 74 10
    75 - 87 5
    88 - 100 1
    Total = 81 Total =
    1. Complete the table.
    2. Use the table to determine the mean mark.
  2. Share your work with other learners in the class.
Key Takeaway.
When data is organized into class intervals, each interval is represented by its mid-point, which is the average of the lower and upper boundaries of that class.
Mid-point can be denoted as;
\begin{equation*} \bar{X} = \frac{\text{upper limit + lower limit}}{2} \end{equation*}
\begin{equation*} \text{Mean} = \frac{\text{sum of all values}}{\text{total frequency}} \end{equation*}
Mean is represented by the symbol \(\bar{X}\text{,}\) read as \(X\) bar.
Thus,
\begin{equation*} \bar{X} = \frac{\sum{fx}}{\sum{f}} \end{equation*}
The symbol \(\sum\)(sigma) means "sum of" and is used to indicate that you should add up all the values in the specified range.

Example 5.1.8. Masses of men.

The table below shows given masses of \(72\) men working in a shop. Calculate the mean mass.
Mass (kg) Mid point
55 - 58 6
59 - 62 8
63 - 66 12
67 - 70 18
71 - 74 6
75 - 78 10
79 - 82 12
Solution.
Mass (kg) Mid point Frequency (f) fx
55 - 58 56.5 6 339
59 - 62 60.5 8 484
63 - 66 64.5 12 774
67 - 70 68.5 18 1233
71 - 74 72.5 6 435
75 - 78 76.5 10 765
79 - 82 79.5 12 954
Total (f) = \(72\) Total (fx) = \(4,984\)
The total frequency is \(72\) and the total of \(fx\) is \(4,984\text{.}\)
The mean mass is calculated as follows:
\begin{equation*} \bar{X} = \frac{\sum{fx}}{\sum{f}} \end{equation*}
\begin{equation*} \bar{X} = \frac{4,984}{72} \end{equation*}
\begin{equation*} \approx 69.72\, \text{kg} \end{equation*}

Example 5.1.9. Points scored in a tournament.

The table below shows the points scored by various teams in a tournament. Calculate the mean points scored.
Points Scored Frequency
1-10 10
11-20 14
21-30 19
31-40 8
41-50 6
51-60 3
Solution.
To find the mean points scored, we need to calculate the total points and the total number of teams.
Points Scored Midpoint Frequency (f) fx
1-10 \(\frac{1+10}{2} = 5.5\) 10 55
11-20 \(\frac{11+20}{2} = 15.5\) 14 217
21-30 \(\frac{21+30}{2} = 25.5\) 19 484.5
31-40 \(\frac{31+40}{2} = 35.5\) 8 284
41-50 \(\frac{41+50}{2} = 45.5\) 6 273
51-60 \(\frac{51+60}{2} = 55.5\) 3 166.5
Total = \(60\) Total = \(1480\)
The total frequency is \(60\) and the total of \(fx\) is \(1480\text{.}\)
The mean points scored is calculated as follows:
\begin{equation*} \bar{X} = \frac{\sum{fx}}{\sum{f}} \end{equation*}
\begin{equation*} \bar{X} = \frac{1480}{60} \end{equation*}
\begin{equation*} \approx 24.67\, \text{points} \end{equation*}

Subsection 5.1.5 Determining the Median of Grouped Data

Median is a value corresponding to the middlemost data point in a dataset, when arranged in ascending order. The value of median helps one to know about center of a dataset. On comparing the value of median with that of mean, one can get idea of distribution of values in a dataset.
As we have data in form of intervals (classes) in this case, we have a corresponding median class to find the value of median.
Also, we need to define cumulative frequencies for each class, which is a kind of prefix sum of frequencies of classes taken in order. The median value lies between the lower limit and upper limit of the median class. This value can be used by using a specified formula discussed as follows.
To find median of ungrouped data, one can simply sort the data points in ascending order. In case of odd number of observations, the middle value would be the median. On the other hand , for even number of observations, one can take mean of the two middle values to find the median. But there is a different method to find median of grouped data discussed later in this article.

Activity 5.1.7. Work in Groups.

  1. Consider the data in the frequency distribution table below.
    Mass Class Boundary Number of People (f) Cumulative Frequency (cf) Class Interval (i) \(\frac{i}{f}\)
    21 - 30 20.5 - 30.5 7 7 7 \(\frac{10}{7}\)
Median of Grouped Data Formula
We can use the following formula to calculate median of grouped data:
\begin{equation*} \text{Median} = L_m + \frac{i_m}{f_m} \left(\frac{N}{2} - C_p \right) \end{equation*}
Where:
  • \(L_m\) = Lower limit of the median class
  • \(i_m\) = Width of the median class interval
  • \(f_m\) = Frequency of the median class
  • \(N\) = Total frequency
  • \(C_p\) = Cumulative frequency of the class preceding the median class

Example 5.1.10. Sample data.

Calculate the value of the median for the following data distribution:
Class Interval 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50
Frequency 5 7 12 10 6
Solution.
To find the median of given data, we build a table containing cumulative frequencies for each class interval along with the frequencies.
Class Interval Frequency (f) Cumulative Frequency (cf)
0 - 10 5 0 + 5 = 5
10 - 20 7 5 + 7 = 12
20 - 30 12 12 + 12 = 24
30 - 40 10 24 + 10 = 34
40 - 50 6 34 + 6 = 40
The total frequency \(N\) is 40. The median class is the one where the cumulative frequency is greater than or equal to \(\frac{N}{2} = \frac{40}{2} = 20\text{.}\) In this case, the median class is \(20 - 30\) with a cumulative frequency of 24.
Now, we can apply the median formula:
\begin{equation*} \text{Median} = L_m + \frac{i_m}{f_m} \left(\frac{N}{2} - C_p \right) \end{equation*}
Substituting the values into the formula:
\begin{equation*} \text{Median} = 20 + \frac{10}{12} \left(\frac{40}{2} - 12 \right) \end{equation*}
Evaluating the expression:
\begin{equation*} \text{Median} = 20 + \frac{10}{12} \left(20 - 12 \right) \end{equation*}
\begin{equation*} = 20 + \frac{10}{12} \times 8 \end{equation*}
\begin{equation*} = 20 + \frac{80}{12} \end{equation*}
\begin{equation*} = 20 + 6.67 \end{equation*}
\begin{equation*} \approx 26.67 \end{equation*}

Example 5.1.11.

The table below shows the mass in kilograms of \(60\) patients in a hospital ward.
Mass in Kilograms Frequency (f)
11 - 20 9
21 - 30 14
31 - 40 20
41 - 50 11
51 - 60 6
Calculate the median mass of the patients.
Solution.
Draw a cumulative frequency table for the data.
Mass in Kilograms Class boundary Cumulative frequency Cumulative frequency (cf) \(\frac{i_m}{f_m}\)
11 - 20 10.5 - 20.5 9 0 + 9 = 9 \(\frac{10}{9}\)
21 - 30 20.5 - 30.5 14 9 + 14 = 23 \(\frac{10}{14}\)
31 - 40 30.5 - 40.5 20 23 + 20 = 43 \(\frac{10}{20}\)
41 - 50 40.5 - 50.5 11 43 + 11 = 54 \(\frac{10}{11}\)
51 - 60 50.5 - 60.5 6 54 + 6 = 60 \(\frac{10}{6}\)
The cumulative frequency for the last class is \(60\text{,}\) which is half of the total frequency. Therefore, the median lies in the class \(31 - 40\text{.}\)
To find the median, we use the formula:
\begin{equation*} \text{Median} = L + \frac{\frac{N}{2} - cf}{f} \times c \end{equation*}
Where:
  • \(L\) = lower boundary of the median class = \(30.5\)
  • \(N\) = total frequency = \(60\)
  • \(cf\) = cumulative frequency of the class before the median class = \(23\)
  • \(f\) = frequency of the median class = \(20\)
  • \(c\) = class width = \(10\)
Substituting these values into the formula gives:
\begin{equation*} \text{Median} = 30.5 + \frac{30 - 23}{20} \times 10 \end{equation*}
\begin{equation*} = 30.5 + \frac{7}{20} \times 10 \end{equation*}
\begin{equation*} = 30.5 + 3.5 \end{equation*}
\begin{equation*} = 34\, \text{kg} \end{equation*}