What are measures of central tendency?
Measures of Central Tendency describe some key characteristics of the data set based on the average or middle values, as they describe the centre of the data. The measures of central tendency that we will be looking at are the mean, mode, and median.
Mean
The mean, also called the mathematical average of a given data set, can be found by adding all values in the data set, and dividing by the number of values. We can use a mathematical formula to describe this: \(\mu = \frac{\Sigma x}{n}\) where \(\mu\) is used to represent the mean.
We have the scores of a quiz taken by mathematics students in the 6th grade. They are 76, 89, 45, 50, 88, 67, 75, 83. What is the mean score?
Answer:
\(\mu = \frac{\Sigma x}{n}\)
The formula above means we will add all the scores and then divide the sum by the number of scores available.
\(\Sigma x = 76 + 89 + 45 + 50 + 88 + 67 + 75 + 83 = 573\), n = 8.
Since there are 8 scores available, we will divide our sum by 8.
\(\mu = \frac{573}{8}\)
Mode
The mode is the most frequently occurring value in a data set. Sometimes you will have a data set where this describes more than one value. Here they are all considered the mode.
Find the mode for the given data set 6, 9, 3, 6, 6, 5, 2, 3.
Answer:
Arranging these values in ascending order will help you to identify which one occurs the most.
2, 3, 3, 5, 6, 6, 6, 9
It is evident that 6 is the most frequently occurring number, therefore the mode is 6.
Median
The median is the midpoint value of a given data set. In cases where the midpoint values are two (when the number of data points is even), you need to find the average of both middle values. When finding the median, it is appropriate to reorder your values in ascending order. Take the \(\frac{n+1}{2}\) value if the number of data points is odd. When the number is even, take the \(\frac{n}{2}\) and the value \(\frac{n+2}{2}\).
The ages of 12 students in grade 11 were collected, and the values are as follows: 15, 21, 19, 19, 20, 18, 17, 16, 17, 18, 19, 18. Find the median age.
Answer:
Arrange these values in ascending order:
15, 16, 17, 17, 18, 18, 18, 19, 19, 19, 20, 21
Since the number of data points is even, we will have two middle numbers, which are both 18. So the median is 18.
The scores of an exam taken by 7 students are given below. Find the median score.
87, 56, 78, 66, 73, 71, 79
Answer:
Rearrange the numbers from lowest to highest.
56, 66, 71, 73, 78, 79, 87
The number of value points is odd, so the middle number becomes the median score.
Median = 73
What are measures of spread?
Measures of spread are statistical measures that describe the similarity and variety of the values of given datasets. Relying on central tendency measures alone as a summary description for data sets can be very misleading since it does not account for extreme values. Measures of spread help us do that, including range, variance, and standard deviation.
Range
The range is the difference between a given data set's highest and lowest values. It helps you to know how wide the data is. To find the range, the lowest value in the data is subtracted from the highest value.
Find the range of the ages of 12 students in a class. Here's your data: 15, 21, 19, 19, 20, 18, 17, 16, 17, 18, 19, 18.
Answer:
Highest value = 21
Lowest value = 15
Range = highest value - lowest value
Range = 21-15
Range = 6
However, the range has a few limitations:
Quartiles and the interquartile range
A quartile is a type of quantile that divides an ordered data set into four parts (quarters). A quartile is not the group of numbers that have been divided. It is the cut-off point in the division.
The interquartile range is the difference between the upper quartile and the lower quartile value.
To find the quartile of a given data set you can proceed as follows:
Order the values in ascending order.
Find the median. This is always labelled as the second quartile (Q2).
Now find the median of both halves of the data set. The lowest half is labelled Q1, and the highest half is labelled Q3.
Find the interquartile range (IQR) by subtracting Q1 from Q3.
Find the interquartile range for the data given 6, 9, 3, 6, 6, 5, 2, 3, 8.
Answer:
Reorder the values from lowest to highest.
2, 3, 3, 5, 6, 6, 6, 8, 9
Find the median
The median is 6.
Q2 = 6
Find the median of the two halves, which are: 2, 3, 3, 5 | 6, 6, 8, 9
For the first part, we have 3 as the median. Q1 = 6
With the second step, we will have to sum both middle values and divide them by 2.
\(\frac{6 +8}{2} = 7\)
Q3 = 7
Find the interquartile range.
\(\begin{align}IQR &= Q_3 - Q_1 \\ &= 7 -3 \\ &= 4 \end{align}\)
Variance and standard deviation
Variance and standard deviation are both measures of variability. The variance is the measure of how data points vary from the mean, and the standard deviation is the square root of variance. What this tells us is that standard deviation is derived from variance.
Variance is denoted by \(\sigma^2\)
Standard deviation is denoted by \(\sigma\).
Variance formula
The population variance formula is \(\sigma^2 = \frac{\Sigma(x_1 - \mu )^2}{N}\)
Where \(\sigma^2\) = population variance
N = size of the population
xi = each value from the population
\(\mu\) = the population mean.
The sample variance formula is \(s^2 = \frac {\Sigma(x_i - \bar{x})^2}{n-1}\)
Where s2 = sample variance
n = size of sample
xi = each value from the sample
\(\bar{x}\) = the sample mean.
Standard deviation formula
The population standard deviation formula is given by \(\sigma = \sqrt{\frac{\Sigma(x_i - \mu)^2}{N}}\)
Where \(\sigma\) = population standard deviation.
N = size of the population.
xi = each value from the population.
\(\mu\) = the population mean.
The sample standard deviation formula is given by \(s = \sqrt {\frac{\Sigma(x_i - \bar {x})^2}{n-1}}\)
Where s = sample standard deviation.
n = size of sample.
xi = each value from the sample.
\(\bar{x}\) = the sample mean.
Calculate the standard deviation for the following scores on a Maths exam taken by 6th-grade students: 82, 93, 98, 89, 88.
Answer:
The first thing you need to do is to find the mean of the sample:
\(\bar{x} = \frac{\Sigma x}{n}\)
\(\bar{x} = \frac {82+93+98+89+88}{5} = \frac{450}{5}\)
\(\bar{x} = 90\)
So the formula that we are going to use here is \(s = \sqrt{\frac{\Sigma(x_i-\bar{x})^2}{n-1}}\), since the scores are available are only a sample of the whole population of students that took the exam.
We can construct a table to break down the formula and work it out appropriately.
xi | \(x_i - \bar{x}\) | \((x_i - \bar{x})^2\) |
82 | | 64 |
93 | 3 | 9 |
98 | | 64 |
89 | -1 | 1 |
88 | -2 | |
According to the formula we will have to sum \((x_i - \bar{x})^2\), which is the last column of our table.
Statistical Measures - Key takeaways
- Statistical Measures are a technique of descriptive analysis used to give a summary of the characteristics of a data set.
- Measures of central tendency describe some key characteristics of the data set based on the average or middle values, as they describe the centre of the data.
- The three main measures of tendency are mean, mode, and median.
- Mean is the most common measure of central tendency and its formula is \(\mu = \frac{\Sigma x}{n}\).
- Measures of spread are statistical measures that describe the similarity and variety of values of given datasets.
- Standard deviation is a measure of the amount of variation or dispersion of a set of values.
- Standard deviation is the square root of the variance.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel