Jump to a key chapter
This process is called taking a sample mean and in this article you will find the definition, how to calculate a sample mean, standard deviation, variance, the sampling distribution and examples.
Definition of Sample Means
The mean of a set of numbers is just the average, that is, the sum of all the elements in the set divided by the number of elements in the set.
The sample mean is the average of the values obtained in the sample.
It is easy to see that if two sets are different, they will most likely also have different means.
Calculation of Sample Means
The sample mean is denoted by \(\overline{x}\), and is calculated by adding up all the values obtained from the sample and dividing by the total sample size \(n\). The process is the same as averaging a data set. Therefore, the formula is \[\overline{x}=\frac{x_1+\ldots+x_n}{n},\]
where \(\overline{x}\) is the sample mean, \(x_i\) is each element in the sample and \(n\) is the sample size.
Let's go back to the San Francisco example. Suppose you asked \(5\) of your acquaintances how much they spend on public transport per week, and they said \(\$20\), \(\$25\), \(\$27\), \(\$43\), and \(\$50\). So, the sample mean is calculated by:
\[\overline{x}=\frac{20+25+27+43+50}{5}=\frac{165}{5}=33.\]
Therefore, for this sample, the average amount spent on public transportation in a week is \($33\).
Standard Deviation and Variance of the Sample Mean
Since the variance is the square of the standard deviation, to calculate either value, two cases must be considered:
1. You know the population standard deviation.
2. You do not know the population standard deviation.
The following section shows how to calculate this value for each case.
The Mean and Standard Deviation Formula for Sample Means
The mean of the sample mean, denoted by \(\mu_\overline{x}\), is given by the population mean, that is if \(\mu\) is the population mean, \[\mu_\overline{x}=\mu.\]
To calculate the standard deviation of the sample mean (also called the standard error of the mean (SEM)), denoted by \(\sigma_\overline{x}\), the two previous cases must be considered. Let's explore them in turn.
Calculating the Sample Mean Standard Deviation using the Population Standard Deviation
If the sample of size \(n\) is drawn from a population whose standard deviation \(\sigma\) is known, then the standard deviation of the sample mean will be given by \[\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]
A sample of \(81\) people was taken from a population with standard deviation \(45\), what is the standard deviation of the sample mean?
Solution:
Using the formula stated before, the standard deviation of the sample mean is \[\sigma_\overline{x}=\frac{45}{\sqrt{81}}=\frac{45}{9}=5.\]
Note that to calculate this, you do not need to know anything about the sample besides its size.
Calculating the Sample Mean Standard Deviation without using the Population Standard Deviation
Sometimes, when you want to estimate the mean of a population, you do not have any information other than just the data from the sample you took. Fortunately, if the sample is large enough (greater than \(30\)), the standard deviation of the sample mean can be approximated using the sample standard deviation. Thus, for a sample of size \(n\), the standard deviation of the sample mean is \[\sigma_\overline{x}\approx\frac{s}{\sqrt{n}},\] where \(s\) is the sample standard deviation (see the article Standard Deviation for more information) calculated by:
\[s=\sqrt{\frac{(x_1-\overline{x})^2+\ldots+(x_n-\overline{x})^2}{n-1}},\]
where \(x_i\) is each element in the sample and \(\overline{x}\) is the sample mean.
❗❗ The sample standard deviation measures the dispersion of data within the sample, while the sample mean standard deviation measures the dispersion between the means from different samples.
Sampling Distribution of the Mean
Recall the sampling distribution definition.
The distribution of the sample mean (or sampling distribution of the mean) is the distribution obtained by considering all the means that can be obtained from fixed-size samples in a population.
If \(\overline{x}\) is the sample mean of a sample of size \(n\) from a population with mean \(\mu\) and standard deviation \(\sigma\). Then, the sampling distribution of \(\overline{x}\) has mean and standard deviation given by \[\mu_\overline{x}=\mu\,\text{ and }\,\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]
Furthermore, if the distribution of the population is normal or the sample size is large enough (according to the Central Limit Theorem, \(n\geq 30\) is enough), then the sampling distribution of \(\overline{x}\) is also normal.
When the distribution is normal, you can calculate probabilities using the standard normal distribution table, for this you need to convert the sample mean \(\overline{x}\) into a \(z\)-score using the following formula
\[z=\frac{\overline{x}-\mu_\overline{x}}{\sigma_\overline{x}}=\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}.\]
You may be wondering, what happens when the population distribution is not normal and the sample size is small? Unfortunately, for those cases, there is no general way to obtain the shape of the sampling distribution.
Let's see an example of a graph of a sampling distribution of the mean.
Going back to the example of public transportation in San Francisco, let's suppose you had managed to survey thousands of people, grouped the people into groups of size \(10\), averaged them in each group and obtained the following graph.
This graph approximates the graph of the sampling distribution of the mean. Based on the graph, you can deduce that an average of \(\$37\) is spent on public transportation in San Francisco.
Examples of Sample Means
Let's see an example of how to calculate probabilities.
It is assumed that the human body temperature distribution has a mean of \(98.6\, °F\) with a standard deviation of \(2\, °F\). If a sample of \(49\) people are taken at random, calculate the following probabilities:
(a) the average temperature of the sample is less than \(98\), that is, \(P(\overline{x}<98)\).
(b) the average temperature of the sample is greater than \(99\), that is, \(P(\overline{x}>99)\).
(c) the average temperature is between \(98\) and \(99\), that is, \(P(98<\overline{x}<99)\).
Solution:
1. Since the sample size is \(n=49>30\), you can assume the sampling distribution is normal.
2. Calculating the mean and the standard deviation of the sample mean. Using the formulas stated before, \(\mu_\overline{x}=98.6\) and the standard deviation \(\sigma_\overline{x}=2/\sqrt{49}=2/7\).
3. Converting the values into \(z-\)scores and using the standard normal table (see the article Standard Normal Distribution for more information), you'll have for (a):
\[\begin{align} P(\overline{x}<98) &=P\left(z<\frac{98-98.6}{\frac{2}{7}}\right) \\ &= P(z<-2.1) \\ &=0.0179. \end{align}\]
For (b) you'll have:
\[\begin{align} P(\overline{x}>99) &=P\left(z>\frac{99-98.6}{\frac{2}{7}}\right) \\ &= P(z>1.4) \\ &=1-P(z<1.4) \\ &=1-0.9192 \\ &= 0.0808. \end{align}\]
Finally, for (c):
\[\begin{align} P(98<\overline{x}<99) &=P(\overline{x}<99)-P(\overline{x}<98) \\ &= P(z<1.4)-P(z<-2.1) \\ &= 0.9192-0.0179 \\ &=0.9013. \end{align}\]
Sample Mean - Key takeaways
- The sample mean allows you to estimate the population mean.
- The sample mean \(\overline{x}\) is calculated as an average, that is, \[\overline{x}=\frac{x_1+\ldots+x_n}{n},\] where \(x_i\) is each element in the sample and \(n\) is the sample size.
- The sampling distribution of the mean \(\overline{x}\) has mean and standard deviation given by \[\mu_\overline{x}=\mu\,\text{ and }\,\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]
- When the sample size is greater than \(30\), according to the Central Limit Theorem, the sampling distribution of the mean is similar to a normal distribution.
Learn with 9 Sample Mean flashcards in the free StudySmarter app
We have 14,000 flashcards about Dynamic Landscapes.
Already have an account? Log in
Frequently Asked Questions about Sample Mean
What is sample mean?
The sample mean is the average of the values obtained in the sample.
How do you find sample mean?
By adding up all the values obtained from a sample and dividing by the number of values in the sample.
What is the formula for sample mean?
The formula for calculating the sample mean is (x1+...+xn)/n, where xi is each element in the sample and n is the sample size.
What is the importance of using sample mean?
The most obvious benefit of computing the sample mean is that it provides reliable information that can be applied to the bigger group/population. This is significant since it allows for statistical analysis without the impossibility of polling every person involved.
What is the disadvantages of using sample mean?
The main disadvantage is that you cannot find extreme values, either very high or very low, since taking the average of them makes you get a value close to the mean. Another disadvantage is that it is sometimes difficult to select good samples, so there is a possibility of getting biased answers.
About StudySmarter
StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.
Learn more