Despite these challenges, environmental scientists must strive to achieve the best possible sample size to ensure that their research findings are as robust and reliable as possible. This is especially important when tackling pressing environmental issues, such as climate change, pollution and deforestation, as the findings from these studies can have significant implications for the future of our planet and all its inhabitants.
What is the meaning of sample size?
In a research study, the sample size refers to the number of participants or observations included in the study. Those participants or observations are gathered from the population the study aims at characterizing or researching.
The sample size is important because it determines the power and precision of the study, which can affect the ability to detect significant differences or correlations. A larger sample size generally provides more reliable results, but it can also increase the cost and complexity of the study. The sample size should be determined based on the research question, the population being studied, and the resources available for the study.
Why is Sample Size Important?
Making sure that your study has an appropriate sample size is important. Otherwise, a range of problems could occur.
If your sample size is too small...
- It won't be representative of the target population.
- It may not be possible to detect any differences between study groups, or the noted differences won't have enough power to be considered significant.
- The investigation may be considered a waste of time, money or resources.
If your sample size is too big...
- It could lead to ethical problems.
- It may use more time, money and resources than is necessary.
- Statistical testing may be affected.
If a sample size is too large, it could be unethical as more participants will be exposed to potential risks than necessary. For example, when testing a new drug, exposing more people than statistically necessary to understand possible side effects would be unethical.
How do you estimate sample size?
Sample size itself is actually described as an estimate that tells us the minimum number of data points required to obtain statistically significant results.
Gaining accurate values for the sample size is often very difficult. Therefore, it is usually quite challenging to provide a reasonably accurate value of the effect size. The exact values of these effect sizes are usually not known and can only be derived from the study after the analysis is completed. Hence, discrepancies of the effect sizes are commonly expected, and researchers will usually either overestimate or underestimate them.
What Affects Sample Size?
When working out the sample size for a study, we need to take into account the three criteria that will affect sample size: level of precision, confidence level and degree of variability.
Level of Precision
The level of precision (sometimes called the sampling error) is the range in which the true value of a population is estimated to be. This is typically expressed using percentage points.
An ecologist investigated 100 school playing fields around the UK. They found that 65% of playing fields contain daisies, with a precision level of ±4%. This means that the ecologist can conclude that between 61% and 69% of school playing fields contain daisies:
65 - 4 = 61 and 65 + 4 = 69
Confidence Level
The confidence level of a study is the probability that the value of a parameter falls within a specified range of values, typically expressed in percentages. In simpler terms, the higher your confidence level is, the more likely it is that your results are accurate.
The ecologist used a confidence level of 95%, which is typical in science. This means that there is a 95% chance that if the investigation was repeated, they would get the same result.
Accuracy is how close a measurement is to the true value.
Three children were asked to count the number of trees in a park. They all had different answers.
Child 1: 45 trees
Child 2: 37 trees
Child 3: 49 trees
There were 38 trees in the park. Which child had the most accurate result?
Child 2, of course!
Degree of Variability
The degree of variability is how different the studied population is within itself. The more heterogeneous (more varied) a population is, the larger the sample size needs to be in order to obtain a given level of precision. If the population is more homogeneous (less varied), the sample size doesn't need to be as big.
Precision is how consistent the results are when the measurements are repeated.
Two groups of six harvest mice were collected and their weight was measured.
The weights of Group 1: 4.5 g, 4.2 g, 4.3 g, 4.4 g, 4.3 g, 4.4 g
The weights of Group 2: 6.1 g, 4.2 g, 4.8 g, 3.9 g, 4.4 g, 5.2 g
Which group had more precise results?
Group 1 has more precise results because the weights of the different mice are not so far apart from each other as in Group 2.
Sample Size Determination
There are a few different methods that can be used to calculate a suitable sample size for a population, including power analysis, pilot studies and using existing data.
Using Power Analysis to Estimate Sample Size
Power analysis is a statistical method used to estimate the sample size required to detect a significant difference or relationship between variables in a research study. Power analysis takes into account several factors including the level of significance, the effect size, and the desired power of the study.
To use power analysis to estimate sample size, the following steps can be followed:
- Define the research question and the null and alternative hypotheses.
- Determine the level of significance (alpha) and the desired power (beta) for the study. A common choice for alpha is 0.05 and beta is 0.20, these are arbitrary values that can be adjusted to specific needs.
- Estimate the effect size, which is a measure of the difference between the means or proportions of the groups being compared.
- Use a power analysis calculator or software to calculate the sample size required to detect the desired level of significance and power.
- Review the results, and adjust the sample size or other parameters if necessary.
It's important to note that the sample size determined by power analysis is only an estimate, and other factors such as the cost of the study, the feasibility of recruiting participants, and the resources available should be considered when finalizing the sample size.
Using Pilot Studies to Determine Sample Size
A pilot study is a smaller version of the main study. Pilot studies are used to identify and understand issues that might arise with the population that is being studied or the experimental methods that will be used during the main study. It can also give a hint as to how many individuals or observations will be needed to counterbalance the detected issues.
With data from the pilot studies, researchers can calculate the sample size needed for the main study using statistical formulas. Even though the size of the pilot study will be smaller than the one of the main study, it should be big enough to give a good estimation of the challenges of the main study.
Using Existing Data to Determine Sample Size
There are many research studies that have already been conducted and that are available for consultation online or through scientific and research magazines. In research articles, all steps and variables must be accounted for, which can help other researchers whilst defining their sample size for similar studies.
Using a Census for Small Populations
This approach is most suitable for populations below 200 individuals, so that every individual can be sampled. This eliminates sampling errors and provides data on every individual. However, virtually every member of the population needs to be sampled to ensure a high level of precision.
Using Sample Sizes from Similar Studies
This approach runs the risk of repeating any mistakes made in previous studies, but reviewing multiple reports can help you find out what the appropriate sample size is.
Using Published Tables
Published tables present suitable sample sizes depending on the overall population size. Data needs to be normally distributed for these tables to be appropriate.
A normal distribution has:
- The same median, mean and mode.
- A line of symmetry around the centre.
- 50% of values are less than the mean and 50% are greater than the mean
What is the formula to determine sample size?
Sometimes, precision, confidence and variability can differ within your population. If this is the case, formulas are the best way to determine sample size. These formulas are pretty complex, but if you take them step-by-step you'll be able to implement them to help you in your research and understanding.
The usual formula to determine sample size for a simple random sample is:
n = \frac{z^{2} \times p \times (1-p)} {E^{2}}
where:
- n = sample size
- z = the standard normal deviate (e.g., 1.96 for a 95% confidence level)
- p = the proportion of individuals in the population with the characteristic of interest
- E = the margin of error (i.e., the maximum acceptable difference between the sample estimate and the true population value)
Z, Z-scores, are normalised numbers that indicate how many units from the mean of the population your value lies. To learn more about the statistics behind Z-scores, you can check out our Standard Normal Distribution article.
It is important to note that these formulas are only approximations and the actual sample size may need to be adjusted based on the specific details of the study and the population being studied. Additionally, these formulas are for simple random sampling. If you are doing any other type of sampling such as stratified, cluster, systematic or multi-stage sampling, the formulas will change accordingly.
Examples of sample size calculation
Let's put this into practice and make determining a sample size easier to understand.
Molly works for a tea manufacturer. Her boss asked her to create a survey about the tea-drinking habits of people living in the UK. Molly has one month to complete this project. Her sample size needs to be large enough to represent the population, but not so large that she would run out of time to complete the project. What factors should she consider when working out her sample size?
- Degree of Variability: there are many people living around the UK, with differing backgrounds, jobs, and income levels. The population is very heterogeneous. As a result, Molly needs to use a large sample size.
- Census: there are millions of people living in the UK, so Molly would not be able to use a census.
- Similar Studies: Molly could look at similar surveys to work out how many people she needs to send the survey to.
- Published Tables: Molly could look at published tables to work out how many people she needs to send the survey to.
- Using a Formula: Molly could use a formula depending on the level of precision, confidence level and degree of variability that she wants the survey to have.
I hope that this article explained the importance of sample size for you. Remember that lots of different factors affect sample size, such as the variability and size of the population.
Sample size - Key takeaways
- Sample size is the number of participants or observations in a study. When planning an experiment, it is important to make sure that your sample size is not too big or too small.
- The criteria that affect sample size are the level of precision, the confidence level, and the degree of variability.
- There are a few ways to determine the sample size. These methods include using censuses, using sample sizes from similar studies, referring to published tables or using formulas.
- The usual formula to determine sample size for a simple random sample is: n = \frac{z^{2} \times p \times (1-p)} {E^{2}}
1. Anne Marie Helmenstine, What Is the Difference Between Accuracy and Precision?, ThoughtCo, 2020
2. Glenn Israel, Determining Sample Size, University of Florida, 2003
3. Jorge Faber, How sample size influences research outcomes, Dental Press Journal of Orthodontics, 2014
4. Prashant Kadam, Sample size calculation, International Journal of Ayurveda Research, 2010
5. The Wildlife Trust, Harvest mouse, 2020
Learn faster with the 5 flashcards about Sample Size
Sign up for free to gain access to all our flashcards.
Frequently Asked Questions about Sample Size
What is sample size?
Sample size is the number of subjects or elements included in a research study. It is frequently used in market research and statistics, but it applies to any field of study that uses populations of any kind to answer its questions.
How do you determine a sample size?
The formula to determine sample size for a simple random sample is: n = \frac{z^{2} \times p \times (1-p)} {E^{2}}
where:
- n = sample si
- z = the standard normal deviate (e.g., 1.96 for a 95% confidence level)
- p = the proportion of individuals in the population with the characteristic of interest
- E = the margin of error (i.e., the maximum acceptable difference between the sample estimate and the true population value)
What is a good sample size?
It is usually accepted that 30 is a good sample size. It is a manageable sample size but it will also give your data reliability compared to smaller samples.
What is an example of a good sample size?
It is usually accepted that 30 is a good sample size. It is a manageable sample size but it will also give your data reliability compared to smaller samples.
What is the formula for calculating sample size?
The formula to determine sample size for a simple random sample is: n = \frac{z^{2} \times p \times (1-p)} {E^{2}}
where:
- n = sample si
- z = the standard normal deviate (e.g., 1.96 for a 95% confidence level)
- p = the proportion of individuals in the population with the characteristic of interest
- E = the margin of error (i.e., the maximum acceptable difference between the sample estimate and the true population value)
Where do we draw sample size from?
Samples are drawn from the population that they are intended to represent. The sample size needed can be calculated with the formula:
n = \frac{z^{2} \times p \times (1-p)} {E^{2}}
where:
- n = sample si
- z = the standard normal deviate (e.g., 1.96 for a 95% confidence level)
- p = the proportion of individuals in the population with the characteristic of interest
- E = the margin of error (i.e., the maximum acceptable difference between the sample estimate and the true population value)
About StudySmarter
StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.
Learn more