Find study content
Learning Materials

Discover learning materials by subject, university or textbook.

Explanations
All Subjects

Anthropology

Archaeology

Architecture

Art and Design

Bengali

Biology

Business Studies

Chemistry

Chinese

Combined Science

Computer Science

Economics

Engineering

English

English Literature

Environmental Science

French

Geography

German

Greek

History

Hospitality and Tourism

Human Geography

Japanese

Italian

Law

Macroeconomics

Marketing

Math

Media Studies

Medicine

Microeconomics

Music

Nursing

Nutrition and Food Science

Physics

Politics

Polish

Psychology

Religious Studies

Sociology

Spanish

Sports Sciences

Translation
Features
Features

Discover all of these amazing features with a free account.

Flashcards

StudySmarter AI

Notes

Study Plans

Study Sets

Exams
What’s new?

Flashcards
Study your flashcards with three learning modes.

Study Sets
All of your learning materials stored in one place.

Notes
Create and edit notes or documents.

Study Plans
Organise your studies and prepare for exams.
Resources
Discover

All the hacks around your studies and career - in one place.

Find a job

Student Deals

Magazine

Mobile App
Featured

Magazine
Trusted advice for anyone who wants to ace their studies & career.

Job Board
The largest student job board with the most exciting opportunities.

StudySmarter Deals
Verified student deals from top brands.

Our App
Discover our mobile app to take your studies anywhere.

Learning Materials

Features

Discover

Hypothesis Test of Two Population Proportions

Suppose you did a survey of employees at corporations in your country and found that out of \(1300\) full-time employees and \(290\) part-time employees, that \(40\%\) of the full-time employees and \(38\%\) of the part-time employees were putting aside at least twelve percent of their earnings as savings. Could you draw any conclusions about the differences in savings habits between full-time and part-time employees? Hypothesis testing to the rescue! This is an example of two population proportions, and here you will see how to do a hypothesis test and draw conclusions from this kind of sampling.

Get started

+ Add tag
Immunology
Cell Biology
Mo

When you are looking at population proportions, and choosing a member for one sample automatically chooses a member for the second sample, the samples are called ____.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

When you are doing a hypothesis test for two population proportions, \(p_1\) and \(p_2\), what is your null hypothesis usually going to be?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Which of the following is a condition to do a hypothesis test for two population proportions?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In a hypothesis test for two population proportions, which of these might be a correct conclusion?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In a hypothesis test for two population proportions, which of the following alternative hypothesis would be a two tailed test?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In a hypothesis test for two population proportions, which of the following alternative hypothesis would be a one tailed test?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

When doing a hypothesis test for two population proportions, which table would you use with the test statistic?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In which of the following would you do a hypothesis test for two population proportions?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

When you are looking at population proportions, and choosing a member for one sample automatically chooses a member for the second sample, the samples are called ____.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

When you are doing a hypothesis test for two population proportions, \(p_1\) and \(p_2\), what is your null hypothesis usually going to be?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Which of the following is a condition to do a hypothesis test for two population proportions?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In a hypothesis test for two population proportions, which of these might be a correct conclusion?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In a hypothesis test for two population proportions, which of the following alternative hypothesis would be a two tailed test?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In a hypothesis test for two population proportions, which of the following alternative hypothesis would be a one tailed test?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

When doing a hypothesis test for two population proportions, which table would you use with the test statistic?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In which of the following would you do a hypothesis test for two population proportions?

Show Answer

Fact Checked Content
Last Updated: 05.01.2023
14 min reading time

Content creation process designed by
Content cross-checked by
Content quality checked by

Hypothesis Test for the Difference of Two Population Proportions

Let's start by listing what you know from the example at the start of this article.

Population	Population Proportion	Sample Size	Sample Proportion
Full-time employees of corporations in your country.	\(p_1 = \) proportion of all full-time employees who put aside at least twelve percent of their earnings in savings.	\(n_1 = 1300\)	\(\hat{p}_1 = 0.40\)
Part-time employees of corporations in your country.	\(p_2 = \) proportion of all part-time employees who put aside at least twelve percent of their earnings in savings.	\(n_2 = 290\)	\(\hat{p}_2 = 0.38\)

It is clear looking at the table that the sample sizes are very different, and their sample proportions are different as well. However, it will be very rare for you to find an example where the sample proportions are the same. Why might the sample proportions be different, even if you might eventually be able to conclude that the proportion of people who put aside at least twelve percent of their earnings is the same between part-time and full-time employees?

Differences that occur between two samples just by chance are called sampling variability.

One of the main questions that a hypothesis test for two population proportions tries to answer is whether the difference in your sample proportions happens because of sampling variability or because of an actual difference in the populations.

Comparing Two Population Proportions with Dependent Samples

One of the assumptions you will need is that your samples are independent.

Two samples are independent if picking members for one sample doesn't influence how members of the second sample are picked.

In the example involving employees, picking a person who is a full-time employee doesn't influence who you pick as a part-time employee, so they are independent. That is very different from dependent samples.

Two samples are dependent if picking members for one sample automatically determines the members of the second sample.

If you were doing a study on twins then picking a twin for one sample would automatically put the other twin in the second sample. Twins are a common example of dependent samples. This is called matched-pair data, and it requires a different form of hypothesis testing than you will see here.

Forming Your Hypothesis

There are many ways that \(p_1\) can be different from \(p_2\). It might be that \(p_1 < p_2\), or that \(p_1>p_2\). Rather than try and list all of the ways they are different and do a hypothesis test for each, you can look at the difference between the two population proportions. In fact, a hypothesis test for two population proportions is often called a hypothesis test for the difference between two population proportions for this very reason!

In this kind of hypothesis test, your null hypothesis will almost always be that the two population proportions are the same. If you state that in terms of their difference you get:

\[ H_0:\; p_1 - p_2 = 0.\]

Then there are three varieties of alternative hypotheses outlined in the next table.

Question	Alternative hypothesis	Test Type
Is \(p_1\) different from \(p_2\)?	\(H_a:\; p_1 - p_2 \ne 0\)	Two-tailed test.
Is \(p_1\) smaller than \(p_2\)?	\(H_a:\; p_1 - p_2 < 0\)	Left-tailed test.
Is \(p_1\) larger than \(p_2\)?	\(H_a:\; p_1 - p_2 > 0\)	Right-tailed test.

Let's go back to the example from the start of this article.

Your goal here is to figure out if full-time employees and part-time employees have different saving habits, so the hypotheses would be:

\[ \begin{align} &H_0:\; p_1 -p_2 = 0 \\ & H_a: \; p_1-p_2 \ne 0, \end{align} \]

and it would be a two-tailed test.

Next, let's look at the test statistic for this type of hypothesis test.

Significance Test Statistic for Two Population Proportions

It is important that your samples are independent, or the test statistic will be different from the one shown here. Since you are using independent samples, remember that

\[ \mu_{\hat{p}_1 - \hat{p}_2} = p_1 - p_2.\]

For a reminder on why this is true, see the articles Transforming Random Variables and Combining Random Variables.

For the standard deviation,

\[ \sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{ \frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2} }.\]

For the savings example, you have that \(n_1 = 1300\), \(n_2 = 290\), \(\hat{p}_1 = 0.40\), and \(\hat{p}_2 = 0.38\). Calculating the mean of the sampling distribution \(\hat{p}_1 - \hat{p}_2 \) gives you:

\[\begin{align} \mu_{\hat{p}_1 - \hat{p}_2} &= p_1 - p_2 \\ &= 0.40 - 0.38 \\ &= 0.02 \end{align}\]

The standard deviation for \(\hat{p}_1 - \hat{p}_2 \) is:

\[ \begin{align} \sigma_{\hat{p}_1 - \hat{p}_2} &= \sqrt{ \frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2} } \\ &= \sqrt{ \frac{0.40(1-0.40)}{1300} + \frac{0.38(1-0.38)}{290} } \\ &= \sqrt{\frac{0.24}{1300} + \frac{0.2356}{290} } \\ &\approx 0.03157 \end{align} \]

So far you have only assumed that the samples are independent. For the next part, you will need to assume that the sample sizes are large enough. If they are, you can use the Central Limit Theorem to get that your sampling distribution \(\hat{p}_1 - \hat{p}_2 \) is approximately normal.

How do you know if your samples are large enough? If all four of the following conditions are satisfied, then your samples are large enough for the sampling distribution \(\hat{p}_1 - \hat{p}_2 \) to be approximately normal:

\[n_1\hat{p_1} \ge 10\].
\[n_2\hat{p_2} \ge 10\].
\[n_1(1-p_1) \ge 10\]. and
\[n_2(1-p_2) \ge 10\].

It isn't too hard to check that the sample sizes in the savings example are large enough for the sampling distribution to be approximately normal.

The last condition to use this type of hypothesis test is that your sample is less than \(10\%\) of the overall population. In this case, the sample size is certainly less than \(10\%\) of all of the people in your country, so this condition is satisfied as well.

Z-test for Difference in Population Proportions

When doing a hypothesis test for the difference in population proportions, a \(z\)-test is used. To do this, you will need to calculate the test statistic, which uses the difference in the two proportions. To make calculations a little easier, it is helpful to find:

\[ \begin{align}\hat{p}_c &= \frac{\text{number of successes in the two samples} }{\text{total of the two sample sizes}} \\ &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2} \end{align}\]

Combining counts to get an overall proportion is called pooling, and \(p_c\) is called the pooled (or combined) proportion.

Going again back to the savings example, \(n_1 = 1300\), \(n_2 = 290\), \(\hat{p}_1 = 0.40\), and

\(\hat{p}_2 = 0.38\), which means that:

\[ \begin{align}\hat{p}_c &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2}. \\ &= \frac{1300(0.40)+ 290(0.38) }{1300+ 290} \\ &= \frac{630.2}{1590} \\ & \approx 0.3964 \end{align}\]

As long as your null hypothesis is \(H_0:\; p_1 -p_2 = 0 \), the test statistic can be calculated using the formula:

\[ z = \frac{\hat{p_1} - \hat{p_2} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } }\]

Calculating the test statistic for the savings example:

\[ \begin{align} z &= \frac{\hat{p_1} - \hat{p_2} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } } \\ &= \frac{0.40 - 0.38 }{\sqrt{ \dfrac{0.3964 (1-0.3964 ) }{1300} +\dfrac{0.3964 (1-0.3964 ) }{290} } } \\ & \approx 0.63,\end{align} \]

Rounded to \(2\) decimal places.

Let's finish up the hypothesis test for the savings example. No significance level was given, so you will need to consider the Type I and Type II error consequences. See Errors in Hypothesis Testing for more information and examples. In this example, a Type I error would be deciding that the savings proportions are not the same for the two groups when in fact they are the same.

A Type II error would be not thinking there is a difference in the population proportion between the two groups when in fact they are not the same. Neither error is very bad (unlike in a medical trial where the type of error is of much more importance) so choosing a significance level of \(\alpha = 0.05\) would be fine.

Remember that this is a two-tailed test! So the \(P\)-value is twice the area under the \(z\)-curve and to the right of the \(z\)-value. In other words:

\[ \begin{align} P\text{-value} &= 2(\text{area under curve to the right of }0.63) \\ &= 2\cdot P(z>0.63) \\ &= 2(0.2643) \\ &\approx 0.529 \end{align} \]

The \(P\)-value is greater than the significance level of \(\alpha = 0.05\), so you will fail to reject the null hypothesis.

Remember that you never say things like "the null hypothesis is true". For a reminder on why, see the article Hypothesis Testing.

Communicating your conclusion can be the most challenging part of doing a hypothesis test. What does it mean to fail to reject the null hypothesis?

Solution:

The original goal was to find out if there is a difference in savings habits between full-time and part-time employees at corporations in your country. The null hypothesis is that there is no difference in the savings habits between the two groups. In failing to reject the null hypothesis, what you are saying is that there is no convincing evidence that there is a difference in savings habits between full-time and part-time employees.

Why was there a difference in the population proportions then? It might have been from sampling variability. All you can say from the sample proportions is that you are not convinced there is a difference between the two sampling proportions.

Hypothesis Testing of Two Population Proportions Example

Let's look at another example of hypothesis testing for the difference in two population proportions.

Many bulldog owners report that their pet snores, and in fact, their bulldog snores more frequently as it gets older.

Hypothesis Tests for Two Population Proportions sleeping bulldog StudySmarter Sleeping bulldog puppy.

You have decided to do a test to see if this is actually true or maybe just a matter of perception. So you break down bulldogs into two groups, those under three years of age and those over three years of age, and choose a random sample of \(700\) bulldog owners to ask them about their dog's snoring. From the survey responses (not everyone responds to surveys), you create the following table:

Population	Population Proportion	Sample Size	Sample Proportion
Bulldogs under the age of \(3\).	\(p_1 = \) proportion of bulldogs under the age of \(3\) who snore more than five times a week.	\(n_1 = 300\)	\(\hat{p}_1 = 0.26\)
Bulldogs over the age of 3.	\(p_2 = \) proportion of bulldogs over the age of \(3\) who snore more than five times a week.	\(n_2 = 291\)	\(\hat{p}_2 = 0.392\)

Before going any further, let's check to make sure that the conditions for doing a hypothesis test for two population proportions are satisfied. First, the samples are independent since a bulldog can't be both under \(3\) years old and over \(3\) years old at the same time. In addition, there are certainly far more than \(591\) people worldwide that own bulldogs, so the number of bulldog owners sampled is less than \(10\%\) of the overall population of people who own bulldogs. Also,

\(n_1\hat{p_1} = 300(0.26)=78 \ge 10\),
\(n_2\hat{p_2} = 291(0.392) = 114 \ge 10\).
\(n_1(1-p_1) = 300(1-0.26) = 222 \ge 10\)
\(n_2(1-p_2) = 291(1-0.392) = 176.9 \ge 10\).

so all of the conditions for applying the test are met.

The next step is deciding on the null and alternative hypotheses. The null hypothesis would be:

\[ H_0: \; p_2-p_1 = 0\]

or in other words that there is no difference in snoring between the two groups. The alternative hypothesis would be that there is a difference in the snoring rates of the two groups, so:

\[H_a:\; p_2-p_1 \ne 0\]

Calculating the pooled success rate (sometimes called the combined success rate):

\[ \begin{align}\hat{p}_c &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2} \\ &= \frac{300(0.26)+291(0.392)}{300+291} \\ &\approx 0.325 . \end{align}\]

Then the test statistic is:

\[\begin{align} z &= \frac{\hat{p_2} - \hat{p_1} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } } \\ &= \frac{ 0.392 - 0.26 }{\sqrt{ \dfrac{0.325 (1-0.325) }{300} +\dfrac{0.325 (1-0.325) }{291} } } \\ &\approx 3.425 \end{align}\]

Notice that here you are using \(p_2-p_1\) as the null hypothesis simply for the convenience of having \(\hat{p_2} - \hat{p_1} \) be positive. It actually doesn't matter which version you choose for the null hypothesis, as long as you are consistent throughout your work and you make sure your \(z\) calculation matches.

Remember that this is a two-tailed test! So the \(P\)-value is twice the area under the \(z\)-curve and to the right of the \(z\)-value. In other words:

\[ \begin{align} P\text{-value} &= 2(\text{area under curve to the right of }3.425) \\ &= 2\cdot P(z>3.425) \\ &\approx 2(0.0003) \\ &= 0.0006, \end{align} \]

where the value of \(P(z>3.425)\) can be found using a standard normal table or calculator.

So at a \(\alpha = 0.05\) significance level, you can reject the null hypothesis, and conclude that there is a difference in bulldog snoring based on age.

Would your conclusion have been any different if the alternative hypothesis had been:

\[H_a:\; p_2-p_1 > 0?\]

Solution:

The main change would have been in calculating the \(P\)-value. Since it would be a one-tailed test, in this case, the calculation would be:

\[ \begin{align} P\text{-value} &= \text{area under curve to the right of }3.425 \\ &= P(z>3.425) \\ &\approx 0.0003 \end{align} \]

At the \(\alpha = 0.05\) significance level, you would still reject the null hypothesis and conclude that bulldogs over the age of \(3\) do snore more than bulldogs under the age of \(3\).

Hypothesis Test of Two Population Proportions - Key takeaways

Two samples are independent if picking members for one sample doesn't influence how members of the second sample are picked.
Two samples are dependent if picking members for one sample automatically determines the members of the second sample.
For a hypothesis test for two population proportions, the null hypothesis will almost always be that the two population proportions are the same.
The conditions for applying a hypothesis test for the difference of two population proportions are:
- The samples are independent.
- The sample is less than \(10\%\) of the overall population.
- \(n_1\hat{p_1} \ge 10\), \(n_2\hat{p_2} \ge 10\), \(n_1(1-p_1) \ge 10\), and \(n_2(1-p_2) \ge 10\) where \(n_1\) is the size of the first sample, \(n_2\) is the size of the second sample, \(p_1\) is the proportion of successes in the first sample, and \(p_2\) is the proportion of successes in the second sample.
The pooled proportion formula is \[ \begin{align}\hat{p}_c &= \frac{\text{number of successes in the two samples} }{\text{total of the two sample sizes}} \\ &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2}. \end{align}\]
The formula for the test statistic is \[ z = \frac{\hat{p_1} - \hat{p_2} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } }\]

Flashcards in Hypothesis Test of Two Population Proportions

Start learning

When you are looking at population proportions, and choosing a member for one sample automatically chooses a member for the second sample, the samples are called ____.

Dependent.

When you are doing a hypothesis test for two population proportions, \(p_1\) and \(p_2\), what is your null hypothesis usually going to be?

\(H_0:\; p_1 - p_2 = 0 \).

Which of the following is a condition to do a hypothesis test for two population proportions?

Your sample is less than \(10\%\) of the overall population.

In a hypothesis test for two population proportions, which of these might be a correct conclusion?

You fail to reject the null hypothesis.

In a hypothesis test for two population proportions, which of the following alternative hypothesis would be a two tailed test?

\(H_a:\; p_1-p_2 \ne 0\).

In a hypothesis test for two population proportions, which of the following alternative hypothesis would be a one tailed test?

\(H_a:\; p_1-p_2 > 0\).

Already have an account? Log in

Frequently Asked Questions about Hypothesis Test of Two Population Proportions

How to find p value for difference in proportions?

First you will need to find the pooled success proportion, then use the formula for the test statistic.

How to compare percentages statistically?

You would use a hypothesis test for two population proportions, also called a hypothesis test for the difference of population proportions.

When to use a two proportion z test?

When you have two independent populations, where the sample size is less than 10% of the overall population, and there are more than 10 successes and failures in each of the two samples.

What is proportion test?

It is a statistical inference test. It can be done with a single population proportion, or as a difference of two population proportions.

How to tell if proportions are equal?

Perform a hypothesis test for the difference of two populations proportions.

Save Article

Test your knowledge with multiple choice flashcards

Score

Access over 700 million learning materials

Study more efficiently with flashcards

Get better grades with AI

Already have an account? Log in

How we ensure our content is accurate and trustworthy?

At StudySmarter, we have created a learning platform that serves millions of students. Meet the people who work hard to deliver fact based content as well as making sure it is verified.

Content Creation Process:

Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.

Get to know Lily

Content Quality Monitored by:

Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.

Get to know Gabriel

Discover learning materials with the free StudySmarter app

About StudySmarter

StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

Learn more

StudySmarter Editorial Team

Team Math Teachers