Find study content
Learning Materials

Discover learning materials by subject, university or textbook.

Explanations
All Subjects

Anthropology

Archaeology

Architecture

Art and Design

Bengali

Biology

Business Studies

Chemistry

Chinese

Combined Science

Computer Science

Economics

Engineering

English

English Literature

Environmental Science

French

Geography

German

Greek

History

Hospitality and Tourism

Human Geography

Japanese

Italian

Law

Macroeconomics

Marketing

Math

Media Studies

Medicine

Microeconomics

Music

Nursing

Nutrition and Food Science

Physics

Politics

Polish

Psychology

Religious Studies

Sociology

Spanish

Sports Sciences

Translation
Features
Features

Discover all of these amazing features with a free account.

Flashcards

StudySmarter AI

Notes

Study Plans

Study Sets

Exams
What’s new?

Flashcards
Study your flashcards with three learning modes.

Study Sets
All of your learning materials stored in one place.

Notes
Create and edit notes or documents.

Study Plans
Organise your studies and prepare for exams.
Resources
Discover

All the hacks around your studies and career - in one place.

Find a job

Student Deals

Magazine

Mobile App
Featured

Magazine
Trusted advice for anyone who wants to ace their studies & career.

Job Board
The largest student job board with the most exciting opportunities.

StudySmarter Deals
Verified student deals from top brands.

Our App
Discover our mobile app to take your studies anywhere.

Learning Materials

Features

Discover

Combining Random Variables

You have probably seen the tags on items you have purchased that say "inspected by". Sometimes, like in car production, an item can be inspected by multiple people over the course of putting it together. If you know the average time it takes for each inspector to check the car, and the standard deviation for each inspector, how do you figure out the total inspection time for a random car? That is an application of combining random variables!

Get started

+ Add tag
Immunology
Cell Biology
Mo

If you have two independent random variables which follow normal distributions, then the sum of their means is the same as the mean of their sums.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

The order of subtraction is not as important as the order of addition when combining two independent random variables.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

If you combine two independent random variables by taking their difference, the difference in their variances is the same as the variance of their differences.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

If you combine two independent random variables by taking the difference, the difference of their means is the same as the mean of the difference.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

For any two independent random variables, the variance of the combined random variable is the same as the sum of the variance of the two random variables.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

To find the combined standard deviation of any two independent random variables, you simply add or subtract standard deviations.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Regarding multiple independent random variables, summing their individual standard deviations gives the overall standard deviations of the distribution.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

If you have two independent random variables which follow normal distributions, then the sum of their means is the same as the mean of their sums.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

The order of subtraction is not as important as the order of addition when combining two independent random variables.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

If you combine two independent random variables by taking their difference, the difference in their variances is the same as the variance of their differences.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

If you combine two independent random variables by taking the difference, the difference of their means is the same as the mean of the difference.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

For any two independent random variables, the variance of the combined random variable is the same as the sum of the variance of the two random variables.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

To find the combined standard deviation of any two independent random variables, you simply add or subtract standard deviations.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Regarding multiple independent random variables, summing their individual standard deviations gives the overall standard deviations of the distribution.

Show Answer

Fact Checked Content
Last Updated: 08.01.2023
18 min reading time

Content creation process designed by
Content cross-checked by
Content quality checked by

Combining and Transforming Random Variables

As you have already seen, many people inspect things like cars before they are sold. Each individual inspector has a mean inspection time, and a variance associated with their inspection time. The random variable in this case is the inspector, and what you are looking for is the sum of their expected inspection times.

If multiple random events occur which are associated with an outcome, you may want to add them to form a new distribution. The new distribution in the car example would be the total inspection time for the car.

Combining random variables means transforming two or more random variables into one.

On the other hand, transforming random variables involves scaling and shifting them. This would happen if you were playing a game multiple times and trying to figure out how much your total wins and losses might be. See the article Transforming Random Variables for more details and examples on that.

One very important thing to check before you combine random variables is that they are independent, or at least that it is reasonable for you to assume that they are independent.

Suppose you have \(3\) people who inspect a cell phone before it gets shipped off from the factory. If no two people ever inspect the phone at the same time, could you combine the random variable of their inspection times to get a new random variable for the total inspection time?

Answer:

Because no inspector ever interacts with a cell phone at the same time as another inspector, it is reasonable to assume that their inspections do not affect each other. That would mean that their inspection times are independent, and you can combine the random variables.

What about in the next example?

Suppose that your first random variable is how many hours a randomly chosen person slept yesterday, and your second random variable is how many hours that same person was awake. Can you combine those random variables?

Answer:

No. How many hours a person is awake is dependent on how many hours they were asleep, so these are not independent random variables, and they cannot be combined.

The notation \(T = X + Y\) can be confusing. Are you really just adding things together? Let's take a look at an example.

Let's think about two people inspecting a cell phone, and they do separate inspections. The company keeps track of how long each person takes to do an inspection. Then you can set up:

\(X\) is the set of times for the first person to inspect a phone; and
\(Y\) is the set of times for the first person to inspect a phone.

Rather than looking at each person inspecting a phone individually, the company wants to get an idea of the total time it takes to inspect a phone. So in this example, combining the random variables \(X\) and \(Y\) means making a random variable \(T\) with \(T = X + Y\) where you are actually adding the times in \(X\) to the times in \(Y\) to get a total time.

Combining Random Variables histogram of inspection times for Inspector #1 StudySmarter Fig. 1. Inspector #1 times, random variable \(X\)

Combining Random Variables histogram of inspection times for Inspector #2 StudySmarter

Fig. 2. Inspector #2 times, random variable \(Y\)

It can help to look at the range of times of \(T\). If the range of times in \(X\) is \(6\) minutes to \(8\) minutes, and the range of times in \(Y\) is \(4\) minutes to \(5\) minutes, then the range of \(T = X + Y\) is \(6+4 =10\) minutes to \(8+5=13\) minutes.

Suppose the company took \(20\) measurements of each inspector, and graphed them in the histograms below.

Combining Random Variables histogram of inspection times for combined random variable T StudySmarter Fig. 3. Combined Inspection times, random variable \(T = X + Y\)

The mean for Inspector #1 is \(7.1\) minutes, and the mean for Inspector #2 is \(4.6\) minutes. Then their times are combined into a new random distribution, \(T\), and the histogram for that data is above.

Notice that the range in times of the histogram goes between \(10\) and \(13\) minutes. The mean for the combined histogram is \(11.7\) minutes, which is about what you would expect given the means for the individual inspections.

How does combining random variables affect the mean?

Combining Random Variables, the Mean

While you can combine more than two random variables as long as they are independent, for simplicity's sake the rest of this article concentrates on combining just two of them.

Suppose \(X\) and \(Y\) are two random variables that are independent. For the mean of \(X\) write \(\mu_X\), and for the mean of \(Y\) write \(\mu_Y\). How do you combine their means?

The mean of the sum of two random variables is the sum of their means. In other words, if \(T = X + Y\) then\[ \mu_T = \mu_X + \mu_Y.\]
If you take the difference of two random variables, then the mean of the difference is the difference of their means. So if \(T = X - Y\), then\[ \mu_T = \mu_X - \mu_Y.\]

Just like in regular subtraction, the order makes a difference. Let's look at a couple of examples.

Jake and Anna work in the same store, but in different departments. Jake expects to sell an average of \(5\) shirts per day and Anna expects to sell an average of \(3\). What is the total expected average number of shirts sold in the store per day?

Answer:

Let \(X\) be the random variable representing how many shirts Jake sells, and \(Y\) be the random variable representing Anna's sales. You would hope these are independent random variables! Call \(T\) the random variable of the total sales in the store, so \(T = X + Y\).

From the problem statement,

\[ \mu_X = 5 \text{ and } \mu_Y = 3.\]

Therefore, they can expect to sell \[ \begin{align} \mu_T &= \mu_X + \mu_Y \\ &= 5 + 3 \\ &= 8, \end{align}\]

or in other words a total of \(8\) shirts.

What if you are asked about how many more shirts Jake would expect to sell?

Jake and Anna work in the same store, but in different departments. Jake expects to sell an average of \(5\) shirts per day and Anna expects to sell an average of \(3\). How many more shirts can Jake expect to sell per day?

Solution:

Just like before, let \(X\) be the random variable representing how many shirts Jake sells, and \(Y\) be the random variable representing Anna's sales, where you are reasonably assuming they are independent. Call \(T\) the random variable of the difference between Jake and Anna's sales in the store. Then since \(T = X - Y\),

\[ \begin{align} \mu_T &= \mu_X - \mu_Y \\ &= 5 - 3 \\ &= 2. \end{align}\]

So Jake can expect to sell \(2\) more shirts than Anna.

Suppose you had looked at the difference between Anna and Jake's sales instead? Then you would have found a mean of \(-2\)! That can happen, and you need to look at the actual combined distribution to figure out what it implies in real life. If you find a negative number when looking at the difference in the sales, it just implies that in general Anna sells fewer shirts than Jake does.

Combining Random Variables, Standard Deviation

Just like with the mean, combining the variance of two independent random variables is a matter of addition. Suppose \(X\) and \(Y\) are two random variables that are independent. For the standard deviation of \(X\) write \(\sigma_X\), and for the standard deviation of \(Y\) write \(\sigma_Y\). Then:

The variance of the sum of two random variables is the sum of their variances. In other words, if \(T = X + Y\) then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]
If you take the difference of two random variables, then the variance of the difference is the sum of their variances. So if \(T = X - Y\), then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]

Wait a minute, that second part doesn't look right! Why is it that when you subtract two distributions you aren't subtracting their variances? It is because the variance is a measure of how spread apart the distribution is. So if you combine two distributions, the new one is going to have a larger spread than either of the two original ones.

Does this imply that you can combine the standard deviation of two independent random variables with addition as well? Absolutely not! Remember that the standard deviation is the square root of the variance, and that

\[ \sqrt{a + b} \ne \sqrt{a} + \sqrt{b}.\]

So the standard deviations cannot be added in the same way that the variance can be.

Let's look at an example to show how it works.

Jake and Anna work in the same store, but in different departments. Jake expects to sell an average of \(5\) shirts per day and Anna expects to sell an average of \(3\). However, Jake has a standard deviation in his sales of \(1\) shirt, while Anna has a standard deviation of \(4\) shirts. Is the standard deviation of their combined shirt totals the same as the sum of the standard deviation of their individual totals?

Solution:

Setting up some variables:

\(X\) is the random variable of the number of shirts Jake sells;
\(Y\) is the random variable of the number of shirts Anna sells; and
\(T\) is the random variable of the number of shirts they sell combined.

As you have already seen, \(\mu_T = 8\). What about the variance and standard deviation? From the statement of the problem, their individual standard deviations are

\[ \sigma_X = 1 \mbox{ and } \sigma_Y = 4.\]

Then for the variance,

\[ \begin{align} \sigma^2_T &= \sigma^2_X + \sigma^2_Y \\ &= 1^2 + 4^2 \\ &= 17, \end{align} \]

but

\[ \sigma_T = \sqrt{17} \approx 4.1\]

which is not the same as

\[ \sigma_X + \sigma_Y = 1 + 4 = 5.\]

In fact,

\[ \sigma_T < \sigma_X + \sigma_Y.\]

So while the average number of shirts they sell per day stays the same if they work together, the standard deviation of the number of shirts they sell together is smaller than if they stay separate.

Combining Normal Random Variables

In the examples you have looked at so far, it didn't make a difference if the random variables followed a normal distribution. The only thing that mattered is that they were independent random variables.

When you have two independent continuous random variables, both of which follow a normal distribution, so does their sum or difference.

Let's look at an example to illustrate this.

Suppose you have a business where you are making and delivering pizzas, where both making and delivering the pizzas are normal distributions, with

making the pizza has an average time of \(18\) minutes with a standard deviation of \(1.5\) minutes; and
delivering the pizzas has an average time of \(25\) minutes with a standard deviation of \(8\) minutes.

(a) What is the probability that making and delivering a pizza takes more than an hour?

(b) What percentage of the pizzas take longer to make than to deliver?

Solution:

(a) In this part of the question you are looking for the total time, in other words, the sum of two normally distributed independent random variables. First, let's define the random variables:

\(X\) is the random variable for the time it takes to make a pizza;
\(Y\) is the random variable for the time it takes to deliver a pizza; and
\(T\) is the random variable to the total time to make and deliver a pizza.

You are told that both of the random variables are normal, and you would expect that making the pizza and delivering the pizza are independent of each other. So \(T\) is also normally distributed, with \(T = X + Y\).

The average time to make and deliver a pizza would be

\[ \begin{align} \mu_T &= \mu_X + \mu_Y \\ &= 18 + 25 \\ &= 43 \, min. \end{align}\]

Since the times are independent,

\[ \begin{align} \sigma^2_T &= \sigma^2_X + \sigma^2_Y \\ &= 1.5^2 + 8^2 \\ &= 66.25,\end{align} \]

\[ \sigma_T = \sqrt{66.25} \approx 8.1 \, min.\]

In other words, \(T\) is a normal distribution with mean \(43\) and standard deviation \(8.1\).

You want to know the probability that making and delivering a pizza takes more than an hour. The graph below shows the normal distribution for the total time, and the shaded region represents the time over \(60\) minutes.

Combining Random Variables normal distribution with shaded area in the right tail showing probability of time more than an hour StudySmarter Fig. 4. Normal Distribution showing time longer than an hour

Then the \(z\)-score associated with \(60\) minutes is

\[ z = \frac{60-43}{8.1} = 2.099\]

which, using a standard normal table, gives you the probability of taking more than \(60\) minutes is

\[ P(T>60) = P(z>2.099) = 0.0179.\]

In other words, there is only a \(1.79\%\) chance that a pizza will take longer than an hour to make and deliver!

(b) Next you want to know the percentage of the pizzas take longer to make than to deliver. This time you want to know about the difference between \(X\) and \(Y\), so you need a new random variable, call it \(D\), to represent this. In other words \(D = X - Y\). It is still true that both \(X\) and \(Y\) are independent random variables that follow a normal distribution.

The average time difference between making and delivering a pizza would be

\[ \begin{align} \mu_D &= \mu_X - \mu_Y \\ &= 18 - 25 \\ &= -8 \, min. \end{align}\]

Since the times are independent,

\[ \begin{align} \sigma^2_D &= \sigma^2_X + \sigma^2_Y \\ &= 1.5^2 + 8^2 \\ &= 66.25,\end{align} \]

\[ \sigma_D = \sqrt{66.25} \approx 8.1 \, min.\]

In other words, \(D\) is a normal distribution with mean \(-8\) and standard deviation \(8.1\). If a pizza takes longer to make than to deliver, what you want to find is \(P(D>0)\). In the graph below, the shaded region represents when the pizza takes longer to make than to deliver.

Combining Random Variables normal distribution with shaded area in the right tail showing probability of time for making a pizza taking longer than time to deliver pizza StudySmarter Fig. 5. Normal distribution showing time greater than 0

Then the \(z\)-score associated with \(0\) minutes is

\[ z = \frac{0-(-8)}{8.1} = 0.988\]

which, using a standard normal table, gives you the probability of taking more than \(60\) minutes is

\[ P(D>0) = P(z>0.988) = 0.1611.\]

In other words, about \(16\%\) of the time, the pizza will take longer to make than to deliver.

More examples are always good!

Examples of Combining Random Variables

Let's take a look at some more examples.

Suppose you have two inspectors working for you. If either of them inspects an item, it takes an average of \(5.8\) minutes to do the inspection, with a standard deviation of \(8\) minutes. However, if both of them work together to inspect the same item, it takes an average of \(11.6\) minutes with a standard deviation of \(17\) minutes. Is it better for you to have the inspectors working separately or together?

Solution:

First, let's give the variables some names:

\(X\) is the variable for inspector A;
\(Y\) is the variable for inspector B; and
\(T\) is the variable for their combined times.

Then \(T = X + Y\), so

\[ \begin{align} \mu_T &= \mu_X + \mu_Y \\ &= 5.8 + 5.8 \\ &= 11.6 \, min. \end{align}\]

That means it doesn't matter if they work together or separately, in either case, their average time is going to be \(11.6\) minutes.

In order for you to look at their combined variances, you need to know that they are independent variables. So for the rest of this example, you will need to assume that two people can inspect an item at the same time without interfering with each other, making them independent variables. Then the variance is

\[\begin{align} \sigma^2_T &= \sigma^2_X + \sigma^2_Y \\ &= 8^2 + 8^2 \\ &= 128, \end{align} \]

and the standard deviation is

\[ \begin{align} \sigma_T &= \sqrt{ \sigma^2_T} \\ & = \sqrt{134.8} \\ &\approx 11.3 \, min. \end{align} \]

So when the two inspectors work separately, they have a much smaller variation in their inspection time.

What does that mean in terms of you having them work together or separately? Given that their mean inspection time is the same either way, it pays you to choose the option that gives you the least variation in inspection times. That means you want the two inspectors working separately since when they work together their standard deviation is \(17\) minutes rather than \(11.3\) minutes when they work separately.

Let's look at one involving toys.

A local shop sells toy cars. The probability of selling between \(0\) and \(5\) toy cars is given in the table below.

Number of Cars	Probability
\(0\)	\(0.03\)
\(1\)	\(0.16\)
\(2\)	\(0.30\)
\(3\)	\(0.23\)
\(4\)	\(0.17\)
\(5\)	\(0.11\)

Table 1. Probability of selling.

Assume that the sale of toy cars is independent.(a) Find the mean and standard deviation for the number of toy cars the shop sells in a day.(b) If the shop is open \(5\) days a week, how many toy cars can the shop expect to sell, and what is the standard deviation?Solution:(a) First let's set up some variables. Here, \(X\) is the random variable representing the number of toy cars the shop sells in a day, with \(x_i\) being the number of cars sold with probability \(p_i\). So\[ \begin{align} \mu_X &= \sum\limits_{i=0}^5 x_i p_i \\ &= 0(0.03) + 1(0.16) + 2(0.30) + 3(0.23) + 4(0.17) + 5(0.11) \\ &= 2.68. \end{align}\]There are \(6\) table entries, so the variance is given by\[ \begin{align} \sigma^2_X &= \frac{ \sum\limits_{i=0}^5 (x_i - \mu_X)^2}{N} \\ &= \frac{\substack{(0-2.68)^2+(1-2.68)^2+(2-2.68)^2\\ +(3-2.68)^2+(4-2.68)^2+(5-2.68)^2 }}{6} \\ & \approx 2.95, \end{align} \] and the standard deviation is given by\[ \begin{align} \sigma_X &= \sqrt{2.95} \\ & \approx 1.72. \end{align} \](b) Hopefully sales of toy cars on any day doesn't affect car sale on another day, so you can assume that the daily number of toy cars sold is independent. In addition, the number of toy cars the shop expects to sell doesn't change on any given day. Then for a week, the shop can expect:\[\begin{align} \text{weekly expected car sale total} &= 5(\text{daily expected car sale total}) \\ &= 5(2.68) \\ &= 13.4, \end{align} \]so about \(13\) toy cars sold in a week.

Remember that you can't just add to get the standard deviation! Instead, you must find the variance for the week, then take the square root. The variance for the weekly toy car sales is additive, so

\[ \begin{align} \text{variance for weekly car sales} &= 5(2.95) \\ &= 14.75 \end{align}\]

which gives you

\[ \begin{align} \text{standard deviation for weekly car sales} &= \sqrt{14.75} \\ & \approx 3.84 . \end{align}\]

Combining Random Variables - Key takeaways

Combining random variables means transforming two or more random variables into one.
Only combine random variables that are independent!
- The mean of the sum of two random variables is the sum of their means. In other words, if \(T = X + Y\) then\[ \mu_T = \mu_X + \mu_Y.\]
- If you take the difference of two random variables, then the mean of the difference is the difference of their means. So if \(T = X - Y\), then\[ \mu_T = \mu_X - \mu_Y.\]
- The variance of the sum of two random variables is the sum of their variances. In other words, if \(T = X + Y\) then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]
- If you take the difference of two random variables, then the variance of the difference is the sum of their variances. So if \(T = X - Y\), then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]
The sum and difference formulas do not work for the standard deviation!

Flashcards in Combining Random Variables

Start learning

If you have two independent random variables which follow normal distributions, then the sum of their means is the same as the mean of their sums.

True.

The order of subtraction is not as important as the order of addition when combining two independent random variables.

False.

If you combine two independent random variables by taking their difference, the difference in their variances is the same as the variance of their differences.

False.

If you combine two independent random variables by taking the difference, the difference of their means is the same as the mean of the difference.

True.

For any two independent random variables, the variance of the combined random variable is the same as the sum of the variance of the two random variables.

True.

To find the combined standard deviation of any two independent random variables, you simply add or subtract standard deviations.

False.

Already have an account? Log in

Frequently Asked Questions about Combining Random Variables

What happens if two independent normal random variables are combined?

This signifies that the total of two normally distributed random variables is normal, with the mean equal to the sum of the two means and the variance equal to the sum of the two variances (i.e., squaring the standard deviation is the sum of the squares of the standard deviations). Thus, you are able to estimate the overall impact/results of two experiments whose variables are normally distributed.

Why do we combine random variables?

Combining random variables allows us to create new distributions. We can find the mean and standard deviation of the new distribution if we have the mean and standard deviation of the source distributions.

How do you combine variables that are random?

To combine normal random variables, the following steps should be followed:

Step 1: Give the random variables meaningful names, such as X and Y

Step 2: Identify their means, X and Y, and their standard deviations, X and Y.

Step 3: Calculate their expected value sum or differences by adding or subtracting the mean values.

Step 4: Square the standard deviations given to give you the variance, then take square root of results to obtain the combined standard deviation.

What does Combining Random Variables means?

Combining random variables means transforming two or more variables into one in simple terms. This means the random variables are assumed to be independent.

Save Article

Test your knowledge with multiple choice flashcards

Score

Access over 700 million learning materials

Study more efficiently with flashcards

Get better grades with AI

Already have an account? Log in

How we ensure our content is accurate and trustworthy?

At StudySmarter, we have created a learning platform that serves millions of students. Meet the people who work hard to deliver fact based content as well as making sure it is verified.

Content Creation Process:

Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.

Get to know Lily

Content Quality Monitored by:

Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.

Get to know Gabriel

Discover learning materials with the free StudySmarter app

About StudySmarter

StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

Learn more

StudySmarter Editorial Team

Team Math Teachers