Find study content
Learning Materials

Discover learning materials by subject, university or textbook.

Explanations
All Subjects

Anthropology

Archaeology

Architecture

Art and Design

Bengali

Biology

Business Studies

Chemistry

Chinese

Combined Science

Computer Science

Economics

Engineering

English

English Literature

Environmental Science

French

Geography

German

Greek

History

Hospitality and Tourism

Human Geography

Japanese

Italian

Law

Macroeconomics

Marketing

Math

Media Studies

Medicine

Microeconomics

Music

Nursing

Nutrition and Food Science

Physics

Politics

Polish

Psychology

Religious Studies

Sociology

Spanish

Sports Sciences

Translation
Features
Features

Discover all of these amazing features with a free account.

Flashcards

StudySmarter AI

Notes

Study Plans

Study Sets

Exams
What’s new?

Flashcards
Study your flashcards with three learning modes.

Study Sets
All of your learning materials stored in one place.

Notes
Create and edit notes or documents.

Study Plans
Organise your studies and prepare for exams.
Resources
Discover

All the hacks around your studies and career - in one place.

Find a job

Student Deals

Magazine

Mobile App
Featured

Magazine
Trusted advice for anyone who wants to ace their studies & career.

Job Board
The largest student job board with the most exciting opportunities.

StudySmarter Deals
Verified student deals from top brands.

Our App
Discover our mobile app to take your studies anywhere.

Learning Materials

Features

Discover

Residuals

You have seen errors occurring in math problems, on some website pages, or in many other places in your life. But what about graphs in statistics? Do they have some kind of error in them? If there are, then are they actually an error? Check out this article on residuals and find out answers to these questions.

Get started

+ Add tag
Immunology
Cell Biology
Mo

Is residual also considered as error in regression?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Do statistics include the study of residuals?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

The desirable residual plot is the one which shows no pattern.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

The mean value of all residuals for a model should equate to 0.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In the graph, the residual between the data point and trendline is shown as ____.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Good fit for linear regression cannot be checked by residual.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

A _____ residual value results in a trendline that better fits the data points.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Is residual also considered as error in regression?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Do statistics include the study of residuals?

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

The desirable residual plot is the one which shows no pattern.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

The mean value of all residuals for a model should equate to 0.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

In the graph, the residual between the data point and trendline is shown as ____.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

Good fit for linear regression cannot be checked by residual.

Show Answer

+ Add tag
Immunology
Cell Biology
Mo

A _____ residual value results in a trendline that better fits the data points.

Show Answer

Fact Checked Content
Last Updated: 08.01.2023
12 min reading time

Content creation process designed by
Content cross-checked by
Content quality checked by

You show in a regression analysis if other variables impact a certain variable (dependent) though it is made known that certain specific variables (explanatory) may have a relationship or explains it. This is explained by a concept called residuals. Let’s take a look at residuals in this lesson.

Residuals in Math

For instance, assuming you want to find out how climate changes affect yield from a farm. You may specify climate variables in the model such as rainfall and temperature. However, other factors such as land size cultivated, and fertilizer use, among others, also affect farm yield. Hence, the question becomes, “is the model accurately predicting the level of yield considering climate changes as an explanatory variable?”. So how do you measure how much impact a given factor has? Let's look at a short and informal definition of a residual.

For any observation, the residual of that observation is the difference between the predicted value and the observed value.

You can lean on the size of the residual to inform you about how good your prediction model is. That means you consider the value of the residual to explain why the prediction is not precisely as the actual.

In mathematics, residual value is usually used in terms of assets and in statistics (basically, in regression analysis as discussed in previous sections). The worth of an asset after a specified use-time explains the residual value of the asset.

For instance, the residual value for renting out a factory machine for $10$ years, is how much the machine will be worth after $10$ years. This can be referred to as the salvage value or scrap value of the asset. Thus, how much an asset is worth after its lease term or productive/useful lifespan.

So, formally you can define residuals as below.

Definition of Residual

The residual is the vertical distance between the observed point and the predicted point in a linear regression model. A residual is termed as the error term in a regression model, though it is not an error, but the difference in the value. Here is the more formal definition of a residual in terms of a regression line.

The difference between the actual value of a dependent variable and its associated predicted value from a regression line (trendline) is called residual. A residual is termed as the error term in a regression model. It measures the accuracy with which the model was estimated with the explanatory variables.

Mathematically, you can estimate the residual by deducting the estimated values of the dependent variable $(\hat{y})$ from the actual values given in a dataset $(y)$.

For a reminder about regression lines and how to use them, see the articles Linear Correlation, Linear Regression and Least-Squares Regression

The residual is represented by $\varepsilon $. That will mean

\[\varepsilon =y-\hat{y}.\]

The predicted value $(\hat{y})$ is obtained by substituting $x$ values in the least-square regression line.

Residuals, residuals for data points, StudySmarter Residuals for data points

In the above graph, the vertical gap between a data point and the trendline is referred to as residual. The spot the data point is pinned determines whether the residual will be positive or negative. All points above the trendline show a positive residual and points below the trendline indicate a negative residual.

Residual in Linear Regression

For simplicity's sake let's look at residuals for bivariate data. In linear regression, you include the residual term to estimate the margin of error in predicting the regression line which passes through the two sets of data. In simple terms, residual explains or takes care of all other factors that may influence the dependent variable in a model other than what the model states.

Residuals are one way to check the regression coefficients or other values in linear regression. If the residual plot some unwanted patterns, then some values in the linear coefficients cannot be trusted.

You should make the following assumptions about the residuals for any regression model:

Assumptions of Residuals

They have to be independent – no one residual at a point influences the next point’s residual value.
Constant variance is assumed for all residuals.
The mean value of all residuals for a model should equate to $0$.
Residuals should be normally distributed/follow a normal distribution – plotting them will give a straight line if they are normally distributed.

Residual Equation in Math

Given the linear regression model which includes the residual for estimation, you can write:

\[y=a+bx+\varepsilon ,\]

where $y$ is the response variable (independent variable), $a$ is the intercept, $b$ is the slope of the line, $x$ is

the explanatory variable (dependent variable) and $\varepsilon$ is the residual.

Hence, the predicted value of $y$ will be:

\[\hat{y} = a+bx .\]

Then using the definition, the residual equation for the linear regression model is

\[\varepsilon =y-\hat{y}\]

where $\varepsilon$ represents residual, $y$ is the actual value and $\hat{y}$ is the predicted value of y.

For $n$ observations of data, you can represent predicted values as,

\[ \begin{align}\hat{y}_1&=a+bx_1 \\ \hat{y}_2&=a+bx_2 \\ &\vdots \\ \hat{y}_n&=a+bx_n\\\end{align}\]

And with these $n$ predicted quantities residuals can be written as,

\[ \begin{align}\varepsilon _1&=y_1-\hat{y}_1 \\ \varepsilon _2&=y_2-\hat{y}_2 \\ &\vdots \\ \varepsilon _n&=y_n-\hat{y}_n \\ \end{align}\]

This equation for residuals will be helpful in finding residuals from any given data. Note that, the order of subtraction is important when finding residuals. It is always the predicted value taken from the actual value. That is

residual = actual value – predicted value.

How to Find Residuals in Math

As you have seen, residuals are errors. Thus, you want to find out how accurate your prediction is from the actual figures considering the trendline. To find the residual of a data point:

First, know the actual values of the variable under consideration. They may be presented in a table format.
Secondly, identify the regression model to be estimated. Find the trendline.
Next, using the trendline equation and the value of the explanatory variable, find the predicted value of the dependent variable.
Finally, subtract the estimated value from the actual given.

This means if you have more than one data point; for example, $10$ observations for two variables, you will be estimating the residual for all $10$ observations. That is $10$ residuals.

The linear regression model is considered to be a good predictor when all the residuals add up to $0$.

You can understand it more clearly by taking a look at an example.

A production plant produces varying numbers of pencils per hour. Total output is given by

\[y=50+0.6x ,\]

where $x$ is the input used to produce pencils and $y$ is the total output level.

Find the residuals of the equation for the following number of pencils produced per hour:

$x$	$500$	$550$	$455$	$520$	$535$
$y$	$400$	$390$	$350$	$355$	$371$

Table 1. Residuals of the example.

Solution:

Given the values in the table and the equation $y=50+0.6x$, you can proceed to find the estimated values by substituting the $x$ values into the equation to find the corresponding estimated value of $y$.

$X$	$Y$	$y=50+0.6x$	$\varepsilon =y-\hat{y}$
$500$	$400$	$350$	$50$
$550$	$390$	$380$	$10$
$455$	$350$	$323$	$27$
$520$	$355$	$362$	$-7$
$535$	$365$	$365$	$0$

Table 2. Estimated values.

The results for $\varepsilon =y-\hat{y}$ shows you the trendline under-predicted the $y$ values for $3$ observations (positive values), and over-predict for one observation (negative value). However, one observation was accurately predicted (residual = $0$). Hence, that point will lie on the trendline.

You can see below how to plot the residuals in the graph.

Residual Plot

The residual plot measures the distance data points have from the trendline in the form of a scatter plot. This is obtained by plotting the computed residual values against the independent variables. The plot assists you to visualize how perfectly the trendline conforms to the given data set.

Residuals, residuals without any pattern, StudySmarter Fig. 1. Residuals without any pattern.

The desirable residual plot is the one which shows no pattern and the points are scattered at random. You can see from the above graph, that there is no specific pattern between points, and all the data points are scattered.

A small residual value results in a trendline that better fits the data points and vice versa. So larger values of the residuals suggest the line is not the best for the data points. When the residual is $0$ for an observed value, it means that the data point is precisely on the line of best fit.

A residual plot can at times be good to identify potential problems in the regression model. It can much easier to show the relationship between two variables. The points far above or below the horizontal lines in residual plots show the error or unusual behavior in the data. And some of these points are called outliers regarding the linear regression lines.

Note that the regression line might not be valid for a wider range of $x$ as sometimes it might give poor predictions.

Considering the same example used above, you can plot the residual values below.

Using the results in the production of pencils example for the residual plot, you can tell that the vertical distance of the residuals from the line of best fit is close. Hence, you can visualize that, line $y=50+0.6x$ is a good fit for the data.

Residuals, residuals plot for data points, StudySmarter Fig. 2. Residual plot.

From below, you can see how to work out the residual problem for different scenarios.

Residual Examples in Math

You can understand how to calculate residuals more clearly by following the residual examples here.

A shop attendant earns $\$800.00$ per month. Assuming the consumption function for this shop attendant is given by $y=275+0.2x$, where $y$ is consumption and $x$ is income. Assuming further, that the shop attendant spends $\$650$ monthly, determine the residual.

Solution:

First, you have to find the estimated or predicted value of $y$ using the model $y=275+0.2x$.

Hence, \[\hat{y}=275+0.2(800) =\$435.\]

Given $\varepsilon =y-\hat{y}$, you can compute the residual as:

\[\varepsilon =\$650-\$435 =\$215 .\]

Therefore, the residual equals $\$215$. This means you predicted the shop attendant spends lesser (that is, $\$435$) than they actually spend (that is, $\$650$).

Consider another example to find the predicted values and residuals for the given data

A production function for a factory follows the function $y=275+0.75x$. Where $y$ is the output level and $x$ is the material used in kilograms. Assuming the firm uses $1000\, kg$ of input, find the residual of the production function.

Solution:

The firm uses $1000kg$ of input, so it will also be the actual value $y$. You want to find the estimated output level. So

\[ \begin{align}\hat{y}&=275+0.75x \\ &=275+0.75(1000) \\ &=1025 . \\ \end{align}\]

Then you can estimate the residual or error of prediction:

\[ \begin{align}\varepsilon &=y-\hat{y} \\ &=1000-1025 \\ &=(-)25\, kg .\\ \end{align}\]

Therefore, the predicted output level is larger than the actual level of $1000kg$ by $25kg$.

The following example will show the plotting of residuals in the graph.

Sam collected data on the time taken to study, and the scores obtained after the given test from the class. Find the residuals for the linear regression model $y=58.6+8.7x$. Also, plot the residuals in the graph.

Study time $(x)$	$0.5$	$1$	$1.5$	$2$	$2.5$	$3$	$3.5$
Test scores $(y)$	$63$	$67$	$72$	$76$	$80$	$85$	$89$

Table 3. Study time example.

Solution:

You can create a table with the above data and calculate predicted values by using $y=58.6+8.7x$.

Study time $(x)$	Test scores $(y)$	Predicted values ($\hat{y}=58.6+8.7x$)	Residuals ($\varepsilon =y-\hat{y}$)
$0.5$	$63$	$62.95$	$0.05$
$1$	$67$	$67.3$	$-0.3$
$1.5$	$72$	$71.65$	$0.35$
$2$	$76$	$76$	$0$
$2.5$	$80$	$80.35$	$-0.35$
$3$	$85$	$84.7$	$0.3$
$3.5$	$89$	$89.05$	$-0.05$

Table 4. Example with study time, test scores, predicted values and residuals data.

Using all the residuals and $x$ values, you can make the following residual plot.

Residuals, residuals plot for data points, StudySmarter Fig. 3. Residual plot for the given data

Residuals - Key takeaways

The difference between the actual value of a dependent variable and its associated predicted value from a regression line (trendline) is called residual.
All points above the trendline shows a positive residual and points below the trendline indicate a negative residual.
Residuals are one way to check the regression coefficients or other values in linear regression.
Then the residual equation is, $\varepsilon =y-\hat{y}$.
The predicted value of $y$ will be $\hat{y} = a+bx$ for linear regression $y=a+bx+\varepsilon $.
A residual plot can at times be good to identify potential problems in the regression model.

Flashcards in Residuals

Start learning

Is residual also considered as error in regression?

Yes.

Do statistics include the study of residuals?

Yes.

The desirable residual plot is the one which shows no pattern.

False.

The mean value of all residuals for a model should equate to 0.

True.

In the graph, the residual between the data point and trendline is shown as ____.

Vertical gap.

Good fit for linear regression cannot be checked by residual.

True.

Already have an account? Log in

Frequently Asked Questions about Residuals

What does residual mean?

The difference between the actual value of a dependent variable and its associated predicted value from a regression line (trendline) is called residual.

How to find a residual in math?

Do the following to find the residual of a data point:

Know the actual values of the variable under consideration. This may be presented in a table format.
Secondly, identify the regression model to be estimated. Thus, the trendline.
Next, using the trendline equation and the value of the explanatory variable, find the predicted value of the dependent variable.
Finally, subtract the estimated value from the actuals given.

What does residual plot mean in maths?

Residual plot measures the distance data points have from the trendline. This is obtained by plotting the computed residual values against the independent variables. The plot assists you to visualize how perfectly the trendline conforms to the given data set.

What is residual value in math?

In mathematics, residual value is usually used in terms of assets and in statistics (basically, in regression analysis as discussed in previous sections).

The worth of an asset after a specified use-time explains the residual value of the asset.

What are some examples of residuals?

Suppose y = 2, y hat = 2.6. Then 2-2.6 = -0.6 is the residual.

Save Article

Test your knowledge with multiple choice flashcards

Score

Access over 700 million learning materials

Study more efficiently with flashcards

Get better grades with AI

Already have an account? Log in

How we ensure our content is accurate and trustworthy?

At StudySmarter, we have created a learning platform that serves millions of students. Meet the people who work hard to deliver fact based content as well as making sure it is verified.

Content Creation Process:

Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.

Get to know Lily

Content Quality Monitored by:

Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.

Get to know Gabriel

Discover learning materials with the free StudySmarter app

About StudySmarter

StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

Learn more

StudySmarter Editorial Team

Team Math Teachers

12 minutes reading time
Checked by StudySmarter Editorial Team

Save Explanation

Study anywhere. Anytime.Across all devices.

Sign-up for free

Get Started Free

Explore our app and discover over 50 million learning materials for free.

94% of StudySmarter users achieve better grades with our free platform.

\(x\)	\(500\)	\(550\)	\(455\)	\(520\)	\(535\)
\(y\)	\(400\)	\(390\)	\(350\)	\(355\)	\(371\)

\(X\)	\(Y\)	\(y=50+0.6x\)	\(\varepsilon =y-\hat{y}\)
\(500\)	\(400\)	\(350\)	\(50\)
\(550\)	\(390\)	\(380\)	\(10\)
\(455\)	\(350\)	\(323\)	\(27\)
\(520\)	\(355\)	\(362\)	\(-7\)
\(535\)	\(365\)	\(365\)	\(0\)

Study time \((x)\)	\(0.5\)	\(1\)	\(1.5\)	\(2\)	\(2.5\)	\(3\)	\(3.5\)
Test scores \((y)\)	\(63\)	\(67\)	\(72\)	\(76\)	\(80\)	\(85\)	\(89\)

Study time \((x)\)	Test scores \((y)\)	Predicted values (\(\hat{y}=58.6+8.7x\))	Residuals (\(\varepsilon =y-\hat{y}\))
\(0.5\)	\(63\)	\(62.95\)	\(0.05\)
\(1\)	\(67\)	\(67.3\)	\(-0.3\)
\(1.5\)	\(72\)	\(71.65\)	\(0.35\)
\(2\)	\(76\)	\(76\)	\(0\)
\(2.5\)	\(80\)	\(80.35\)	\(-0.35\)
\(3\)	\(85\)	\(84.7\)	\(0.3\)
\(3.5\)	\(89\)	\(89.05\)	\(-0.05\)

Residuals

Residuals in Math

Definition of Residual

Residual in Linear Regression

Assumptions of Residuals

Residual Equation in Math

How to Find Residuals in Math

Residual Plot

Residual Examples in Math

Residuals - Key takeaways

Similar topics in Math

Related topics to Statistics

Flashcards in Residuals

Learn faster with the 7 flashcards about Residuals

Frequently Asked Questions about Residuals

How we ensure our content is accurate and trustworthy?

About StudySmarter