Residual Sum of Squares

Suppose you thought that a dog's height could be predicted by its weight. How could you tell if there was really a relationship there?  One way would be to choose a random sample of dogs, collect their weights and heights, and then graph your data. Surely if there is a relationship it would show up on a graph, right?  But even if it looks like there is a linear relationship, how could you be sure?  The principle of least-squares regression, also known as the residual sum of squares, can help you tell just how good a dog's weight is at predicting its height.

Get started

Millions of flashcards designed to help you ace your studies

Sign up for free

Achieve better grades quicker with Premium

PREMIUM
Karteikarten Spaced Repetition Lernsets AI-Tools Probeklausuren Lernplan Erklärungen Karteikarten Spaced Repetition Lernsets AI-Tools Probeklausuren Lernplan Erklärungen
Kostenlos testen

Geld-zurück-Garantie, wenn du durch die Prüfung fällst

Review generated flashcards

Sign up for free
You have reached the daily AI limit

Start learning or create your own AI flashcards

StudySmarter Editorial Team

Team Residual Sum of Squares Teachers

  • 11 minutes reading time
  • Checked by StudySmarter Editorial Team
Save Article Save Article
Contents
Contents

Jump to a key chapter

    Residual sum of squares linear regression

    Let's continue with the example of trying to use a dog's adult weight to predict its height. You have done random sampling and your best to make sure your sample is representative of the overall adult dog population. The information you have gathered is in the table below, where the weight is in pounds and the height is in inches.

    Table 1 - Dog Weights (in pounds) and Heights (in inches)

    Weight

    Height

    Weight

    Height

    Weight

    Height

    \(10\)

    \(10\)

    \(75\)

    \(23\)

    \(12\)

    \(12\)

    \(63\)

    \(25\)

    \(80\)

    \(25\)

    \(45\)

    \(22\)

    \(60\)

    \(23\)

    \(20\)

    \(15\)

    \(50\)

    \(18\)

    \(100\)

    \(26\)

    \(46\)

    \(24\)

    \(36\)

    \(17\)

    \(6\)

    \(12\)

    \(62\)

    \(23\)

    \(95\)

    \(27\)

    \(48\)

    \(20\)

    \(45\)

    \(18\)

    \(34\)

    \(24\)

    \(40\)

    \(19\)

    \(32\)

    \(17\)

    \(57\)

    \(21\)

    \(50\)

    \(21\)

    \(19\)

    \(10\)

    \(37\)

    \(23\)

    The first thing to do is make a scatter plot.

    Least-Squares Regression scatter plot of data in table StudySmarterFig. 1 - Scatter plot of the data in the table of dog weights and heights.

    Next, you would check for any unusual points in the data.

    Unusual Data Points

    Let's take a look at the kinds of unusual points you might see that would affect your linear regression analysis.

    Outliers

    Remember that an outlier is a data point that is an abnormal distance from other points in the sample. In other words, the response variable (in this case the height of the dog) does not follow the general trend of the other data. Who gets to decide what points are outliers? The person looking at the data of course! In the scatter plot of the data above you can see that there doesn't appear to be any real outliers in the data.

    High Leverage Points

    What makes a data point of your sample a high leverage point?

    A high leverage point is one that has an unusually large distance between it and the mean.

    A high leverage point can either be above or below the mean. Points like this can have a large effect on linear regression.

    Influential Points

    Influence is a way to measure how much impact an outlier or a high leverage point has on your regression model.

    A point is considered to be influential if it unduly influences any part of your regression analysis, like the line of best fit.

    While outliers and high leverage points could be influential points, they are not always influential points. In order to say if an outlier or a high leverage point is actually influential, you would need to remove it from the data set, recalculate the linear regression, and then see how much it changed. The best way to check is to see if the \(R^2\) value has changed.

    For a reminder about the \(R^2\) value, see the articles Linear Regression and Residuals.

    Residual sum of squares geometric interpretation

    Once you have made a scatter plot of the data, you can check to see if it looks linear. In this case, it might be, but the question is how to draw the line. As you can see in the picture below, any of the three lines drawn look like they might fit the data pretty well.

    Least-Squares Regression Scatter plot showing three potential lines through the data StudySmarterFig. 2 - Scatter plot showing three potential lines through the data.

    So what makes a line the "best" line? You want a line that is as close to as many data points in the sample as possible. For that, you need to look at the deviation, also called the residual. The residual of a data point is simply how far away the data point is from the potential line of best fit.

    Least-Squares Regression scatter plot showing deviation of two points to the line, one above the line and one below StudySmarterFig. 3 - Scatter plot showing the deviation of two of the data points.

    A negative residual means the point is below the line, and a positive residual means the point is above the line. If a point lies exactly on the line the residual would be zero. Because the residual could be positive or negative, it is common to look at the square of the residual so things don't get accidentally cancelled out.

    Residual sum of squares definition

    Let's look at the actual definition of the residual sum of squares. You will notice that it can be defined for any line \(y=a+bx\), not just for the line of best fit.

    For \(n\) data points,

    \[(x_1, y_1), (x_2, y_2), \dots (x_n, y_n),\]

    one way to measure the fit of a line \(y=bx+a\) to bivariate data is the sum of squared residuals using the formula

    \[\sum\limits_{i=1}^n (y_i - (a+bx_i))^2.\]

    The goal is to make the sum of squared residuals as small as possible.

    For an explanation of why the residual sum of squares is the best way to go about things, see the article Minimising the Sum of Squares Residual.

    You might see the residual at point \((x_i,y_i)\) written as \(\epsilon_i\).

    Formula for residual sum of squares

    Now you can define the line of best fit, also known as the least-squares regression line.

    The least-squares regression line is the line that minimises the sum of squared deviations to the sample data.

    You still need a way to find the least-squares regression line! Thankfully other people have done all the math to find the slope and intercept of the line. The notation in the formulas is:

    • \(n\) number of sample points;

    • \(\bar{x}\) the average of the \(x_i\) values; and

    • \(\bar{y}\) the average of the \(y_i\) values.

    The slope of the least-squares regression line is

    \[ b = \frac{\sum\limits_{i=1}^n(x_i - \bar{x})(y_i - \bar{y})}{ \sum\limits_{i=1}^n(x_i - \bar{x})^2 } = \frac{S_{xy}}{S_{xx}} ,\]

    the \(y\)-intercept is

    \[ a = \bar{y} - b\bar{x},\]

    and the equation of the least-squares regression line is

    \[ \hat{y} = a+bx,\]

    where \(\hat{y}\) is the predicted value that results from substituting a given \(x\) into the equation.

    \(S_{xx}\) and \(S_{xy}\) are called summary statistics, and their formulas may show up depending on what learning tools you are using.

    Let's look at an example.

    Going back to the table with the dog weights and heights, the dependent variable is the height (these would be the \(y_i\) values), and the independent variable is the weight (these would be the \(x_i\) values). There are \(24\) data points in the table, so \(n=24\). You can calculate

    • \( \bar{x} = 46.75\) and
    • \(\bar{y} = 19.79\),

    rounded to two decimal places. Generally, you will use a spreadsheet or calculator to find the values of \(b\) and \(a\), especially when there are lots of data points! Here

    • \( a =11.69\) and
    • \(b = 0.17\),

    where both have been rounded to two decimal places. So the equation of the least-squares regression line is

    \[ \hat{y} = 11.69 + 0.17x.\]

    Least-Squares Regression scatter plot of data showing line of best fit, also known as the least-squares regression line StudySmarterFig. 4 - Scatter plot with the line of best fit, also known as the least-squares regression line.

    Now that you have a formula for the line, you can find the residual sum of squares deviation for this line. Using the formula,

    \[\sum\limits_{i=1}^24 (y_i - (a+bx_i))^2 \approx 160.58.\]

    In fact, the \(R^2\) value, also known as the coefficient of determination, is about \(R^2 = 0.73\), or \(73\%\).

    Now let's look for influential points.

    Going back to the table of data, if you look at the deviation for each point in the sample, one of them seems to contribute quite a bit more than the others to the sum of squares deviation. That data point is \( (37, 23)\) with a deviation of almost \(24\). That is considerably more than any of the other sample points, with the next highest being less than \(12\). This implies that the data point \( (37, 23)\) is a high leverage point, but you do need to show whether or not it is an influential point.

    It might be the case that \( (37, 23)\) is an influential point. If you remove that point from the sample and then calculate the new \(R^2\) value, you get about \(0.77\), or \(77\%\), with a least-squares regression line of

    \[\hat{y} = 11.31 + 0.18x,\] and a residual sum of squares deviation of \(135.36\).

    Remember that the coefficient of determination, \(R^2\), is a measure of the variability in \(y\) that can be explained by a linear relationship between \(x\) and \(y\). The closer to \(1\) that \(R^2\) is, the closer to linear your sample data is. So by removing one point from the data set, you have changed the \(R^2\) value from \(73\%\) to \(77\%\), which is a big change! That means the data point \( (37, 23)\) is in fact an influential point.

    Remember that variability can be decreased by increasing the sample size. See Unbiased Point Estimates for more information.

    Once you have the least-squares regression line, what can you do with it?

    Examples of residual sum of squares

    There are a couple of important things to consider when using the least-squares regression line to make a prediction.

    • The least-squares regression line is a predictor of the population, not an individual.

    • Using the least-squares regression line to make a prediction for a value outside the range of the collected data might not work very well.

    Let's look at an example of the kinds of problems that can occur when these considerations are ignored.

    Least-Squares Regression bulldogs are very heavy given how short they generally are StudySmarterFig. 5 - Bulldogs are an example of why you can't necessarily make a prediction about an individual from a least-squares regression line.

    Going back to the dog weight/height information, and using the least-squares regression line

    \[\hat{y} = 11.31 + 0.18x,\]

    you what can you predict about the height of a bulldog that weighs \(65\) pounds?

    Answer:

    Simply plugging in the weight of the bulldog, you get

    \[\hat{y} = 11.31 + 0.18(65) = 23.01,\]

    so the least-squares regression line predicts that the bulldog would be \(23.01\) inches tall. However, a bulldog of this weight will actually be about \(15\) inches tall, which is quite a difference! This is an example of why you can use the least-squares regression line to make a prediction about dogs in general (i.e. the population of dogs) and not about specific dogs.

    What about a dog that has a weight of more than \(100\) pounds?

    Least-Squares Regression bull mastiff dogs are one to a kid sized swimming pool StudySmarterFig. 6 - Bull mastiff dogs are definitely one to a kid sized wading pool!

    A male bull mastiff dog can easily weigh \(130\) pounds. This is outside the range of the data collected in the table. When you use the least-squares regression line to make a prediction, you find that a bull mastiff dog should be

    \[\hat{y} = 11.31 + 0.18(130) = 34.71\, \text{in},\]

    tall. However in general this dog won't be more than \(27\) inches tall, which is considerably less than what the least-squares regression line predicts! That is because the weight of the dog is quite far outside of the data collected, so the least-squares regression line isn't a very good predictor.

    Residual Sum of Squares - Key takeaways

    • The residual of a data point is how far away the data point is from the potential line of best fit. Deviation can be positive or negative.
    • For \(n\) data points,

      \[(x_1, y_1), (x_2, y_2), \dots (x_n, y_n),\]

      one way to measure the fit of a line \(y=mx+b\) to bivariate data is the residual sum of squared deviations using the formula

      \[\sum\limits_{i=1}^n (y_i - (a+bx_i))^2.\]

    • The least-squares regression line is the line that minimises the residual sum of squares.
    • The slope of the least-squares regression line is

      \[ \begin{align} b &=\frac{S_{xy}}{S_{xx}} \\ & = \frac{\sum\limits_{i=1}^n(x_i - \bar{x})(y_i - \bar{y})}{ \sum\limits_{i=1}^n(x_i - \bar{x})^2 }, \end{align}\]

      the \(y\)-intercept is

      \[ a = \bar{y} - b\bar{x},\]

      and the equation of the least-squares regression line is

      \[ \hat{y} = a+bx,\]

      where \(\hat{y}\) is the predicted value that results from substituting a given \(x\) into the equation.

    Frequently Asked Questions about Residual Sum of Squares

    How to calculate residual sum of squares? 

    Find the residual for each observation, and then square it.  Add all of those together and you get the residual sum of squares.

    What is the residual sum of squares? 

    It is a way to measure how far your line of best fit deviates from the observations.

    What is ESS and RSS? 

    RSS = residual sum of squares.

    ESS = explained sum of squares.

    What does the Residual Sum of Squares measure? 

    It measures the level of variance in the residuals of a regression model.

    What is an example of Residual Sum of Squares? 

    An example of using the residual sum of squares is checking to see how well the observations in a data set fit the least squares regression line.  This can help you locate influential points.

    Save Article

    Test your knowledge with multiple choice flashcards

    True or False: The closer to \(1\) that \(R^2\) is, the closer to linear your sample data is. 

    True or False: The the coefficient of determination, \(R^2\), is a measure of the variability in \(y\) that can be explained by a linear relationship between \(x\) and \(y\). 

    True or false: the least-squares regression line is the only way to make a prediction about a population.

    Next

    Discover learning materials with the free StudySmarter app

    Sign up for free
    1
    About StudySmarter

    StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

    Learn more
    StudySmarter Editorial Team

    Team Math Teachers

    • 11 minutes reading time
    • Checked by StudySmarter Editorial Team
    Save Explanation Save Explanation

    Study anywhere. Anytime.Across all devices.

    Sign-up for free

    Sign up to highlight and take notes. It’s 100% free.

    Join over 22 million students in learning with our StudySmarter App

    The first learning app that truly has everything you need to ace your exams in one place

    • Flashcards & Quizzes
    • AI Study Assistant
    • Study Planner
    • Mock-Exams
    • Smart Note-Taking
    Join over 22 million students in learning with our StudySmarter App
    Sign up with Email