Jump to a key chapter
In this article, you will learn about a confidence interval for the slope of a regression model, its meaning, the conditions necessary to be able to construct them, the formula, and how to actually determine them. For information on drawing conclusions about a population from the confidence interval, see the article Justifying Claims Based on the Confidence Interval for the Slope of a Regression Model.
Meaning of Confidence Interval for Slope of Regression Line
By now you know that when there is a linear relationship between a variable \(x\) and a variable \(y\) – the linear correlation coefficient \(r\) is non-zero – you can model it with a linear regression. This regression consists of:
\[\hat{y}=\beta_0+\beta_1x\]
where:
\(\beta_0\) is the y-intercept;
\(\beta_1\) is the slope of the regression;
\(x\) is the independent variable; and
\(\hat{y}\) the predicted value of the dependent variable.
For a better reminder of this topic, see our article Least-Squares Regression. Remember that the correlation coefficient \(r\) tells you how much of a correlation there is between the two variables. If \(r\) is close to zero, then there is little to no correlation between the variables, while \(r\) values close to \(-1\) or \(1\) indicate that there is a strong correlation between the two variables.
On the other hand, the slope \(\beta_1\) represents how much \(\hat{y}\) changes to the changes in the \(x\)-values, that is, for each unit of increase of \(x\), \(\hat{y}\) increases \(\beta_1\) units.
Suppose you suspect that an increase in book price means that fewer books will be sold. You collect data, and find the line of best fit to be:
\[\hat{y}=3500-10x\]
where \(x\) is the price is the book and \(hat{y}\) is the predicted number of books sold. What a \(\$1\) increase in \(x\) mean about the number of books you predict will sell?
Solution:
From the equation given you can see that \(\beta_0 = 3500\) and \(\beta_1 = -10\). Notice that the slope of the regression model is negative. That means an increase of \(\$1\) in the book price corresponds to a predicted increase of \(-10\) books sold, or in other words you can predict that 10 fewer books will be sold for every dollar increase in book price.
By calculating a confidence interval with a high confidence level, say \(c\%\), for the slope \(\beta_1\), you get two values that define the limits of a range of values in which you can find the slope. You can say with \(c\%\) confidence that the value of the slope will be between those two values.
Furthermore, you can say that the method used to construct the interval is successful in capturing the actual slope of the linear regression model about \(c\%\) of the time.
Conditions for Confidence Interval for the Slope of a Regression Line
The conditions for constructing a confidence interval for the slope of a linear regression are the same as for constructing a linear regression. These conditions are:
Quantitative variable condition: Correlation only applies if both variables are quantitative.
Straight enough condition: Look at the scatter plot and make sure your data has an approximately linear relationship. Correlation only measures the strength in a linear association. This can also be done by looking at the correlation coefficient of the data.
Independence of Variables: Data should be collected randomly, and if sampling without replacement is done, the sample size is less than or equal to \(10\%\) of the total population.
Normal: The independent variable is normally distributed.
Formula of Confidence Interval for Slope of Regression Line
Like any confidence interval you have studied so far, a confidence interval for the slope \(\beta_1\) of the least squares regression line has the following structure:
sample statistic – margin of error \(\le \beta_1\le\) sample statistic + margin of error,
where margin of error = critical value \(\times\) standard error.
Now, you just have to understand what each of those three elements is for the slope \(\beta_1\):
The sample statistic will be \(\hat{\beta}_1\), the point estimator of the slope \(\beta_1\);
For the margin of error:
this time the critical value will be of a \(t\)-distribution with \(n-2\) degrees of freedom, i.e., \(t\) with \(df=n-2\);
the standard error for the slope, written \(SE_{\beta_1} \), will be:\[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\]where \(s\) is the sample standard deviation calculated as:\[s={\sqrt{\frac{\sum_{i=1}^{n}(y_i-\hat{y}_i)^2}{n-2}}}\ \]
Thus, the formula for a confidence interval for the slope \(\beta_1\) is:
\[\hat{\beta}_1- t\cdot SE_{\beta_1}\le \beta_1\le \hat{\beta}_1+ t\cdot SE_{\beta_1}\]
or an even shorter version:
\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\\]
This confidence interval is for any confidence level, but confidence levels that you will see most often are \(90\%\), \(95\%\), and \(99\%\). These are the values you should consider when calculating the critical value \(t\).
Calculations for Confidence Interval for Slope of Regression Line
From what you have read so far, the formula for a confidence interval for the slope suggests a set of steps you should follow when you want to find it.
Step 1: Find the sample statistic \(\hat{\beta}_1\).
You get the value of the point estimator \(\hat{\beta}_1\) by constructing the regression line for the data set you are working with.
Step 2: Select a confidence level \(c\%\).
The confidence level describes the uncertainty of a sampling method. You will most often be asked for a confidence level of \(90\%\), \(95\%\), or \(99\%\).
The purpose of knowing the confidence level is to be able to find the critical value \(t\), by consulting a \(t\) table, with two bits of information:
the degrees of freedom, given by the:\[ \text{sample size } -2 = n-2\]where \(n\) is the sample size; and
the confidence level adjusted for the table you are using.
Depending on the table you consult, the confidence level may have to be adjusted to \(1-\tfrac{\alpha}{2}\) or to \(\tfrac{\alpha}{2} \).
For example, for a confidence level of \(99\%\), you know that \(c=100(1-\alpha)\%\) and so:
\[\begin{align} 99\%&=100\%(1-\alpha) \\ 0.99&=1-\alpha \\ \alpha&=0.01 .\end{align}\]
Now, depending on the table you consult, you'll do:
\[1-\frac{\alpha}{2}=1-\frac{0.01}{2}=0.995\]
or
\[\frac{\alpha}{2} = \frac{0.01}{2}=0.005\]
Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).
As you already know, the margin of error is the product of the critical value \(t\) with the value of the standard error. The formula for the standard error is:
\[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\]
where \(s\) is the sample standard deviation.
Step 4: Find the confidence interval.
Here you just have to replace the values you got in the previous step in the formula:
\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\\]
Let's look at an example where you can apply the steps by hand.
Given that the data set in the table below
x | y |
1 | 3 |
2 | 4 |
2 | 7 |
3 | 8 |
5 | 9 |
Table 1. Example data.
find a confidence interval of \(95\%\) for the slope knowing that the least squares regression line of this data is:
\[\hat{y}=2.41+1.46x\]
the sample variance is \(s^2=2.39\) and \(t=3.182\).
Solution:
Step 1: Find the sample statistic \(\hat{\beta}_1\)
You were given the equation of the regression line, so you know that \(\hat{\beta}_1=1.46\).
Step 2: Select a confidence level \(c\%\)
The confidence level is given: \(c=95\%\). You’re also given the critical value \(t=3.182\).
If you had to consult a \(t\) table, you would first see that \(df=5-2=3\), second that \(95\%=100\%(1-\alpha)\) if and only if \(0.95=1-\alpha\) if and only if \(\alpha=0.05\), and then that \(1-\alpha/2=1-0.05/2=0.975\).
Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).
You know that:
\[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\\]
You know \(s^2=2.39\), so the sample standard deviation is \(s=1.55\).
For the sum in the denominator, you first need the sample mean of the \(x-\)values.
\[\bar{x}=\frac{1+2+2+3+5}{5}=2.6\]
Now the sum:
\[\begin{align} \sum_{i=1}^{n}(x_i-\bar{x})^2=&(1-2.6)^2+(2-2.6)^2+(2-2.6)^2+\\&+(3-2.6)^2+(5-2.6)^2 \\ &=9.2 \end{align}\]
Finally, for the margin of error:
\[\begin{align} t\cdot SE_{\beta_1}&=3.182\left( \frac{1.55}{\sqrt{9.2}}\right)\\ &=3.182(0.51)\\ &=1.62282. \end{align} \]
Step 4: Find the confidence interval
Now just substitute the values you determined in the previous steps into the formula:
\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}= 1.46\pm 1.62282\]
which gives you
\[ -0.16282\le \beta_1 \le 3.08282\ \]
If you have satisfied the conditions for doing a confidence interval for the slope of a regression model, you can say with \(95\%\) confidence that the true value of the slope \(\beta_1\) is between \(-0.16282\) and \(3.08282\).
Example of Confidence Interval for Slope of Regression Line
Let's look at an example of doing the calculations necessary for finding the confidence interval for the slope of a regression line.
Between \(2010\) and \(2022\), data was collected on the average cost of college textbooks required for a semester that year. That data is in the table below. Find the confidence interval for the slope of the regression line at a \(99\%\) confidence level.
Year | Average Book Cost (in \($\)) | Year | Average Book Cost (in \($\)) |
\(2010\) | \(660\) | \(2017\) | \(1125\) |
\(2011\) | \(678\) | \(2018\) | \(1100\) |
\(2012\) | \(596\) | \(2019\) | \(1300\) |
\(2013\) | \(550\) | \(2020\) | \(1320\) |
\(2014\) | \(770\) | \(2021\) | \(1369\) |
\(2015\) | \(790\) | \(2022\) | \(1400\) |
\(2016\) | \(860\) |
Table 2. Data sample.
Solution:
First, draw a scatter plot of the data.
It certainly looks reasonable to consider a linear regression model, and there are no obvious outliers. Assume year \(2010\) corresponds to \(x=1\). You can find the correlation coefficient \(r = 0.96\) and the line of best fit \(\hat{y} = 79.9x+ 458.1\). With the correlation coefficient being close to \(1\) you can see there is a strong linear relationship between the year and the average book cost.
For a reminder of how to find the correlation coefficient and the line of best fit see Linear Regression and Least-Squares Regression
In fact if you graph the line of best fit you can see immediately that there is a strong linear relationship.
Now let's follow the steps to find the confidence interval for the slope of the regression line.
Step 1: Find the sample statistic \(\hat{\beta}_1\).
The line of best fit is \(\hat{y} = 79.9x + 458.1\), so \(\beta_1 = 79.9\). This is the point estimator for the data.
Step 2: Select a confidence level \(c\%\).
The confidence level for this problem is \(99\%\). There are \(13\) samples, which means the degree of freedom is \(13-2=11\). Consulting a \(t\)-table then gives the \(t\) critical value as \(3.11\), so \(t = 3.11\).
Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).
To do this you first need to calculate \(s^2\). Given the equation for the line:
\[ y_i-\hat{y}_i = y_i - (79.9x_i - 458.1 ) \]
To make the calculations for \(s\) a little easier to follow it can help to make a table.
\(x_i\) | \(y_i\) | \(\hat{y}_i\) | \((y_i-\hat{y}_i )^2 \) |
1 | 660 | 538 | 3844 |
2 | 678 | 617.9 | 3612.01 |
3 | 596 | 697.8 | 10363.24 |
4 | 550 | 777.7 | 51847.29 |
5 | 770 | 857.6 | 24837.76 |
6 | 790 | 937.5 | 21756.25 |
7 | 860 | 1017.4 | 24774.76 |
8 | 1125 | 1097.3 | 767.29 |
9 | 1100 | 1177.2 | 5959.84 |
10 | 1300 | 1257.1 | 1840.41 |
11 | 1320 | 1337 | 289 |
12 | 1369 | 1416.9 | 2294.41 |
13 | 1400 | 1496.8 | 9370.24 |
Table 3. Data sample.
Using the formula and the information in the table above:
\[\begin{align} s &=\sqrt{\frac{\sum_{i=1}^{n}(y_i-\hat{y}_i)^2}{n-2}} \\ &= \sqrt{\frac{\sum_{i=1}^{13}(y_i-\hat{y}_i)^2}{11}} \\ &= \sqrt{\frac{161556.5 }{11}} \\ &\approx 121.2 \end{align}\]
Then you have:
\[\begin{align} SE_{\beta_1}&=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}} \\ &= \frac{121.2}{182} \\ &\approx 0.67 \end{align} \]
You have already found the critical value \(t = 3.11\), so:
\[ \begin{align} \text{margin of error} &= t\cdot SE_{\beta_1} \\ &= (3.11)(0.67 ) \\ &\approx 2.08 \end{align}\]
Step 4: Find the confidence interval
Substituting the values you found in the previous steps into the formula:
\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}= 79.9\pm 2.08\]
which gives you a confidence interval of \( (77.82, 79.98) \).
If you have satisfied the conditions for doing a confidence interval for the slope of a regression model, you can say with \(99\%\) confidence that the true value of the slope \(\beta_1\) is between \(77.82 \) and \(79.98 \).
Confidence Intervals for the Slope of a Regression Model – Key takeaways
- By calculating a confidence interval with a high confidence level, say \(c\%\), for the slope \(\beta_1\), you get two values that define the limits of a range of values in which you can find the slope. You can say with \(c\%\) confidence that the value of the slope will be between those two values.
- You can say that the method used to construct the interval is successful in capturing the actual slope of the linear regression model about \(c\%\) of the time.
- The formula for the confidence interval for the slope of a regression model is \[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\, ,\] where
- \(\hat{\beta}_1\) is the estimate of the slope \(\beta_1\)
- \(t\cdot SE_{\beta_1}\) is the margin of error
- \(t\) is the critical value from the \(t-\)distribution with parameter \(df=n-2\) (\(n-2\) degrees of freedom)
- \(SE_{\beta_1}\) is the standard error for the slope
Learn with 8 Confidence Interval for Slope of Regression Line flashcards in the free StudySmarter app
We have 14,000 flashcards about Dynamic Landscapes.
Already have an account? Log in
Frequently Asked Questions about Confidence Interval for Slope of Regression Line
How to interpret confidence interval for slope of regression line?
c% of the time, the estimated slope β1* is going to overlap with the true value of the slope β1 that you’re estimating.
What is the confidence interval for the slope of a regression line?
It is a range of values in which you have c% confidence that the estimated value of the slope, β1*, is in that range.
What is an example of a confidence interval for the slope of a regression line?
For a small data set like
x 1 2 2 3 5
y 3 4 7 8 9
the confidence interval for the slope is
-0.16282 ≤ β1 ≤ 3.08282
How to calculate the confidence interval for the slope of a regression line?
To calculate the confidence interval for the slope, follow these steps:
Step 1: Find the slope estimate, β1*
Step 2: Select a confidence level c%
Step 3: Find the margin of error t×SEβ1
Step 4: Find the confidence interval
What is the formula for the confidence interval for the slope of a regression line?
The formula is β1* ± t×SEβ1, where β1* is the slope estimate, t is the critical value, and SEβ1 is the standard error of the slope.
About StudySmarter
StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.
Learn more