Jump to a key chapter
In statistics, there are constraints as well. The Chi Squared Tests use degrees of freedom to describe how free a test is based on the constraints placed on it. Read on to figure out how free the Chi Squared Test really is!
Degrees of freedom meaning
Many tests use degrees of freedom, but here you will see degrees of freedom as it relates to Chi Squared Tests. In general, the degrees of freedom is a way to measure how many test statistics you have calculated from the data. The more test statistics you have calculated using your sample, the less freedom you have to make choices with your data. Of course, there is a more formal way to describe these constraints as well.
A constraint, also called a restriction, is a requirement placed on the data by the model for the data.
Let's look at an example to see what that means in practice.
Suppose you are doing an experiment where you roll a four sided die \(200\) times. Then the sample size is \(n=200\). One constraint is that your experiment needs the sample size to be \(200\).
The number of constraints will also depend on the number of parameters you need to describe a distribution, and whether or not you know what these parameters are.
Next, let's look at how the constraints relate to degrees of freedom.
Degrees of freedom formula
For most cases, the formula
degrees of freedom = number of observed frequencies - number of constraints
can be used. If you go back to the example with the four sided die above, there was one constraint. The number of observed frequencies is \(4\) (the number of sides on the die. So the degrees of freedom would be \(4-1 = 3\).
There is a more general formula for the degrees of freedom:
degrees of freedom = number of cells (after combining) - number of constraints.
You are probably wondering what a cell is and why you might combine it. Let's look at an example.
You send out a survey to \(200\) people asking how many pets people have. You get back the following table of responses.
Table 1. Responses from pet ownership survey.
Pets | \(0\) | \(1\) | \(2\) | \(3\) | \(4\) | \(>4\) |
Expected | \(60\) | \(72\) | \(31\) | \(20\) | \(7\) | \(10\) |
However, the model you are using is only a good approximation if none of the expected values falls below \(15\). So you could combine the last two columns of data (known as cells) into the table below.
Table 2. Responses from pet ownership survey with combined cells.
Pets | \(0\) | \(1\) | \(2\) | \(3\) | \(>3\) |
Expected | \(60\) | \(72\) | \(31\) | \(20\) | \(17\) |
Then there are \(5\) cells, and one constraint (that the total of the expected values is \(200\)). So the degrees of freedom is \(5 - 1= 4\).
You will usually only combine adjoining cells in your tables of data. Next, let's look at the official definition of degrees of freedom with the Chi-Squared distribution.
Degrees of freedom definition
If you have a random variable \(X\) and want to do an approximation for the statistic \(X^2\), you would use the \(\chi^2\) family of distributions. This is written as
\[\begin{align} X^2 &= \sum \frac{(O_t - E_t)^2}{E_t} \\ &= \sum \frac{O_t ^2}{E_t} -N \\ & \sim \chi^2, \end{align}\]
where \(O_t\) is the observed frequency, \(E_t\) is the expected frequency, and \(N\) is the total number of observations. Remember that the Chi-Squared tests are only a good approximation if none of the expected frequencies is below \(5\).
For a reminder of this test and how to use it, see Chi Squared Tests.
The \(\chi^2\) distributions are actually a family of distributions that depend on the degrees of freedom. The degrees of freedom for this kind of distribution are written using the variable \(\nu\). Since you may need to combine cells when using \(\chi^2\) distributions, you would use the definition below.
For the \(\chi^2\) distribution, the number of degrees of freedom, \(\nu\) is given by
\[ \nu = \text{number of cells after combining}-1.\]
There will be cases where cells won't be combined, and in that case, you can simplify things a bit. If you go back to the four sided die example, there are \(4\) possibilities that could come up on the die, and these are the expected values. So for this example \(\nu = 4 - 1 = 3\) even if you are using a Chi-Squared distribution to model it.
To be sure you know how many degrees of freedom you have when using the Chi-Squared distribution, it is written as a subscript: \(\chi^2_\nu \).
Degrees of freedom table
Once you know that you are using a Chi-Squared distribution with \(\nu\) degrees of freedom, you will need to use a degrees of freedom table so that you can do hypothesis tests. Here is a section out of a Chi-Squared table.
Table 3. Chi-Squared table.
degrees of freedom | \(0.99\) | \(0.95\) | \(0.9\) | \(0.1\) | \(0.05\) | \(0.01\) |
\(2\) | \(0.020\) | \(0.103\) | \(0.211\) | \(4.605\) | \(5.991\) | \(9.210\) |
\(3\) | \(0.155\) | \(0.352\) | \(0.584\) | \(6.251\) | \(7.815\) | \(11.345\) |
\(4\) | \(0.297\) | \(0.711\) | \(1.064\) | \(7.779\) | \(9.488\) | \(13.277\) |
The first column of the table contains the degrees of freedom, and the first row of the table are areas to the right of the critical value.
The notation for a critical value of \(\chi^2_\nu\) which is exceeded with probability \(a\%\) is \(\chi^2_\nu(a\%)\) or \(\chi^2_\nu(a/100)\).
Let's take an example using the Chi-Squared table.
Find the critical value for \(\chi^2_3(0.01)\).
Solution:
The notation for \(\chi^2_3(0.01)\) tells you that there are \(3\) degrees of freedom and you are interested in the \(0.01\) column of the table. Looking at the intersection of the row and column in the table above, you get \(11.345\). So
\[\chi^2_3(0.01) = 11.345 . \]
There is a second use for the table, as demonstrated in the next example.
Find the smallest value of \(y\) such that \(P(\chi^2_3 > y) = 0.95\).
Solution:
Remember that the significance level is the probability that the distribution exceeds the critical value. So asking for the smallest value \(y\) where \(P(\chi^2_3 > y) = 0.95\) is the same as asking what \(\chi^2_3(0.95)\) is. Using the Chi-Squared table you can see that \(\chi^2_3(0.95) =0.352 \), so \(y=0.352\).
Of course, a table can't list all of the possible values. If you need a value which is not in the table, there are many different statistics packages or calculators that can give you Chi-Squared table values.
Degrees of freedom t-test
The degrees of freedom in a \(t\)-test is calculated depending on if you are using paired samples or not. For more information on these topics, see the articles T-distribution and Paired t-test.
Degrees of Freedom - Key takeaways
- A constraint, also called a restriction, is a requirement placed on the data by the model for the data.
- In most cases, degrees of freedom = number of observed frequencies - number of constraints.
- A more general formula for degrees of freedom is: degrees of freedom = number of cells (after combining) - number of constraints.
For the \(\chi^2\) distribution, the number of degrees of freedom, \(\nu\) is given by
\[ \nu = \text{number of cells after combining}-1.\]
Learn faster with the 12 flashcards about Degrees of Freedom
Sign up for free to gain access to all our flashcards.
Frequently Asked Questions about Degrees of Freedom
How do you determine the degrees of freedom?
It depends on the kind of test you are doing. Sometimes it is the sample size minus 1, sometimes it is the sample size minus 2.
What is degree of freedom with example?
The degree of freedom is related to the sample size and the kind of test you are doing. For example in a paired t-test the degree of freedom is the sample size minus 1.
What is DF in at test?
It is the number of degrees of freedom.
What is the role of degree of freedom?
It tells you how many independent values that can vary without breaking any constraints in the problem.
What do you mean by degrees of freedom?
In statistics, the degrees of freedom tells you how many independent values that can vary without breaking any constraints in the problem.
About StudySmarter
StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.
Learn more