Jump to a key chapter
Grouping or blocking is the main idea behind the randomized block design. Hereafter, this concept would be defined and comparisons made with both completely randomized designs and matched pairs. Start blocking, and be organized.
The Definition of Randomized Block Design
When data is grouped based on measurable and known unwanted variables, you say the data has been blocked. This is carried out to prevent undesirable factors from reducing the accuracy of an experiment.
The randomized block design is described as the process of grouping (or stratifying) before randomly picking samples for an experiment.
When carrying out an experiment or survey, you should try to reduce errors that may be contributed by various factors. A factor may be known and controllable, so you block (group) the samples based on this factor in a bid to reduce variability caused by this factor. The end goal of this process is to minimize the differences between components in a blocked group as compared to the differences between components of the entire sample. This would help you get more accurate estimates from each block, since the variability of members of each group is low.
Note that a reduced variability makes the comparison more accurate because more specific characters are compared, and more accurate results are gotten.
For instance, if Femi wants to clean the house, and plans on determining which out of three brushes would clean the whole house faster. Rather than carrying out an experiment involving each brush cleaning the entire house, he decides to divide the house into three portions such as bedroom, sitting room, and kitchen.
This is a reasonable thing to do if Femi assumes each square meter of the floor in different rooms differs by texture. This way, the variability due to different floor types is reduced so that each exists in its block.
In the above example, Femi identified that the floor texture can make a difference. But Femi is interested in which brush is better, so he decided to make three blocks for his experiment: the kitchen, the bedroom, and the sitting room. The factor that led Femi to the decision of making blocks is often regarded as a nuisance factor.
A nuisance factor, also known as a nuisance variable, is a variable that affects the outcomes of the experiment, but it is not of particular interest for the experiment.
Nuisance factors are not the same thing as lurking variables.
Lurking variables are ones that either hide a relationship between variables that may exists, or lead to a correlation that isn't actually true.
A lurking variable that needs to be accounted for in medical trials is the placebo effect, where people believe the medicine will have an effect so they experience an effect, even if what they are actually getting is a sugar pill instead of real medical treatment.
Let's look at two illustrations of a randomized block design to help clarify how a randomized block design would be constructed.
From the above figure, you can see how Femi has grouped the experiment into three sections. This is an important idea about the randomized block design.
From the above figure, after blocking into groups, Femi randomly samples each group for the test. After this stage, the analysis of variance is carried out.
Randomized Block Design vs Completely Randomized Design
A completely randomized design is a process of randomly picking samples for an experiment so that all randomly selected items are treated without segregation (grouping). This method is susceptible to an error by chance, since common characteristics are not considered initially, which should minimize variability if they were put in groups. This variability is minimized by the randomized block design through grouping so that a balance is forced between study groups.
You can better understand the difference between a randomized block design versus a completely randomized design with an example.
Suppose you want to test a viral recipe of home-made ice cream. The recipe has pretty good directions, except that it does not specify the amount of sugar you need to use. Since you intend to serve this at a family dinner next week, you ask your neighbors if they could help you by tasting different batches of ice cream made with different amounts of sugar.
Here, the experiment is performed by varying the amount of sugar of each batch.
The first and most important ingredient is raw milk, so you go to your closest farmer's market just to find that they only have half a gallon left. You need at least \(2\) gallons to make enough batches of ice cream, so your neighbors can taste them.
After looking for a while, you find another farmer's market \(15\) minutes down the highway, where you buy the remaining \(1.5\) gallons of raw milk you required.
Here, the different types of milk are the nuisance variable.
As you make the ice cream, you note that the ice cream made with the milk from one place tastes slightly different from the ice cream made from the milk of the other place! You consider that you might be biased because you used milk that was not from your trustworthy farmer's market. It is time for experimentation!
A completely randomized design would be to let your neighbors taste random batches of ice cream, just organized by the sugar amount used in the recipe.
A randomized block design would be to first segregate the batches made from the different milks, and then let your neighbors taste random batches of ice cream, that while keeping note of which milk was used in each observation.
It is completely possible that the milk does have an influence on the result when making the ice cream. This could introduce an error in your experiment. Because of this, you should use the same kind of milk for the experiment, and for the family dinner as well.
So which is better, blocking or randomization?
Is Blocking Better Than Randomization or Not?
The randomized block design is more beneficial than complete randomization because it reduces error by creating groups that contain items that are much more similar in comparison to the entire samples.
However, blocking would be preferred only when the sample size is not too large and when the nuisance factor(s) are not too many. When you deal with large samples, there is a higher tendency of numerous nuisance factors, which would require you to increase the grouping as well. The principle is that the more grouping you do, the smaller the sample size in each group. Therefore, when large sample sizes are involved or there are many nuisance factors, then you should approach such cases with a completely randomized design.
Furthermore, as mentioned earlier, when the blocking variable is unknown you should rely on a completely randomized design.
Randomized Block Design vs Matched Pairs Design
A matched pair design deals with the grouping of samples in twos (pairs) based on confounding characteristics (such as age, gender, status, etc.), and members of each pair are randomly assigned treatment conditions. Randomized block designs differ from matched pairs since its there can be more than two groupings. However, when there are just two groups in a randomized block design, then it may appear to be similar to a matched pair design.
Moreover, both the randomized block and matched pair designs are best applied to only small sample sizes.
In the ice cream example, you would make a matched pairs design by asking your neighbors to taste two scoops of ice cream at each observation, both with the same amount of sugar but with milk from different places.
So what are the advantages of a randomized block design?
What are the Advantages of a Randomized Block Design?
A primary benefit of the randomized block design is the creation of groups which increases similarities between members in the block as compared to the wide variation which may occur when each member is compared with the entire data set. This attribute is very advantageous because:
It reduces error.
It increases the statistical reliability of a study.
It remains a better approach to analyzing smaller sample sizes.
Let's look closer at the model for a randomized block design.
The Statistical Model for a Randomized Block Design
The statistical model for a randomized block design for one blocked nuisance factor is given by:
\[y_{ij}=µ+T_1+B_j+E_{ij}\]
where:
\(y_{ij}\) is the observation value for treatments in \(j\) and blocks in \(i\);
\(μ\) is the grand mean;
\(T_j\) is the \(j\)th treatment effect;
\(B_i\) is the \(i\)th blocking effect; and
\(E_{ij}\) is the random error.
The above formula is equivalent to that of ANOVA. You can thus use:
\[SS_T=SS_t+SS_b+SS_e\]
where:
\(SS_T\) is the total sum of squares;
\(SS_t\) is the sum of squares of from treatments;
\(SS_b\) is the sum of squares from blocking; and
\(SS_e\) is the sum of squares from the error.
The total sum of squares is calculated using:
\[SS_T=\sum_{i=1}^{\alpha} \sum_{j=1}^{\beta}(y_{ij}-\mu)^2\]
The sum of squares from treatments is calculated using:
\[SS_t=\beta \sum_{j=1}^{\alpha}(\bar{y}_{.j}-\mu)^2\]
The sum of squares from blocking is calculated using:
\[SS_b=\alpha \sum_{i=1}^{\beta}(\bar{y}_{i.}-\mu)^2\]
where:
\(\alpha\) is the number of treatments;
\(\beta\) is the number of blocks;
\(\bar{y}_{.j}\) is the mean of the \(j\)th treatment;
\(\bar{y}_{i.}\) is the mean of the \(i\)th blocking; and
the total sample size is a product of the number of treatments and blocks, which is \(\alpha \beta\).
The sum of squares of error can be calculated using:
\[SS_e=SS_T-SS_t-SS_b\]
Note that:
\[SS_T=SS_t+SS_b+SS_e\]
This becomes:
\[SS_e=\sum_{i=1}^{\alpha} \sum_{j=1}^{\beta}(y_{ij}-\mu)^2- \beta \sum_{j=1}^{\alpha}(\bar{y}_{.j}-\mu)^2 -\alpha \sum_{i=1}^{\beta}(\bar{y}_{i.}-\mu)^2\]
However, the value of the test static is obtained by dividing the mean square values of the treatment by that of the error. This is mathematically expressed as:
\[F=\frac{M_t}{M_e}\]
where:
\(F\) is the test static value.
\(M_t\) is the mean square value of treatment, which is equivalent to the quotient of the sum of squares from treatments and its degree of freedom, this is expressed as:\[M_t=\frac{SS_t}{\alpha -1}\]
\(M_e\) is the mean square value of error which is equivalent to the quotient of the sum of squares of error and its degree of freedom, this is expressed as:\[M_e=\frac{SS_e}{(\alpha -1)(\beta -1)}\]
The next section looks at an example to explain the application of these formulas.
Examples of Randomized Block Design
As mentioned at the end of the previous section, you shall have a clearer understanding of the randomized block design with its application in the illustration below.
Nonso requests Femi to carry assess the efficiency of three types of brushes in cleaning his whole house. The following values which refer to efficiency rate were obtained from Femi's study afterward.
Brush 1 | Brush 2 | Brush 3 | |
Sitting room | \(65\) | \(63\) | \(71\) |
Bedroom | \(67\) | \(66\) | \(72\) |
Kitchen | \(68\) | \(70\) | \(75\) |
Bathroom | \(62\) | \(57\) | \(69\) |
Table 1. Example of Randomized block design.
Would Femi's conclusion indicate variability in the efficiency between the brushes?
Solution:
Note that Femi had carried out blocking by grouping his assessment of the whole house into four such as bedroom, kitchen, sitting room, and bathroom.
First step: Make your hypotheses.
\[ \begin{align} &H_0: \; \text{There is no variability in the efficiency of the brushes.} \\ &H_a: \; \text{There is variability in the efficiency of the brushes.} \end{align} \]
Don't forget that \(H_0\) implies the null hypothesis, and \(H_a\) implies the alternate hypothesis.
Second step: Find the means for the treatments (columns), blocks (row), and the grand mean.
The mean of Treatment 1 is:
\[\bar{y}_{.1}=\frac{262}{4}=65.5\]
The mean of Treatment 2 is:
\[\bar{y}_{.2}=\frac{256}{4}=64\]
The mean of Treatment 3 is:
\[\bar{y}_{.3}=\frac{287}{4}=71.75\]
The mean of Block 1 is:
\[\bar{y}_{1.}=\frac{199}{3}=66.33\]
The mean of Block 2 is:
\[\bar{y}_{2.}=\frac{205}{3}=68.33\]
The mean of Block 3 is:
\[\bar{y}_{3.}=\frac{213}{3}=71\]
The mean of Block 4 is:
\[\bar{y}_{4.}=\frac{188}{3}=62.67\]
The grand mean is:
\[\mu=\frac{805}{12}=67.08\]
Update your table as follows:
Brush 1(Treatment 1) | Brush 2(Treatment 2) | Brush 3(Treatment 3) | Block total(row summation)& mean | ||
Sitting room(1st block) | \(65\) | \(63\) | \(71\) | \(199\) | \(63.3\) |
Bedroom(2nd block) | \(67\) | \(66\) | \(72\) | \(205\) | \(68.3\) |
Kitchen(3rd block) | \(68\) | \(70\) | \(75\) | \(213\) | \(71\) |
Bathroom(4th block) | \(62\) | \(57\) | \(69\) | \(188\) | \(62.67\) |
Treatment total(Columnsummation) | \(262\) | \(256\) | \(287\) | \(805\) | \(67.08\) |
Mean ofTreatment | \(65.5\) | \(64\) | \(71.75\) |
Table 2. Example of Randomized block design.
Third step: Find the sum of squares for total, treatment, blocking, and error.
The total sum of squares, \(SS_T\), is:
Recall that
\[SS_T=\sum_{i=1}^{\alpha} \sum_{j=1}^{\beta}(y_{ij}-\mu)^2\]
\[\begin{align} SS_T& =(65-67.08)^2+(63-67.08)^2 \\ & \quad + \dots+(57-67.08)^2+(69-67.08)^2 \\ &=264.96 \end{align}\]
The sum of squares from treatments, \(SS_t\), is:
Recall that:
\[SS_t=\beta \sum_{j=1}^{\alpha}(\bar{y}_{.j}-\mu)^2\]
and \(beta\) is \(3\).
\[\begin{align} SS_t &=3((65.5-67.08)^2+(64-67.08)^2+(71.75-67.08)^2)\\ &=101.37 \end{align}\]
The sum of squares from blocking, \(SS_b\), is:
Recall that:
\[SS_b=\alpha \sum_{i=1}^{\beta}(\bar{y}_{i.}-\mu)^2\]
and \(\alpha\) is \(4\)
\[\begin{align} SS_b &=4((66.33-67.08)^2+(68.33-67.08)^2+(71-67.08)^2+(62.67-67.08)^2)\\ &=147.76 \end{align}\]
Therefore, you can find the sum of squares of error:
Recall that:
\[SS_e=SS_T-SS_t-SS_b\]
\[\begin{align} SS_e&=264.96-101.37-147.76 \\ &=15.83 \end{align}\]
Fourth step: Find the mean square values for treatment and error.
The mean square value for treatment, \(M_t\), is:
Recall that:
\[M_t=\frac{SS_t}{\alpha -1}\]
\[M_t=\frac{101.37}{4-1}=33.79\]
Recall that \(\alpha\) is the number of blocks which is \(4\) in this case.
The mean square value for error, \(M_e\), is:
Recall that:
[M_e=\frac{SS_e}{(\alpha -1)(\beta -1)}\]
\[M_e=\frac{15.83}{(4-1)(3-1)}=2.64\]
Fifth strep: Find the value of test static.
The test static value, \(F\), is:
Recall that:
\[F=\frac{M_t}{M_e}\]
\[F=\frac{33.79}{2.64} \approx 12.8\]
Sixth Step: Use statistical tables to determine the conclusion.
Here, you have to take some care. You need your numerator degrees of freedom, \(df_n\), and your denominator degrees of freedom \(df_d\).
Note that:
\[df_n=\alpha -1\]
and
\[df_d=(\alpha-1)(\beta-1)\]
Hence,
\[df_n=4-1=3\]
and
\[df_d=(4-1)(3-1)=6\]
You could use a level of significance \(a=0.05\) to carry out your hypothesis test. Find the \(P\)-value at this significant level (\(a=0.05\)) with a \(df_n\) of \(3\) and \(df_d\) of \(6\) which is \(4.76\). It appears that the solved \(F\) value falls very close to a significant level of \(a=0.005\) which has a \(P\)-value of \(12.9\).
You must be able to refer to the table on "Percentiles of F Distribution" to conduct your analysis or use some other statistical software to determine the exact \(P\)-value.
Final step: Communicate your finding.
The \(F\)-value determined from the experiment, \(12.8\) is found in between \(F_{0.01}=9.78\) and \(F_{0.005}=12.9\), and by using statistical software the exact \(P\)-value is \(0.00512\). Since the experiment \(P\)-value (\(0.00512\)) is less than said the chosen level of significance \(a=0.05\), then, you can reject the null hypothesis, \(H_0\): There is no variability in the efficiency of the brushes.
This means that Femi's conclusion indicates variability in the brushes.
Well, I guess that supported my excuse as to why I got tired of cleaning since some brushes weren't that efficient.
Try out more examples on your own, while keeping in mind that randomized blocking is essentially ridding off the nuisance factors through blocking (grouping) before randomization. The goal is to create groups that are similar with less variability as compared to the whole samples. Moreover, if variability is more observable within blocks, this is an indication that blocking is not done properly or the nuisance factor is not very good a variable to block. Hoping you'll start blocking afterwards!
Randomized Block Design - Key takeaways
- The randomized block design is described as the process of grouping (or stratifying) before randomly picking samples for an experiment.
- The randomized block design is more beneficial than complete randomization because it reduces error by creating groups that contain items that are much more similar in comparison to the entire sample.
- The randomized block and matched pair designs are best applied to only small sample sizes.
Randomized error is beneficial in smaller sample sizes in reducing the error term.
The statistical model for a randomized block design for one blocked nuisance factor is given by:
\[y_{ij}=µ+T_1+B_j+E_{ij}\]
Learn faster with the 14 flashcards about Randomized Block Design
Sign up for free to gain access to all our flashcards.
Frequently Asked Questions about Randomized Block Design
What is an example of a randomized block design?
A randomized block design is when you divide in groups the population before proceeding to take random samples. For example, rather than picking random students from a high school, you first divide them in classrooms, and then you start picking random students from each classroom.
How do you create a randomized block design?
To create a randomized block design you first need to divide the population in groups, a step which is also known as stratification. Then, you pick random samples from each group.
What is the difference between a completely randomized design and a randomized block design?
In the completely randomized design, you make a sample by picking random individuals from the whole population with no particular criteria. In a randomized block design, you first divide the population into groups, and then pick random individuals from each group.
What is the primary benefit of a randomized block design?
Doing a randomized block design can help you identify factors that otherwise would have lead to errors in the experiment. A factor may be known and controllable, so you divide the samples based on this factor to reduce variability.
What are the advantages of randomized block design?
Variability is reduced by creating groups of members that share characteristics. This means that a randomized block design can help you:
- Reduce error.
- Increase the statistical reliability of a study.
- Focus on smaller sample sizes
About StudySmarter
StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.
Learn more