Error estimation can also use the mean value of all the measurements if there is no expected value or standard value.
The mean value
To calculate the mean, we need to add all measured values of x and divide them by the number of values we took. The formula to calculate the mean is:
\[\text{mean} = \frac{x_1 + x_2 + x_3 + x_4 + ...+x_n}{n}\]
Let’s say we have five measurements, with the values 3.4, 3.3, 3.342, 3.56, and 3.28. If we add all these values and divide by the number of measurements (five), we get 3.3764.
As our measurements only have two decimal places, we can round this up to 3.38.
Estimation of errors
Here, we are going to distinguish between estimating the absolute error, the relative error, and the percentage error.
Estimating the absolute error
To estimate the absolute error, we need to calculate the difference between the measured value x0 and the expected value or standard xref:
\[\text{Absolute error} = |x_0 - x_{ref}|\]
Imagine you calculate the length of a piece of wood. You know it measures 2.0m with a very high precision of ± 0.00001m. The precision of its length is so high that it is taken as 2.0m. If your instrument reads 2.003m, your absolute error is | 2.003m-2.0m | or 0.003m.
Estimating the relative error
To estimate the relative error, we need to calculate the difference between the measured value x0 and the standard value xref and divide it by the total magnitude of the standard value xref:
\[\text{Relative error} = \frac{|x_0 - x_{ref}|}{|x_{ref}|}\]
Using the figures from the previous example, the relative error in the measurements is | 2.003m-2.0m | / | 2.0m | or 0.0015. As you can see, the relative error is very small and has no units.
Estimating the percentage error
To estimate the percentage error, we need to calculate the relative error and multiply it by one hundred. The percentage error is expressed as ‘error value’%. This error tells us the deviation percentage caused by the error.
\[\text{Percentage error} = \frac{|x_0 - x_{ref}|}{|x_{ref}|} \cdot 100 \%\]
Using the figures from the previous example, the percentage error is 0.15%.
What is the line of best fit?
The line of best fit is used when plotting data where one variable depends on another one. By its nature, a variable changes value, and we can measure the changes by plotting them on a graph against another variable such as time. The relationship between two variables will often be linear. The line of best fit is the line that is closest to all the plotted values.
Some values might be far away from the line of best fit. These are called outliers. However, the line of best fit is not a useful method for all data, so we need to know how and when to use it.
Obtaining the line of best fit
To obtain the line of best fit, we need to plot the points as in the example below:
Fig. 1 - Data plotted from several measurements showing variation on the y-axis
Here, many of our points are dispersed. However, despite this data dispersion, they appear to follow a linear progression. The line that is closest to all those points is the line of best fit.
When to use the line of best fit
To be able to use the line of best fit, the data need to follow some patterns:
- The relationship between the measurements and the data must be linear.
- The dispersion of the values can be large, but the trend must be clear.
- The line must pass close to all values.
Data outliers
Sometimes in a plot, there are values outside the normal range. These are called outliers. If the outliers are fewer in number than the data points following the line, the outliers can be ignored. However, outliers are often linked to errors in the measurements. In the image below, the red point is an outlier.
Fig. 2 - Data plotted from several measurements showing variation on the y-axis in green and an outlier in pink
Drawing the line of best fit
To draw the line of best fit, we need to draw a line passing through the points of our measurements. If the line intersects with the y-axis before the x-axis, the value of y will be our minimum value when we measure.
The inclination or slope of the line is the direct relationship between x and y, and the larger the slope, the more vertical it will be. A large slope means that the data changes very fast as x increases. A gentle slope indicates a very slow change of the data.
Figure 3 - The line of best fit is shown in pink, with the slope being shown in light green
Calculating uncertainty in a plot
In a plot or a graph with error bars, there can be many lines passing between the bars. We can calculate the uncertainty of the data using the error bars and the lines passing between them. See the following example of three lines passing between values with error bars:
Fig. 4 - Plot showing uncertainty bars and three lines passing between them. The blue and purple lines begin at the extreme values of the uncertainty bars
How to calculate the uncertainty in a plot
To calculate the uncertainty in a plot, we need to know the uncertainty values in the plot.
- Calculate two lines of best fit.
- The first line (the green one in the image above) goes from the highest value of the first error bar to the lowest value of the last error bar.
- The second line (red) goes from the lowest value of the first error bar to the highest value of the last error bar.
- Calculate the slope m of the lines using the formula below.
\[m = \frac{y_2 - y_1}{x_2-x_1}\]
Let’s look at an example of this, using temperature vs time data.
Calculate the uncertainty of the data in the plot below.
Figure 6. Plot showing uncertainty bars and three lines passing between them. The red and green lines begin at the extreme values of the uncertainty bars. Source: Manuel R. Camacho, StudySmarter.The plot is used to approximate the uncertainty and calculate it from the plot.
Time (s) | 20 | 40 | 60 | 80 |
Temperature in Celsius | 84.5 ± 1 | 87 ± 0.9 | 90.1 ± 0.7 | 94.9 ± 1 |
To calculate the uncertainty, you need to draw the line with the highest slope (in red) and the line with the lowest slope (in green).
In order to do this, you need to consider the steeper and the less steep slopes of a line that passes between the points, taking into account the error bars. This method will give you just an approximate result depending on the lines you choose.
You calculate the slope of the red line as below, taking the points from t=80 and t=60.
\(\frac{(94.9+1)^\circ C - (90.1 + 0.7)^\circ C}{(80-60)} = 0.255 ^\circ C\)
You now calculate the slope of the green line, taking the points from t=80 and t=20.
\(\frac{(94.9- 1)^\circ C - (84.5 + 1)^\circ C}{(80-20)} = 0.14 ^\circ C\)
Now you subtract the slope of the green one (m2) from the slope of the red one (m1) and divide by 2.
\(\text{Uncertainty} = \frac{0.255^\circ C - 0.14 ^\circ C}{2} = 0.0575 ^\circ C\)
As our temperature measurements take only two significant digits after the decimal point, we round the result to 0.06 Celsius.
Estimation of Errors - Key takeaways
- You can estimate the errors of a measured value by comparing it to a standard value or reference value.
- The error can be estimated as an absolute error, a percentage error, or a relative error.
- The absolute error measures the total difference between the value you expect from a measurement (X0) and the obtained value (Xref), equal to the absolute value difference of both Abs = | X0-Xref|.
- The relative and percentage errors measure the fraction of the difference between the expected value and the measured value. In this case, the error is equal to the absolute error divided by the expected value \(rel = \frac{Abs}{X_0}\) for the relative error, and divided by the expected value and expressed as a percentage for the \(\text{percentage error per} = \Big(\frac{Abs}{X_0} \Big) \cdot 100\). You must add the percentage symbol for percentage errors.
- You can approximate the relationship between your measured values using a linear function. This approximation can be made simply by drawing a line, which must be the line that passes closest to all values (the line of best fit).
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel