The table shows
known values relating variables x
and y.
What will be the best prediction of
y when x
= 10?
We will find
the line of best fit by finding the least square errors in three
ways: numerically, algebraically and using technology software.
x
y
0
2
2
8
4
14
6
16
So y'
= 2x
+ 3 better fits the data than y'
= 3x
+ 2. But we have to check if another line does not give an
even better fit!
Check which of these lines gives a smaller square
error and is therefore an even better fit, and therefore
gives a better estimate of y'(10): y'
= 2x
+ 3; y'
= 2x
+ 4; y'
= 2x
+ 5; y'
= 3x;
y'
= 3x
+ 1; y'
= 3x
+ 2; y'
= 2,5x
+ 1; y'
= 2,5x
+ 2; y'
= 2,5x
+ 3
To find
the least square error numerically is very computation-intensive!
So let’s use technology tools to do the calculations
for us! Open and use the Best
line Excel worksheet:
Is there
always a minimum square error? Why? This is an important theoretical
and practical question – you do not want to search for
a minimum if a unique minimum does not exist!
Let’s
investigate a special case to begin with. Let’s fix
b at 3, and ask: How does S =
å(y–y')2
change
as a changes? For which a is y'
= ax
+ 3 the best fit for the above data? Investigate
the problem numerically, by using this Excel worksheet:
Now investigate
the problem algebraically: Show that S = å(y–y')2 can
be written as a quadratic function of a, and deduce the minimum
value of S and the corresponding value of a.
Check your answer here:
When both a
and b can vary, we will find that the sum of square errors is a quadratic
function in a and b, and therefore S always has a minimum value. We can
find the minimum value of S and the corresponding values of a and b, and
therefore the line of best fit in several ways.
Let’s
find the line of best fit through numerical trial-and
improvement. The computational power of Excel can help
us to easily find the line of best fit:
Now investigate
the problem algebraically. Show that S = å(y'
–y)2 can
be written as a quadratic function in a and b and deduce the
minimum value of S and the corresponding values of a and b.
Check your answer here:
In
the introduction we asked the question about the algebraic
relationship between x
and y
in the table (Hooke’s Law) and the value of y(30).
Now
use the computational power of Excel to find the least
square error and hence the line of best fit: