5.
The Excel Trendline tool
Although
it is important to understand the criterion of least square errors
and the norm of the regression coefficient at the conceptual
level, it is not important that we be able to calculate these by
hand. Luckily modern technology tools can easily find the line of best fit and the regression data for us. Therefore the Revised National Curriculum Statement requires learners to:
- Use available technology to calculate the regression function which best fits a given set of bivariate numerical data.
- Use available technology to calculate the correlation co-efficient of a set of bivariate numerical data to make relevant deductions.
If you have access to a graphing calculator, you can read this TI-82, 83 Data Fitting Tutorial. In this section we investigate the use of Excel’s Trendline tool.
The
table shows the price of a one-day adult admission to
a local zoo since 1999.
Fit a regression line
to the data and predict the ticket price in 2005 and
2010.
To
learn how to use the Excel Trendline and to solve the
problem, open the Excel ticket prices tool:

|
Years |
x:
yrs since 1999 |
y:
Price (R) |
1999 |
0 |
12 |
2000 |
1 |
15 |
2001 |
2 |
18 |
2002 |
3 |
21 |
2003 |
4 |
22,5 |
2004 |
5 |
23,85 |
|
|
|
Let’s now solve
two previous problems again, to show how useful Excel’s Trendline
is!
- We
have solved this problem of the line of best fit numerically
and algebraically. Now use the Best Excel worksheet
to let Excel “do it for you”!
|
|
- In
the introduction we asked the question about the algebraic
relationship between x
and y
in the table (Hooke’s Law) and the value of y(30).
Now solve the problem with this Excel Spring tool:

|
 |
Mass
(x) |
Length
(y) |
0 |
5,05 |
3 |
6,72 |
6 |
8,40 |
9 |
9,15 |
12 |
10,50 |
15 |
12,85 |
18 |
13,65 |
30 |
? |
|
|
|
A Grade
8 mathematics class collected cylindrical objects like coffee
cans, juice cans, etc. They measured the circumference and
the diameter and recorded their results in this table.
Object |
Diameter
(cm) |
Circumference
(cm) |
Orange
juice can |
3 |
10 |
Coffee
can |
5 |
16,2 |
Coffee
cup |
7,65 |
23,9 |
Tomato
juice can |
10,8 |
32,3 |
Ashtray |
13 |
39,7 |
Milk
container |
10 |
32,25 |
Saucer |
14 |
43,98 |
Ashtray |
12,5 |
40,15 |
Soup
can |
6,8 |
20,8 |
Candle |
4,5 |
17,2 |
|
How
can they deduce the relationship between the circumference
and diameter of a circle (the definition of
p)?
Find the most appropriate model for the situation.
Predict the circumference of a circular object with
a diameter of 20 cm.
Open
the Pi Excel tool below and use Excel’s built-in
Trendline facility to find the best model fitting
the experimental data. There is also a discussion.

|
|
|
Why does
a balloon rise when inflated with warm air? From our everyday
experiences we realize that air expands on heating. So,
the density of the air should decrease as it is heated (remember
density = mass/volume).
An
amateur balloonist in France by the name of Jacques
Charles (1746-1823), made a series of careful investigations
of the effect of changing temperature on the volume
of a sample of air. Specifically, he investigated the
change of volume of (a fixed amount of) air as a function
of temperature at a fixed pressure. This is now known
as Charles’ Law.
Use
the given sample data and the Hot balloon Excel tool
below to find:
-
An algebraic model of the relationship between the
Volume and Temperature of a gas.
-
The volume at a temperature of 0 °C.
-
The temperature when the volume is 0 cm3.
|
Temperature
(°C) |
Volume
(cm3) |
10,0 |
141,4 |
20,0 |
144,8 |
25,0 |
147,8 |
30,0 |
149,7 |
35,0 |
153,1 |
50,0 |
160,5 |
80,0 |
175,4 |

|
|
|
If an
object is dropped from a height, its downward speed theoretically
increases linearly over time because the object is subject
to the steady pull of gravity (see unit 11). But because it
is not easy to measure speed after a fixed time, a plot of
speed vs. time will usually not lie exactly on a straight
line. Here are observational data on the speed of a ball dropped
from a certain height.
Find
a model approximating the speed of the ball
after a certain time. Would you be willing to use the
model to predict the speed after 10 seconds? If so,
what speed would you predict after 10 seconds?
You
can use this Speed Excel tool:

|
Time
(x
seconds) |
Speed
(y
m/s) |
0 |
0 |
0,2 |
1,92 |
0,4 |
3,58 |
0,6 |
6,01 |
0,8 |
7,88 |
|
|
|
Polynomial
regression
Not
all data sets are linear. Some data may
be better fitted by a quadratic, cubic, or even a
quartic relationship.
This
table gives the age and average number of births per
1000 women at that age. Fit a quadratic, cubic, quartic,
… function to this set of data and decide which
function best fits the data.
Polynomial
regression is easy to perform in Excel. Click on the
button below for a demonstration and solution:

|
Age |
Births/1000
women |
16 |
34 |
18,5 |
86,5 |
22 |
111,1 |
27 |
113,9 |
32 |
84,5 |
37 |
35,4 |
42 |
6,8 |
|
|
|
The technology gives
us undreamed computational power – it automatically does
all the calculations to find the least square error and to calculate the
equation of the function of best fit for that model, and it can instantaneously
do it for several different models we may choose (e.g. linear, quadratic,
cubic).
The danger exists
that we will simply use this technology computational power to quickly
and painlessly try different models, and then choose the one with the
highest numerical correlation (i.e. R2) value. This will
be misuse of the technology and will often lead to ridiculous results!
The technology can generate the one model after the other at the
wink of an eye, but the technology cannot decide which is the
most appropriate model for the situation! The model with the highest correlation
is not necessarily the most appropriate model for the situation!
The onus for interpreting
the technology results is on us! We must choose
the appropriate model – the technology cannot do it for us!!
The table
below lists winners and winning times (in seconds) for the men's and women's
200 meter run at the Olympic Games.
Year |
Male
Winner |
Time |
Female
Winner |
Time |
1948 |
Mel
Patton, USA |
21,1 |
F.
Blanker-Koen, Neth |
24,4 |
1952 |
Andrew
Stanfield, USA |
20,7 |
Marjorie
Jackson, Australia |
23,7 |
1956 |
Bobby
Morrow, USA |
20,6 |
Betty
Cuthbert, Australia |
23,1 |
1960 |
Livio
Berruti, Italy |
20,5 |
Wilma
Rudolph, USA |
24,0 |
1964 |
Harry
Carr, USA |
20,3 |
Edith
McGuire, USA |
23,0 |
1968 |
Tommie
Smith, USA |
19,83 |
Irena
Szewinska, Poland |
22,5 |
1972 |
Valeri
Borzov, USSR |
20,00 |
Renate
Stacher, E. Ger. |
22,4 |
1976 |
Donald
Quarrie, Jam. |
20,23 |
Barbel
Eckert, E. Ger. |
22,37 |
1980 |
Pietro
Mennes, Italy |
20,19 |
Barbel
Wockel, E. Ger. |
22,03 |
1984 |
Carl
Lewis, USA |
19,80 |
Valerie
Brisco-Hooks, USA |
21,81 |
1988 |
Joe
Deloach, USA |
19,75 |
Flo
Griffith-Joyner, USA |
21,34 |
1992 |
Mike
Marsh, USA |
20,01 |
Gwen
Torrance, USA |
21,81 |
1996 |
Michael
Johnson, USA |
19,32 |
Marie-Jose
Perec, France |
22,12 |
2000 |
Konstantinos
Kenteris, Greece |
20,09 |
Marion
Jones, USA |
21,64 |
The
main question for investigation: Do you think the time
might come when the winning times for men and women
are the same? If so, when? Motivate your answer.
You
can use this Olympic Excel tool to help you investigate:
First
analyse some underlying assumptions: In the Excel worksheet,
fit a linear, a quadratic and a cubic Trendline to the data for men and for women.
Which of these models is the best fit? Motivate.
|
 |
What do you predict for the men’s and the women’s
times for the 200 m run at the 2008 and the 2060 Olympics?
Return
to the original question: Do you think the time might come
when the winning times for men and women are the same? If
so, when? Motivate your answer.
|
|
|