Frames:

5. The Excel Trendline tool

Although it is important to understand the criterion of least square errors and the norm of the regression coefficient at the conceptual level, it is not important that we be able to calculate these by hand. Luckily modern technology tools can easily find the line of best fit and the regression data for us. Therefore the Revised National Curriculum Statement requires learners to:

  • Use available technology to calculate the regression function which best fits a given set of bivariate numerical data.
  • Use available technology to calculate the correlation co-efficient of a set of bivariate numerical data to make relevant deductions.
If you have access to a graphing calculator, you can read this TI-82, 83 Data Fitting Tutorial. In this section we investigate the use of Excel’s Trendline tool.


The table shows the price of a one-day adult admission to a local zoo since 1999.

Fit a regression line to the data and predict the ticket price in 2005 and 2010.

To learn how to use the Excel Trendline and to solve the problem, open the Excel ticket prices tool:

Years
x: yrs since 1999
y: Price (R)
1999
0
12
2000
1
15
2001
2
18
2002
3
21
2003
4
22,5
2004
5
23,85

Let’s now solve two previous problems again, to show how useful Excel’s Trendline is!


  1. We have solved this problem of the line of best fit numerically and algebraically. Now use the Best Excel worksheet to let Excel “do it for you”!

x
y
0
2
2
8
4
14
6
16
  1. In the introduction we asked the question about the algebraic relationship between x and y in the table (Hooke’s Law) and the value of y(30).

    Now solve the problem with this Excel Spring tool:


Mass (x)
Length (y)
0
5,05
3
6,72
6
8,40
9
9,15
12
10,50
15
12,85
18
13,65
30
?

A Grade 8 mathematics class collected cylindrical objects like coffee cans, juice cans, etc. They measured the circumference and the diameter and recorded their results in this table.

Object
Diameter (cm)
Circumference (cm)
Orange juice can
3
10
Coffee can
5
16,2
Coffee cup
7,65
23,9
Tomato juice can
10,8
32,3
Ashtray
13
39,7
Milk container
10
32,25
Saucer
14
43,98
Ashtray
12,5
40,15
Soup can
6,8
20,8
Candle
4,5
17,2

How can they deduce the relationship between the circumference and diameter of a circle (the definition of p)?

Find the most appropriate model for the situation. Predict the circumference of a circular object with a diameter of 20 cm.

Open the Pi Excel tool below and use Excel’s built-in Trendline facility to find the best model fitting the experimental data. There is also a discussion.


Why does a balloon rise when inflated with warm air? From our everyday experiences we realize that air expands on heating. So, the density of the air should decrease as it is heated (remember density = mass/volume).

An amateur balloonist in France by the name of Jacques Charles (1746-1823), made a series of careful investigations of the effect of changing temperature on the volume of a sample of air. Specifically, he investigated the change of volume of (a fixed amount of) air as a function of temperature at a fixed pressure. This is now known as Charles’ Law.

Use the given sample data and the Hot balloon Excel tool below to find:

  1. An algebraic model of the relationship between the Volume and Temperature of a gas.
  2. The volume at a temperature of 0 °C.
  3. The temperature when the volume is 0 cm3.
Temperature (°C)
Volume (cm3)
10,0
141,4
20,0
144,8
25,0
147,8
30,0
149,7
35,0
153,1
50,0
160,5
80,0
175,4


If an object is dropped from a height, its downward speed theoretically increases linearly over time because the object is subject to the steady pull of gravity (see unit 11). But because it is not easy to measure speed after a fixed time, a plot of speed vs. time will usually not lie exactly on a straight line. Here are observational data on the speed of a ball dropped from a certain height.

Find a model approximating the speed of the ball after a certain time. Would you be willing to use the model to predict the speed after 10 seconds? If so, what speed would you predict after 10 seconds?

You can use this Speed Excel tool:

Time (x seconds)
Speed (y m/s)
0
0
0,2
1,92
0,4
3,58
0,6
6,01
0,8
7,88

Polynomial regression


Not all data sets are linear. Some data may be better fitted by a quadratic, cubic, or even a quartic relationship.

This table gives the age and average number of births per 1000 women at that age. Fit a quadratic, cubic, quartic, … function to this set of data and decide which function best fits the data.

Polynomial regression is easy to perform in Excel. Click on the button below for a demonstration and solution:

Age
Births/1000 women
16
34
18,5
86,5
22
111,1
27
113,9
32
84,5
37
35,4
42
6,8

The technology gives us undreamed computational power – it automatically does all the calculations to find the least square error and to calculate the equation of the function of best fit for that model, and it can instantaneously do it for several different models we may choose (e.g. linear, quadratic, cubic).

The danger exists that we will simply use this technology computational power to quickly and painlessly try different models, and then choose the one with the highest numerical correlation (i.e. R2) value. This will be misuse of the technology and will often lead to ridiculous results! The technology can generate the one model after the other at the wink of an eye, but the technology cannot decide which is the most appropriate model for the situation! The model with the highest correlation is not necessarily the most appropriate model for the situation!

The onus for interpreting the technology results is on us! We must choose the appropriate model – the technology cannot do it for us!!

The table below lists winners and winning times (in seconds) for the men's and women's 200 meter run at the Olympic Games.

Year
Male Winner
Time
Female Winner
Time
1948
Mel Patton, USA
21,1
F. Blanker-Koen, Neth
24,4
1952
Andrew Stanfield, USA
20,7
Marjorie Jackson, Australia
23,7
1956
Bobby Morrow, USA
20,6
Betty Cuthbert, Australia
23,1
1960
Livio Berruti, Italy
20,5
Wilma Rudolph, USA
24,0
1964
Harry Carr, USA
20,3
Edith McGuire, USA
23,0
1968
Tommie Smith, USA
19,83
Irena Szewinska, Poland
22,5
1972
Valeri Borzov, USSR
20,00
Renate Stacher, E. Ger.
22,4
1976
Donald Quarrie, Jam.
20,23
Barbel Eckert, E. Ger.
22,37
1980
Pietro Mennes, Italy
20,19
Barbel Wockel, E. Ger.
22,03
1984
Carl Lewis, USA
19,80
Valerie Brisco-Hooks, USA
21,81
1988
Joe Deloach, USA
19,75
Flo Griffith-Joyner, USA
21,34
1992
Mike Marsh, USA
20,01
Gwen Torrance, USA
21,81
1996
Michael Johnson, USA
19,32
Marie-Jose Perec, France
22,12
2000
Konstantinos Kenteris, Greece
20,09
Marion Jones, USA
21,64

The main question for investigation: Do you think the time might come when the winning times for men and women are the same? If so, when? Motivate your answer.

You can use this Olympic Excel tool to help you investigate:

First analyse some underlying assumptions: In the Excel worksheet, fit a linear, a quadratic and a cubic Trendline to the data for men and for women. Which of these models is the best fit? Motivate.

What do you predict for the men’s and the women’s times for the 200 m run at the 2008 and the 2060 Olympics?

Return to the original question: Do you think the time might come when the winning times for men and women are the same? If so, when? Motivate your answer.