previous lesson

Prediction

INTERVAL DATA -- Bivariate Distribution

A. A review of predictive techniques:
Income (Y)
Education (X)Hi Lo
Hi
14 2
Lo
6 11

Here one can easily see a positive relationship:

hi education--hi income

low education--low income

It is easy to predict using this table: Say a person has a low income, one would predict a low level of education.

Data in different form:
PersonEducation X Income Y
A109
B88
C76
PREDICT6 5
D54
E22

Using this information: If a person has education 8, predict an income of 8. If a person has income 5, although there is no income of 5, still predict that the person's education would be about 6.

A scatter diagram can be made to represent the relationship between the two variables: X = Education, Y = Income.


A straight line approximately describes the relationship between X and Y. One can use it to predict: Say a person has an income of 5, one would predict an education of about 6.

B. Equation of a Line: Y = a + bX

Where: X and Y are variables using for prediction

a = "Y - intercept"

b = slope of line


To get the slope, take any point not on the line and measure its distance from the line. The vertical line's distance is "P" and the horizontal line's distance is "Q." The slope equals P / Q. In this example,

a = 2, b = 2 / 2 = 1, so the equation of this line is:

Y = 2 + 1X

C. Predicting Y from a knowledge of X: Y = ayx + byxX

Where: byx = N( XY) - ( X)( Y)
----------------N( X2 ) - (X)2

ayx = Y - (byx)(X)
------------N

D. Predicting X from a knowledge of Y: X = axy + bxyY

Where: byx = N( XY) - ( X)( Y)
----------------N(Y2 ) - ( Y)2

ayx = X - (byx )(Y)
-----------N

Assumptions:

interval data

linear relationship

homoscedasticity

There is a linear relationship as long as the shape of the data scatter has some oblong shape. A circle, a S, etc. is not a linear. Homoscedasticity: similar variance in columns and rows. If it fits linear a relationship, it will also be homoscedastic.

Example: Q: Predict the income of a person who is 9 feet tall.

Before using the formulas, try predicting by observation; one would expect an answer of about $9.50.
PersonHeight X Income Y
A1010
B89
C67
D32

A: X = height Y = income
PersonHeight X X2Income Y XY
A10100 10100
B864 972
C636 742
D39 26
N=427209 28220

byx = [4(220) - 27(28)] / [4(209) - (27)2] = 1.16

ayx = [28 - 1.16(27)] / 4 = -.83

Y = -.83 + 1.16X

This is the straight line which describes the data. Now use the equation to solve for X = 9:
Y = -.83 + 1.16(9)
Y = -.83 + 10.44
Y = $9.61

table of contents

homework

next lesson