previous lesson

Kruskal - Wallis H

INFERENCE: one nominal and one ordinal variable

A. Kruskal-Wallis analysis of variance (H):
1. Rank order all ordinal scores in all nominal categories.
2. Calculate the sum of ranks for each nominal category.
3. Substitute into the formula:

H = 12/[N(N + 1)] [Rj2 / nj ] - 3(N + 1)

Where: N = total in sample
nj = number in a nominal category
k = number of nominal categories
Rj = sum of ranks for a nominal category

4. Look up p-value of H in the Chi Square Significant Probability Value using the degrees of freedom formula: k - Assumptions:

Example: Data without ties on the ordinal variable

Political Interest Scores
SouthNorthEast West
353640 41
343130 28
222326 27
212017 14
8911 13

Q: Can this be generalized to the whole population?
A: Variables: region of residence (nominal) and political interest scores (ordinal).
Check to be sure assumptions are met before proceeding. (Okay in this example)

SouthRank NorthRank EastRank WestRank
3517 3618 4019 4120
3416 3115 3014 2813
229 2310 2611 2712
218 207 176 145
81 92 113 134
51=Rs 52=Rn 53=Re 54=Rw

*Note: When ranking the scores, ranking can either be from high to low or low to high.

RS = sum of ranks for south. nS = number in south = 5
RN = " " " " north. nN = " " "north = 5
RE = " " " " east. nE = " " " east = 5
RW = " " " " west. nW = " " " west = 5 and N = 20

H = 12/[20(20+1)] [(51)2/5 + (52)2/5+ (53)2/5 + (54)2/5] - 3(20 + 1) =

H = .0285714 [ 2206 ] - 63**Note: Keep as much accuracy as possible for this statistic.

H = 63.028508 - 63

H =.03 (At the end, round answer to 2 decimal places.)

Look this up on Chi Square table: d.f. = k - 1 = 4 - 1 = 3

In this case, p value > .05 so one cannot generalize the difference to the whole population.

Example: Data with ties on the ordinal variable:

Abortion Attitude Scores
HiMedHi MedLoLo
Experimental group #15 1006
Experimental group #20 1157
Control group03 407
55 5520

Q: Can this be generalized to the whole population?
A: Put data in a table with nominal variable on left side and ordinal variable on top. Use same formula and same procedure.

Dealing with rank scores in data with ties: scores from an exam:

Score Rank

99 -----1

97 -----2.5 If these were ranked #2 and #3 it wouldn't

97 -----2.5 be fair since they have the same score, so

93 -----4 split the difference.

92 -----5

90 -----7 Same for here: Instead of numbers 6,7 and 8,

90 -----7 split the difference and gave it to each.

90 -----7

89 -----9

In the example above, take the column totals and make up a table:

Abortion Attitude Scale
HiMedHi MedLoLo
Column totals55 55
Rank range1-56-10 11-1516-20
Median range28 1318

R exp #1 = 5(3) + 1(8) = 23; n exp #1 = 6

R exp #2 = 1(8) + 1(13) + 5(18) = 111; n exp #2 = 7

R control = 3(8) + 4(13) = 76; n control = 7

H = 12/[20(20+1)] [(23)2/6 + (111)2/7 + (76)2/7] - 3(20+1) = .0285714 [ 2673.45 ] - 63

H = 76.384209 - 63 = 13.38

Look this value up on the Chi Square table, (d.f.= 3 - 1 = 2) and find that H is greater than p < .01 so one can generalize to the entire population.

table of contents

homework

next lesson