Unit 2: MANOVA

Document technical information

Format pdf
Size 765.6 kB
First found Nov 13, 2015

Document content analisys

Category Also themed
Language
Type
not defined
Concepts
no text concepts found

Transcript

MULTIVARIATE NORMAL DISTRIBUTION
SNU Psychometrics Lab
2014-09-15
Normal Distribution
ì 1 ( X - m )2 ü
1
f(X ) =
expí- ·
ý
s2 þ
2p s
î 2
SNU Psychometrics Lab
2014-09-15
n1
Data Structure and Distribution
é x11
êx
21
data X = ê
n´ p
ê ...
ê
êë xn1
x12
x22
...
xn 2
... x1 p ù
... x2 p úú
... ... ú
ú
... xnp úû
X ~ N ( m , I Ä S)
SNU Psychometrics Lab
2014-09-15
Data Structure and Distribution
é m1 ù
êm ú
2
m=ê ú
ê .. ú
ê ú
êë m p úû
X ~ N ( m , I Ä S)
Covariance
és 11 s 12
ês
s 22
21
Σ =ê
pxp
ê ...
...
ê
ëês p1 s p 2
SNU Psychometrics Lab
... s 1 p ù é s 12
ê
... s 2 p úú ê r 21s 2s 1
=
... ... ú ê ...
ú ê
... s pp ûú êë r p1s ps 1
r12s 1s 2
s 22
...
r p 2s ps 2
... r1 ps 1s p ù
ú
... r 2 ps 2s p ú
...
... ú
ú
...
s 2p úû
2014-09-15
n2
Bivariate Normal Distribution
f ( X1, X 2 ) =
1
2ps 1s 2
ìï - 1 é ( X 1 - m1 )2 ( X 2 - m 21 )2
( X - m1 )( X 2 - m 2 )ù üï
expí
+
- 2r 1
ê
úý
2
2
2
s2
s 1s 2
ïî 2(1 - r ) ë s 1
1- r 2
û ïþ
f ( X1, X 2 ) =
1
2ps 1s 2
ì -1
ü
expí
Cý
2
1- r 2
î 2(1 - r ) þ
2
2
(
( X - m1 )( X 2 - m 2 )
X 1 - m1 ) ( X 2 - m 21 )
C=
+
- 2r 1
s 12
SNU Psychometrics Lab
s 22
s 1s 2
2014-09-15
Density Function
m1 = 0 s 1 = 1
m2 = 0 s 2 = 2
r =0
SNU Psychometrics Lab
2014-09-15
n3
Isodensity Contour
m1 = 0 s 1 = 1
m2 = 0 s 2 = 2
r =0
SNU Psychometrics Lab
2014-09-15
Density Function
m1 = 0 s 1 = 1
m2 = 0 s 2 = 2
r = 0.7
SNU Psychometrics Lab
2014-09-15
n4
Isodensity Contour
m1 = 0 s 1 = 1
m2 = 0 s 2 = 2
r = 0.7
SNU Psychometrics Lab
2014-09-15
Marginal Distribution
SNU Psychometrics Lab
2014-09-15
n5
Generalization to p variable
é s 12
S=ê
ë rs 1s 2
rs 1s 2 ù
ú
s 22 û
S = s 12s 22 (1 - r 2 )
S -1 =
é s 22
1
ê
s 12s 22 (1 - r 2 ) ë- rs 1s 2
é 1
1 ê
s 12
=
(1 - r 2 ) êê- r
s 1s 2
ë
- rs 1s 2 ù
ú
s 12 û
ù
-r
s 1s 2 ú
ú
1 2 ú
s2 û
SNU Psychometrics Lab
2014-09-15
éX - m ù
c 2 = [X 1 - m1, X 2 - m2] S -1 ê 1
ú
ëX 2 - m û
éX - m ù
x =ê 1
ú
X
m
ë 2
û
c 2 = x¢S -1x
-1
f(X 1, X 2 ) = (2p ) S
SNU Psychometrics Lab
-1
2
2
c
exp()
2
2014-09-15
n6
Multivariate Normal
é s 12
ê
ê r 21s 2s 1
S=ê
.
ê
.
ê
êr s s
ë p1 p 1
r12s 1s 2
s 22
.
.
r p 2s ps 2
... r1 ps 1s p ù
ú
... r 2 ps 2s p ú
ú
.
.
ú
.
.
ú
...
s p2 úû
c 2 = x¢S -1x
x¢ = éë X 1 - m1 , X 2 - m2 ,..., X p - m p ùû
f ( X 1 , X 2 , ... , X p ) = (2p )
-p/2
S
-1
2
æ c2 ö
expç 2 ÷ø
è
SNU Psychometrics Lab
2014-09-15
Uncorrelated p variables
és 12 0 ... 0 ù
ê
ú
2
0
s
.
.
.
0
2
ê
ú
.
.
. ú
S=ê .
ê .
.
.
. ú
ê
2ú
0
0
.
.
.
s
êë
pú
û
p
x' S -1x =
SNU Psychometrics Lab
å(X
i
- mi ) 2
i =1
s
2
i
~ c 2 (df = p )
2014-09-15
n7
ellipse
•


+


= 1 (타원의 공식: 표준형)
•
 

•
   

•
  

+
+
  

 

 +
= 1 (평행이동)
+
   

  

+
  

= 1(회전)
  + 2 
 


+
SNU Psychometrics Lab



2014-09-15
Multivariate Analysis
Lecture 2: MANOVA, Discriminant
Analysis, Canonical Correlation
김청택
서울대학교 심리학과
SNU Psychometrics Lab
2014-09-15
n8
연구 예
• 가설: 인터넷 중독집단이 정상집단보다 인
터넷에 대한 금단과 내성성향이 높을 것이
다.
• IV: 인터넷 중독여부 (자가 진단)
• DV: 금단점수
내성점수
SNU Psychometrics Lab
2014-09-15
Data
90
80
70
내
성
점 60
수
50
40
30
30
40
50
60
70
80
90
금단점수
SNU Psychometrics Lab
2014-09-15
n9
m(내
금성
단)(
)(gg13) )
m g1
g3
다변량 분산분석의 영가설
H0:
내성 = 내성 = 내성
금단 = 금단 = 금단
 =  = 
내성
= 
금단
SNU Psychometrics Lab
2014-09-15
용어의 정리
l Yij : i 번째 집단의 j 번째 사람의 관찰치
l Full model 의 error term:
² 첫번째 종속변수: e1(ij ) = Y1ij - Y 1i.
² 두번째 종속변수:
e2 (ij ) = Y2ij - Y 2i.
l Reduced model 의 error term
² 첫번째 종속변수:
e1(ij ) = Y1ij - Y 1..
² 두번째 종속변수:
e2 ( ij ) = Y2ij - Y 2 ..
SNU Psychometrics Lab
2014-09-15
n10
é åå e12 ( F )
ê
E = e' e( F ) = ê i j
e1e2 ( F )
êëåå
i
j
åå e e ( F )ùú
åå e ( F ) úú
é åå e12 ( R )
ê
T = e' e( R ) = ê i j
e1e2 ( R )
êëåå
i
j
åå e e ( R)ùú
åå e ( R) úú
1 2
i
j
2
2
i
û
j
1 2
i
j
2
2
i
j
û
é
(Y 1 j - Y 1.) 2
åå
ê
i
j
H =T-E = ê
(Y 1 j - Y 1.)(Y 2 j - Y 2 .)
êëåå
i
j
åå (Y - Y .)(Y
åå (Y - Y
1j
i
1
j
2j
i
j
- Y 2 .)ù
ú
2
ú
2 .)
úû
2j
SNU Psychometrics Lab
2014-09-15
Test Statistics
E -1H 와 T -1H 가 가설검증의 통계치로 사용될 수 있음
단변량의 경우
H=SSB
SNU Psychometrics Lab
E=SSE
à F=
MSB SSB / df B SSB df E
=
=
·
MSE SSE / df E SSE df B
2014-09-15
n11
Data
• 변수 f4: 금단
• 변수 f7: 내성
• Diag: 인터넷 중독으로 자가 진단/ 비중독으로 자
가 진단
SNU Psychometrics Lab
2014-09-15
Data
90
80
70
60
50
40
30
30
SNU Psychometrics Lab
40
50
60
70
80
90
2014-09-15
n12
E=
é98824 55800 ù
ê55800 86074 ú
ë
û
0
é0.344069
ù
0
0.000354 úû
ë
l (eigen value) = ê
T=
é121152 80395 ù
ê 80395 113195ú
ë
û
H =T -E =
é 22328 24595ù
ê 24595 27121ú
ë
û
E -1 H =
é0.10189 0.11193ù
ê0.21969 0.24253ú
ë
û
é -0.419540 -0.740661ù
v(eigen vector) = ê
ú
ë -0.907737 0.671879 û
SNU Psychometrics Lab
2014-09-15
Canonical Variable
• p개의 변수가 있을 때 서로 직교하는 (독립적인)
p개의 canonical variables이 존재한다.
• 그 변수들을 V1, V 2, V 3, … , Vp 라 하자.
SNU Psychometrics Lab
2014-09-15
n13
Eigen Values and Eigen Vectors
• E-1 H 의 Eigen value, l1, l2, l3, l4, ……, lp라 하자.
단 l1> l2> l3> l4> ……> lp
• E-1 H 의 Eigen value, l1, l2, l3, l4, ……, lp는 각각 Vi의
SSB/SSE 를 나타낸다.
• 또한 각각의 eigen value에 해당하는 eigen vector는
canonical 계수 (c1, c2, …, cp)가 된다.
SNU Psychometrics Lab
2014-09-15
New Variate
V1 = -0.419540 X 1 - 0.907737 X 2
V2 = -0.740661X 1 + 0.671879 X 2
20
15
10
5
0
-5
-10
-15
-20
-25
-30
-110
SNU Psychometrics Lab
-100
-90
-80
-70
-60
-50
-40
2014-09-15
n14
New Variate
0 ù
é130819
ê 0
ú
37532
ë
û
E=
0 ù
é175830
ê 0
ú
37545
ë
û
T=
H =T -E =
é 45011 0 ù
ê 0
13úû
ë
E -1 H =
0
é0.344070
ù
ê
0
0.000354 úû
ë
SNU Psychometrics Lab
1
1
=
= 0.74401
1 + l1 1 + .344070
1
1
=
= 0.99965
1 + l2 1 + .000354
SSE1 130819
=
= 0.74401
SST1 175830
SSE2 37532
=
= 0.99965
SST2 37545
2014-09-15
Test Statistics
• 이때, 각 Vi의 검증 통계치는 다음과 같이 표현될
수 있다.
SSE
SSE
SSE / SSE
1
=
=
=
SST SSE + SSB SSE / SSE + SSB / SSE 1 + li
l
SSB
SSB
SSB / SSE
=
=
= i
SST SSE + SSB SSE / SSE + SSB / SSE 1 + li
SNU Psychometrics Lab
2014-09-15
n15
Test Statistics
• Wilk’s Lambda
L=
E
H+E
p
=Õ
i =1
1
1 + li
• Pillai-Batlett Trace
[
]
p
li
i =1 1 + li
V = tr (H + E) -1 H = å
SNU Psychometrics Lab
2014-09-15
Test Statistics
• Roy’s Greatest Characteristic Root
GCR =
l1
1 + l1
• Hotelling-Lawley Trace
[
]
p
T 2 = tr E -1 H = å li
i =1
SNU Psychometrics Lab
2014-09-15
n16
Multivariate Testsc
Effect
Intercept
diag
Value
F
Hypothesis
df
Error df
Sig.
Pillai's Trace
0.966
14350.011
2
996
0.0000
Wilks' Lambda
0.034
14350.011
2
996
0.0000
Hotelling's Trace
28.815
14350.011
2
996
0.0000
Roy's Largest Root
28.815
14350.011
2
996
0.0000
Pillai's Trace
0.256
73.287
4
1994
0.0000
Wilks' Lambda
0.744
79.454
4
1992
0.0000
Hotelling's Trace
0.344
85.675
4
1990
0.0000
Roy's Largest Root
0.344
171.519
2
997
0.0000
a. Exact statistic
b. The statistic is an upper bound on F that yields a lower bound on the significance level.
c. Design: Intercept + diag
SNU Psychometrics Lab
2014-09-15
Tests of Between-Subjects Effects
Source
Dependent Variable
Corrected Model
f4
Intercept
diag
Error
Total
Corrected Total
Type III Sum of
Squares
df
Mean Square
F
Sig.
22327.9
2
11164.0
112.6
0.00000
f7
27121.5
2
13560.7
157.1
0.00000
f4
2151047.2
1
2151047.2
21701.1
0.00000
f7
2097617.0
1
2097617.0
24296.9
0.00000
f4
22327.9
2
11164.0
112.6
0.00000
f7
27121.5
2
13560.7
157.1
0.00000
f4
98824.4
997
99.1
86.3
f7
86073.6
997
f4
2867095.7
1000
f7
2811146.7
1000
f4
121152.3
999
f7
113195.0
999
a. R Squared = .184 (Adjusted R Squared = .183)
SNU Psychometrics Lab
2014-09-15
n17
Between-Subjects SSCP Matrix
f4
Hypothesis Intercept
diag
Error
f7
f4
2151047.2 2124164.1
f7
2124164.1 2097617.0
f4
22327.9
24594.9
f7
24594.9
27121.5
f4
98824.4
55800.1
f7
55800.1
86073.6
Based on Type III Sum of Squares
Residual SSCP Matrix
f4
f7
Sum-of-Squares and Cross-Products f4
98824.4
55800.1
f7
55800.1
86073.6
f4
99.1
56.0
f7
56.0
86.3
f4
1.000
0.605
f7
0.605
1.000
Covariance
Correlation
Based on Type III Sum of Squares
SNU Psychometrics Lab
2014-09-15
SAS
proc glm data="manova";
class group;
model useful difficulty importance = group / ss3;
manova h=group;
run;
•
Data: A researcher randomly assigns 33 subjects to one of three
groups. The first group receives technical dietary information interactively
from an on-line website. Group 2 receives the same information from a
nurse practitioner, while group 3 receives the information from a video
tape made by the same nurse practitioner. The researcher looks at three
different ratings of the presentation, difficulty, usefulness and importance,
to determine if there is a difference in the modes of presentation. In
particular, the researcher is interested in whether the interactive website is
superior because that is the most cost-effective way of delivering the
information.
SNU Psychometrics Lab
2014-09-15
n18
Output
Dependent Variable: USEFUL
Source
Model
Error
Corrected Total
R-Square
0.152568
Source
GROUP
Sum of
DF
Squares
Mean Square
F Value
2
52.9242378
26.4621189
2.70
30
293.9654425
9.7988481
32
346.8896803
Coeff Var
Root MSE
USEFUL Mean
19.16873
3.130311
16.33030
DF
Type III SS
Mean Square
F Value
2
52.92423783
26.46211891
2.70
Pr > F
0.0835
Pr > F
0.0835
Dependent Variable: DIFFICULTY
Source
Model
Error
Corrected Total
R-Square
0.030516
Source
GROUP
Sum of
DF
Squares
Mean Square
F Value
2
3.9751512
1.9875756
0.47
30
126.2872767
4.2095759
32
130.2624279
Coeff Var
Root MSE
DIFFICULTY Mean
35.89975
2.051725
5.715152
DF
Type III SS
Mean Square
F Value
2
3.97515121
1.98757560
0.47
Pr > F
0.6282
Pr > F
0.6282
Dependent Variable: IMPORTANCE
Source
Model
Error
Corrected Total
R-Square
0.161018
Source
GROUP
Sum of
DF
Squares
Mean Square
F Value
Pr > F
2
81.8296936
40.9148468
2.88
0.0718
30
426.3708962
14.2123632
32
508.2005898
Coeff Var
Root MSE
IMPORTANCE Mean
58.21603
3.769929
6.475758
DF
Type III SS
Mean Square
F Value
Pr > F
2
81.82969356
40.91484678
2.88
0.0718
The SAS System
14:05 Monday, September 22, 2008
5
SNU Psychometrics Lab
2014-09-15
Characteristic Roots and Vectors of: E Inverse * H, where
H = Type III SSCP Matrix for GROUP
E = Error SSCP Matrix
Characteristic
Root
0.89198790
0.00524207
0.00000000
Percent
99.42
0.58
0.00
Characteristic Vector V'EV=1
USEFUL
DIFFICULTY
0.06410227
-0.00186162
0.01442655
0.06888878
-0.03149580
0.05943387
IMPORTANCE
0.05375069
-0.02620577
0.01270798
MANOVA Test Criteria and F Approximations for the Hypothesis of No Overall GROUP Effect
H = Type III SSCP Matrix for GROUP
E = Error SSCP Matrix
S=2
M=0
N=13
Statistic
Wilks' Lambda
Pillai's Trace
Hotelling-Lawley Trace
Den DF
56
58
35.61
Pr > F
0.0049
0.0122
0.0031
Roy's Greatest Root
0.89198790
8.62
3
29
NOTE: F Statistic for Roy's Greatest Root is an upper bound.
NOTE: F Statistic for Wilks' Lambda is exact.
0.0003
SNU Psychometrics Lab
Value
0.52578838
0.47667013
0.89722998
F Value
3.54
3.02
4.12
Num DF
6
6
6
2014-09-15
n19
판별분석 (DISCRIMINANT ANALYSIS)
SNU Psychometrics Lab
2014-09-15
판별분석 (Discriminant Analysis)
• 목표
– p개의 독립변수로 하나의 범주변수를 예언
– 예)
• 수입, 학력, 나이 등의 독립변수들을 이용하여 화이트 칼라
인지 블루 칼라인지를 예언하는 모형
• 개인차 변수들 (수입, 학력, 성별, 나이)을 이용하여 특정한
상품을 구입할 것인지를 예언하는 모형
• 여러 가지 진단결과들 (검사1, 검사2)을 이용하여 유병에
대하여 진단하는 모형
SNU Psychometrics Lab
2014-09-15
n20
New Variate
• p개의 독립변수, X1, … , Xp 로 하나의 변량(Y)을
구성함.
Y=v1X1+v2X2+…+ vpXp
• 새로운 변량Y의 분산은 다음과 같다.
– K개의 집단이 서로 다른 평균벡터를 가지고 있다고 가정하면,
SSw(Y)=SS1(Y)+SS2(Y)+…+SSK(Y)
=v’E1v +v’E2v +… +v’Ekv
=v’(E1 +E2+…+Ek)v= v’Ev
여기서 SSK(Y)는 k번째 범주에 속하는 집단의 SS matrix;
Ek는 k번째 집단의 E 행렬
– K개의 집단이 서로 다른 평균을 가지고 있지 않다고 가정하면,
SST(Y)= v’Tv
SNU Psychometrics Lab
2014-09-15
분산
• 집단간 차이에 의해 설명되는 분산 SSH(Y)
SS H (Y ) = ( Xv - Xv)¢( Xv - Xv)
= v' ( X - X)¢( X - X)v
= v' Hv
where H = ( X - X)¢( X - X)
• 집단내 차이에 비하여 집단간 차이를 가장 크게 하
는 v를 구함. 즉 아래의 값을 최대화시키는 v를 구함
f =
SS H (Y ) v'Hv
=
SS E (Y ) v'Ev
• F의 최대값은 E-1H의 최대고유치(eigen value)이고,
이때의 v는 E-1H는 위의 고유치에 대응하는 고유벡
터이다.
SNU Psychometrics Lab
2014-09-15
n21
용어의 정리
l Yij : i 번째 집단의 j 번째 사람의 관찰치
l Full model 의 error term:
² 첫번째 종속변수: e1(ij ) = Y1ij - Y 1i.
² 두번째 종속변수:
e2 (ij ) = Y2ij - Y 2i.
l Reduced model 의 error term
² 첫번째 종속변수:
e1(ij ) = Y1ij - Y 1..
² 두번째 종속변수:
e2 ( ij ) = Y2ij - Y 2 ..
SNU Psychometrics Lab
2014-09-15
é åå e12 ( F )
ê
E = e' e( F ) = ê i j
åå e1e2 ( F )
ëê i j
åå e e ( F )ùú
åå e ( F ) úú
é åå e12 ( R )
ê
T = e' e( R ) = ê i j
e1e2 ( R )
êëåå
i
j
åå e e ( R)ùú
åå e ( R) úú
1 2
i
j
2
2
i
û
j
1 2
i
j
2
2
i
j
û
é
(Y 1 j - Y 1.) 2
åå
ê
i
j
H =T-E = ê
(Y 1 j - Y 1.)(Y 2 j - Y 2 .)
åå
êë i j
SNU Psychometrics Lab
åå (Y - Y .)(Y
åå (Y - Y
1j
i
1
j
2j
i
j
- Y 2 .)ù
ú
2
ú
2 .)
úû
2j
2014-09-15
n22
Classification by Minimum Distance
• Euclidian Distance
Di2 = (y - y i )¢(y - y i )
• Mahalanobis Distance
Di2 = (y - y i )¢S -1 (y - y i )
• Assign x to population i if
{
Di2 = min D12 , D22 ,..., Dk2
}
SNU Psychometrics Lab
2014-09-15
Classification By Probability of Group Membership
• 자료가 주어졌을때 각 범주에 속할 확률을 구한 다음
그 확률이 가장 높은 범주로 분류한다.
• Bayes 정리를 사용하여 계산
– 집단의 centroid가 주어졌을 때 자료가 관찰된 확률을 가정 P(y|Gk)
– 집단의 기저율을 가정 P(Gk)
– 위의 두 정보로부터 주어진 자료가 각 집단에 속할 확률을 계산 P(Gk|y)
P (Gi | y ) =
P (y | Gi ) p (Gi )
P (y | G1 ) p (G1 ) + P (y | G2 ) p (G2 ) + ... + P (y | Gk ) p (Gk )
à 가장 확률이 높은 집단으로 분류
SNU Psychometrics Lab
2014-09-15
n23
DISCRIMINANT GROUPS=varname(min,max)
/VARIABLES=varlist
[/SELECT=varname(value)]
[/ANALYSIS=varlist[(level)] [varlist...]]
[/OUTFILE MODEL('file')]
[/METHOD={DIRECT**}] [/TOLERANCE={0.001}]
{WILKS }
{ n }
{MAHAL }
{MAXMINF }
{MINRESID}
{RAO }
[/MAXSTEPS={n}]
[/FIN={3.84**}] [/FOUT={2.71**}] [/PIN={n}]
{n }
{n }
[/POUT={n}] [/VIN={0**}]
{n}
[/FUNCTIONS={g-1,100.0,1.0**}]
[/PRIORS={EQUAL** }]
{n1 , n2 , n3 }
{SIZE
}
{value list}
[/SAVE=[CLASS[=varname]] [PROBS[=rootname]]
[SCORES[=rootname]]]
[/ANALYSIS=...]
[/MISSING={EXCLUDE**}]
{INCLUDE }
[/MATRIX=[OUT({*
})] [IN({*
})]]
{'savfile'|'dataset'}
{'savfile'|'dataset'}
[/HISTORY={STEP**} ]
{NONE }
[/ROTATE={NONE** }]
{COEFF }
{STRUCTURE}
[/CLASSIFY={NONMISSING } {POOLED }
[MEANSUB]]
{UNSELECTED } {SEPARATE}
{UNCLASSIFIED}
[/STATISTICS=[MEAN] [COV ] [FPAIR] [RAW ]
[STDDEV]
[GCOV] [UNIVF] [COEFF] [CORR] [TCOV ]
[BOXM] [TABLE] [CROSSVALID]
[ALL]]
[/PLOT=[MAP] [SEPARATE] [COMBINED]
[CASES[(n)]] [ALL]]
SNU Psychometrics Lab
2014-09-15
Test Statistics
• MANOVA에서 Wilk’s lambda에 대하여 배웠다.
Λ=
E
H+E
r
1
i =1 1 + li
=Õ
• L의 분포는 Bartlett V에 의하여 chi-square
분포로 근사될 수 있다.
V = -[N - 1 - ( p + K ) / 2]ln Λ
– 즉 V는 자유도가 p(K-1)인 c2분포에 접근한다.
SNU Psychometrics Lab
2014-09-15
n24
• V는 다시 Vm의 합으로 재기술될 수 있다.
V = -[N - 1 - ( p + K ) / 2] ln Λ
= [N - 1 - ( p + K ) / 2] ln[(1 + l1 )(1 + l2 )...(1 + lr )]
= [N - 1 - ( p + K ) / 2]
r
å ln(1 + l
m
)
m =1
= å Vm
m
where
Vm = [N - 1 - ( p + K ) / 2] ln(1 + lm )
SNU Psychometrics Lab
2014-09-15
Hypothesis Testing
• 영가설 0: H 00 : l1* = l*2 = ... = l*r = 0
– V를 통계치로 사용
1
*
*
*
• 영가설 1: H 0 : l1 ¹ 0 & l 2 = ... = l r = 0
–
r
V - V1 = å Vm 을
통계치로 사용
m=2
• 영가설 q:
H 0q : l1* ¹ 0 & ... & l*q ¹ 0 & l*q +1 = ... = l*r = 0
r
– V - V1 - V2 - ... - Vq =
åV 을 통계치로 사용
m
m = q +1
SNU Psychometrics Lab
2014-09-15
n25
SPSS Output 결과
Eigenvalues
Function
1
Eigenvalue
1.632a
% of
Variance
100.0
Cumulative
%
100.0
Canonical
Correlation
.787
a. First 1 canonical discriminant functions were used in the
analysis.
Wilks' Lambda
Test of Function(s)
1
Wilks'
Lambda
.380
Chi-square
91.467
df
7
SNU Psychometrics Lab
Sig.
.000
2014-09-15
Standardized Canonical Discriminant Function Coefficients Structure Matrix
CDI
BDI
BHOP
SSI
RFL
FIS
DIS
Function
1
-.428
.377
.177
.543
.643
.028
.069
RFL
SSI
BDI
BHOP
CDI
DIS
FIS
Function
1
.796
.669
.502
.469
.368
.145
.019
Pooled within-groups correlations between discriminating
variables and standardized canonical discriminant functions
Variables ordered by absolute size of correlation within function.
SNU Psychometrics Lab
2014-09-15
n26
Prior Probabilities for Groups
Functions at Group Centroids
GENDER
1.00
2.00
Function
1
-1.191
1.343
GENDER
1.00
2.00
Total
Unstandardized canonical discriminant functions evaluated at
group means
Cases Used in Analysis
Unweighted Weighted
53
53.000
47
47.000
100
100.000
Prior
.500
.500
1.000
Classification Resultsb,c
Original
GENDER
1.00
2.00
1.00
2.00
1.00
2.00
1.00
2.00
Count
%
Cross-validated a
Count
%
Predicted Group
Membership
1.00
2.00
51
2
3
44
96.2
3.8
6.4
93.6
48
5
6
41
90.6
9.4
12.8
87.2
Total
53
47
100.0
100.0
53
47
100.0
100.0
a. Cross validation is done only for those cases in the analysis. In cross
validation, each case is classified by the functions derived from all
cases other than that case.
b. 95.0% of original grouped cases correctly classified.
c. 89.0% of cross-validated grouped cases correctly classified.
SNU Psychometrics Lab
2014-09-15
Canonical Discriminant Function 1
Canonical Discriminant Function 1
GENDER = 1
GENDER = 2
14
10
12
8
10
8
6
6
4
4
Std. Dev = .94
2
2
Std. Dev = 1.06
Mean = -1.19
Mean = 1.34
N = 53.00
0
-4.25 -3.75 -3.25 -2.75 -2.25 -1.75 -1.25
-.75
-4.00 -3.50 -3.00 -2.50 -2.00 -1.50 -1.00
SNU Psychometrics Lab
-.25
-.50
.25
0.00
N = 47.00
0
-.25
.25
0.00
.75
.50
1.25
1.00
1.75
1.50
2.25
2.00
2.75
2.50
3.25
3.00
3.75
3.50
4.25
4.00
2014-09-15
n27
Another Example
SNU Psychometrics Lab
2014-09-15
proc discrim canonical crossvalidate
data=discrim ;
class gender;
var cdi bdi bhop ssi rfl fis dis;
priors proportional;
run;
SNU Psychometrics Lab
2014-09-15
n28
정준상관 (CANONICAL CORRELATION)
SNU Psychometrics Lab
2014-09-15
Canonical Correlation
l Canonical Correlation
u1 (CDI ) + u2 ( BDI ) + u3 ( BHOP) 와
v1 ( SSI ) + v2 ( RFL)의 상관계수를 가장 크게
만드는 coefficients (u1, u2, u3 , v1 , v2)와
그 상관계수를 구함
SNU Psychometrics Lab
2014-09-15
n29
• Z=u1X1+u2X2+…+ upXp
• W=v1Y1+v2Y2+…+ vqYq
• Correlation between Z and W
rzw =
u' S xy v
(u' S xx u)( v ' S yy v )
SNU Psychometrics Lab
2014-09-15
§ Maximize u' S xy v
under the following constraints
u' S xx u = v ' S xx v = 1
§ Maximum rzw= the largest eigen value of
S -xx1S xy S -yy1S yx
§ corresponding weights for W (u)is eigen vector
of S -xx1S xy S -yy1S yx
(
)
§ v = 1 / l S -yy1S yxu
SNU Psychometrics Lab
2014-09-15
n30
Canonical Redundancy Coefficients
• Canonical 함수의 경우 동일 변수군내의 일관성을 반영하지 못하기
때문에 redundancy coefficients를 이용하여 이러한 문제가 해결하
려함
• Redundancy of set X given set Y: R2x.y
R x2.y =
r
2
where
Pxj =
1æ p 2
ç å rx z
p çè i =1 i j
ö
÷÷
ø
2
where
Pyj =
1æ q 2
ç å ry w
q çè i =1 i j
ö
÷÷
ø
Pxj m j
å
j
=1
R y2.x =
r
Pyj m j
å
j
=1
• Zj는 X변수군의 j번째 canonical variate; Wj는 Y변수군의 j번째
canonical variate; mj는 j번째 canonical correlation
• X가 주어졌을 때 Y의 redundancy와 Y가 주어졌을 때 X의 redundancy가
일치하지 않음
SNU Psychometrics Lab
2014-09-15
CANCORR
1. 먼저 SPSS directory에 있는 canonical correlation 프
로그램을 실행시킨다.
2. 다음의 프로그램을 실행시킨다.
cancorr set1= ssi rfl /
set2= cdi bdi bhop.
SNU Psychometrics Lab
2014-09-15
n31
Output
Correlations for Set-1
SSI
RFL
SSI 1.0000
.5820
RFL
.5820 1.0000
Correlations for Set-2
CDI
BDI
BHOP
CDI 1.0000
.7960
.4400
BDI
.7960 1.0000
.4660
BHOP
.4400
.4660 1.0000
Correlations Between Set-1 and
Set-2
SSI
RFL
CDI
.5320
.5560
BDI
.5600
.5810
BHOP
.3780
.6200
Canonical Correlations
1
.733
2
.251
Test that remaining correlations are zero:
Wilk's
Chi-SQ
DF
Sig.
1 .434
80.099
6.000
.000
2 .937
6.235
2.000
.044
Raw Canonical Coefficients for Set -1
1
2
SSI
-.320
-1.181
RFL
-.774
.948
Standardized Canonical Coefficients for Set-1
SSI
RFL
2
-1.187
.952
Raw Canonical Coefficients for Set -2
1
2
CDI
-.276
-.335
BDI
-.400
-.647
BHOP
-.512
1.011
Standardized Canonical Coefficients for Set-2
CDI
BDI
BHOP
SNU Psychometrics Lab
1
-.322
-.778
1
-.277
-.402
-.515
2
-.336
-.650
1.016
2014-09-15
Output
Canonical Loadings for Set-1
1
2
SSI
-.774
-.633
RFL
-.965
.262
Cross Loadings for Set-1
1
2
SSI
-.567
-.159
RFL
-.707
.066
Canonical Loadings for Set-2
1
2
CDI
-.824
-.406
BDI
-.863
-.444
BHOP
-.824
.566
Cross Loadings for Set-2
1
2
CDI
-.604
-.102
BDI
-.632
-.111
BHOP
-.604
.142
SNU Psychometrics Lab
Redundancy Analysis:
Proportion of Variance of Set-1
Explained by Its Own Can. Var.
Prop Var
CV1-1
.766
CV1-2
.234
Proportion of Variance of Set-1
Explained by Opposite Can.Var.
Prop Var
CV2-1
.411
CV2-2
.015
Proportion of Variance of Set-2
Explained by Its Own Can. Var.
Prop Var
CV2-1
.701
CV2-2
.227
Proportion of Variance of Set-2
Explained by Opposite Can. Var.
Prop Var
CV1-1
.376
CV1-2
.014
2014-09-15
n32
proc cancorr corr data=discrim;
var ssi rfl;
with cdi bdi bhop;
run;
SNU Psychometrics Lab
2014-09-15
n33
×

Report this document