Københavns Universitet Statistik for Biokemikere Det naturvidenskabelige fakultet Institut for Matematiske Fag December 2007 Variansanalyse i SAS 1 Ensidet variansanalyse Bartlett s test Tukey s test PROC NPAR1WAY Dette notat er baseret på overheads til forelæsningerne i efteråret 2007.
Ensidet variansanalyse og Kruskal-Wallis test i SAS Eksempel 1.1: Dette er eksempel 12.1-4 i Biostatistics regnet ved hjælp af SAS. Sammenlign udskriftsresultaterne med tabel 12.1, 12.3 og 12.4. (Normalfordelte data. Brug af PROC GLM. ANOVA, Bartletts test, Tukeys test.) PROGRAM: /*Her kommer data og SAS-program*/ DATA biost12_1; INPUT group $ age @@; CARDS; surgery 32 surgery 28 surgery 22 surgery 25 surgery 20 surgery 20 surgery 28 surgery 28 surgery 20 surgery 29 surgery 22 surgery 37 surgery 18 surgery 29 surgery 22 surgery 32 surgery 21 surgery 34 surgery 19 surgery 23 surgery 23 surgery 26 surgery 41 surgery 20 surgery 33 contr1 32 contr1 26 contr1 31 contr1 39 contr1 34 contr1 33 contr1 29 contr1 41 contr1 35 contr1 33 contr1 33 contr1 43 contr1 25 contr1 39 contr1 36 contr1 37 contr1 28 contr1 34 contr1 27 contr1 45 contr1 22 contr1 29 contr1 51 contr1 28 contr1 35 contr2 31 contr2 35 contr2 26 contr2 28 contr2 22 contr2 29 contr2 27 contr2 21 contr2 22 contr2 27 contr2 24 contr2 44 contr2 21 contr2 25 contr2 27vcontr2 18 contr2 27 contr2 36 ; CLASS group; MODEL age=group/solution; MEANS fage/hovtest=bartlett TUKEY; /*Bartlett og Tukeys test*/ UDSKRIFT: group 3 contr1 contr2 surgery Dependent Variable: age Number of Observations Read 68 Number of Observations Used 68 1- way ANOVA Model 2 842.740065 421.370033 10.29 0.0001 Error 65 2660.951111 40.937709 Corr. Total 67 3503.691176 R-Square Coeff Var Root MSE age Mean 0.240529 21.89640 6.398258 29.22059 Source DF Type I SS Mean Square F Value Pr > F group 2 842.7400654 421.3700327 10.29 0.0001
Standard Parameter Estimate Error t Value Pr > t Intercept 26.08000000 B 1.27965166 20.38 <.0001 group contr1 7.72000000 B 1.80970074 4.27 <.0001 group contr2 1.14222222 B 1.97783355 0.58 0.5656 group surgery 0.00000000 B... Bartlett's Test for Homogeneity of age Variance group 2 0.1871 0.9107 Tukey's Studentized Range (HSD) Test for age NOTE: This test controls the Type I experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 65 Error Mean Square 40.93771 Critical Value of Studentized Range 3.39207 Comparisons significant at the 0.05 level are indicated by ***. Difference Simultaneous group Between 95% Confidence Comparison Means Limits contr1 - contr2 6.578 1.834 11.322 *** contr1 - surgery 7.720 3.379 12.061 *** contr2 - contr1-6.578-11.322-1.834 *** contr2 - surgery 1.142-3.602 5.886 surgery - contr1-7.720-12.061-3.379 *** surgery - contr2-1.142-5.886 3.602 Eksempel 1.2: Dette er data fra eksempel 1.1 analyseret ved ikke-parametrisk test. (Brug af PROC NPAR1WAY. Kruskal Wallis test.). PROC NPAR1WAY WILCOXON DATA=biost12_1; CLASS group; VAR age; UDSKRIFT: The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable age Classified by Variable group Expected Std Dev Mean group N Scores Under H0 Under H0 Sc surgery 25 648.0 862.50 78.500037 25.920000 contr1 25 1182.0 862.50 78.500037 47.280000 contr2 18 516.0 621.00 71.826861 28.666667 Average scores were used for ties.
Kruskal-Wallis Test Chi-Square 16.7679 DF 2 Pr > Chi-Square 0.0002 Eksempel 1.3: Data er lavglykosyleret PrP fra Collinge et al. Først sammenlignes de tre kontrolgrupper. Derefter sammenlignes type 4 PrP med den sammenslåede kontrolgruppe. (Normalfordelte data. Brug af PROC GLM.) PROGRAM: /*Her indlæses data fra Collinge et al*/ DATA cd4_lav; INPUT type lav_glyc @@; CARDS; 1 47.2 1 48.8 1 38.0 1 37.4 1 35.6 1 48.0 2 46.0 2 51.0 2 46.4 2 49.5 2 42.7 2 43.1 2 46.9 2 42.6 2 40.5 2 51.1 2 47.0 2 44.7 2 42.4 2 45.1 2 48.4 2 42.7 2 47.1 2 45.5 3 40.1 3 45.9 3 44.3 3 45.7 4 35.1 4 40.7 4 38.2 4 36.4 4 43.3 4 33.1 4 35.0 4 37.2 4 43.6 ; /*Her sammenlignes de tre kontrolgrupper (type 1-3).*/ DATA kontrol; SET cd4_lav; IF type LE 3; CLASS type; MODEL lav_glyc=type; MEANS type/ HOVTEST=bartlett; /*De tre grupper kan antages at være ens. Se udskrift 1*/ /*Sammenligning af type 4 med samlet kontrolgruppe*/ DATA nycd4_lav; SET cd4_lav; IF type LE 3 THEN nytype=0; ELSE nytype=1; CLASS nytype; MODEL lav_glyc=nytype/solution; MEANS type/ HOVTEST=bartlett; /*Konklusion: Type 4 kan antages at have same varians som kontrolgruppen, men middelværdierne er forskellige.*/ /*Man kan her diskutere, om man indledningsvis skulle have sammenlignet alle 4 varianser*/ /*Normalfordelingshypotese kunne være kontrolleret med PROC UNIVARIATE.*/ /*Tukeys test udføres ikke da hypotesen om ens middelværdier godkendes*/ /*Ikke-parametrisk test PROC NPAR1WAY udføres ikke, da normalfordelingshypotesen godtages*/ /*Udskrifterne er af pladhensyn redigerede*/
UDSKRIFT 1 (Sammenligning af de tre kontrolgrupper.) type 3 1 2 3 Number of observations 28 Dependent Variable: lav_glyc Model 2 49.0430556 24.5215278 1.67 0.2077 Error 25 366.0494444 14.6419778 Corrected Total 27 415.0925000 R-Square Coeff Var Root MSE lav_glyc Mean 0.118150 8.546027 3.826484 44.77500 Bartlett's Test for Homogeneity of lav_glyc Variance type 2 4.6991 0.0954 UDSKRIFT 2 (Sammenligning af type 4 med sammenslået kontrolgruppe.) Dependent Variable: lav_glyc nytype 2 0 1 Number of observations 37 Model 1 306.4983108 306.4983108 20.38 <.0001 Error 35 526.4525000 15.0415000 Corrected Total 36 832.9508108 R-Square Coeff Var Root MSE lav_glyc Mean 0.367967 8.989443 3.878337 43.14324 Standard Parameter Estimate Error t Value Pr > t Intercept 38.06666667 B 1.29277909 29.45 <.0001 nytype 0 6.70833333 B 1.48609361 4.51 <.0001 nytype 1 0.00000000 B... Bartlett's Test for Homogeneity of lav_glyc Variance nytype 1 0.0286 0.8656