# AN INVESTIGATION OF BEST COVARIANCE STRUCTURE FOR EXPERIMENTAL DESIGN WITH REPEATED MEASURE

AN INVESTIGATION OF BEST COVARIANCE STRUCTURE FOR EXPERIMENTAL DESIGN WITH REPEATED MEASURE

ABSTRACT

One of the major problems in conducting repeated measures analysis is the sphericity assumption. The consequences of applying univariate or multivariate repeated measure methods when this assumption is violated lead to inappropriate use of covariance structure. In case of univariate, degree of freedom adjustment methods were used while in multivariate method Profile Analysis in General Linear mixed model is usually applied. The six covariance structures used in the study are: Unstructured (UN), Compound symmetry (CS), Huynh-Feldt (HF), First order-auto regressive (AR(1)), Heterogeneous first order-auto regressive (ARH(1)) and Heterogeneous compound symmetry (CSH). The Goodness of fit criteria used to evaluate the performances of covariance structures are: Akaike information criterion (AIC), Burnham-Handerson criterion (AICC) and Schwartz’s Bayes criterion (SBC). The data used composed of Forty two Albino rats which were randomly assign in to six groups, each group was given different type of diet. The weekly weight of rats was measured seven times during the experimental period (week 0, 1, 2, 3, 4, 5, and 6). The data used violate the assumption of spherity. According to AIC, AICC and SBC criteria. Unstructured (UN) was found to be the best covariance structure for the data set. Linear mixed model approach was suggested as the best method of analyzing repeated measures data as it gives room to specify any covariance structure in case of violation of sphericity.

Keywords: Repeated measure, Sphericity test, covariance structures, Information criteria, univariate analysis and multivariate analysis.

INTRODUCTION

Repeated Measures Designs (RMD) is one of the most frequently studied and applied designs in a variety of applied fields. A design in which the same experimental unit is repeatedly observed under multiple treatments is called repeated measures design (Algina et al., 2000). This is but a broad concept and in practice a repeated measures design is laid out in a variety of ways, from a very simple set up of one-way repeated measures design to a very complex framework of longitudinal data or some other mixed model set up (Algina et al., 2000). Due to its wide application in the realm of applied research, a repeated measures design is conceived of, and planned in a number of ways. It is basically a set up that can be used to plan any standard experiment. Mostly, any design with single observation can also be planned with repeated observations (Crowder and Hand, 1990). A comprehensive introduction to the analysis and different plans of RMD is in Crowder and Hand (1990) and an application to several real-life, mostly medical, experiments is given by Hand and Taylor (1987). The most attractive feature of a repeated measures design is the potential advantages it offers. For instance, it has maximum error control (Winter, 1991). In addition Davis (2002) wrote extensively on the advantages of the design such as economy of subjects, study of patterned behavior of individuals over different treatments conditions or time points and data are more reliable than in a cross sectional study. It is lamentable that more researchers have not opted to utilize this type of design. One reason for the under utilization of this approach is disagreement regarding the appropriate analytic technique to use to evaluate results, as either a univariate or multivariate approach can be invoked (Algina & Keselman, 1997; Girden, 1992; Keselman et al., 1995; Keselman, et al, 1996; Maxwell & Delaney, 1990). As noted by some authors (Algina et al., 1994; Girden; Maxwell & Delaney, 2004), each analysis has distinct advantages and disadvantages, and each type of analysis will provide a more powerful result under certain conditions and when certain statistical assumptions are satisfied. In repeated measures experimental design it remains possible to use relatively straightforward analysis of variance procedures to analyze the data if three particular assumptions about the observations are valid; that is:

1. Normality: the data arise from populations with normal distributions (i.e. the measurement errors are independent and identically normally distributed with mean 0 and the same variance)

2. Homogeneity of variance: the variances of the assumed normal distributions are equal.

3. Sphericity: the variances of the differences between all pairs of the repeated measurements are equal. This condition implies that the correlations between pairs of repeated measures are also equal, the so-called compound symmetry pattern (i.e. the covariance between observations within any two different factor levels be the same).

Compound symmetry is a special case of more general property of sphericity. If compound symmetry exists, then sphericity also exists, but it is possible for sphericity to exist when compound symmetry does not.

Alternative analytic techniques are available when assumptions validity is dubious. These include epsilon (ε) adjustment procedure based on Geisser-Greenhouse Epsilon (G-G), Huynh-Feldt Epsilon (H-F) and some multivariate approaches such as Profile analysis (Repeated MANOVA) in linear mixed model while Mauchly’s test can be used to control Sphericity assumption. Mixed model methodology enable statisticians to specify different covariance structures in repeated measures designs where both random and fixed effects are included in the model. Therefore, mixed model methodology is potentially the most powerful tool. Although reports on application of mixed model methodology in animal science without specifying various covariances are available (Pancarci et al., 2007; 2009), however, reports on jointly usage possibilities of univariate ANOVA, profile analysis, mixed model approaches (with different covariance structure) in repeated measures design, in animal science are few in literature. At times, the structure of covariance matrix can be extremely complicated and the issue needs a careful treatment and a good amount of knowledge. Second, the data may not be complete due to one reason or another.

MATERIALS AND METHODS

As material, the data used for this study was a secondary data, obtained from an experiment conducted for M.Sc. research by Nura Lawal title “Serum Glucose, Lipid Profile and Oxidative Stress markets of Salt-Induced Metabolic Syndrome Rats” Department of Biochemistry, Usman Danfodiyo University, Sokoto. The data composed of Forty two rats which were randomly assigned in to six groups, each group were given different type of ration (food). The weekly weight of rats in gram was measured seven times during the experimental period (week 0, 1, 2, 3, 4, 5 and 6).

The six covariance structure used are: Unstructured (UN), Compound symmetry (CS), Huynh-Feldt (HF), First order-auto regressive (AR(1)), Heterogeneous first order-auto regressive (ARH(1)) and Heterogeneous compound symmetry (CSH).

The Goodness of fit criteria used to evaluate the performances of covariance structures are Akaike information criterion (AIC), Burnham-Handerson criterion (AICC) and Schwartz’s Bayes criterion (SBC).

The Shapiro-Wilk Test was used to test the hypothesis that the data arise from populations with normal distributions, Pearson's Correlation Coefficient test the correlation between the repeated measure observations, Levene’s Test for Homogeneity of Variances test the hypothesis that the variance of the dependent variable is equal across groups while Mauchly’s sphericity test ( Mauchly’s W) test the hypothesis that the variances of the difference between levels were significantly the same. SPSS Statistical Package was used in this research at 5% level of significance throughout.

For the Univariate ANOVA in this research, there are two factors, Group (between-subject factor (A) and Time (within-subject factor (B).

The statistical model is:

1

( i=1,2,…,n ; j=1,2,…p, k=1,2,…,q )

Where, is the grand mean, is the ith fixed level of group (treatment) factor, is the random effect of j. rat (experimental unit) fed with i. ration, is the kth fixed time effect,is ration(group) by time interaction effect, is the random error term.

The sources of total variation are separated in to two parts, as the between subject () and within subject variations ().

Therefore,

2

3

In case of violation of Sphericity in univariate method, two adjustment or correction method are available. This correction is done by adjusting the df downward for determining the cirtical F value. The two corrections commonly used are: The Greenhouse-Geisser correction, Huynh-Feldt correction. It has been suggested that lower Huynh-Feldt be used with smaller departures from sphericity, while Greenhouse-Geisser be used when the departures are very large (Mauchly 1940). The adjusted univariate F test measures the degree to which the variance-covariance matrix departs from compounded symmetry and sphericity is measured by epsilon () parameter (Winter et al., 1991). When sphericity is met, then equals one. The further is from one, the more the sphericity is violated. An estimate of can be determined from the sample variance covariance matrix and is termed the Greenhouse-geisser epsilon ().

The df for the F test for the factor can then be adjusted downwards based on the value of (the value of is multiply to both the df for the measurement and error term).

Moreover, the correction factor in the analysis of repeated measures data when the sphericity assumption is judged to be inappropriate is to use multivariate approache. The advantage is that no assumptions are now made about the pattern of correlations between the repeated measurements. A disadvantage of using Linear mixed model for repeated measures is often stated to be the technique’s relatively low power when the assumption of compound symmetry is actually valid.

Therefore the matrix notation of the model is given by:

4

Where, is the vector for fixed (ration, week and ration by week interaction), is the vector for random (individual within ration) effects, is the design matrix for fixed effect, is the design matrix for random effect, e is the random error term.

The assumptions of the model are:

is multivariate normal (0, G). (2) e is also multivariate normal (0, R)

and V(Y) = ZGZ’ + R.

RESULTS AND DISCUSSION

The Shapiro-Wilk Test indicates that the data is identically normally distributed, Table 1. Shows that the repeated measurements are highly correlated since all the values were positive.

TABLE 1: Pearson’s Correlation Coefficients Between Live Weights

Correlation between the Values

0week

1week

2week

3week

4week

5week

6week

0week

1.000

.721*

.544

.331

.400

.432

.355

1week

1.000

.326

.538

.418

.465

.348

2week

1.000

.239

.204

.366

.540

3week

1.000

.682*

.462

.380

4week

1.000

.428

.355

5week

1.000

.720*

6week

1.000

Table 2 Shows the Levene's Test of Equality of Variances which clearly shows that the variances are homogeneous for all levels of the repeated-measures variable (P>0.05)

TABLE 2: Levene's Test of Equality of Error Variances

F

df1

df2

Sig.

0week

6.093

5

36

.100

1week

1.925

5

36

.114

2week

.850

5

36

.524

3week

3.265

5

36

.066

4week

1.166

5

36

.345

5week

1.389

5

36

.252

6week

1.077

5

36

.389

Table 3 shows that the assumption of Sphericity has not been met, since the Sig. value is 0.00 which is less than 0.05, so the null hypothesis that the variances of the difference between levels were significantly the same was rejected.

TABLE 3: Mauchly's Test of Sphericity

Within Subjects Effect

Mauchly's W

Approx. Chi-Square

df

Sig.

Epsilona

Greenhouse-Geisser

Huynh-Feldt

Lower-bound

Week

.157

62.588

20

.000

.655

.848

.167

TABLE 4: Tests of within-Subjects Effects

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

Partial Eta Squared

Week

Sphericity Assumed

57041.409

6

9506.901

40.099

.000

.598

Greenhouse-Geisser

57041.409

3.168

18002.815

40.099

.000

.598

Huynh-Feldt

57041.409

4.304

13254.580

40.099

.000

.598

Lower-bound

57041.409

1.000

57041.409

40.099

.000

.598

Week * Group

Sphericity Assumed

21180.102

30

706.003

2.978

.000

.355

Greenhouse-Geisser

21180.102

15.842

1336.929

2.978

.001

.355

Huynh-Feldt

21180.102

21.518

984.314

2.978

.000

.355

Lower-bound

21180.102

5.000

4236.020

2.978

.029

.355

Error(Week)

Sphericity Assumed

38407.993

162

237.086

Greenhouse-Geisser

38407.993

85.549

448.960

Huynh-Feldt

38407.993

116.195

330.547

Lower-bound

38407.993

27.000

1422.518

Table 5 shows the ANOVA test of main effect of the between subject factor Group (Ration), which reveals a significant effect, since the significance value of 0.000 is less than the standard cut-off point of 0.05 level of significance.

TABLE 5: Tests of Between-Subjects Effects

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

Intercept

1.540E7

1

1.540E7

18120.768

.000

Ration

59035.500

5

11807.100

13.893

.000

Error

30594.327

36

849.842

Table 6 shows the Information criteria for different covariance structures in mixed model approach. According to AIC, AICC and SBC fitting criteria, unstructured (UN) was the best covariance structure (since it is the one that provide the most smaller value on both the AIC and AICC) then followed by Heterogeneous autoregressive ARH (1), while the worst one was compound symmetry (CS). UN gave information about growth-development mechanism and consecutive variation at weight performances of the rats over trial time.

TABLE 6: Fitting Criteria Results for Comparing Covariance Structures in Multivariate

Covariance Structure

Information Criteria UN CS HF AR (1) ARH (1) CSH

AIC 2501.730 2626.250 2622.067 2547.790 2543.246 2569.173

AICC 2508.305 2626.294 2622.606 2547.834 2543.785 2569.712

SBC 2603.102 2633.491 2651.030 2555.031 2572.209 2598.135

Using Unstructured (UN) covariance structure, Table 7 shows that, there is a strong significant effect between the repeated measures (week) and the interaction between the group and week, since the Sig. value from the above table are less than the value of at 5% significance level.

TABLE 7: Multivariate Analysis of Variance (MANOVA)

Effect

Value

F

Hypothesis df

Error df

Sig.

Partial Eta Squared

Week

Pillai's Trace

.874

2.549E1

6.000

22.000

.000

.874

Wilks' Lambda

.126

2.549E1

6.000

22.000

.000

.874

Hotelling's Trace

6.951

2.549E1

6.000

22.000

.000

.874

Roy's Largest Root

6.951

2.549E1

6.000

22.000

.000

.874

Week * Group

Pillai's Trace

1.611

2.060

30.000

1.300E2

.003

.322

Wilks' Lambda

.073

2.781

30.000

90.000

.000

.408

Hotelling's Trace

5.421

3.686

30.000

1.020E2

.000

.520

Roy's Largest Root

4.204

1.822E1

6.000

26.000

.000

.808

RESULT AND DISCUSSION

In this study, the set of data used was found to be normally distributed since the significance value of the Shapiro- Wilk Test is greater than 0.05 (p>0.05), the correlation between the dependent variable (week) was strongly positive in which the highest correlation is between week0 – week1, week5 - week6 and week3 – week4. The Levene’s test indicates that variances are homogeneous for all levels of the repeated-measures variable (because all Sig. values are greater than 0.05).

The sphericity assumption was violated according to mauchly statistic result (P<0.05) and evaluating results of fixed effects from Repeated ANOVA may be lead to faulty interpretations. Therefore, since the Mauchly’s test is significant, then Greenhouse-Geisser and Huyn-Feld corrected degrees of freedom adjustment was used in order to assess the significance of the corresponding F test.

The epsilon (ε) values used to correct or adjust the degree of freedom are: For the Greenhouse-Geisser, the ε- value use to adjust the degree of freedom is 0.655, for the Huyn-Feld, the ε- value use to adjust the degree of freedom is 0.848 and for the lower bound i.e the lower value that ε can take is 0.167. It was found that the effect of within subject (week) and group by week interaction effect is significant, since the Sig. value is less than the value of at 5% significance level. For the between-subject effect it was also found to be highly significant (p<0.05).

According to information criteria AIC, AICC and SBC Unstructured (UN) was the best covariance structure (since it is the one that provide the most smaller value on both the AIC and AICC) then followed by Heterogeneous autoregressive ARH (1), while the worst one was compound symmetry (CS). UN gave information about growth-development mechanism and consecutive variation at weight performances of the rats over trial time.

Similarly, using unstructured (UN) covariance structure, the multivariate analysis of variance (MANOVA) shows the effect of within subject (week) and also the effect of group by week interaction which clearly indicates that there is significant effects between the repeated measures (week) and the interaction between the group and week, since the Sig. values are less than the value of at 5% significance level.

From the above results of the analysis, the following recommendations were made:

Before conducting any repeated measure analysis, the sphericity test has to be check in order to determine the best covariance structure for the data, so that to avoid misinterpretation.

The covariance that best fit the data should be the one to use for further analysis.

The mixed model approach in multivariate are recommended for any repeated measure analysis, since it allows one to specify different covariance structure.

For any data set with missing observations, it is recommended to use the mixed model approach that will be more appropriate.

Whereas, when sphericity met, it is recommended to use the repeated measure ANOVA.

REFERENCES

Algina, J., Wilcox, R.R. and Kowalchuk, R.K. (2000). The analysis of repeated measure.

A quantitative research synthesis. British Journal of Mathematical and Statistical Psychology, 34 : 1735- 1748.

Algina, J. and Oshima, T. C. (1994). Type I error rates for Huynh’s General

Approximation and Improved General Approximation tests. British Journal of Mathematics and Statatistics Psychology, 47: 151-165.

Algina, J. and Keselman, H.J. (1997). Detecting repeated measures effects with univariate

and multivariate statistics. Psychological Methods, 2 : 208-218.

Crowder, Y. and Hand, D. (1990). A repeated measure design and formulation. Journal of

Animal and Plant Science, 23(4):46-63.

Davis, D. (2002). Repeated measures: psychology research. Pers Soc Psychol Bull,

32:45– 58.

Girden, E. R. (1992). ANOVA: Repeated measures. Newbury Park, CA: Sage.

Hand, D.J. and Taylor, K. (1987). Experimental Designs including repeated measurements

in Real-life. CA: Wadsworth.

Keselman, H.J., Keselman, J.C. and Lix, L.M. (1995). The analysis of repeated

measurements: Univariate tests, multivariate tests, or both? British Journal of Mathematical and Statistical Psychology, 48:319-338.

Keselman, J.C., Lix, L.M. and Keselman, H.J. (1996). The Multivariate analysis of

repeated Measurements in SAS: A quantitative research synthesis. British Journal of Mathematical and Statistical Psychology, 49:275- 298.

Maxwell, S.E. and Delaney, H.D. (1990). Designing experiments and analyzing data: A

model comparison perspective. Belmont, CA: Wadsworth.

Maxwell, S. E. and Delany, H. D. (2004). Designing experiments and analyzing data: A

model comparison perspective (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associated, Publishers.

Pancarci, S. M., Gurbulak, K. H., Oral, M., Karapehlivan, R., Tunca and Çolak A. (2009).

Effect Of Immunomodulatory Treatment with Levamisole on Uterine Inflammation and Involution, Serum Sialic Acid Levels and Ovarian Function in Cows. Kafkas, University Journal Veterinary Faculty. 15(1): 25-33.

Pancarci S. M., C. Kacar, M., Ogun, O. Gungor, K., Gurbulak, H., Oral, M., Karapehlivan and

Citil M. (2007). Effect of L-Carnitine Administration on Energy Metabolism During Peripaturient Periodin Ewes. Kafkas University Journal Veterinary Faculty. 13(2): 149-154.

Winter, W. R. D. (1991). Conducting repeated measures analyses using regression:

The general linear model lives. Paper presented at the annual meeting of the Mid-South Educational Research Association, New Orleans, Louisiana.

## 0 Response to "AN INVESTIGATION OF BEST COVARIANCE STRUCTURE FOR EXPERIMENTAL DESIGN WITH REPEATED MEASURE"

## Post a Comment