|
|
 |
The research problem lies in assessing the type
and degree of influence that the organized camping
experience has on the self constructs of youth.
In order to determine this influence, a random
effects model of meta-analysis was employed. Use
of this meta-analytical technique allows for the
combination and comparison of research results
providing information that can be generalized
to the population. This chapter addresses the
treatment of data, the reliability of that treatment,
the data analysis, and the results. The guiding
work for the procedures described in this chapter
was The Handbook of Research Synthesis,
(Cooper & Hedges, 1994).
Treatment
of the Data
The
Sample
A total sample of sixty-one (61) studies was evaluated
for there ability to supply relevant data with
regard to the research question. Of this sample,
twenty -two (22) studies measured a construct
of self and provided sufficient data from which
to calculate an effect size. Sufficient data is
represented by a minimum of mean and standard
deviation values of the pre and post treatment
measures and an N value. These 22 studies provided
sufficient data to identify a sample of thirty-seven
(37) effect sizes. These effect sizes represent
37 independent measures, taken at thirty-six (36)
different camps. The 37 cases were then subject
to the methodology described in Chapter 3. Appendix
B contains a list of the entire sample evaluated
for inclusion in the meta-analysis and the reason
for any study's exclusion from the analysis.
Coding
Procedures
The coding protocol was developed by the researcher,
using the research questions as a foundation for
the data extracted. The coding sheet was then
reviewed by a panel of experts. The coding sheet
was amended and the data re-coded. A panel of
coders was then employed to verify the coding
process. Suggestions from this experience were
used to amend the coding key a second time. The
data was re-coded. The final data coding sheet
and key were then reviewed and confirmed by the
panel of experts, and the data re-coded for the
final time (Cooper & Hedges, 1994; Electric...,
1987; Sacks, et al., 1994). The primary focus
of the coding was on the quality of the research
method employed and the accurate extraction of
data used to identify the potential moderators
to the effect. A weighting factor was assigned
to each study based on the coders estimation of
quality derived from the coding process (Cooper
& Hedges, 1994; Rosenthal, 1984; Wolfe, 1986).
Reliability
of the Coding Procedure
The Effective Reliability in Table 4.1 expresses
a measure of the level of repeatability of the
quality rating aspect of the coding process that
could be expected if the effort were to be duplicated.
In this case (r = .9982) the coding key
provides sufficient information for a high level
of repeatability of the coding used to estimate
the quality rating of each study.
The quality rating was used to compute a weighting
factor and combined with the inverse of each study's
variance for the evaluation of the quality weighted
effect sizes (Cooper & Hedges, 1994). The
use of weightings for calculating effect size
is discussed later in this chapter in the section
on Results. Reliability of the coding of the data
extracted from the sample can be established through
the initial coding and the subsequent re-codings
during the development and verification of the
coding key.
Table 4.1 - Effective
Reliability of the Mean of Judge's Ratings for
Quality
| Relationship
of Coders |
r |
| r
- Primary Coder/ Verification Coder J
= |
0.9960 |
| r -
Primary Coder/ Verification Coder M = |
0.9919 |
| r
- Verification Coder J / Verification
Coder M = |
0.9960 |
| mean r
= |
0.9946 |
| r
of the Effective Reliability
of the Mean of Judge's Ratings
= |
0.9982 |
| Rosenthal, (1984). |
N=35 |
Data
Analysis
The Sample of Studies
Studies included in the meta-analysis were selected
based on the following criteria:
- The study was experimental, pre-experimental,
or quasi-experimental in design.
- Age was initially a criteria for including
studies in the meta-analysis. This criteria
was eliminated as a parameter for the decision
to include or not include a study in the final
analysis. The decision to eliminate the age
criteria approximately double the number of
effect sizes included in the study. The final
range of ages included in the analysis is from
six to twenty years old. Thus, use of this approach
enhanced generalizability by expanding the population
of studies to reflect the range of ages of subjects
that attend camps. Furthermore, there is some
difficulty in associating development stage
with chronological age. The elimination of the
age criteria was thought to better represent
the spectrum of youth that experience the organized
camping environment.
- The study measured a construct of self, as
defined in Chapter 2, primarily either self-concept
or self-esteem.
- Individual study's reflected an acceptable
level of validity which was evaluated based
on criteria outlined in the Coding Key (Appendix
G). Statistical methods employed depended on
information provided in the study (Cooper &
Hedges, 1994; Hunt, 1997; Rosenthal, 1984; Wolfe
1986). No studies were eliminated from the sample
for failure to meet this criteria.
- The study provided adequate statistical information
or data to be useful in the greater meta-analysis:
either a reported effect size or the data to
calculate an effect size.
- Studies which exhibited a fundamental flaw,
defined as not meeting criteria one through
five above, were not included. Notation about
a study's exclusion was recorded (Cooper &
Hedges, 1994; Hedges & Olkin, 1985; Hunt,
1997; Hunter, Schmidt & Jackson, 1982; Light
& Pillemer, 1984; Wachter & Straf, 1990)
and can be found in Appendix B.
- Unpublished studies and journals that are
not refereed were screened for appropriate compliance
with human subjects procedural protocols that
were in effect at the time the study was conducted.
A study's mention of parental permission and
a subjects freedom to discontinue participation
at any time were taken as means of compliance.
Given the refereeing process for journals and
dissertations, these studies were assumed to
be in compliance.
The
Random Effects Model
The random effects model of meta-analysis presumes
that the sample of studies analyzed is based on
a greater population of studies that differ from
the studies in the sample in two ways. The population
and the sample of studies differ based on characteristics
and effect size parameters. In the fixed effects
model, studies are in groups with similar characteristics
and effect size parameters, and the greater population
is composed of these groups. The second difference
between population and sample in the random effects
model is that the characteristics of the subjects
in the sample studies differ from the greater
population as a result of the randomization process.
The random effects model mathematically reduces
to a fixed effects model if the variance across
the effect sizes is homogeneous. An identified
random effect that is significant is by definition,
generalizable to the population (Cooper &
Hedges, 1994).
Precedent, set by Andrews, Guitar, & Howie
(1980) for including pre-experimental studies,
recognizes the accomplishment of randomization
through the spectrum of subjects that result from
the combination of numerous studies. The spectrum
of subjects in the various studies included in
this meta-analysis supports this notion of randomization.
An overview of the studies in this research supports
the arguments for randomization and generalizability.
The thirty-seven effect size cases combined in
this research were generated from samples with
a cumulative total of 1139 subjects. The profiles
available covered a broad spectrum of cultural,
socio-economic, gender, and ability variables
across the sample analyzed in this research.
The data was analyzed according to the procedures
described in the Chapter 3 section titled Statistical
Analysis and Interpretation. In the majority of
cases the mean and standard deviations of the
pre and post treatment, or control and experimental
groups, were used to generate a Hedge's g
for each study. Appendix J contains a matrix of
data that was used to calculate the mean effect
sizes for the sample. The combination of effect
sizes was analyzed for homogeneity of variance
using the Q statistic. The Q was then compared
to Pē (p < .05, df = k-1 studies),
rejection of the hypothesis of homogeneity was
based on the Q value exceeding the Pē (Cooper
& Hedges, 1994).
Rejection of the hypothesis of homogeneity indicates
a variance component of the mean effect size that
significantly exceeds zero: the variance is due
to more than just being chance. The unconditional
variance component was then recalculated according
to Cooper & Hedges (1994), and the conditional
variance component isolated. Based on the conditional
variance, a 95% confidence interval was used to
establish significance of the random variance
and the generalizability of that variance to the
population. A non-zero interval indicates statistical
significance and generalizability of the random
variance (Cooper & Hedges, 1994).
In order to identify moderator variables that
are correlated with the random effect, explaining
some of the associated random variance, a triangulation
approach to analysis was used. The first angle
of analysis used in the identification of moderator
variables was the use of data-point line plots
to identify relationships (Electric..., 1994;
Cooper & Hedges, 1994; Rosenthal, 1984). Moderator
variables were identified through visible relationships
to the distribution of effect sizes from the 37
cases taken from the sample of studies. Potential
moderators were then included in a regression
analysis. In order for this to be meaningful,
the coded data was arranged in some order so that
identified variances could be explained and graphic
representations could be visually interpreted.
The ordering of the data for each of the moderators
is presented in the Variable Key in Appendix K.
A step-wise linear regression was
performed on the Hedge's g, Pearson r,
and the Fisher Zr effect sizes, as the
second angle of analysis (Cooper & Hedges,
1994). In the regression, moderators that contributed
to the random variance were isolated for analysis
in order to add to the explanation of the random
variance (Appendix L). SPSS 8.0 for Windows was
used to calculate the step-wise regression.
The final angle of analysis used
to identify moderators was the comparison of various
combinations of the effect sizes from the 37 cases
(Cooper & Hedges, 1994, Hunt, 1997; Wachter
& Straf, 1990). These combinations where based
on the moderator variables identified in the plot
and regression analysis described above. Effect
sizes were combined based on an individual study's
relationship to these moderator variables and
then compared to other combinations in order to
identify the largest effect.
As an example of effect size sensitivity
analysis, all the studies from day camps could
be combined to generate a random effect size for
day camps. A similar combined random effect could
be generated for resident camps. Comparing the
two would give an idea about the influence of
the day versus resident camp environment on the
magnitude of the associated random effect. There
was an insufficient amount of data on day camps
in this study to actually make this comparison.
Results
Interpretation of the effect size
requires some caution, as different metrics have
different scales. Hedge's g was used to
calculate the initial effect size for most of
the cases. Because of the limitations of the data
included in some studies, calculating Pearson's
r was necessary to establish an initial
effect in these cases. Hedge's g, a dichotomous
measure, is most appropriate for this measure
because it is based on the difference between
a pre and post mean comparison and an associated
estimate of standard deviation (Cooper & Hedges,
1994). A problem exists with dichotomous, d-index,
estimators as they tend to overestimate the population
for small sample sizes (Hunt, 1997). Both the
initial g or r were transformed
into the other measure so that an r and
a g were available for each case.
The Pearson r is applicable to correlational
data as well as dichotomous data. There are different
equations for calculating r for each of
the d and r indices. Also, Pearson's
r is readily interpretable (Cooper &
Hedges, 1994; Hunt 1997). As a comparative measure,
Fisher's Z transformation of r was also
calculated. The Fisher Zr is a z
score transformation that corrects r for
variance that arises as an r distribution
becomes skewed when the population size gets farther
and farther from zero (Cooper & Hedges, 1994).
For smaller sample sizes and an aggregated r
value that is less than .25, calculating Zr
provides no additional meaningful information
(Hunt, 1997).
Appendix M presents a graph of the g,
r, and Zr effect sizes, arranged
from highest to lowest value. This table provides
a good visual interpretation of the relationship
between the three effect size calculations. The
Pearson r can be seen to provide the tightest
distribution around the zero axis, a function
of the limits of the r scale, -1 to 1 (Hunt,
1997). Because of the Pearson r distribution's
applicability to d-index data, the ease
of interpretation of r, the lack of need
for the use of the Fisher's Zr transformation,
and some small sample sizes among the studies
that would overstate the dichotomous Hedge's g
values, the Pearson r will be used for
the interpretation of the random effect. The distribution
of r's for each case is also presented
in Appendix M.
All three effect size estimators (r, g,
and Zr) were used as dependent variables
in three separate regression analysis, this was
done in order to identify any potential differences
in the estimators and to enhance identification
of moderators. The final weighting of effect sizes
was based on the inverse of the effect's variance
(Cooper & Hedges, 1994). Recall from earlier
discussion that the inverse of the variance provides
more weight to those effect sizes that are the
result of studies done with greater precision.
Use of the quality weighting factor generated
during the coding process did not produce significantly
different results, and in light of Cooper &
Hedges (1994) preference for the inverse of the
variance, quality weighting was dropped from the
analysis. The decision to drop quality weighting
was further reinforced by a resulting reduction
of confusion that might occur from the reader's
need to interpret the additional results.
Only sensitivity analyses that identified moderator
variable as contributing to the variance of the
effect are discussed in the data interpretation.
Effect sizes could not be compared based on gender,
socio-economic status or cultural background variables
because this information was not available in
enough detail across the cases to provide a meaningful
interpretation of the influences of these potential
moderators. The variables of the timing of the
measure, length of the camp session, sample size,
research design, instrument used to measure the
treatment, alpha level or Type I error probability,
and study quality weighting were found not to
be moderating variables in this analysis.
Interpretation
of the Random Effect
Table 4.2 presents the mean effect
size estimates, Q statistics, and confidence intervals
for the sample of 37 effect sizes. These results
indicate a significant positive random effect
that is generalizable to the population (Cooper
& Hedges, 1994). The magnitude of the positive
effect (r = .1023) can be interpreted as
small (Cohen, 1969; Cooper
Hedges, 1994).
Table 4.2 - Mean Random Effect Comparison for
g, r, and Zr.
| |
|
Hedge's
g |
Pearson
r |
Fisher
Zr |
|
Mean Random Effect |
= |
.2517 |
.1023 |
.4308 |
|
95% Confidence |
= |
1168-
.3866 |
.0457-
.1606 |
.3748-
.4858 |
|
N=37 |
= |
|
|
|
|
Q statistic |
= |
165.9668 |
127.2930 |
89.0058 |
|
Pē (p<.05,36) |
= |
50.9985 |
5039985 |
50.9985 |
An intuitive and easily understood tool for interpreting
the effect size is the Binomial Effect Size Distribution
(BESD), a common way of interpreting an effect
size (Cooper & Hedges, 1994; Rosenthal, 1984).
The BESD answers the question, What is the effect
on, or change in, the success rate as a result
of the treatment? Because the difference in success
rates is identical to r, interpretation
is simplified (Rosenthal, 1984). In this case
r = .1023, thus 10.23% more of the population
experiencing camp achieve significant positive
increases in a construct of self than the rest
of the population. According to the prescribed
calculations for the BESD, .5 + or - r/2
(Cooper & Hedges, 1994), this effect can be
restated as a change in the number of people effected
from 45% to 55%. This restatement provides for
easily understood interpretation, that while the
r represents a small effect, an effect
on 10% of the population is significant.
Given the nature of a meta-analysis' reliance
on locating research results, an estimation should
be calculated of how many non-significant results
that would be required to reverse the findings
of a study (Cooper & Hedges, 1994; Hunt 1997).
This file-drawer effect, as it is known,
is calculated using a formula that sums the effect
size from each case divided by its associated
standard error. The sum is then divided by 1.96
and squared, and then added to the negative value
of the number of studies in the meta analysis,
37 in this case. The value 1.96, a constant, represents
the z score for a two-tailed significance
test at the 95 percent confidence level (Cooper
& Hedges, 1994). The result is the number
of studies required to reverse the findings for
the study. In the case of this meta-analysis the
file-drawer effect is 296 studies. Thus, it would
take 296 studies with an average result of non-significance
in order to negate the findings of this meta-analysis.
Identifying
Moderator Variables
The step-wise regression analysis
of the g, r, and Zr effect
sizes indicate that the philosophy of the camping
program and, to a lesser extent, age were explanatory,
or moderator variables for the variance of the
random effect. Table 4.3 presents the results
of the regression. The moderators of camp philosophy
and age account for 33% of the variance (Rē =
.330, for r = .1023). The graphs in Appendix
N provide a corresponding visual interpretation
of the age and camp philosophy moderators and
their relationship to the random effect, r.
Table 4.3 - Results of the Step-Wise Regression
Analysis.
| |
Predictors |
|
| Camp
Philosophy |
Age |
| Dependent
Variables |
|
|
|
|
| Hedge's g
- |
|
|
2.667 |
- |
| Significance
level |
|
|
.012 |
.035 |
| Pearson r
- |
|
|
3.054 |
- |
| Significance
level |
|
|
.004 |
.028 |
| Fisher Zr - |
|
|
2.919 |
- |
| Significance
level |
|
|
.006 |
.031 |
Identifying the meaningful moderator variables
as being the camp philosophy and age of the subjects,
permitted for an effect size sensitivity analysis
as described in the above section on the Random
Effects Model. Sensitivity to the camp philosophy
was explored by recognizing a positive correlation
with r and sequentially eliminating the lowest
ranked category for the camp philosophy. For the
purpose of this study, coded camp philosophies
were divided into three categories termed campgoal
2, campgoal 3 and campgoal 4:
- Campgoal 2) A structured environment focused
on competence or knowledge development. Enhancement
of a self construct is not part of the camp
philosophy.
- Campgoal 3) An experience designed for the
development of personal leisure skills or an
environment in which to have fun. Enhancement
of a self construct is not a stated part of
the philosophy. The focus of the camp is also
not based on developing a competence. This category
is operationally termed general camp.
- Campgoal 4) The development of some construct
of self was a focus of the program and camp
philosophy. Effort, recognized in the primary
study, was made to create an environment that
was conducive to enhancing a self construct.
Table 4.4 - Effect Size
Sensitivity Analysis for the Campgoal Moderator
Variable.
|
Campgoal Categories |
| |
2
- competence
3 - general camp
4 - self focus |
3-
general camp
4- self focus |
4-
self focus
|
| Random
effect - r |
.1023 |
.1585 |
.2006 |
Table 4.4 presents the change in the random effect
for the sensitivity analysis of the camp philosophy
moderator variable. Eliminating the campgoal 2
camps, those focused on competence development,
made a significant increase in the r value.
Similarly, when the camps with focus on self enhancement
were taken alone, the effect increased again.
The implication of this is that programs designed
with the intent of enhancing a self construct
have a greater effect on the evaluation of self.
Appendix O provides a graph of the relationship
between r and age, sorted by the campgoal categories.
Evaluation of the age variable in sensitivity
analysis identified that eliminating the 13 to
18 year old categories created the largest random
effect from eliminating any one age group, Table
4.5. The need to take the data into context of
other moderators is highlighted by this table.
Removing age category 2 would increase the overall
effect to r = .1208, but eight to twelve
year olds would still be included in the identified
random effect. The information in this table should
be viewed in the context of the plot in Appendix
O, as the outcomes from sensitivity are likely
related to the camp philosophy. This is particularly
true in this context because the breakdown of
the age category does not provide for easy interpretation.
Examination of the graph in Appendix P also shows
that the youngest age group had the majority of
the highest scores. The negative regression correlation
between r and age (t = -2.292, p
< .031), also indicates that the there is a
greater influence on the self at younger ages.
The results indicate that the camp philosophy
has an influence on the magnitude of the random
effect. This correlation suggests that a camp
philosophy that is oriented toward enhancing a
camper's self constructs has a more significant
effect. To a lesser extent, age is negatively
correlated to the random effect, indicating that
the effect is greater for younger campers than
for older campers.
Table 4.5 - Effect
Size Sensitivity Analysis for the Age Moderator
Variable.
|
Age Category Removed |
| |
full
study |
1-
age
6 - 10 |
2-
age
8 - 12 |
3-
age
10 - 15 |
4-
age
13 - 18 |
5-
age
7 - 15 |
6-
age
8 - 20 |
|
Resulting Random Effect
- r |
.1023 |
.0655 |
.1208 |
.0802 |
.1257 |
.1129 |
.1107 |
Summary
A sufficient sample of studies
was gathered to produce effect sizes and perform
a meta-analysis. Reliability of the coding process
was established through expert and verification
coder review. An effect size was generated and
evaluated as being a significant positive random
effect. By definition this positive effect is
generalizable to the population. The Pearson r
was identified as the most useful effect size
estimator because of its generality and ease of
interpretation. The data was then analyzed using
data-point line plots, regression analysis and
an effect size sensitivity analysis, in order
to identify moderator variables. The explanatory
variables were identified as being those of a
camp philosophy related to enhancing self, and
the age of the camper.
|