Research Design & Stats

Threats to Validity

It is extremely difficult, if not impossible, to control for all of the possible complications and events which may affect subjects participating in a research project. However, responsible, ethical methodology dictates that all threats to the validity of a study must be considered and accounted for throughout the design process. While a researcher is expected to account for all of the possible influences which may have altered the results, some are reluctant to do so fearing that the such acknowledgment may discredit their work. Life care planners must be aware of these threats so that we may independently evaluate the efficacy of research studies and draw alternative conclusions from the data where appropriate.

The validity of a research design is evaluated in two ways; the internal validity of the study and the external validity of the study.

Top

Internal Validity

Internal validity evaluates the extent to which extraneous factors, rather than the treatment, may have produced the outcomes of the study. When a researcher designs the methodology to be employed throughout the project, careful consideration is given all factors which may exert an unintended effect and cause subjects to respond differently than they would have otherwise. Internal validity seeks to answer the question:

Was the treatment responsible for the results of the study or was it something else?

1. Sample Selection: Consider the fundamental differences between the control group and the treatment group, or between the subjects who are being compared. There may have been significant differences between these groups from the conception of the project.

How were these groups selected? If subjects were randomly selected and randomly assigned to groups, the threat is decreased. A researcher may administer a pre-test to all subjects then compare the responses of the groups to ensure that they are similar before introducing the treatment or intervention.

2. History: Events may occur during the course of a study that impact the responses of the subjects. For example, a national news story which is closely related to the topic of the study, or a natural disaster occurring in an area where many of the subjects reside may influence their responses to the treatment or intervention.

What did the subjects experience during the course of the study? A researcher may ask all participants to complete the study in an isolated setting or to keep a diary detailing the event of their lives throughout the project. If the subjects were randomly assigned to groups, theoretically, extraneous factors will influence the groups equally.

3. Mortality: Some of the subjects may drop out of the project, move, or be unreachable for follow-up evaluation. This may present difficulty for the researcher if mortality affects groups at different rates. For example, in a two-group study of 50 participants each, if 15 drop out of Group A and only two drop out of Group B the groups may no longer be suitable for comparison. A researcher must try to identify the cause for attrition.

How many subjects dropped out of the study? A researcher may attempt to re-establish contact with subjects or rely upon statistical procedures designed to account for missing data (Campbell & Stanley, 1966).

4. Location: Consider the location and circumstances under which the first sets of data were gathered as compared to the location or circumstance under which the second set of data were gathered. If the situations were different, the setting or circumstance under which data was collected may have influenced the response of subjects (rather than the treatment or intervention).

What were the circumstances under which all sets of data were collected? A researcher should consider the quality of testing environments, similarity among sites, and describe the general testing circumstance.

5. Instrumentation: Changes in the calibration of the testing instruments or equipment, or changes in observers or scorers may affect the data. Life care planners should be aware of the conditions under which measurement instruments were normed, how they should be administered, and the purpose for which they were developed. There is also the possibility of scorer bias, whether conscious or unconscious.

Were the measurement instruments correctly used? A researcher may randomly assign scorers to participant groups, or employ blind or double-blind data collection techniques. A researcher may train and then pre-test scorers so that all are clear as to what is/is not to be tabulated and how the scores should/should not be derived. These pretests can be analyzed for inter-rater reliability and intra-rater reliability before scorers are given the responsibility of data collection.

6. Testing: The “practice effect” of pretesting may influence the outcome of posttests, particularly when the contents of these assessments are closely related. In addition, the contents of a pretest may make subjects more sensitive or responsive to the treatment or intervention.

What effect might the pretest have exerted upon the results of the posttest? A researcher may choose not to administer a pretest. Theoretically, and if the sample size is large enough, this threat should equally effect all groups if the subjects were randomly assigned.

7. Maturation: Particularly in a longitudinal study, changes over the course of the study may be attributable to the effects of time, rather than the intervention or treatment. For example, first graders may respond to project assessments much differently at the end of the school year simply due to maturation effects, rather than the intervention applied over several months. In another example, improvements in cognitive functioning may be a result of natural, biological processes rather than the rehabilitation program instituted by therapists.

Were the effects due to the intervention or to maturation? A researcher may select subjects who are relatively mature or exhibit stability on measures of interest. Also, the duration of the experiment may be limited to control for the effects of maturation, fatigue, or physical changes. Theoretically, if subjects were randomly selected this threat should affect all groups equally.

8. Attitude of Subjects: The approach and mindset of study participants can affect the outcome of the project. For example, subjects may put forth exceptional effort because they know their performance will be evaluated. Or, subjects may feel insulted based upon how they perceive the group of which they were assigned, particularly if the groups are being treated differently beyond the administration of the independent variable. When evaluating research, life care planners should consider whether results were affected by the experience of subjects in the experimental condition or whether results reflect only the influence of the treatment or intervention.

Do the results reflect the subjects’ reaction to the experimental condition or the treatment? A researcher should make a conscious effort to treat all groups the same, aside from the administration of the treatment. Unobtrusive measures may be selected so that scorers are able to observe subjects’ behavior without disrupting the natural circumstances of the environment or calling attention to their task.

9. Implementation: This threat occurs when implementers of the treatment or intervention use different methods in instructing or implementing the independent variable. An implementer may like one intervention better than the others and do a better job of implementing it. For example, if a study was designed to examine the effects of a new teaching method, an implementer who preferred the traditional method may not teach the experimental method as well.

Could the implementer have influenced the results of the study? A researcher may randomly assign implementers to groups (when possible), monitor the administration of the trials, or use the same implementer for all groups.

10. Regression: Groups selected because of unusually high or low scores on pretests (or similar measures) will tend to score closer to the mean on subsequent assessments (Ary, Jacobs, & Razavieh, 1996). This threat occurs when groups are selected on the basis of scores that are not representative of their true performance. For example, a researcher tests all patients in a rehabilitation facility with the same level and type of injury on measures of psychological adjustment. The lowest (i.e., those who show the most significant psychological difficulties in adjusting to their disability) are selected to participate in a six week intervention program. At the end of the program, all subjects are re-tested, the scores are compared, and the scores of the experimental group improved.

Actually, two extraneous variables may have influenced the results of this study. First, most patients will experience greater ease in psychological adjustment over time, particularly if counseling support is available in a rehabilitation setting such as the one referenced in this example. Second, there is a tendency for extreme scores to move closer to the mean on subsequent measures.

Is movement in scores over time due to the effects of the intervention or to regression to the mean? A researcher may attempt to control for this threat by eliminating extreme scores from participation in the study or by randomly assigning individuals to groups (theoretically, regression to the mean should occur equally in both groups). By analyzing the raw data for aberrant scores which make extreme moves, a researcher may conclude that this effect is not typical, but a result of measurement error.

11. Statistical Conclusion Validity: This threat occurs when analytical errors are made and these produce invalid results. There are numerous statistical errors that can corrupt the data such as the reliability of measurement instruments, violations of the assumptions of the statistical tests used, or even selecting the wrong statistic for data analysis. Sample size is important to consider, particularly if very few or a large number subjects were used in the study. Statistical analysis may produce invalid results by being over-sensitive (if the sample size is large) or under-sensitive (if the sample size is small) to differences attributed to the treatment. In other words, when sample sizes are very large statistical analysis may detect positive effects that do not exist. When sample sizes are very small, statistical analysis may not be sensitive enough to detect the differences that exist; so, even though the treatment did have an effect, it is not recognized (Ary, Jacobs, & Razavieh, 1996).

Are the results based on what truly occurred throughout the study, or are they due to statistical errors? A researcher often consults with statisticians during the course of the study to insure that all analytical errors are prevented. Life care planners should be familiar enough with basic statistical analysis to determine whether the conclusions reached by the researcher are plausible.

Top

Contact Us for Your Comprehensive Life Care Plan

Call Our Office: 407-977-3223