On This Page:ToggleInternal ValidityExternal ValidityTrade-off Between Internal and External ValidityThreats to Internal ValidityThreats to External ValidityImproving Internal ValidityImproving External Validity
On This Page:Toggle
On This Page:
Internal Validity
Internal validityrefers to the degree of confidence that the causal relationship being tested exists and is trustworthy.
Internal validity centers on the strength of the causal relationship between theindependent and dependent variables.
It addresses whether the observed effects can be confidently attributed to the intervention or experimental manipulation rather than to extraneous factors or confounding variables.
Internal validity demands rigorous control over extraneous variables to isolate the effects of the independent variable.
This often involves conducting research in laboratory settings with standardized procedures, random assignment, and control groups.
Such tight control helps ensure that the observed effects are truly due to the manipulation or intervention and not to confounding factors.
Researchers must identify and minimize threats to internal validity, such as history, maturation, testing, instrumentation, statistical regression, selection, and mortality.
Example
An example of a study with high internal validity would be if you wanted to run an experiment to see if using a particular weight-loss pill will help people lose weight.
To test this hypothesis, you would randomly assign a sample of participants to one of two groups: those who will take the weight-loss pill and those who will take a placebo pill.
If participants drop out of the study, their characteristics are examined to ensure there is no systematic bias regarding who left.
It is important to have a well-thought-out research procedure to mitigate the threats to internal validity.
External Validity
External validityfocuses on the extent to which the findings of a study can be generalized beyond the specific sample, setting, and time in which the research was conducted.
This is important because if external validity is established, the studies’ findings can be generalized to a larger population as opposed to only the relatively few subjects who participated in the study.
To generalize findings, researchers need to ensure that their samples, settings, and procedures reflect the real-world conditions to which they wish to apply their results.
Unlike internal validity, external validity doesn’t assess causality or rule out confounders.
There are two types of external validity: ecological validity and population validity.
An example of a study with high external validity would be if you hypothesize that practicing mindfulness two times per week will improve the mental health of those diagnosed with depression.
You recruit people who have been diagnosed withdepressionfor at least a year and are between 18–29 years old. Choosing this representative sample with a clearly defined population of interest helps ensure external validity.
You give participants a pre-test and a post-test measuring how often they experienced symptoms of depression in the past week.
During the study, all participants were given individual mindfulness training and asked to practice mindfulness daily for 15 minutes as part of their morning routine.
You can also replicate the study’s results using different methods of mindfulness or different samples of participants.
Trade-off Between Internal and External Validity
This trade-off arises from the fundamental goals of research: to establish causal relationships within a controlled setting (internal validity) and to generalize those findings to broader populations and contexts (external validity).
Internal validity is a prerequisite for external validity.
Without a strong basis for claiming a causal relationship within the study itself, generalizing the findings to other contexts becomes meaningless.
Transition experiments can bridge the gap between controlled and field settings.
They incorporate elements of both environments, gradually increasing ecological validity while maintaining some control.
This gradual shift can provide valuable insights into the robustness of the findings across different contexts.
Conducting two separate experiments requires more time, funding, and personnel. Researchers must weigh the potential benefits of the approach against these practical limitations.
Threats to Internal Validity
Attrition
Attrition refers to the loss of study participants over time. Participants might drop out or leave the study which means that the results are based solely on a biased sample of only the people who did not choose to leave.
Confounders
Confounders are threats to internal validity because you can’t tell whether the predicted independent variable causes the outcome or if the confounding variable causes it.
Participant Selection Bias
This is a bias that may result from the selection or assignment of study groups in such a way that proper randomization is not achieved.
If participants are not randomly assigned to groups, the sample obtained might not be representative of the population intended to be studied.
For example, some members of a population might be less likely to be included than others due to motivation, willingness to take part in the study, or demographics.
Experimenter Bias
Experimenter bias occurs when an experimenter behaves in a different way with different groups in a study, impacting the results and threatening internal validity. This can be eliminated through blinding.
Social Interaction (Diffusion)
Diffusion refers to when the treatment in research spreads within or between treatment and control groups. This can happen when there is interaction or observation among the groups.
Diffusion poses a threat to internal validity because it can lead to resentful demoralization. This is when the control group is less motivated because they feel resentful over the group that they are in.
Historical Events
Historical events might influence the outcome of studies that occur over longer periods of time.
For example, changes in political leadership, natural disasters, or other unanticipated events might change the conditions of the study and influence the outcomes.
Instrumentation
Instrumentation refers to any change in the dependent variable in a study that arises from changes in the measuring instrument used. This happens when different measures are used in the pre-test and post-test phases.
Maturation
Maturation refers to the impact of time on a study.
If the outcomes of the study vary as a natural result of time, it might not be possible to determine whether the effects seen in the study were due to the study treatment or simply due to the impact of time.
Statistical Regression
Regression to the mean refers to the fact that if one sample of a random variable is extreme, the next sampling of the same random variable is likely going to be closer to its mean.
This is a threat to internal validity as participants at extreme ends of treatment can naturally fall in a certain direction due to the passage of time rather than being a direct effect of an intervention.
Repeated Testing
Testing your research participants repeatedly with the same measures will influence your research findings because participants will become more accustomed to the testing.
Due to familiarity, or awareness of the study’s purpose, many participants might achieve better results over time.
Threats to External Validity
Sample Features
If some feature(s) of the sample used were responsible for the effect, this could lead to limited generalizability of the findings.
Situational Factors
Factors such as the setting, time of day, location, researchers’ characteristics, noise, or the number of measures might affect the generalizability of the findings.
Aptitude-Treatment Interaction → Aptitude-Treatment Interaction to the concept that some treatments are more or less effective for particular individuals depending upon their specific abilities or characteristics.
Hawthorne Effect
TheHawthorne Effectrefers to the tendency for participants to change their behaviors simply because they know they are being studied.
Experimenter Effect
Experimenter bias occurs when an experimenter behaves in a different way with different groups in a study, impacting the results and threatening the external validity.
John Henry Effect
The John Henry Effect refers to the tendency for participants in a control group to actively work harder because they know they are in an experiment and want to overcome the “disadvantage” of being in the control group.
Factors that Improve Internal Validity
Blinding
Blinding refers to a practice where the participants (and sometimes the researchers) are unaware of what intervention they are receiving.
This reduces the influence ofextraneous factorsand minimizes bias, as any differences in outcome can thus be linked to the intervention and not to the participant’s knowledge of whether they were receiving a new treatment or not.
Random Sampling
Usingrandom samplingto obtain a sample that represents the population that you wish to study will improve internal validity.
Random Assignment
Using random assignment to assign participants to control and treatment groups ensures that there is no systematic bias among the research groups.
Strict Study Protocol
Highly controlled experiments tend to improve internal validity.
Experiments that occur in lab settings tend to have higher validity as this reduces variability from sources other than the treatment.
Experimental Manipulation
Manipulating an independent variable in a study as opposed to just observing an association without conducting an intervention improves internal validity.
Factors that Improve External Validity
Replication
Conducting a study more than once with a different sample or in a different setting to see if the results will replicate can help improve external validity.
If multiple studies have been conducted on the same topic, a meta-analysis can be used to determine if the effect of an independent variable can be replicated, thus making it more reliable.
Field Experiments
Conducting a study outside the laboratory, in a natural, real-world setting will improve external validity (however, this will threaten the internal validity)
Probability Sampling
Recalibration
Recalibration is the use of statistical methods to maintain accuracy, standardization, and repeatability in measurements to assure reliable results.
Reweighting groups, if a study had uneven groups for a particular characteristic (such as age), is an example of calibration.
Inclusion and Exclusion Criteria
Setting criteria as to who can be involved in the research and who cannot be involved will ensure that the population being studied is clearly defined and that the sample is representative of the population.
Psychological Realism
Psychological realism refers to the process of making sure participants perceive the experimental manipulations as real events to not reveal the purpose of the study and so participants don’t behave differently than they would in real life based on knowing the study’s goal.
![]()
Saul McLeod, PhD
BSc (Hons) Psychology, MRes, PhD, University of Manchester
Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.
Olivia Guy-Evans, MSc
BSc (Hons) Psychology, MSc Psychology of Education
Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.
Julia Simkus
BA (Hons) Psychology, Princeton University
Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master’s Degree in Counseling for Mental Health and Wellness in September 2023. Julia’s research has been published in peer reviewed journals.