No highlights yet. Use the Highlight button in the article.
Master the scientific methods used to study behavior and mental processes, from experimental design to statistical analysis.
Reading short version (6 min)
Psychological research employs the scientific method to investigate behavior and mental processes systematically. Understanding research methods is essential for evaluating the quality of evidence, designing rigorous studies, and becoming a critical consumer of psychological literature. This article covers the fundamental principles, designs, and statistical tools used in psychological research.
The scientific method provides a systematic approach to knowledge acquisition:
1. Observation: Identifying a phenomenon of interest through careful observation of behavior or mental processes.
2. Hypothesis Formation: Developing testable predictions based on theory or prior research. A good hypothesis is specific, falsifiable, and operationally defined.
3. Research Design: Planning systematic data collection that will test the hypothesis while controlling for alternative explanations.
4. Data Collection: Gathering empirical observations using reliable and valid measurement procedures.
5. Statistical Analysis: Using statistical techniques to evaluate whether results support or refute the hypothesis.
6. Conclusion and Replication: Interpreting results, acknowledging limitations, and encouraging replication by other researchers.
Key Principles: - Empiricism: Knowledge based on systematic observation rather than intuition or authority - Objectivity: Minimizing bias in observation, measurement, and interpretation - Skepticism: Questioning claims until supported by evidence - Replicability: Findings should be reproducible by other researchers
Experimental research involves the systematic manipulation of variables to establish cause-and-effect relationships.
Key Components: - Independent Variable (IV): The factor manipulated by the researcher - Dependent Variable (DV): The outcome measured to assess the effect of manipulation - Random Assignment: Participants randomly assigned to conditions, ensuring groups are equivalent on average - Control Group: Comparison condition without the manipulation - Experimental Group: Condition receiving the manipulation
Types of Experimental Designs:
Between-Subjects Design: Different participants in each condition. Requires random assignment to equate groups.
Within-Subjects Design: Same participants experience all conditions. Controls for individual differences but introduces order effects (addressed through counterbalancing).
Factorial Design: Examines effects of two or more independent variables simultaneously, allowing assessment of main effects and interactions.
Strengths: Allows causal inference when properly controlled. Limitations: May lack ecological validity; ethical constraints on manipulation.
Correlational Research: Examines relationships between variables without manipulation. Measures the degree to which two variables covary.
Descriptive Methods:
Case Studies: In-depth examination of individuals or small groups. Valuable for rare conditions or generating hypotheses, but limited generalizability.
Naturalistic Observation: Observing behavior in natural settings without intervention. High ecological validity but potential observer bias.
Surveys and Questionnaires: Self-report measures for attitudes, beliefs, or behaviors. Efficient for large samples but vulnerable to response biases.
Archival Research: Analysis of existing records (medical files, census data, historical documents).
Longitudinal vs. Cross-Sectional Designs: - Longitudinal: Same participants studied over time (tracks change but time-consuming, attrition) - Cross-Sectional: Different age groups compared at one time point (efficient but confounds age with cohort)
Reliability: Consistency of measurement - Test-retest reliability: Consistency across time - Inter-rater reliability: Agreement between observers - Internal consistency: Consistency among items (Cronbach's alpha) - Split-half reliability: Consistency between halves of a test
Validity: Whether a measure assesses what it claims to measure - Content validity: Measure samples the full domain of interest - Criterion validity: Measure predicts relevant outcomes (concurrent or predictive) - Construct validity: Measure relates to other variables as theoretically expected (convergent and discriminant validity)
Types of Variables: - Nominal: Categories without order (e.g., gender, diagnosis) - Ordinal: Ordered categories without equal intervals (e.g., rankings) - Interval: Equal intervals without true zero (e.g., temperature in Celsius) - Ratio: Equal intervals with true zero (e.g., reaction time, age)
Operationalization: Defining abstract constructs in terms of specific, measurable procedures.
Population: The entire group to which findings are intended to apply.
Sample: The subset of the population actually studied.
Sampling Methods: - Random sampling: Every population member has equal chance of selection (ideal for generalizability) - Stratified sampling: Population divided into subgroups, random sample from each - Convenience sampling: Participants selected based on availability (common but limits generalizability) - Purposive sampling: Participants selected based on specific characteristics
Threats to External Validity: - WEIRD samples (Western, Educated, Industrialized, Rich, Democratic) - Volunteer bias - Laboratory vs. real-world settings
Sample Size and Power: Larger samples provide more reliable estimates and greater statistical power to detect effects. Power analysis should be conducted before data collection.
Descriptive statistics summarize and describe data:
Measures of Central Tendency: - Mean: Arithmetic average; sensitive to outliers - Median: Middle value; robust to outliers - Mode: Most frequent value
Measures of Variability: - Range: Difference between highest and lowest values - Variance: Average squared deviation from the mean - Standard Deviation (SD): Square root of variance; interpretable in original units
Distribution Shape: - Normal distribution: Symmetric, bell-shaped curve - Skewness: Asymmetry (positive = tail to right; negative = tail to left) - Kurtosis: Peakedness relative to normal distribution
Inferential statistics allow conclusions about populations based on sample data.
Hypothesis Testing: - Null hypothesis (H₀): No effect or relationship exists - Alternative hypothesis (H₁): Effect or relationship exists - p-value: Probability of obtaining results at least as extreme as observed, assuming H₀ is true - Alpha level (α): Threshold for statistical significance (conventionally .05) - Type I error: Rejecting H₀ when it's true (false positive) - Type II error: Failing to reject H₀ when it's false (false negative)
Common Statistical Tests: - t-test: Compares means of two groups - ANOVA: Compares means of three or more groups - Correlation: Measures association between continuous variables - Regression: Predicts one variable from others - Chi-square: Tests associations between categorical variables
Effect Size: Magnitude of an effect independent of sample size - Cohen's d: Standardized difference between means - r²: Proportion of variance explained - Odds ratio: Ratio of odds between groups
Confidence Intervals: Range of values likely to contain the population parameter (e.g., 95% CI).
Internal Validity Threats (alternative explanations for results): - Confounding variables: Uncontrolled variables that vary with the IV - Selection bias: Non-equivalent groups - History: External events affecting outcomes - Maturation: Natural changes over time - Testing effects: Prior testing influences later performance - Instrumentation: Measurement changes over time - Regression to the mean: Extreme scores tend toward average on retest
External Validity Threats (limits to generalizability): - Sample characteristics not representative - Artificial laboratory conditions - Demand characteristics and experimenter effects
Construct Validity Threats: - Inadequate operationalization - Mono-operation and mono-method bias - Hypothesis guessing by participants
4 questions to test your understanding of this topic
Gravetter, F. J., & Forzano, L. B. (2018). Research Methods for the Behavioral Sciences. Cengage Learning (6th ed.).
Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics. SAGE Publications (5th ed.).
Kazdin, A. E. (2017). Research Design in Clinical Psychology. Cambridge University Press (5th ed.).
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin.
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7-29.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359-1366.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum (2nd ed.).
Creswell, J. W., & Creswell, J. D. (2018). Research Design: Qualitative, Quantitative, and Mixed Methods Approaches. SAGE (5th ed.).
Nosek, B. A., et al. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 2600-2606.