Alternatives to Probabilistic Psychological Measurement
To explore historical and contemporary critiques of probabilistic psychometrics and survey alternative mathematical and conceptual frameworks proposed for psychological measurement.
Philosophical framing
Psychological measurement, at its core, grapples with the fundamental question of whether psychological attributes possess a quantitative structure amenable to numerical representation. This isn't merely a technical issue; it delves into the very nature of psychological reality. The prevailing probabilistic psychometrics, with its reliance on latent variables and statistical inference, implicitly assumes such a quantitative structure, often without rigorous empirical justification. This lecture will explore traditions that challenge this assumption, seeking to understand psychological phenomena through lenses that do not presuppose commensurability with standard numerical scales, drawing from philosophical critiques of measurement itself.
Introduction
The landscape of psychological measurement has long been dominated by probabilistic approaches, primarily rooted in Classical Test Theory (CTT) and Item Response Theory (IRT). These frameworks conceptualize psychological attributes as latent variables, measurable with some degree of error, and rely heavily on statistical inference to quantify these constructs [McNeish et al., 2020]. However, this probabilistic paradigm has faced persistent, fundamental critiques, questioning its very foundations and its suitability for capturing the complexity of psychological phenomena.
One of the most potent criticisms centers on the assumption that psychological attributes are inherently quantitative. Michell argues that psychometrics often assigns numbers to attributes "without ever considering whether those attributes can sustain the operations represented within the empirical numeric relation system so imposed" [Barrett, 2003, с. 75]. This "measurement imperative," stemming from a Pythagorean philosophy that equates science with quantification, has led to a "pathology of science" where numerical manipulation precedes empirical validation of quantitative structure [Barrett, 2003]. This lecture aims to survey the intellectual alternatives that have emerged from such critiques, exploring frameworks that seek to measure psychological phenomena without relying on probability theory, or at least, without its foundational assumptions.
Critique and limitations
One significant limitation of current psychometric practices, particularly those relying on sum scores, is the implicit assumption of a quantitative structure that may not empirically exist for many psychological attributes. As Michell and Barrett forcefully argue, psychometrics often assigns numbers without first verifying if the underlying attribute can sustain the operations implied by those numbers. If a psychological construct, say "anxiety," does not possess a true interval or ratio scale property, then treating sum scores as such for statistical analyses (e.g., calculating means, standard deviations, or using parametric tests) can lead to misleading conclusions about the magnitude and differences between individuals. Without this foundational empirical validation, any subsequent statistical inference, however sophisticated, operates on potentially flawed premises, making it difficult to interpret what the numbers truly represent beyond mere ordinal rankings.
Another critical weakness lies in the often-unexamined assumptions embedded within statistical models, which can severely compromise the generalizability and replicability of findings. Greenland et al. highlight that "every method of statistical inference depends on a complex web of assumptions," many of which are "unrealistic or at best unjustified" in practice. For instance, the assumption of random sampling or independence, crucial for many probabilistic models, is frequently violated in psychological research, especially with convenience samples. If these foundational assumptions are not met, the statistical tests and confidence intervals derived from them become unreliable, leading to inflated Type I error rates or inaccurate effect size estimates. This means that even if a study reports statistically significant results, these findings might not generalize to other populations or contexts, and attempts to replicate them could fail not due to a lack of a true effect, but due to the invalidity of the underlying statistical model's assumptions in the new context.
Finally, the widespread reliance on traditional probabilistic models often overlooks the dynamic, contextual, and emergent nature of many psychological phenomena. Jana Uher critiques the application of propensity probabilities to individuals, arguing that "the individual’s empirical frequencies cannot converge (i.e., stabilise) in the long run as may be true for physical systems." Psychological attributes, such as personality, are assumed to develop and change over the lifespan, meaning that probabilities themselves are not static. This transience and context-dependency challenge models that seek to measure fixed, latent traits. If psychological states are better understood as emergent properties of complex, interacting systems, as suggested by network approaches [Borsboom et al., 2021], then static probabilistic models that treat attributes as stable, independent entities will inevitably misrepresent the phenomenon. The difficulty in addressing this limitation lies in developing mathematical frameworks that can adequately capture such inherent dynamism and context-sensitivity without sacrificing the rigor required for scientific inquiry.
Sources
- Joseph F. Hair; G. Tomas M. Hult; Christian M. Ringle; Marko Sarstedt; Nicholas P. Danks; Soumya Ray. Partial Least Squares Structural Equation Modeling (PLS-SEM) Using R (2021) ↗ doi
- Nelson Cowan. The magical number 4 in short-term memory: A reconsideration of mental storage capacity (2001) ↗ doi
- Dale H. Schunk. Self-Efficacy and Academic Motivation (1991) ↗ doi
- Nikolaus Kriegeskorte. Representational similarity analysis – connecting the branches of systems neuroscience (2008) ↗ doi
- Sander Greenland; Stephen Senn; Kenneth J. Rothman; John B. Carlin; Charles Poole; Steven N. Goodman; Douglas G. Altman. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations (2016) ↗ doi
- Jens Hainmueller; Daniel J. Hopkins; Teppei Yamamoto. Causal Inference in Conjoint Analysis: Understanding Multidimensional Choices via Stated Preference Experiments (2013) ↗ doi
- T. Bedirhan Üstün; Somnath Chatterji; Nenad Kostanjsek; Jürgen Rehm; Cille Kennedy; JoAnne E. Epping‐Jordan; Shekhar Saxena; Michael Von Korff; Charles B. Pull. Developing the World Health Organization Disability Assessment Schedule 2.0 (2010) ↗ doi
- Robert M. Sellers; Mia Smith Bynum; J. Nicole Shelton; Stephanie J. Rowley; Tabbye M. Chavous. Multidimensional Model of Racial Identity: A Reconceptualization of African American Racial Identity (1998) ↗ doi
- Jose Benitez; Jörg Henseler; Ana Castillo; Florian Schuberth. How to perform and report an impactful analysis using partial least squares: Guidelines for confirmatory and explanatory IS research (2019) ↗ doi
- Monica Eriksson; Bengt Lindström. Antonovsky’s sense of coherence scale and the relation with health: a systematic review (2006) ↗ doi
- R.Duncan Luce; John W. Tukey. Simultaneous conjoint measurement: A new type of fundamental measurement (1964) ↗ doi
- Sarah Stewart‐Brown; Alan Tennant; Ruth Tennant; Stephen Platt; Jane Parkinson; Scott Weich. Internal construct validity of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS): a Rasch analysis using data from the Scottish Health Education Population Survey (2009) ↗ doi
- Denny Borsboom; Marie K. Deserno; Mijke Rhemtulla; Sacha Epskamp; Eiko I. Fried; Richard J. McNally; Donald J. Robinaugh; Marco Perugini; Jonas Dalege; Giulio Costantini; Adela‐Maria Isvoranu; Anna Wysocki; Claudia D. van Borkulo; Riet van Bork; Lourens Waldorp. Network analysis of multivariate data in psychological science (2021) ↗ doi
- David Meunier; Renaud Lambiotte; Edward T. Bullmore. Modular and Hierarchically Modular Organization of Brain Networks (2010) ↗ doi
- Sacha Epskamp; Lourens Waldorp; René Mõttus; Denny Borsboom. The Gaussian Graphical Model in Cross-Sectional and Time-Series Data (2018) ↗ doi
- Jane C. Weeks; Paul J. Catalano; Angel M. Cronin; Matthew Finkelman; Jennifer W. Mack; Nancy L. Keating; Deborah Schrag. Patients' Expectations about Effects of Chemotherapy for Advanced Cancer (2012) ↗ doi
- JC Hobart. The Multiple Sclerosis Impact Scale (MSIS-29): A new patient-based outcome measure (2001) ↗ doi
- Kari Sentz; Scott Ferson. Combination of Evidence in Dempster-Shafer Theory (2002) ↗ doi
- Lex Borghans; Angela Duckworth; James J. Heckman; Bas ter Weel. The Economics and Psychology of Personality Traits (2008) ↗ doi
- Fanni Bányai; Ágnes Zsila; Orsolya Király; Anikó Maráz; Zsuzsanna Elekes; Mark D. Griffiths; Cecilie Schou Andreassen; Zsolt Demetrovics. Problematic Social Media Use: Results from a Large-Scale Nationally Representative Adolescent Sample (2017) ↗ doi
- Hans Strasburger; Ingo Rentschler; Martin Jüttner. Peripheral vision and pattern recognition: A review (2011) ↗ doi
- Gerd Gigerenzer. Why Heuristics Work (2008) ↗ doi
- A. E. Ades; G. Lu; Julian P. T. Higgins. The Interpretation of Random-Effects Meta-Analysis in Decision Models (2005) ↗ doi
- Jens B. Asendorpf; Mark Conner; Filip De Fruyt; Jan De Houwer; Jaap J. A. Denissen; Klaus Fiedler; Susann Fiedler; David C. Funder; Reinhold Kliegl; Brian A. Nosek; Marco Perugini; Brent W. Roberts; Manfred Schmitt; Marcel A. G. van Aken; Hannelore Weber; Jelte M. Wicherts. Recommendations for Increasing Replicability in Psychology (2013) ↗ doi
- Louis Guttman. A Basis for Scaling Qualitative Data (1944) ↗ doi
- Daniel Kardefelt‐Winther; Alexandre Heeren; Adriano Schimmenti; Antonius J. van Rooij; Pierre Maurage; Michelle Colder Carras; Johan Edman; Alex Blaszczynski; Yasser Khazaal; Joël Billieux. How can we conceptualize behavioural addiction without pathologizing common behaviours? (2017) ↗ doi
- Wolf Mehling; Michael Acree; Anita L. Stewart; Jonathan Silas; Alexander Jones. The Multidimensional Assessment of Interoceptive Awareness, Version 2 (MAIA-2) (2018) ↗ doi
- Åsa Lundgren‐Nilsson; Ingibjörg H. Jónsdóttir; Julie Pallant; Gunnar Ahlborg. Internal construct validity of the Shirom-Melamed Burnout Questionnaire (SMBQ) (2012) ↗ doi
- Marco Fabbri; Alessia Beracci; Monica Martoni; Debora Meneo; Lorenzo Tonetti; Vincenzo Natale. Measuring Subjective Sleep Quality: A Review (2021) ↗ doi
- Daniel McNeish; Melissa Gordon Wolf. Thinking twice about sum scores (2020) ↗ doi
- Lotfi A. Zadeh. FUZZY SETS (1996) ↗ doi
- Patrick Mair; Reinhold Hatzinger. Extended Rasch Modeling: The<b>eRm</b>Package for the Application of IRT Models in<i>R</i> (2007) ↗ doi
- Joel Michell. Measurement in Psychology (1999) ↗ doi
- Matt N Williams; Carlos Alberto Gomez Grajales; Dason Kurkiewicz. Assumptions of Multiple Regression: Correcting Two Misconceptions (2020) ↗ doi
- Richard Perline; Benjamin D. Wright; Howard Wainer. The Rasch Model as Additive Conjoint Measurement (1979) ↗ doi
- Fabian Dablander; Max Hinne. Node centrality measures are a poor substitute for causal inference (2019) ↗ doi
- E.H. Mamdani. Fuzzy sets and applications: selected papers by L A Zadeh (1988) ↗ doi
- Jana Uher. Personality Psychology: Lexical Approaches, Assessment Methods, and Trait Concepts Reveal Only Half of the Story—Why it is Time for a Paradigm Shift (2013) ↗ doi
- Joseph A. Goguen. L. A. Zadeh. Fuzzy sets. Information and control, vol. 8 (1965), pp. 338–353. - L. A. Zadeh. Similarity relations and fuzzy orderings. Information sciences, vol. 3 (1971), pp. 177–200. (1973) ↗ doi
- David H Krantz. Conjoint measurement: The Luce-Tukey axiomatization and some extensions (1964) ↗ doi
- Wenhua Zhao; Wei Jiang; Huilin Wang; Jianbo He; Cuiyun Su; Qitao Yu. Impact of Smoking History on Response to Immunotherapy in Non-Small-Cell Lung Cancer: A Systematic Review and Meta-Analysis (2021) ↗ doi
- Paul Barrett. Beyond psychometrics (2003) ↗ doi
- Probabilistic Models for Some Intelligence and Attainment Tests (2010) ↗ doi
- Clark. Commerce, Culture, and Liberty: Readings on Capitalism Before Adam Smith (1776)
- Clark. The Distribution of Wealth: A Theory of Wages, Interest and Profits (1908)
- L. S. Vygotsky. Mind in Society: The Development of Higher Psychological Processes (1978)
- Alice Ambrose; Ludwig Wittgenstein; G. E. M. Anscombe. Philosophical Investigations. (1954) ↗ doi
- Max van Manen. Researching Lived Experience: Human Science for an Action Sensitive Pedagogy (1990)
- Jean Piaget. The origins of intelligence in children. (1952) ↗ doi
- Francisco J. Varela; Eleanor Rosch; Evan Thompson. The Embodied Mind: Cognitive Science and Human Experience (2017)
- R. L. French; Kurt Lewin; Dorwin Cartwright. Field Theory in Social Science (1953) ↗ doi
- Bärbel Inhelder; Jean Piaget. The growth of logical thinking: From childhood to adolescence. (1958) ↗ doi
- Jean Piaget. The Construction Of Reality In The Child (2013) ↗ doi
- Jean Piaget. The construction of reality in the child. (1954) ↗ doi
- Jean Piaget. The Moral Judgment Of The Child (2013) ↗ doi
- Kurt Lewin. Principles of topological psychology. (1936) ↗ doi