Estimating the Reproducibility of Psychological Science

6 results back to index


Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth by Stuart Ritchie

Albert Einstein, anesthesia awareness, Bayesian statistics, Carmen Reinhart, Cass Sunstein, citation needed, Climatic Research Unit, cognitive dissonance, complexity theory, coronavirus, correlation does not imply causation, COVID-19, Covid-19, crowdsourcing, deindustrialization, Donald Trump, double helix, en.wikipedia.org, epigenetics, Estimating the Reproducibility of Psychological Science, Growth in a Time of Debt, Kenneth Rogoff, l'esprit de l'escalier, meta analysis, meta-analysis, microbiome, Milgram experiment, mouse model, New Journalism, p-value, phenotype, placebo effect, profit motive, publication bias, publish or perish, race to the bottom, randomized controlled trial, recommendation engine, rent-seeking, replication crisis, Richard Thaler, risk tolerance, Ronald Reagan, Scientific racism, selection bias, Silicon Valley, Silicon Valley startup, Stanford prison experiment, statistical model, stem cell, Steven Pinker, Thomas Bayes, twin studies, University of East Anglia

For example, see Philip Zimbardo, ‘Philip Zimbardo’s Response to Recent Criticisms of the Stanford Prison Experiment’, 23 June 2018; https://static1.squarespace.com/static/557a07d5e4b05fe7bf112c19/t/5dee52149d16d153cba11712/1575899668862/Zimbardo2018-06-23.pdf. See also Le Texier’s reply to a more recent (at the time of writing unpublished) version: Thibault Le Texier, ‘The SPE Remains Debunked: A Reply to Zimbardo and Haney (2020)’, Preprint, PsyArXiv (24 Jan. 2020); https://doi.org/10.31234/osf.io/9a2er 26.  Open Science Collaboration, ‘Estimating the Reproducibility of Psychological Science’, Science 349, no. 6251 (28 Aug. 2015): aac4716; https://doi.org/10.1126/science.aac4716 27.  77 per cent: Colin F. Camerer et al., ‘Evaluating the Replicability of Social Science Experiments in Nature and Science between 2010 and 2015’, Nature Human Behaviour 2, no. 9 (Sept. 2018): pp. 637–44; https://doi.org/10.1038/s41562-018-0399-z 28.  This number is derived from six out of sixteen studies showing successful replications.

I’ve been stressing the importance of robust results, but in making the case that there’s a replication crisis, I’m relying on multi-study replication attempts that weren’t representative samples of all the scientific literature. The conclusion of ‘only about half of published results replicate’ might not generalise to all science. This was a point made in a critique of one of the replication survey studies: D. T. Gilbert et al., ‘Comment on “Estimating the Reproducibility of Psychological Science”’, Science 351, no. 6277 (4 Mar. 2016): p. 1037; https://doi.org/10.1126/science.aad7243. Whereas I disagree with many of the arguments made in this rejoinder (for some reasons to be sceptical of it, see Daniël Lakens, ‘The Statistical Conclusions in Gilbert et al (2016) Are Completely Invalid’, The 20% Statistician, 6 March 2016; https://daniellakens.blogspot.com/2016/03/the-statistical-conclusions-in-gilbert.html), the criticism about representativeness was fair.


pages: 288 words: 81,253

Thinking in Bets by Annie Duke

banking crisis, Bernie Madoff, Cass Sunstein, cognitive bias, cognitive dissonance, Daniel Kahneman / Amos Tversky, delayed gratification, Donald Trump, en.wikipedia.org, endowment effect, Estimating the Reproducibility of Psychological Science, Filter Bubble, hindsight bias, Jean Tirole, John Nash: game theory, John von Neumann, loss aversion, market design, mutually assured destruction, Nate Silver, p-value, phenotype, prediction markets, Richard Feynman, ride hailing / ride sharing, Stanford marshmallow experiment, Stephen Hawking, Steven Pinker, the scientific method, The Signal and the Noise by Nate Silver, urban planning, Walter Mischel, Yogi Berra, zero-sum game

Oettingen, Gabriele. Rethinking Positive Thinking: Inside the New Science of Motivation. New York: Current, 2014. Oettingen, Gabriele, and Peter Gollwitzer. “Strategies of Setting and Implementing Goals.” In Social Psychological Foundations of Clinical Psychology, edited by James Maddox and June Price Tangney, 114–35. New York: Guilford Press, 2010. Open Science Collaboration. “Estimating the Reproducibility of Psychological Science.” Science 349, no. 6251 (August 28, 2015): 943 and aac4716-1–8. Oswald, Dan. “Learn Important Lessons from Lombardi’s Eight-Hour Session.” HR Hero (blog), March 10, 2014. http://blogs.hrhero.com/oswaldletters/2014/03/10/learn-important-lessons-from-lombardis-eight-hour-session. Oyserman, Daphna, Deborah Bybee, Kathy Terry, and Tamara Hart-Johnson. “Possible Selves as Roadmaps.”


pages: 442 words: 94,734

The Art of Statistics: Learning From Data by David Spiegelhalter

Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Carmen Reinhart, complexity theory, computer vision, correlation coefficient, correlation does not imply causation, dark matter, Edmond Halley, Estimating the Reproducibility of Psychological Science, Hans Rosling, Kenneth Rogoff, meta analysis, meta-analysis, Nate Silver, Netflix Prize, p-value, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, randomized controlled trial, recommendation engine, replication crisis, self-driving car, speech recognition, statistical model, The Design of Experiments, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Malthus

Scott, AIQ: How Artificial Intelligence Works and How We Can Harness Its Power for a Better World (Penguin, 2018), p. 000. 8. R. E. Kass and A. E. Raftery, ‘Bayes Factors’, Journal of the American Statistical Association 90 (1995), 773–95. 9. J. Cornfield, ‘Sequential Trials, Sequential Analysis and the Likelihood Principle’, American Statistician 20 (1966), 18–23. CHAPTER 12: HOW THINGS GO WRONG 1. Open Science Collaboration, ‘Estimating the Reproducibility of Psychological Science’, Science 349:6251 (28 August 2015), aac4716. 2. A. Gelman and H. Stern, ‘The Difference Between “Significant” and “Not Significant” Is Not Itself Statistically Significant’, American Statistician 60:4 (November 2006), 328–31. 3. Ronald Fisher, Presidential Address to the first Indian Statistical Congress, 1938, Sankhyā 4(1938), 14–17. 4. See ‘The Reinhart and Rogoff Controversy: A Summing Up’, New Yorker, 26 April 2013. 5.


pages: 340 words: 94,464

Randomistas: How Radical Researchers Changed Our World by Andrew Leigh

Albert Einstein, Amazon Mechanical Turk, Anton Chekhov, Atul Gawande, basic income, Black Swan, correlation does not imply causation, crowdsourcing, David Brooks, Donald Trump, ending welfare as we know it, Estimating the Reproducibility of Psychological Science, experimental economics, Flynn Effect, germ theory of disease, Ignaz Semmelweis: hand washing, Indoor air pollution, Isaac Newton, Kickstarter, longitudinal study, loss aversion, Lyft, Marshall McLuhan, meta analysis, meta-analysis, microcredit, Netflix Prize, nudge unit, offshore financial centre, p-value, placebo effect, price mechanism, publication bias, RAND corporation, randomized controlled trial, recommendation engine, Richard Feynman, ride hailing / ride sharing, Robert Metcalfe, Ronald Reagan, statistical model, Steven Pinker, uber lyft, universal basic income, War on Poverty

Masicampo & Daniel R. Lalande, ‘A peculiar prevalence of p values just below .05’, Quarterly Journal of Experimental Psychology, vol. 65, no. 11, 2012, pp. 2271–9; Kewei Hou, Chen Xue & Lu Zhang, ‘Replicating anomalies’, NBER Working Paper 23394, Cambridge, MA: National Bureau of Economic Research, 2017. 46Alexander A. Aarts, Joanna E. Anderson, Christopher J. Anderson, et al., ‘Estimating the reproducibility of psychological science’, Science, vol. 349, no. 6251, 2015. 47This represented two out of eighteen papers: John P.A. Ioannidis, David B. Allison, Catherine A. Ball, et al., ‘Repeatability of published microarray gene expression analyses’, Nature Genetics, vol. 41, no. 2, 2009, pp. 149–55. 48This represented six out of fifty-three papers: C. Glenn Begley & Lee M. Ellis, ‘Drug development: Raise standards for preclinical cancer research’, Nature, vol. 483, no. 7391, 2012, pp. 531–3. 49This represented twenty-nine out of fifty-nine papers: Andrew C.


pages: 434 words: 117,327

Can It Happen Here?: Authoritarianism in America by Cass R. Sunstein

active measures, affirmative action, Affordable Care Act / Obamacare, airline deregulation, anti-communist, anti-globalists, availability heuristic, business cycle, Cass Sunstein, David Brooks, Donald Trump, Edward Snowden, Estimating the Reproducibility of Psychological Science, failed state, Filter Bubble, Francis Fukuyama: the end of history, ghettoisation, illegal immigration, immigration reform, Isaac Newton, job automation, Joseph Schumpeter, Long Term Capital Management, Nate Silver, Network effects, New Journalism, night-watchman state, obamacare, Potemkin village, random walk, Richard Thaler, road to serfdom, Ronald Reagan, the scientific method, War on Poverty, WikiLeaks, World Values Survey

The End of Theory: Financial Crises, the Failure of Economics, and the Sweep of Human Interaction. Princeton, NJ: Princeton University Press, 2017. Camerer, Colin F., and Eric J. Johnson. “The Process-Performance Paradox in Expert Judgment: How Can Experts Know So Much and Predict So Badly?” Research on Judgment and Decision Making: Currents, Connections, and Controversies 342 (1997). Collaboration, Open Science. “Estimating the Reproducibility of Psychological Science.” Science 349, no. 6251 (2015): 10.1126/science.aac4716. DiPrete, Thomas A., and Gregory M. Eirich. “Cumulative Advantage as a Mechanism for Inequality: A Review of Theoretical and Empirical Developments.” Annual Review of Sociology 32, no. 1 (2006): 271–97. Dunning, Thad. Natural Experiments in the Social Sciences: A Design-based Approach. Cambridge, UK: Cambridge University Press, 2012.


pages: 543 words: 153,550

Model Thinker: What You Need to Know to Make Data Work for You by Scott E. Page

"Robert Solow", Airbnb, Albert Einstein, Alfred Russel Wallace, algorithmic trading, Alvin Roth, assortative mating, Bernie Madoff, bitcoin, Black Swan, blockchain, business cycle, Capital in the Twenty-First Century by Thomas Piketty, Checklist Manifesto, computer age, corporate governance, correlation does not imply causation, cuban missile crisis, deliberate practice, discrete time, distributed ledger, en.wikipedia.org, Estimating the Reproducibility of Psychological Science, Everything should be made as simple as possible, experimental economics, first-price auction, Flash crash, Geoffrey West, Santa Fe Institute, germ theory of disease, Gini coefficient, High speed trading, impulse control, income inequality, Isaac Newton, John von Neumann, Kenneth Rogoff, knowledge economy, knowledge worker, Long Term Capital Management, loss aversion, low skilled workers, Mark Zuckerberg, market design, meta analysis, meta-analysis, money market fund, Nash equilibrium, natural language processing, Network effects, p-value, Pareto efficiency, pattern recognition, Paul Erdős, Paul Samuelson, phenotype, pre–internet, prisoner's dilemma, race to the bottom, random walk, randomized controlled trial, Richard Feynman, Richard Thaler, school choice, sealed-bid auction, second-price auction, selection bias, six sigma, social graph, spectrum auction, statistical model, Stephen Hawking, Supply of New York City Cabdrivers, The Bell Curve by Richard Herrnstein and Charles Murray, The Great Moderation, The Rise and Fall of American Growth, the rule of 72, the scientific method, The Spirit Level, The Wisdom of Crowds, Thomas Malthus, Thorstein Veblen, urban sprawl, value at risk, web application, winner-take-all economy, zero-sum game

“Evolution of Indirect Reciprocity by Image Scoring.” Nature 393: 573–577. Olson, Mancur. 1965. The Logic of Collective Action: Public Goods and the Theory of Groups. Cambridge, MA: Harvard University Press. O’Neil, Cathy 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York, NY: Crown. Open Science Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349: 6251. Organization for Economic Co-operation and Development. 1996. The Knowledge Based Economy. Paris: OECD. Ormerod, Paul. 2012. Positive Linking: How Networks Can Revolutionise the World. London: Faber and Faber. Ostrom, Elinor. 2004. Understanding Institutional Diversity. Princeton, NJ: Princeton University Press. Ostrom, Elinor. 2010. “Beyond Markets and States: Polycentric Governance of Complex Economic Systems.”