replication crisis

9 results back to index


Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth by Stuart Ritchie

Albert Einstein, anesthesia awareness, Bayesian statistics, Carmen Reinhart, Cass Sunstein, citation needed, Climatic Research Unit, cognitive dissonance, complexity theory, coronavirus, correlation does not imply causation, COVID-19, Covid-19, crowdsourcing, deindustrialization, Donald Trump, double helix, en.wikipedia.org, epigenetics, Estimating the Reproducibility of Psychological Science, Growth in a Time of Debt, Kenneth Rogoff, l'esprit de l'escalier, meta analysis, meta-analysis, microbiome, Milgram experiment, mouse model, New Journalism, p-value, phenotype, placebo effect, profit motive, publication bias, publish or perish, race to the bottom, randomized controlled trial, recommendation engine, rent-seeking, replication crisis, Richard Thaler, risk tolerance, Ronald Reagan, Scientific racism, selection bias, Silicon Valley, Silicon Valley startup, Stanford prison experiment, statistical model, stem cell, Steven Pinker, Thomas Bayes, twin studies, University of East Anglia

But even if you think the talk of a ‘crisis’ is grandiose or exaggerated, there’s one final argument in my quiver.122 It’s this: the reforms we’ve discussed in this chapter would all be beneficial for science even if there weren’t a replication crisis. This brings to mind the following classic cartoon about climate change, from the Lexington Herald-Leader’s Joel Pett: With apologies to Pett, let me rewrite his cartoon to respond to a different kind of doubter: Openness. Transparency. Improved statistics. Pre-registration. Automated error-checking. Clever ways to catch fraudsters. Preprints. Better hiring practices. A new culture of humility. Etc. Etc. What if the replication crisis is a big hoax and we create a better science for nothing? Epilogue O, while you live, tell truth, and shame the Devil! William Shakespeare, Henry IV, 1.3.1.59 As I write this, astrophysicists have recently taken the first photograph of a black hole.1 Medical geneticists have announced that seven children with severe immune deficiencies, who’d been forced to live in isolation lest they catch a common – but for them, deadly – infection, might have been cured by gene therapy; and that gene-based therapies for cystic fibrosis have shown results that imply they could work for 90 per cent of people with the condition.2 Public health researchers have shown that, for HIV-positive gay men who are taking the latest antiretroviral drugs, the chance of transmitting the virus to a sexual partner is ‘effectively zero’.3 Engineers have teleported information within a diamond using quantum entanglement.4 Scientists have injected nanoparticles into the eyes of mice, giving them infrared vision.5 These are all marvels and to see them amid a stream of scientific and medical progress is a regular reminder that science is one of humanity’s proudest achievements.

The policy, which was intended to protect the ‘integrity’ of Scotland’s ‘clean and green … brand’ (whatever that means), was derided as ‘cheap populism’ by a political commentator and described as ‘extremely concern[ing]’ in an open letter signed by twenty-eight scientific societies.24 What all this tells us is that, regardless of our discussion of the replication crisis and its associated failings, politicians will still trample all over science if they think it’ll lead them towards votes. The worry that the arguments in this book might be misappropriated to make selective, insincere attacks on research shouldn’t stop us from publicly discussing the replication crisis and its associated problems. We mustn’t make science suck in its stomach whenever a member of the public or a politician is watching. In fact, a frank admission of science’s weaknesses is the best way to pre-empt attacks by science’s critics and to be honest more generally about how the uncertainty-filled process of science really works.

But the very fact that we don’t know – along with the fact that so many high-profile, puffed-up findings have fallen apart upon closer inspection – is, I’d argue, cause for enough concern. For responses to other criticisms of the idea that there’s a crisis, see Harold Pashler & Christine R. Harris, ‘Is the Replicability Crisis Overblown? Three Arguments Examined’, Perspectives on Psychological Science 7, no. 6 (Nov. 2012): pp. 531–36; https://doi.org/10.1177/1745691612463401 30.  Alexander Bird, ‘Understanding the Replication Crisis as a Base Rate Fallacy’, British Journal for the Philosophy of Science, 13 Aug. 2018; https://doi.org/10.1093/bjps/axy051 31.  Of course, the argument of the original authors (those whose findings failed to replicate) has often been that the modifications aren’t, in fact, slight, and break the experiment in important ways.


pages: 290 words: 82,871

The Hidden Half: How the World Conceals Its Secrets by Michael Blastland

air freight, Alfred Russel Wallace, banking crisis, Bayesian statistics, Berlin Wall, central bank independence, cognitive bias, complexity theory, Deng Xiaoping, Diane Coyle, Donald Trump, epigenetics, experimental subject, full employment, George Santayana, hindsight bias, income inequality, manufacturing employment, mass incarceration, meta analysis, meta-analysis, minimum wage unemployment, nudge unit, oil shock, p-value, personalized medicine, phenotype, Ralph Waldo Emerson, random walk, randomized controlled trial, replication crisis, Richard Thaler, selection bias, the map is not the territory, the scientific method, The Wisdom of Crowds, twin studies

Index abstract formulas 141 Academy of Medical Sciences 133 adoption studies 41 aid, economic development 141 aid-effectiveness craze, the 153 alcohol consumption 180 AllTrials campaign 114–5 Altman, Doug 129–30 Amano, Yukiya 185 ambiguity 209–10 Amgen 111–2 Analysis (radio programme) 102 analytic validity 158, 263n18 anarchy 224 aphorisms 68–9, 149 apprenticeships 205–6 argument, beliefs and habits of 186 asthma 135 Attanasio, Orazio 225–9, 230 Autho, David 219–23 average knowledge 173 background influences 23–34 background norms, rejecting 24–5 bacon 161–3, 162–3 Banerjee, Abhijit 150–4, 157 Bangladesh 80–2, 82, 101–2, 158, 261n6 Bank of England 103, 216 Bank of Japan 103 Basbøll, Thomas 244–5 baseline data 165 base-rate neglect 176–7 basic laws 140 Bateson, William 245 BBC 88, 98 Beatles, the 52–3, 259n33 Begley, Glenn 111–7 behaviour context-specific 42–3 environmental cues 65–7 behavioural economics 157 Behavioural Insight Team 155, 156, 232 beliefs 60 contradictory 63–4 inconsistency of 60–6 justification 60–1, 63 manipulation 62–3 power of information on 66–8 self-contradiction 61–2 Berlin, Isaiah 199 betting, on knowledge 236–7 big causes, power of 35 big events causal intricacy 193–6 complexity 185–7 difficulty determining causality 188–96 power of circumstance 196–9 big picture, the 215–6 Bijani, Ladan 40–1 Bijani, Laleh 40–1 biographies 49 biological randomness 43–4 biomedical science, research standards 129–36 Bolsover 217–8 Boorstin, Daniel 17, 136, 138, 264n24 Booth, Charles 146–7 BP 211 brain, the 64 plasticity 56 self-justifying 83 breast cancer 45–6, 46 Brexit referendum 18–9, 20, 90, 214–8, 223–4, 241 Bunnings 77 Burckhardt, Jacob 255n20 Burke, Edmund 269n1 Burns, Terry 102–3 business decisions, failures 210–1 cancer 45–8 breast 45–6, 46 lung 174–5 risk 162–3, 166, 174–5 screening 132–3 Cancer Research UK 133 canned laughter 154–5 capitalism 118 Carillion 211 Carp, Joshua 123–4 Cartwright, Nancy 79, 79–82, 82, 193–4, 195, 202–3, 203–4, 263n18 causal instincts 123 causal interactions, complexity 239 causal intricacy 193–4 causal models 242–4, 243, 269–70n3 causal theorizing 212–4 causality assumption of 212–4 difficulty determining 188–96 existence of 276–7n12 hard 225–9 importance of 212 mechanical models 242–4, 243 in one person 48 cause and effect dependable 203–4 patterns of 23, 25–6, 26 supposed 248 unreliable 204 causes and causal influences 90, 94 competing 248 criminals 29 interaction 193–6 and luck 178 secret life of 8–11 simple 184–5 cells, biographical stories 47–8 certainty, desire for 235 Chadwick, Edwin 146–7 chance 14, 37–8, 247, 281n1 chaos theory 56–7, 276n10 Chater, Nick 59, 60, 63, 64–5, 66–7 Chernobyl disaster 185 child and adolescent development 23–6, 41–2 child mental health 206–7 childhood influences 23–5 delinquent boys 26–34 China, rise of 218–23, 279n19 choice, situated 31–3, 34 choice blindness 62 choices 60 Cialdini, Robert 154–5 Cifu, Adam 131–2 circumstances 70 power of 196–9 claims inflation 130 climate change 238–9 Clinton, Hillary 222 Cochrane Collaboration, the 189–90 cognition 64 cognitive biases 14 cognitive limitations 14, 214 Comaroff, John 107–8 common sense 69–70 comparative cost analysis 173 competence 236–7 complacency 237 complexity adding 244 big events 185–7 facing 15 hidden 184–201 of reality 245 complexity theory 276n10 complexity-avoidance 187 complications, hidden 187 Conan Doyle, Arthur 108 confidence 72 consistency 68–75, 202–4, 260n6, 260n8 constructive realism 17 consumer behaviour 77 context 41–2, 72, 101 context-specific behaviour 72 context-specific learning 42–3 control alternative to 248–9 elusiveness of 85–6 powers of 195 conviction 104 coping strategies 16–7, 225–46 adapting 230–3 betting 236–7 communicate uncertainty 237–9 embracing uncertainty 234–6 exceptions 244–5 experiment 230–3 governing for uncertainty 239–41 managing for uncertainty 241–2 metaphors 242–4 negative capability 234 relax 246 triangulation 233–4 use of probability 242 Corbyn, Jeremy 20 corporate power 241 cost/benefit analysis, cows 117–22 cows, cost/benefit analysis 117–22 Coyle, Diane 216, 262n12 Crabbe, John 85–7 credibility 238–9 credibility crisis 18 crime causes of 142–4 heroes and villains view 142 opportunist 144–5 reduced opportunity 144–5 theory of 142–6, 143 victims and survivors view 142–3 criminals causal influences 29 childhood influences 26–34 desisters 30 high rate chronics 30 life-course persistent offenders 28–9 life-courses 28, 236 variables 31 critical factors 83–5 crowds, wisdom of 149 cultural difference 79–82, 79–85 Daniels, Denise 43–4, 57 Darwin, Charles 50–1 data granularity 216–7 interpretation 98–100 Dawid, Philip 276–7n12 De Rond, Mark 198, 201 de Vries, Ymkje Anna 114 deadweight cost 205–6 debate 98 decision making 58–60 influences 32–3 situated choice 31–3 deep preferences 65 deeper rationale, construction of 60 Deepwater Horizon 211 defining characteristics 43 degrees of freedom 122–9 delinquent boys 26–34 dementia 176–7, 274n16 democracy 20 Deng Xiaoping 219 Denrell, Jerker 199, 201 desires 59 details importance of 49–54 neglecting 151–2 problem of 229 selective 26 determinism 28 development economics 150–3 developmental difference, sources of variation 9–11 developmental noise 10 difference 15 pockets of 214–24 Dilnot, Andrew 237, 275n3 disciplined pluralism 231 disorder 45 forces of 11–3 doubt 238 Down’s syndrome 166 drugs comparative cost analysis 173 impact 171–2 medical effect 167–9, 169, 170–4 non-responders 172 Numbers-Needed-to-Treat (NNTs) 168, 169, 170, 173–4 predictive weakness 170–3 duelling certainties 235 Duflo, Esther 83, 84, 141, 150–3, 157–8, 158–9, 230–1 ecological validity 263n18 economic development, aid 141 economic forecasting 92, 102–7 economic recovery 217–8 economics 233 economy, the 87–100, 91, 93, 94, 95 education 151–2, 206–7, 275– 6n7 Einstein, Albert 140–1 Emerson, Ralph Waldo 68 enigmatic variation 13–6, 48 environment context 72 non-shared 37 shared 35 environmental influences 43–4 epidemiology 181 epigenetics 6–7 erratic influences 60 essential you, the 59–60 estimates 89–91, 96 European Central Bank 103 evidence 21 balance of 114 conclusive 186, 187 the Janus effect 121, 122–9 limitations of 117–22 statistical significance 137 strength of 137 evidence-based medicine 133–4 exceptions 214–24, 244–5 expectations 35 big 196 frustration of 15 of regularity 47, 202–4 unrealistic 182 experience, influence of 33, 34, 55–7 experiment 230–3 expertise, crisis of 18–9 experts, credibility crisis 18–9 external validity 101, 158, 263n18, 264n19 extreme performance 199 failure 204–11 fairness 66–7 false negatives 113–4 false positives 113–4, 122 falsification 245 family, changes of 41 farmer and a chicken, the 202–4 fate 30 fears, exaggerated 46 Financial Times 77 First World War 108 Fitzroy, Robert 50 flat mind, the 60, 60–8 Flaubert, Gustave 139 forecasting 109 former Yugoslavia 108 foxes 199 France 186–7 Freedman, Sir Lawrence 108, 109 freedom 236 Fukushima nuclear power station meltdown 185–7 fundamentals 141 identifying 153 further education 208–9 Galbraith, John Kenneth 110 Gartner, Klaus 87 Gash, Tom 142–3 Gates, Bill 199 GDP data 262n12 growth estimation 88–100, 91, 93, 94, 95, 262–3n14 local 214–5, 216, 218 Gelman, Andrew 124–5, 244 gene–environment interaction 6–7 general principles 140 generalities 174 generalization 76–8, 146, 152, 263n18 genes and genetics influence of 34–7, 39–41, 44, 45–7 overclaiming 134–5 power of 33, 45 genetic risk 45–7 genius, dangerous 212–4 genotype 8 Germany 185, 186, 188 Gillam, John 77 global financial crisis, 2008–9 104, 106, 210, 235 globalization 213 Gove, Michael 18–9 granularity 216–7 ground truth 217 groupthink 149 guarantees, lack of 160 Guardian 207 Gupta, Rajeev 117, 118 Haldane, Andy 216–7, 218 Harford, Tim 156–7, 237 Harris, Judith Rich 40–2, 72 Hayek, Friedrich 105–6 health screening 177 heart disease 163–6 hedgehogs 199 Henry (ex-delinquent) 32 Hensall, Abigail 39–40, 41 Hensall, Brittany 39–40, 41 herd mentality 154–5 hidden causes 35–8 hidden half, the coping strategies 225–46 ignoring 202–24 mystery of 35 power of 44–5 hidden trivia 8–9 hindsight 78 hindsight bias 83 history 107–8 lessons of 109 Homebase 76–7 Honda, US motorcycle market penetration 196–9 hubris 77 human sameness irregularity 45–9 limits of 34–45 human understanding, fundamentals 213 Human Zoo, The (radio programme) 60–6 humility 224, 248–9 IBM 199 ibuprofen 163–5 ideological divide 240 ideologies 9–10 idiosyncratic influence 53–4 ignorance 21, 107 disguising 242 the shock of 7 imagination 138 impulsive judgement, value of 149 incarceration rates, United States of America 222, 240, 280n10 incidentals, effect of 51–2 incoherency problem, the 149 inconsistency beliefs 60–6 justifiable 70–1 incredible certitude 209 Indian Express 117 individual differences 56 individuality conjoined twins 39–42 neurological foundation of 56 industrial policy 208 inflation 102–7 influences background 23–34 childhood 26–34 criminals 26–34 decision making 32–3 environmental 43–4 erratic 60 hidden 204 microenvironmental 8–9, 253–4n12 information power of 66–8 selective 66–7 Institute for Fiscal Studies 205–6 Institute for Government 208–9 intangible differences 253n11 intangible variation 10, 229 interaction, problems of 193–6 internal validity 101–2, 158 International Journal of Epidemiology 43 intuition 54, 204 Ioannidis, John 121, 133–6 irrationality, human 14 irregularity 94 disruptive power of 224 frustration of 15 human 45–9 influence 12 problem of 229 underestimating 214–24 Islamic State 108 it’s-all-because problem 91, 96 James, Henry 29, 56 James, William 141 Janus effect, the 121, 122–9 Johansen, Petter 62 Johnson, Samuel 214 Johnson, Wendy 71–2 Jones, Susannah Mushatt 162–3, 165 journalism 237–8 Juno (film) 193 Kaelin, William 130 Kawashima, Kihachiro 197 Kay, John 16, 68, 197, 231, 232 Keats, John 138–9, 234 Kempermann, Gerd 56, 57 Keynes, John Maynard 107, 271n9 Keynesianism 103 King, Mervyn 103, 104, 106, 110 Kinnell, Galway 28 Knausgaard, Karl Ove 86–7 Knight, Frank 107 Knightian uncertainty 107 knowledge 12–3, 170 advance of 20–1 average 173 betting on 236–7 credibility crisis 18 critical factors 83–5 failures of 19, 76–8, 79–82 fallibility of 248 generalizable 234 generalization 76–8 illusion of 136, 138 lessons of the past 102–7, 107–10 in medicine 182 negative capability 138–9 as obstacle to progress 17 obvious 82 paths to 136–9 plausibility mistaken for 132 practical 30–1 pretence of 105–6 probabilistic 160, 161, 163–4, 172–3 and probability 180 problem of scale 177–80 provenance 116 relevant 82–5 replication crisis 111–7 subverting 76–110 and time variations 87–100, 91, 93, 94, 95 transfer 37, 76–8, 83, 101–2 unknowns 85–7 validity 100–2 validity across time 107–10 weakest-link principle 79–82 Krugman, Paul 210 Lancet 225–6 Langley, Winnie 51, 165, 178 Laub, John 26–34, 42 law-like effects, claims about 21 learning styles 207 Leicester City Football Club 199–201 Leon (ex-delinquent) 31–2 Leyser, Ottoline 114 life, mechanics of 51 life-course persistent offenders 28–9 limits and limitations 16–7, 44, 75 base-rate neglect 176–7 of cleverness 278n14 individual level 174–6, 178–9, 181–3 lack of guarantees 160 marginal probabilistic outcomes 176–7 medical effect 167–9, 169, 170–4 on prediction 165–6 on probability 160–83 problem of scale 161–6, 174– 6, 177–80, 181–3 Liskov Substitution Principle 261n3 Little Britain (TV comedy) 192 Liu, Chengwei 198, 201 lives, understanding 29 location shift 264n20 Loken, Erik 124–5 long-acting reversible contraceptives (LARCS) 190 luck 37–8, 48, 178, 198 lung cancer 174–5 Lyko, Frank 1, 2 machine mode thinking 151–2 Macron, Emmanuel 20 Manski, Charles 209, 235 Mao Zedong 218 marginal probabilistic outcomes 176–7 marmorkrebs 1–9, 4, 10, 12, 12–3, 22, 35, 81, 182, 252n2 Marteau, Theresa 65 Martin, George 52 May, Theresa 208 Mayne, Stephen 77 measurement 99–100 mechanical relationships 212, 242, 244 mechanical thinking 242–4, 243 media stigma 192–3 medical effect, drugs 167–9, 169, 170–4 medical reversal 131–3 medicine comparative cost analysis 173 knowledge in 182 non-responders 172 Numbers-Needed-to-Treat (NNTs) 168, 169, 170, 173–4 personalized 181–3 predictive weakness 170–3 probability and 167–9, 169, 170–4 memory 56, 102–7 Mendelian randomization 233 Menon, Anand 214–5 mental shortcuts 14–5 mere facts 202–3 meta-science 19, 20 methodological revisions 97–8, 120 mice 55 microenvironmental influences 8–9, 253–4n12 micro-irregularity 35–7 micro-particulars 128 Microsoft 147–50, 199 Miller, Helen 66–7, 67 mind, the flat 59–60, 60–8 shape 59 models and modelling 140, 242–4, 243, 269–70n3 moment when, the 52 morality, changing 108 More or Less (radio programme) 237 Munafò, Marcus 234 Nadella, Satya 147–8 National Survey of Family Growth 192 National Surveys of Sexual Attitudes and Lifestyles 191–2 nationalism 108 Nature 2, 112, 136, 168, 174 nature/nurture debate 3, 5–6, 9–10 negative capability 138–9, 234 neurology 58 New England Journal of Medicine 131–2 Newcastle upon Tyne 214 Newton, Isaac 140–1 noise 14 definition 10 developmental 10 as intellectual dross 11 re-appraisal of 11–3 non-shared environment 37 Nosek, Brian 129 noses 49–51 Nottingham 217 Numbers-Needed-to-Treat (NNTs) 168, 169 nurture, influence of 44 O’Connor, Sarah 217–8 Office for National Statistics 89, 92, 98, 99–100, 216 O’Neill, Onora 238 opinions 21, 59 order 11–2, 13 organ donation campaign 155–6 outside influence 44 overclaiming 134–5 overconfidence 21 overseas business expansion 76–8 Oxfam, sexual abuse scandal 210 Paphides, Pete 52–3 parental behaviour 41 parents, impact of 41 Parris, Matthew 63 parthenogenesis 1–2 particularism 271–2n15 particularity problem, the 93 past, the, lessons of 102–7, 107–10 pattern-making instinct 21 patterns 13 pendulums 57 perceptual systems 64 performance 72–5 personalized medicine 181–3 Peto, Richard 47–8 phenotypes 8 physiognomy, and character 50 plausibility 132 Plomin, Robert 43–4, 49, 57 pluralism 231–2 polarization 235 policy making 231–2 appraisal 277n4 chances of success 208 failures 204–9 governing for uncertainty 239–41 and probability 178–9 secret of 209 seminar 207–8 sequential changes 208 political assumptions, fall of 20 political beliefs 60–6 population validity 263n18 populism, rise of 20 poverty 240–1 Prasad, Vinayak 131–2 precision 183 predictability 28 predictive weakness 165–6, 170–3 preferences 59, 62 deep 65 priming 126–8 probabilistic knowledge 160, 161, 163–4, 170, 172–3 probability 54, 70, 107, 258n25, 272n2 advantages 177–80 base-rate neglect 176–7 difference in 30 fear of low probabilities 166 individual level 174–6, 178–9, 181–3 limits and limitations 160–83 marginal 176–7 medical effect 167–9, 169, 170–4 paradox 170 and policy making 178–9 predictive weakness 165–6 problem of scale 161–6, 174– 6, 177–80, 181–3 recognizing significance 161 risk evaluation 161–6 suggestion of knowledge 180 use of 242 usefulness 161 problems, conceptualizing 17 productivity growth 209–10 progress, knowledge as obstacle to 17 psychoanalysis 58 psychology 58 Pullinger, John 278n14 Pullman, Philip 37 quantification, risk and risk-taking 162–5 racism 125–6 radical uncertainty 106, 107 Radio, Andrew 102 rage to conclude, the 139 randomized controlled trials, value of 280n6 randomness, pure 9 Ranieri, Claudio 200–1 rationality 68, 260n6, 260n8 reality 230, 245, 254n14 reciprocity 155 reflection 65–6 regularity 73, 160 assumption of 212–4 expectations of 47, 202–4 search for 212, 230 statistical 240–1 replication crisis 18, 111–7, 117– 22, 129, 136, 138 Replication Project 129 research 111–39 balance of evidence 114 breadth 130 claims inflation 130 confidence in 115–6 credibility crisis 18 decision rules 136–9 depth 130 evidence-based medicine 133–4 false negatives 113–4 false positives 113–4, 122 fragility 128–9 freedom 122–9 half wrong 113, 115–6 the Janus effect 121, 122–9 limitations of 117–22 micro-particulars 128 multiple analyses 125–6 multiple conclusions 117–22 overclaiming 134–5 priming 126–8 redemption 20 replication crisis 111–7, 117– 22, 129, 136, 138 rigour 19 scepticism 115–6 standards 129–36 statistical significance 122 triangulation 138 validity 101–2 research-credibility crisis 18 rigour 19, 246 risk and risk-taking 70–1, 107, 186 alcohol consumption 180 cancer 162–3, 166, 174–5 communication of 133 evaluation 161–6 heart disease 163–6 quantification 162–5, 166 quantified 187 risk-perception 71 Rockhill, Beverly 181 Rolling Stone magazine 23 Rose, Geoffrey 175–6 Rowntree Joseph 146–7 Royal Bank of Scotland 211 Russell, Bertrand 202, 202–3 samples, validity 100–2 Sampson, Robert 26–34, 42, 236 sanitation 225–9 Santayana, George 109 scale, problem of 161–6, 174–6, 177–80, 181–3 scepticism 105, 115–6, 128, 206 schizophrenia 34–6, 256n10 Science 56 Scientific American 55 Scotland, Triple-P parenting programme 206 screening 132–3, 177 searing memory, doctrine of the 102–7 selection bias 244 self-understanding 67 Sense about 115 serendipitous events 43, 52–3 sex education, role of 189–90 short-term gene–environment interaction 7 significance, recognizing 161 Silberzahn, Raphael 125–6 Simmons, Joseph 122–3 situated choice 31–3, 34, 42 situations, appraisal of 72 sliding-doors moments 50 small differences, power of 56–7 small effects, influence of 49–54 small experiences, influence of 35–7 smartphones 97, 191 Smith, George Davey 50, 51, 234, 281n1 social contexts 31, 195 social media 191 social mobility 240–1 social policy 195 social proof 154–6 social reformers 146–7 social science, utility of 146–50 special theory of relativity 140–1 Spiegelhalter, David 180, 244–5 spontaneous interaction 9 stagflation 103 statins 171 statistical regularities 240–1 statistical significance 122, 137 stents, use of 131 stories and storytelling 25–6, 53–4, 244–5, 247, 248, 258n25, 258n27 structural forces 54 Sun, the 51 support factors 194 Surfers Against Sewage 70–1 surgeons, skills 73–4 system 1 thinking 149 systematic forces 54 systems-level thinking 153 Tamil Nadu 79–82, 101–2 Tangiers, Morocco 84 technology, changing 108 Teen Mom (TV show) 193 teenage pregnancy rate decline in 184, 188–96 estimates 275n3 terrible simplifiers 255n20 Tesco 77, 211 Thaler, Richard 157 theories 140–59 analytic validity 158 arguments about 150–4 of crime 142–6, 143 development economics 150–3 fitness 157 implementation 152 limitations 157 and practice 141 refining 156–7 relevance 157–8 social science 146–50 tension in 154–9 using 156–7 ‘thick’ description 86 time, validity across 107–10 Time magazine 193 time variations, and knowledge 87–100, 91, 93, 94, 95 The Times 63 toilets 225–9 Toshiba 211 trade-offs 190–1 trends 54 trials 156 triangulation 138, 233–4 Triple-P parenting programme 206–7 trivia, importance of 84–5 true uncertainty 107 Trump, Donald 20, 218, 222, 223–4 trust 238 trust deficit 218 trustworthiness 238 Tufte, Edward 139 turning points, variety 49–54 TV crime shows 143, 143 twins and twin studies conjoined 39–42 identical 34–7, 39, 256n10 Tyson, Mike 23, 23–6 Tyson, Rodney 24–5, 255n3 Uhlmann, Eric 125–6 uncertainty 89–90, 100, 209– 12, 254n14 admitting 238 communicating 237–9 data 89–91 embracing 234–6 erratic 93 governing for 239–41 Knightian 107 language of 238 managing for 241–2 in medicine 167–9, 169, 170–4 perpetual 230 radical 106, 107 true 107 uncertainty laundering 268n33 understanding hidden half of 13 limiting effects on 14 limits of 54 unemployment 221–2, 263n17 unintended consequences 105, 229 United States of America China trade 220–3 incarceration rates 222, 240, 280n10 labour market 221 minimum wage 266–7n10 unemployment 221–2 universal gravitational attraction, theory of 140–1 unknowns 85–7, 206 unusual, the 195 upbringing 23–5 Uyeno, Lori 47 validity across time 107–10 analytic 158, 263n18 ecological 263n18 external 101, 158, 263n18, 264n19 internal 101–2, 158 knowledge 100–2, 107–10 population 263n18 research 101–2 samples 100–2 values 59, 232 variation, sources of 5–8 Volkswagen, diesel emissions scandal 211 Wall Street Journal 219 Wallace, Alfred Russel 259n33 Walmart 77 Watts, Duncan 68, 69, 147–50 weakest-link principle 79–82 Wedgwood, Josiah 50–1 Wellington, Duke of 51 Wesfarmers 76–7 West Germany, motorcycle thefts 142–4 Western, Bruce 54 Wilson, Harold 99 World Bank Independent Evaluation Group 79 World Health Organization 162 world picture 63–4 Wright, Sewall 253n11

More striking than what I think is how many others say something similar: that we have somehow over-reached, and now wake up to the fact that life is not the shining edifice of robust understanding that our mass of research findings suggests. These findings are prolific, to be sure. But they have started to fall over at an alarming rate, failing to replicate as scientists seek to repeat each other’s work. You might have heard talk of a replication crisis, even a crisis of expertise, or research-credibility crisis. Take a moment to absorb that phrase: ‘research’ faces a ‘credibility crisis’. We’re not sure what to believe even from people whose purpose is finding out what to believe. If the knowledge factory, of all places, can’t be relied on for knowledge, we know we’re in trouble. By one estimate, most published research is false.15 We have become, said one genuinely respected researcher, prone to building mansions of straw, rather than sturdy houses of brick.16 The crisis narrative can be overdone, partly because we have little idea whether the problem is actually any worse than it used to be.

He was more careful than that, and actually said: ‘People have had enough of experts from organisations with acronyms saying that they know what is best and getting it consistently wrong.’ Which, if that is what these experts did, would be reasonable enough. He didn’t say whether people had also had enough of politicians, and I suspect at the time he had little idea of the research credibility or replication crisis, which is more serious and remains under-reported. This book is written against that more serious background – of anxiety about failures of knowledge in the social and human sciences, coupled with a movement to raise the game. It has various names, this movement – ‘meta-science’ is one you might come across if you haven’t already. Some people say it can be a force for good, in that we are at last becoming aware of the problem, which means it can be addressed, which in turn promises better science.


Super Thinking: The Big Book of Mental Models by Gabriel Weinberg, Lauren McCann

affirmative action, Affordable Care Act / Obamacare, Airbnb, Albert Einstein, anti-pattern, Anton Chekhov, autonomous vehicles, bank run, barriers to entry, Bayesian statistics, Bernie Madoff, Bernie Sanders, Black Swan, Broken windows theory, business process, butterfly effect, Cal Newport, Clayton Christensen, cognitive dissonance, commoditize, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, David Attenborough, delayed gratification, deliberate practice, discounted cash flows, disruptive innovation, Donald Trump, Douglas Hofstadter, Edward Lorenz: Chaos theory, Edward Snowden, effective altruism, Elon Musk, en.wikipedia.org, experimental subject, fear of failure, feminist movement, Filter Bubble, framing effect, friendly fire, fundamental attribution error, Gödel, Escher, Bach, hindsight bias, housing crisis, Ignaz Semmelweis: hand washing, illegal immigration, income inequality, information asymmetry, Isaac Newton, Jeff Bezos, John Nash: game theory, lateral thinking, loss aversion, Louis Pasteur, Lyft, mail merge, Mark Zuckerberg, meta analysis, meta-analysis, Metcalfe’s law, Milgram experiment, minimum viable product, moral hazard, mutually assured destruction, Nash equilibrium, Network effects, nuclear winter, offshore financial centre, p-value, Parkinson's law, Paul Graham, peak oil, Peter Thiel, phenotype, Pierre-Simon Laplace, placebo effect, Potemkin village, prediction markets, premature optimization, price anchoring, principal–agent problem, publication bias, recommendation engine, remote working, replication crisis, Richard Feynman, Richard Feynman: Challenger O-ring, Richard Thaler, ride hailing / ride sharing, Robert Metcalfe, Ronald Coase, Ronald Reagan, school choice, Schrödinger's Cat, selection bias, Shai Danziger, side project, Silicon Valley, Silicon Valley startup, speech recognition, statistical model, Steve Jobs, Steve Wozniak, Steven Pinker, survivorship bias, The Present Situation in Quantum Mechanics, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, transaction costs, uber lyft, ultimatum game, uranium enrichment, urban planning, Vilfredo Pareto, wikimedia commons

By now you should know that some experimental results are just flukes. In order to be sure a study result isn’t a fluke, it needs to be replicated. Interestingly, in some fields, such as psychology, there has been a concerted effort to replicate positive results, but those efforts have found that fewer than 50 percent of positive results can be replicated. That rate is low, and this problem is aptly positive results the replication crisis. This final section offers some models to explain how this happens, and how you can nevertheless gain more confidence in a research area. Replication efforts are an attempt to distinguish between false positive and true positive results. Consider the chances of replication in each of these two groups. A false positive is expected to replicate—that is, a second false positive is expected to occur in a repetition of the study—only 5 percent of the time.

Using those numbers, a replication rate of 50 percent requires about 60 percent of the studies to have been true positives and 40 percent of them to have been false positives. To see this, consider 100 studies: If 60 were true positives, we would expect 48 of those to replicate (80 percent of 60). Of the remaining 40 false positives, 2 would replicate (5 percent of 40) for a total of 50. The replication rate would then be 50 per 100 studies, or 50 percent. Replication Crisis Re-test 100 Studies So, under this scenario, about a fourth of the failed replications (12 of 50) are explained by a lack of power in the replication efforts. These are real results that would likely be replicated successfully either if an additional replication study were done or if the original replication study had a higher sample size. The rest of the results that failed to replicate should have never been positive results in the first place.

After twenty-one rolls (matching the twenty-one jelly bean colors in the comic), there is about a two-thirds chance that a one is rolled at least once, i.e., that there was at least one erroneous result. If this type of data dredging happens routinely enough, then you can see why a large number of studies in the set to be replicated might have been originally false positives. In other words, in this set of one hundred studies, the base rate of false positives is likely much larger than 5 percent, and so another large part of the replication crisis can likely be explained as a base rate fallacy. Unfortunately, studies are much, much more likely to be published if they show statistically significant results, which causes publication bias. Studies that fail to find statistically significant results are still scientifically meaningful, but both researchers and publications have a bias against them for a variety of reasons. For example, there are only so many pages in a publication, and given the choice, publications would rather publish studies with significant findings over ones with none.


The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns, Aaron Roth

23andMe, affirmative action, algorithmic trading, Alvin Roth, Bayesian statistics, bitcoin, cloud computing, computer vision, crowdsourcing, Edward Snowden, Elon Musk, Filter Bubble, general-purpose programming language, Google Chrome, ImageNet competition, Lyft, medical residency, Nash equilibrium, Netflix Prize, p-value, Pareto efficiency, performance metric, personalized medicine, pre–internet, profit motive, quantitative trading / quantitative finance, RAND corporation, recommendation engine, replication crisis, ride hailing / ride sharing, Robert Bork, Ronald Coase, self-driving car, short selling, sorting algorithm, speech recognition, statistical model, Stephen Hawking, superintelligent machines, telemarketer, Turing machine, two-sided market, Vilfredo Pareto

This is why making decisions based on the data has classically been viewed with skepticism, going by various derogatory names including “data snooping,” “data dredging,” and—as we have seen—“p-hacking.” The methodological dangers presented by the combination of algorithmic and human p-hacking have generated acrimonious controversies and hand-wringing over scientific findings that don’t reflect reality. These play a central role in what is broadly referred to as the “reproducibility crisis” in science, which has its own Wikipedia pages that begins: The replication crisis (or replicability crisis or reproducibility crisis) is an ongoing (2019) methodological crisis in science in which scholars have found that the results of many scientific studies are difficult or impossible to replicate or reproduce on subsequent investigation, either by independent researchers or by the original researchers themselves. The crisis has long-standing roots; the phrase was coined in the early 2010s as part of a growing awareness of the problem.

See also precise specification goal racial data and bias and algorithmic violations of fairness and privacy, 96 and college admissions models, 77 and dating preferences, 94–97 and “fairness gerrymandering,” 86–89 and fairness issues in machine learning, 65–66 and forbidden inputs, 66–67 and Google search, 14–15 and lending decisions, 191 and scope of topics covered, 19 and unique challenges of algorithms, 7 RAND Corporation, 100 randomization and differential privacy, 36–37, 40–44, 47 random lending, 69–71 random sampling, 18–19, 40 and self-play in machine learning, 131–32 and trust in data administrators, 45–47 rare events, 144 regulation of data and algorithms. See laws and regulations reidentification of anonymous data, 22–31, 33–34, 38 relationship status data, 51–52 religious affiliation data, 51–52 reproducibility (replication) crisis, 19–20, 156–60 residency hiring, 126–30 resume evaluation, 60–61 Rock-Paper-Scissors, 99–100, 102–3 Roth, Alvin, 130 RuleFit algorithms, 173 runs on banks, 95–96 sabotage, 99–100 Sandel, Michael, 177–78 SAT tests and fairness vs. accuracy of models, 65, 74–80 and predictive modeling, 8 and word analogy problems, 57 and word embedding models, 59–60 scale issues, 139, 143–45, 192.


pages: 297 words: 83,651

The Twittering Machine by Richard Seymour

4chan, anti-communist, augmented reality, Bernie Sanders, Cal Newport, Cass Sunstein, Chelsea Manning, citizen journalism, colonial rule, correlation does not imply causation, credit crunch, crowdsourcing, don't be evil, Donald Trump, Elon Musk, Erik Brynjolfsson, Filter Bubble, Google Chrome, Google Earth, hive mind, informal economy, Internet of things, invention of movable type, invention of writing, Jaron Lanier, Jony Ive, Kevin Kelly, knowledge economy, late capitalism, liberal capitalism, Mark Zuckerberg, Marshall McLuhan, meta analysis, meta-analysis, Mohammed Bouazizi, moral panic, move fast and break things, move fast and break things, Network effects, new economy, packet switching, patent troll, Philip Mirowski, post scarcity, post-industrial society, RAND corporation, Rat Park, rent-seeking, replication crisis, sentiment analysis, Shoshana Zuboff, Silicon Valley, Silicon Valley ideology, smart cities, Snapchat, Steve Jobs, Stewart Brand, Stuxnet, TaskRabbit, technoutopianism, the scientific method, Tim Cook: Apple, undersea cable, upwardly mobile, white flight, Whole Earth Catalog, WikiLeaks

But it has done so largely by sharpening tendencies that were already in play in the old media. The complaints about ‘fake news’ indicate that the embattled political establishment has not yet mastered the new media. But the problem goes even deeper than that and, in a strange way, the myth of a ‘post-truth’ society is a bungled attempt to diagnose the rot. In the sciences, there is an ongoing ‘replication crisis’ afflicting medicine, economics, psychology and evolutionary biology. The crisis consists of the fact that the results of many scientific studies are impossible to replicate in subsequent tests. In a survey of 1,500 scientists in the journal Nature, 70 per cent of the respondents failed to replicate the findings of another scientist’s experiment.50 Half of them couldn’t even replicate their own findings.

Jane and Chris Fleming, Modern Conspiracy: The Importance of Being Paranoid, Bloomsbury: New York and London, 2014 pp. 4–5. Devorah Baum notices a similar pattern. Feeling Jewish: (A Book for Just About Anyone), Yale University Press: New Haven and London, 2017, pp. 53–5. 50. In a survey of 1,500 scientists . . . Monya Baker, ‘1,500 scientists lift the lid on reproducibility’, Nature, 25 May 2016. On the replication crisis, see Nature’s special online report, ‘Challenges in irreproducible research’, Nature, 18 October 2018. 51. According to the historian of ideas Philip Mirowski . . . Philip Mirowski, Science-Mart: Privatizing American Science, Harvard University Press: Cambridge, MA, 2011. 52. Among the worst examples of this degradation . . . C. G. Begley and L. M. Ellis, ‘Raise standards for preclinical cancer research’, Nature, 483, 2012, pp. 531–3; C.


pages: 442 words: 94,734

The Art of Statistics: Learning From Data by David Spiegelhalter

Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Carmen Reinhart, complexity theory, computer vision, correlation coefficient, correlation does not imply causation, dark matter, Edmond Halley, Estimating the Reproducibility of Psychological Science, Hans Rosling, Kenneth Rogoff, meta analysis, meta-analysis, Nate Silver, Netflix Prize, p-value, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, randomized controlled trial, recommendation engine, replication crisis, self-driving car, speech recognition, statistical model, The Design of Experiments, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Malthus

This has led to doubts about the reliability of parts of the scientific literature, with claims that many ‘discoveries’ cannot be reproduced by other researchers – such as the continuing dispute over whether adopting an assertive posture popularly known as a ‘power pose’ can induce hormonal and other changes.8 The inappropriate use of standard statistical methods has received a fair share of the blame for what has become known as the reproducibility or replication crisis in science. With the growing availability of massive data sets and user-friendly analysis software, it might be thought that there is less need for training in statistical methods. This would be naïve in the extreme. Far from freeing us from the need for statistical skills, bigger data and the rise in the number and complexity of scientific studies makes it even more difficult to draw appropriate conclusions.

Locators in italics refer to figures and tables A A/B tests 107 absolute risk 31–2, 36–7, 383 adjustment 110, 133, 135, 383 adjuvant therapy 181–5, 183–4 agricultural experiments 105–6 AI (artificial intelligence) 144–5, 185–6, 383 alcohol consumption 112–13, 299–300 aleatory uncertainty 240, 306, 383 algorithms – accuracy 163–7 – biases 179 – for classification 143–4, 148 – complex 174–7 – contests 148, 156, 175, 277–8 see also Titanic challenge – meaning of 383 – parameters 171 – performance assessment 156–63, 176, 177 – for prediction 144, 148 – robustness 178 – sensitivity 157 – specificity 157 – and statistical variability 178–9 – transparency 179–81 allocation bias 85 analysis 6–12, 15 apophenia 97, 257 Arbuthnot, John 253–5 Archbishop of Canterbury 322–3 arm-crossing behaviour 259–62, 260, 263, 268–70, 269 artificial intelligence (AI) 144–5, 185–6, 383 ascertainment bias 96, 383 assessment of statistical claims 368–71 associations 109–14, 138 autism 113 averages 46–8, 383 B bacon sandwiches 31–4 bar charts 28, 30 Bayes, Thomas 305 Bayes factors 331–2, 333, 384 Bayes’ Theorem 307, 313, 315–16, 384 Bayesian hypothesis testing 219, 305–38 Bayesian learning 331 Bayesian smoothing 330 Bayesian statistical inference 323–34, 325, 384 beauty 179 bell-shaped curves 85–91, 87 Bem, Daryl 341, 358–9 Bernoulli distribution 237, 384 best-fit lines 125, 393 biases 85, 179 bias/variance trade-off 169–70, 384 big data 145–6, 384 binary data 22, 385 binary variables 27 binomial distribution 230–6, 232, 235, 385 birth weight 85–91 blinding 101, 385 BMI (body mass index) 28 body mass index (BMI) 28 Bonferroni correction 280, 290–1, 385 boosting 172 bootstrapping 195–203, 196, 198, 200, 202, 208, 229–30, 386 bowel cancer 233–6, 235 Box, George 139 box-and-whisker plots 42, 43, 44, 45 Bradford-Hill, Austin 114 Bradford-Hill criteria 114–17 brain tumours 95–6, 135, 301–3 breast cancer screening 214–16, 215 breast cancer surgery 181–5, 183–4 Brier score 164–7, 386 Bristol Royal Infirmary 19–21, 56–8 C Cairo, Alberto 25, 65 calibration 161–3, 162, 386 Cambridge University 110, 111 cancer – breast 181–5, 183–4, 214–16, 215 – lung 98, 114, 266 – ovarian 361 – risk of 31–6 carbonated soft drinks 113 Cardiac Surgical Registry (CSR) 20–1 case-control studies 109, 386 categorical variables 27–8, 386 causation 96–9, 114–17, 128 reverse causation 112–15, 404 Central Limit Theorem 199, 238–9, 386–7 chance 218, 226 child heart surgery see heart surgery chi-squared goodness-of-fittest 271, 272, 387 chi-squared test of association 268–70, 387 chocolate 348 classical probability 217 classification 143–4, 148–54 classification trees 154–6, 155, 168, 174, 387 cleromancy 81 clinical trials 82–3, 99–107, 131, 280, 347 clustering 147 cohort studies 109, 387 coins 308, 309 communication 66–9, 353, 354, 364–5 complex algorithms 138–9 complexity parameters 171 computer simulation 205–7, 208 conclusions 15, 22, 347 conditional probability 214–16 confidence intervals 241–4, 243, 248–51, 250, 271–3, 335–6, 387–8 confirmatory studies 350–1, 388 confounders 110, 135, 388 confusion matrixes 157 continuous variables 46, 388 control groups 100, 389 control limits 234, 389 correlation 96–7, 113 count variables 44–6, 389 counterfactuals 97–8, 389 crime 83–5, 321–2 see also homicides Crime Survey for England and Wales 83–5 cross-sectional studies 108–9 cross-validation 170–1, 389 CSR(Cardiac Surgical Registry) 20–1 D Data 7–12, 15, 22 data collection 345 data distribution see sample distribution data ethics 371 data literacy 12, 389 data science 11, 145–6, 389 data summaries 40 data visualization 22, 25, 65–6, 69 data-dredging 12 death 9 see also mortality; murder; survival rates deduction 76 deep learning 147, 389 dependent events 214, 389 dependent variables 60, 125–6, 389 deterministic models 128–9, 138 dice 205–7, 206, 213 differences between groups of numbers 51–6 distribution 43 DNA evidence 216 dogs 179 Doll, Richard 114 doping 310–13, 311–12, 314, 315–16 dot-diagrams 42, 43, 44, 45 dynamic graphics 71 E Ears 108–9 education 95–6, 106–7, 131, 135, 178–9 election result predictions 372–6, 375 see also opinion polls empirical distribution 197, 404 enumerative probability 217–18 epidemiology 95, 117, 389 epistemic uncertainty 240, 306, 308, 309, 390 error matrixes 157, 158, 390 errors in coding 345–6 ESP (extra-sensory perception) 341, 358–9 ethics 371 eugenics 39 expectation 231, 390 expected frequencies 32, 209–13, 211, 214–16, 215, 390 explanatory variables 126, 132–5 exploratory studies 350, 390 exposures 114, 390 external validity 82–3, 390 extra-sensory perception (ESP) 341, 358–9 F False discovery rate 280, 390 false-positives 278–80, 390 feature engineering 147, 390 Fermat, Pierre de 207 final odds 316 financial crisis of 2007–2008 139–40 financial models 139–40 Fisher, Ronald 258, 265–6, 336, 345 five-sigma results 281–2 forensic epidemiology 117, 391 forensic statistics 6 framing 391 – of numbers 24–5 – of questions 79–80 fraud 347–50 funnel plots 234, 391 G Gallup, George 81 Galton, Francis 39–40, 58, 121–2, 238–9 gambler’s fallacy 237 gambling 205–7, 206, 213 garden of forking paths 350 Gaussian distribution see normal distribution GDP (Gross Domestic Product) 8–9 gender discrimination 110, 111 Gini index 49 Gombaud, Antoine 205–7 Gross Domestic Product (GDP) 8–9 Groucho principle 358 H Happiness 9 HARKing 351–2 hazard ratios 357, 391 health 169–70 heart attacks 99–104 Heart Protection Study (HPS) 100–2, 103, 273–5, 274, 282–7 heart surgery 19–21, 22–4, 23, 56–8, 57, 93, 136–8, 137 heights 122–5, 123, 124, 127, 134, 201, 202, 243, 275–8, 276 hernia surgery 106 HES (Hospital Episode Statistics) 20–1 hierarchical modelling 328, 391 Higgs bosons 281–2 histograms 42, 43, 44, 45 homicides 1–6, 222–6, 225, 248, 270–1, 272, 287–94 Hospital Episode Statistics (HES) 20–1 hospitals 19–21, 25–7, 26, 56–61, 138 house prices 48, 112–14 HPS (Heart Protection Study) 100–2, 103, 273–5, 274, 282–7 hypergeometric distribution 264, 391 hypotheses 256–7 hypothesis testing 253–303, 336, 392 see also Neyman-Pearson Theory; null hypothesis significance testing; P-values I IARC (International Agency for Research in Cancer) 31 icon arrays 32–4, 33, 392 income 47–8 independent events 214, 392 independent variables 60, 126, 392 induction 76–7, 392 inductive behaviour 283 inductive inference 76–83, 78, 239, 392 infographics 69, 70 insurance 180 ‘intention to treat’ principle 100–1, 392 interactions 172, 392 internal validity 80–1, 392 International Agency for Research in Cancer (IARC) 31 inter-quartile range (IQR) 51, 89, 392 IQ 349 IQR (inter-quartile range) 49, 51, 89, 392 J Jelly beans in a jar 40–6, 48, 49, 50 K Kaggle contests 148, 156, 175, 277–8 see also Titanic challenge k-nearest neighbors algorithm 175 L LASSO 172–4 Law of Large Numbers 237, 393 law of the transposed conditional 216, 313 league tables 25, 130–1 see also tables least-squares regression lines 124, 125, 393 left-handedness 113–14, 229–33, 232 legal cases 313, 321, 331–2 likelihood 327, 336, 394 likelihood ratios 314–23, 319–20, 332, 394 line graphs 4, 5 linear models 132, 138 literal populations 91–2 logarithmic scale 44, 45, 394 logistic regression 136, 172, 173, 394 London Underground 24 loneliness 80 long-run frequency probability 218 look elsewhere effect 282 lung cancer 98, 114, 266 lurking factors 113, 135, 394–5 M Machine learning 139, 144–5, 395 mammography 214–16, 215 margins of error 189, 199, 200, 244–8, 395 mean average 46–8 mean squared error (MSE) 163–4, 165, 395 measurement 77–9 meat 31–4 media 356–8 median average 46, 47–8, 51, 89, 395 Méré, Chevalier de 205–7, 213 meta-analysis 102, 104, 395 metaphorical populations 92–3 mode 46, 48, 395 mortality 47, 113–14 MRP (multilevel regression and post-stratification) 329, 396 MSE (mean squared error) 163–4, 165, 395 mu 190 multilevel regression and post-stratification (MRP) 329, 396 multiple linear regression 132–3, 134 multiple regression 135, 136, 396 multiple testing 278–80, 290, 396 murders 1–6, 222–6, 225, 248, 270–1, 287–94 N Names, popularity of 66, 67 National Sexual Attitudes and Lifestyle Survey (Natsal) 52, 69, 70, 73–5 natural variability 226 neural networks 174 Neyman, Jerzy 242, 283, 335–6 Neyman-Pearson Theory 282–7, 336–7 NHST (null hypothesis significance testing) 266–71, 294–7, 296 non-significant results 299, 346–7, 370 normal distribution 85–91, 87, 226, 237–9, 396–7 null hypotheses 257–65, 336, 397 null hypothesis significance testing (NHST) 266–71, 294–7, 296 O Objective priors 327 observational data 108, 114–17, 128 odds 34, 314, 316 odds ratios 34–6 one-sided tests 264, 397–8 one-tailed P-values 264, 398 opinion polls 82, 245–7, 246, 328–9 see also election result predictions ovarian cancer 361 over-fitting 167–71, 168 P P-hacking 351 P-values 264–5, 283, 285, 294–303, 336, 401 parameters 88, 240, 398 Pascal, Blaise 207 patterns 146–7 Pearson, Egon 242, 283, 336 Pearson, Karl 58 Pearson correlation coefficient 58, 59, 96–7, 126, 398 percentiles 48, 89, 398–9 performance assessment of algorithms 156–67, 176, 177 permutation tests 261–4, 263, 399 personal probability 218–19 pie charts 28, 29 placebo effect 131 placebos 100, 101, 399 planning 13–15, 344–5 Poisson distribution 223–4, 225, 270–1, 399 poker 322–3 policing 107 popes 114 population distribution 86–91, 195, 399 population growth 61–6, 62–4 population mean 190–1, 395 see also expectation populations 74–5, 80–93, 399 posterior distributions 327, 400 power of a test 285–6, 400 PPDAC (Problem, Plan, Data, Analysis, Conclusion) problem-solving cycle 13–15, 14, 108–9, 148–54, 344–8, 372–6, 400 practical significance 302, 400 prayer 107 precognition 341, 358–9 Predict 2.1 182 prediction 144, 148–54 predictive analytics 144, 400 predictor variables 392 pre-election polls see opinion polls presentation 22–7 press offices 355–6 priming 80 prior distributions 327, 400 prior odds 316 probabilistic forecasts 161, 400 probabilities, accuracy 163–7 probability 10 meaning of 216–22, 400–1 rules of 210–13 and uncertainty 306–7 probability distribution 90, 401 probability theory 205–27, 268–71 probability trees 210–13, 212 probation decisions 180 Problem, Plan, Data, Analysis, Conclusion (PPDAC) problem-solving cycle 13–15, 14, 108–9, 148–54, 344–8, 372–6, 400 problems 13 processed meat 31–4 propensity 218 proportions, comparisons 28–37, 33, 35 prosecutor’s fallacy 216, 313 prospective cohort studies 109, 401 pseudo-random-number generators 219 publication bias 367–8 publication of findings 355 Q QRPs (questionable research practices) 350–3 quartiles 89, 402 questionable research practices (QRPs) 350–3 Quetelet, Adolphe 226 R Race 179 random forests 174 random match probability 321, 402 random observations 219 random sampling 81–2, 208, 220–2 random variables 221, 229, 402 randomization 108, 266 randomization tests 261–4, 263, 399 randomized controlled trials (RCTs) 100–2, 105–7, 114, 135, 402 randomizing devices 219, 220–1 range 49, 402 rate ratios 357, 402 Receiver Operating Characteristic (ROC) curves 157–60, 160, 402 recidivism algorithms 179–80 regression 121–40 regression analysis 125–8, 127 regression coefficients 126, 133, 403 regression modelling strategies 138–40 regression models 171–4 regression to the mean 125, 129–32, 403 regularization 170 relative risk 31, 403 reliability of data 77–9 replication crisis in science 11–12 representative sampling 82 reproducibility crisis 11–12, 297, 342–7, 403 researcher degrees of freedom 350–1 residual errors 129, 403 residuals 122–5, 403 response variables 126, 135–8 retrospective cohort studies 109, 403 reverse causation 112–15, 404 Richard III 316–21 risk, expression of 34 robust measures 51 ROC (Receiver Operating Characteristic) curves 157–60, 160, 402 Rosling, Hans 71 Royal Statistical Society 68, 79 rules for effective statistical practice 379–80 Ryanair 79 S Salmon 279 sample distribution 43 sample mean 190–1, 395 sample size 191, 192–5, 193–4, 283–7 sampling 81–2, 93 sampling distributions 197, 404 scatter-plots 2–4, 3 scientific research 11–12 selective reporting 12, 347 sensitivity 157–60, 404 sentencing 180 Sequential Probability Ratio Test (SPRT) 292, 293 sequential testing 291–2, 404 sex ratio 253–5, 254, 261, 265 sexual partners 47, 51–6, 53, 55, 73–5, 191–201, 193–4, 196, 198, 200 Shipman, Harold 1–6, 287–94, 289, 293 shoe sizes 49 shrinkage 327, 404 sigma 190, 281–2 signal and the noise 129, 404 significance testing see null hypothesis significance testing Silver, Nate 27 Simonsohn, Uri 349–52, 366 Simpson’s Paradox 111, 112, 405 size of a test 285–6, 405 skewed distribution 43, 405 smoking 98, 114, 266 social acceptability bias 74 social physics 226 Somerton, Francis see Titanic challenge sortilege 81 sortition 81 Spearman’s rank correlation 58–60, 405 specificity 157–9, 405 speed cameras 130, 131–2 speed of light 247 sports doping 310–13, 311–12, 314, 315–16 sports teams 130–1 spread 49–51 SPRT (Sequential Probability Ratio Test) 292, 293 standard deviation 49, 88, 126, 405 standard error 231, 405–6 statins 36–7, 99–104, 273–5, 274, 282–7 statistical analysis 6–12, 15 statistical inference 208, 219, 229–51, 305–38, 323–8, 335, 404 statistical methods 12, 346–7, 379 statistical models 121, 128–9, 404 statistical practice 365–7 statistical science 2, 7, 404 statistical significance 255, 265–8, 270–82, 404 Statistical Society 68 statistics – assessment of claims 368–71 – as a discipline 10–11 – ideology 334–8 – improvements 362–4 – meaning of 404 – publications 16 – rules for effective practice 379–80 – teaching of 13–15 STEP (Study of the Therapeutic Effects of Intercessory Prayer) 107 storytelling 69–71 stratification 110, 383 Streptomycin clinical trial 105, 114 strip-charts 42, 43, 44, 45 strokes 99–104 Student’s t-statistic 275–7 Study of the Therapeutic Effects of Intercessory Prayer (STEP) 107 subjective probability 218–19 summaries 40, 49, 50, 51 supermarkets 112–14 supervised learning 143–4, 404 support-vector machines 174 surgery – breast cancer surgery 181–5, 183–4 – heart surgery 19–21, 22–4, 23, 56–8, 57, 93, 136–8, 137 – hernia surgery 106 survival rates 25–7, 26, 56–61, 57, 60–1 systematic reviews 102–4 T T-statistic 275–7, 404 tables 22–7, 23 tail-area 231 tea tasting 266 teachers 178–9 teaching of statistics 13–15 technology 1 telephone polls 82 Titanic challenge 148–56, 150, 152–3, 155, 162, 166–7, 172, 173, 175, 176, 177, 277 transposed conditionals, law of 216, 313 trees 7–8 trends 61–6, 62–4, 67 two-sided tests 265, 397–8 two-tailed P-values 265, 398 Type I errors 283–5, 404 Type II errors 283–5, 407 U Uncertainty 208, 240, 306–7, 383, 390 uncertainty intervals 199, 200, 241, 335 unemployment 8–9, 189–91, 271–3 university education 95–6, 135, 301–3 see also Cambridge University unsupervised learning 147, 407 US Presidents 167–9 V Vaccination 113 validity of data 79–83 variability 10, 49–51, 178–9, 407 variables 27, 56–61 variance 49, 407 Vietnam War draft lottery 81–2 violence 113 virtual populations 92 volunteer bias 85 voting age 79–80 W Waitrose 112–14 weather forecasts 161, 164, 165 weight loss 348 ‘When I’m Sixty-Four’ 351–2 wisdom of crowds 39–40, 48, 51, 407 Z Z-scores 89, 407 PELICAN BOOKS Economics: The User’s Guide Ha-Joon Chang Human Evolution Robin Dunbar Revolutionary Russia: 1891–1991 Orlando Figes The Domesticated Brain Bruce Hood Greek and Roman Political Ideas Melissa Lane Classical Literature Richard Jenkyns Who Governs Britain?


pages: 281 words: 79,464

Against Empathy: The Case for Rational Compassion by Paul Bloom

affirmative action, Albert Einstein, Asperger Syndrome, Atul Gawande, Columbine, David Brooks, Donald Trump, effective altruism, Ferguson, Missouri, impulse control, meta analysis, meta-analysis, Paul Erdős, period drama, Peter Singer: altruism, publication bias, Ralph Waldo Emerson, replication crisis, Ronald Reagan, social intelligence, Stanford marshmallow experiment, Steven Pinker, theory of mind, Walter Mischel, Yogi Berra

Gazzaniga, The Ethical Brain: The Science of Our Moral Dilemmas (New York: Dana Press, 2005). 221 countless demonstrations For a good review of these experiments and others, see Adam Alter, Drunk Tank Pink: And Other Unexpected Forces That Shape How We Think, Feel, and Behave (New York: Penguin Books, 2013). 222 Dick Finder Example from John M. Doris, Talking to Our Selves: Reflection, Ignorance, and Agency (Oxford: Oxford University Press, 2015). 223 Jonathan Haidt captures Jonathan Haidt, “The Emotional Dog and Its Rational Tail: A Social Intuitionist Approach to Moral Judgment,” Psychological Review 108 (2001): 814–34. The issue in “repligate” For discussion, see Paul Bloom, “Psychology’s Replication Crisis Has a Silver Lining,” The Atlantic, February 19, 2016, http://www.theatlantic.com/science/archive/2016/02/psychology-studies-replicate/468537. 224 eventually published this failure Brian D. Earp et al., “Out, Damned Spot: Can the ‘Macbeth Effect’ Be Replicated?” Basic and Applied Social Psychology 36 (2014): 91–98. Your impression of a résumé Joshua M. Ackerman, Christopher C. Nocera, and John A.


Blueprint: The Evolutionary Origins of a Good Society by Nicholas A. Christakis

agricultural Revolution, Alfred Russel Wallace, Amazon Mechanical Turk, assortative mating, Cass Sunstein, crowdsourcing, David Attenborough, different worldview, disruptive innovation, double helix, epigenetics, experimental economics, experimental subject, invention of agriculture, invention of gunpowder, invention of writing, iterative process, job satisfaction, Joi Ito, joint-stock company, land tenure, Laplace demon, longitudinal study, Mahatma Gandhi, Marc Andreessen, means of production, mental accounting, meta analysis, meta-analysis, microbiome, out of africa, phenotype, Pierre-Simon Laplace, placebo effect, race to the bottom, Ralph Waldo Emerson, replication crisis, Rubik’s Cube, Silicon Valley, social intelligence, social web, stem cell, Steven Pinker, the scientific method, theory of mind, twin studies, ultimatum game, zero-sum game

Heisenberg also argued that positivists could actually undermine their own program and illustrated this with an example from the history of science in which claims of meteorites in the eighteenth century were “dismissed as rank superstition,” whereas, of course, they do exist. 37. D. Kevles, In the Name of Eugenics: Genetics and the Uses of Human Heredity (New York: Knopf, 1985); R. Merton, The Sociology of Science: Theoretical and Empirical Investigations (Chicago: University of Chicago Press, 1973). We should also be cognizant of the “replication crisis” afflicting so many branches of science in the 2010s, including psychology, economics, physics, biology, epidemiology, and oncology. 38. P. W. Anderson, “More Is Different,” Science 177 (1972): 393–396. 39. Interestingly, humans are natural-born essentialists. From an early age, we categorize objects according to fundamental commonalities, discriminate between these categories, and assign each category a basic essence.


pages: 741 words: 199,502

Human Diversity: The Biology of Gender, Race, and Class by Charles Murray

23andMe, affirmative action, Albert Einstein, Alfred Russel Wallace, Asperger Syndrome, assortative mating, basic income, bioinformatics, Cass Sunstein, correlation coefficient, Daniel Kahneman / Amos Tversky, double helix, Drosophila, epigenetics, equal pay for equal work, European colonialism, feminist movement, glass ceiling, Gunnar Myrdal, income inequality, Kenneth Arrow, labor-force participation, longitudinal study, meta analysis, meta-analysis, out of africa, p-value, phenotype, publication bias, quantitative hedge fund, randomized controlled trial, replication crisis, Richard Thaler, risk tolerance, school vouchers, Scientific racism, selective serotonin reuptake inhibitor (SSRI), Silicon Valley, social intelligence, statistical model, Steven Pinker, The Bell Curve by Richard Herrnstein and Charles Murray, the scientific method, The Wealth of Nations by Adam Smith, theory of mind, Thomas Kuhn: the structure of scientific revolutions, twin studies, universal basic income, working-age population

You can square that figure and point out that IQ explains only 25 percent of the variance in job performance. If you’re an employer, however, and are told that a standard deviation increase in IQ is associated with half a standard deviation increase in overall job performance, a predictive validity of +.50 is a big deal.18 Since we live in an age when the social sciences are suffering from a replication crisis, I emphasize again that the generalizations I have made about the relationships of g to educational attainment and job productivity are drawn from hundreds of studies. Psychometric g Versus Other Personal Traits The popular suspicion of IQ’s relationship to success has been tenacious, but for an understandable reason. Anyone who has reached adulthood is aware of all the things besides intelligence that matter in achieving success.