butter production in bangladesh

15 results back to index


pages: 402 words: 110,972

Nerds on Wall Street: Math, Machines and Wired Markets by David J. Leinweber

AI winter, algorithmic trading, asset allocation, banking crisis, barriers to entry, Big bang: deregulation of the City of London, business cycle, butter production in bangladesh, butterfly effect, buttonwood tree, buy and hold, buy low sell high, capital asset pricing model, citizen journalism, collateralized debt obligation, corporate governance, Craig Reynolds: boids flock, creative destruction, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, Danny Hillis, demand response, disintermediation, distributed generation, diversification, diversified portfolio, Emanuel Derman, en.wikipedia.org, experimental economics, financial innovation, fixed income, Gordon Gekko, implied volatility, index arbitrage, index fund, information retrieval, intangible asset, Internet Archive, John Nash: game theory, Kenneth Arrow, load shedding, Long Term Capital Management, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, market fragmentation, market microstructure, Mars Rover, Metcalfe’s law, moral hazard, mutually assured destruction, Myron Scholes, natural language processing, negative equity, Network effects, optical character recognition, paper trading, passive investing, pez dispenser, phenotype, prediction markets, quantitative hedge fund, quantitative trading / quantitative finance, QWERTY keyboard, RAND corporation, random walk, Ray Kurzweil, Renaissance Technologies, risk tolerance, risk-adjusted returns, risk/return, Robert Metcalfe, Ronald Reagan, Rubik’s Cube, semantic web, Sharpe ratio, short selling, Silicon Valley, Small Order Execution System, smart grid, smart meter, social web, South Sea Bubble, statistical arbitrage, statistical model, Steve Jobs, Steven Levy, Tacoma Narrows Bridge, the scientific method, The Wisdom of Crowds, time value of money, too big to fail, transaction costs, Turing machine, Upton Sinclair, value at risk, Vernor Vinge, yield curve, Yogi Berra, your tax dollars at work

A central theme for anyone doing this kind of forecasting is that it is remarkably easy to fool yourself. Once, as a demonstration, we set our machinery loose to find the best predictor of the year-end close for the S&P 500. We avoided any financial indicators, but used only data the UN compiled profiling 145 member nations. There were thousands of annual time series for each country. Which of all these series had the strongest correlation with U.S. stocks? Butter production in Bangladesh, with a correlation of 75 percent! Getting into the spirit, we tossed in cheese, and brought it up to 95 percent. Using only dairy products is an undiversified approach, so we added sheep population to the mix and took it up to 99 percent, in sample, over 10 years. Adding random data to a regression does that. The out-of-sample predictions are less than worthless, often negative. This business with the butter, cheese, and sheep has been widely cited.

The Wikipedia patrol does not have much more of a sense of humor than Don Rice did, and it keeps getting edited out and replaced by dry bibliographic material and biographical details. Fortunately, they can’t erase a book. Never underestimate the ability of people not to get the gag. Chapter 6, “Stupid Data Miner Tricks,” is very much in the spirit of “The Tumescent Threat,” but I still get calls asking about current butter production in Bangladesh. Intr oduction xlv 6. It is now complete, and is utterly awesome. See the video at www.deltawerken.com/ The-Oosterschelde-storm-surge-barrier/324.html. This is one of the premier flood control projects in the world, and particularly instructive when compared with the misplaced concrete slabs in New Orleans. 7. RAND had some distinguished financial alumni, foremost among them Harry Markowitz and Bill Sharpe.The ideas of operations research and optimization of risk and reward under constraints in military problems generalized, as we have seen, to a wide swath of finance. 8.

The example in this paper is intended as a blatant instance of totally bogus application of data mining in finance. My quant equity research group first did this several years ago to make the point about the need to be aware of the risks of data mining in quantitative investing. In total disregard of common sense, we showed the strong statistical association between the annual changes in the S&P 500 index and butter production in Bangladesh, along with other farm products. Reporters picked up on it, and it has found its way into the curriculum at the Stanford Business School and elsewhere. We never published it, since it was supposed to be a joke. With all the requests for the nonexistent publication, and the graying out of many generations of copies of copies of the charts, it seemed to be time to write it up for real.


pages: 295 words: 66,824

A Mathematician Plays the Stock Market by John Allen Paulos

Benoit Mandelbrot, Black-Scholes formula, Brownian motion, business climate, business cycle, butter production in bangladesh, butterfly effect, capital asset pricing model, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, diversified portfolio, dogs of the Dow, Donald Trump, double entry bookkeeping, Elliott wave, endowment effect, Erdős number, Eugene Fama: efficient market hypothesis, four colour theorem, George Gilder, global village, greed is good, index fund, intangible asset, invisible hand, Isaac Newton, John Nash: game theory, Long Term Capital Management, loss aversion, Louis Bachelier, mandelbrot fractal, margin call, mental accounting, Myron Scholes, Nash equilibrium, Network effects, passive investing, Paul Erdős, Paul Samuelson, Ponzi scheme, price anchoring, Ralph Nelson Elliott, random walk, Richard Thaler, Robert Shiller, Robert Shiller, short selling, six sigma, Stephen Hawking, stocks for the long run, survivorship bias, transaction costs, ultimatum game, Vanguard fund, Yogi Berra

People commonly pore over price and trade data attempting to discover investment schemes that have worked in the past. In a reductio ad absurdum of such unfocused fishing for associations, David Leinweber in the mid-90s exhaustively searched the economic data on a United Nations CD-ROM and found that the best predictor of the value of the S&P 500 stock index was—a drum roll here—butter production in Bangladesh. Needless to say, butter production in Bangladesh has probably not remained the best predictor of the S&P 500. Whatever rules and regularities are discovered within a sample must be applied to new data if they’re to be accorded any limited credibility. You can always arbitrarily define a class of stocks that in retrospect does extraordinarily well, but will it continue to do so? I’m reminded of a well-known paradox devised (for a different purpose) by the philosopher Nelson Goodman.


pages: 263 words: 75,455

Quantitative Value: A Practitioner's Guide to Automating Intelligent Investment and Eliminating Behavioral Errors by Wesley R. Gray, Tobias E. Carlisle

activist fund / activist shareholder / activist investor, Albert Einstein, Andrei Shleifer, asset allocation, Atul Gawande, backtesting, beat the dealer, Black Swan, business cycle, butter production in bangladesh, buy and hold, capital asset pricing model, Checklist Manifesto, cognitive bias, compound rate of return, corporate governance, correlation coefficient, credit crunch, Daniel Kahneman / Amos Tversky, discounted cash flows, Edward Thorp, Eugene Fama: efficient market hypothesis, forensic accounting, hindsight bias, intangible asset, Louis Bachelier, p-value, passive investing, performance metric, quantitative hedge fund, random walk, Richard Thaler, risk-adjusted returns, Robert Shiller, Robert Shiller, shareholder value, Sharpe ratio, short selling, statistical model, survivorship bias, systematic trading, The Myth of the Rational Market, time value of money, transaction costs

A regression analysis of, for example, height and weight in humans shows a strong, positive relationship of approximately 0.7 or 70 percent. The taller you are, the heavier you are likely to be. After running a regression analysis of the UN's international data series for all 140 member countries, Leinweber made a stunning discovery. A simple dairy product from an unlikely country explained 75 percent of the variation in the S&P 500. What was it? Butter production in Bangladesh. Leinweber knew he was on to something. Maybe he could do better by including global data on a broader selection of dairy products. What about including cheese and U.S. production? Leinweber consulted the data. Amazingly, the R-squared vaulted to 95 percent accuracy. But what was driving these returns? By including a third variable— sheep population—Leinweber found that he could explain 99 percent of the movement in the S&P 500 for the period 1983 to 1993.

By including a third variable— sheep population—Leinweber found that he could explain 99 percent of the movement in the S&P 500 for the period 1983 to 1993. Close to a perfect fit. Leinweber didn't immediately publish his findings. They seemed to good to be true. Reporters picked up on Leinweber's study, and the research finding found its way into the curriculum at the Stanford Graduate School of Business and elsewhere. Leinweber started getting calls from investors about the status of butter production in Bangladesh. With the charts fading from being copied time and time again, he decided to write up the study and publish it. Leinweber's study was of course meant as a joke to illustrate the dangers of data mining. Data mining is the practice of analyzing huge amounts of data to find relationships between data series that are merely coincidental over the period analyzed. Bangladeshi butter production, for example, is useless as a predictor of the S&P 500 before 1983 or after 1993.


pages: 502 words: 107,657

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by Eric Siegel

Albert Einstein, algorithmic trading, Amazon Mechanical Turk, Apple's 1984 Super Bowl advert, backtesting, Black Swan, book scanning, bounce rate, business intelligence, business process, butter production in bangladesh, call centre, Charles Lindbergh, commoditize, computer age, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data is the new oil, en.wikipedia.org, Erik Brynjolfsson, Everything should be made as simple as possible, experimental subject, Google Glasses, happiness index / gross national happiness, job satisfaction, Johann Wolfgang von Goethe, lifelogging, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mass immigration, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, Norbert Wiener, personalized medicine, placebo effect, prediction markets, Ray Kurzweil, recommendation engine, risk-adjusted returns, Ronald Coase, Search for Extraterrestrial Intelligence, self-driving car, sentiment analysis, Shai Danziger, software as a service, speech recognition, statistical model, Steven Levy, text mining, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!, X Prize, Yogi Berra, zero-sum game

—British Prime Minister Benjamin Disraeli (quote popularized by Mark Twain) An unlimited amount of computational resources is like dynamite: If used properly, it can move mountains. Used improperly, it can blow up your garage or your portfolio. —David Leinweber, Nerds on Wall Street A few years ago, Berkeley Professor David Leinweber made waves with his discovery that the annual closing price of the S&P 500 stock market index could have been predicted from 1983 to 1993 by the rate of butter production in Bangladesh. Bangladesh’s butter production mathematically explains 75 percent of the index’s variation over that time. Urgent calls were placed to the Credibility Police, since it certainly cannot be believed that Bangladesh’s butter is closely tied to the U.S. stock market. If its butter production boomed or went bust in any given year, how could it be reasonable to assume that U.S. stocks would follow suit?

If, instead of looking at how one factor simply shadows another, you apply the dynamics of machine learning to create models that combine factors, the match can appear even more perfect. It’s a catchphrase favored by naysayers: “Hey, throw in something irrelevant like the daily temperature as another factor, and a regression model gets better—what does that say about this kind of analysis?” Leinweber got as far as 99 percent accuracy predicting the S&P 500 by allowing a regression model to work with not only Bangladesh’s butter production, but Bangladesh’s sheep population, U.S. butter production, and U.S. cheese production. As a lactose-intolerant data scientist, I protest! Leinweber attracted the attention he sought, but his lesson didn’t seem to sink in. “I got calls for years asking me what the current butter business in Bangladesh was looking like and I kept saying, ‘Ya know, it was a joke, it was a joke!’ It’s scary how few people actually get that.”


pages: 267 words: 71,941

How to Predict the Unpredictable by William Poundstone

accounting loophole / creative accounting, Albert Einstein, Bernie Madoff, Brownian motion, business cycle, butter production in bangladesh, buy and hold, buy low sell high, call centre, centre right, Claude Shannon: information theory, computer age, crowdsourcing, Daniel Kahneman / Amos Tversky, Edward Thorp, Firefox, fixed income, forensic accounting, high net worth, index card, index fund, John von Neumann, market bubble, money market fund, pattern recognition, Paul Samuelson, Ponzi scheme, prediction markets, random walk, Richard Thaler, risk-adjusted returns, Robert Shiller, Robert Shiller, Rubik’s Cube, statistical model, Steven Pinker, transaction costs

The familiar caveat is that the future could be different from the past, perhaps in ways we can’t envision. This applies to any trading system, including having no system at all. There is also a subtler issue. A trading system can be “over-fitted” to the data. Economist and money manager David J. Leinweber supplied a classic example. He searched UN statistics to determine that the best predictor of S&P 500 performance was … butter production in Bangladesh. The connection was, of course, just a coincidence. Leinweber’s point was that not all that correlates is gold. While no one would be so daft as to use butter production as a buy signal for stocks, it’s not always easy to tell what’s a useful predictor. It’s been seriously or semiseriously proposed that hemlines, sunspots, and the political party in the White House predict stock market returns.


pages: 256 words: 60,620

Think Twice: Harnessing the Power of Counterintuition by Michael J. Mauboussin

affirmative action, asset allocation, Atul Gawande, availability heuristic, Benoit Mandelbrot, Bernie Madoff, Black Swan, butter production in bangladesh, Cass Sunstein, choice architecture, Clayton Christensen, cognitive dissonance, collateralized debt obligation, Daniel Kahneman / Amos Tversky, deliberate practice, disruptive innovation, Edward Thorp, experimental economics, financial innovation, framing effect, fundamental attribution error, Geoffrey West, Santa Fe Institute, George Akerlof, hindsight bias, hiring and firing, information asymmetry, libertarian paternalism, Long Term Capital Management, loose coupling, loss aversion, mandelbrot fractal, Menlo Park, meta analysis, meta-analysis, money market fund, Murray Gell-Mann, Netflix Prize, pattern recognition, Philip Mirowski, placebo effect, Ponzi scheme, prediction markets, presumed consent, Richard Thaler, Robert Shiller, Robert Shiller, statistical model, Steven Pinker, The Wisdom of Crowds, ultimatum game

One favorite is the Super Bowl Indicator, invariably trotted out after the football season’s championship game. The indicator is simple: the stock market goes up when a National Football Conference team wins and goes down when an American Football Conference team wins. The Super Bowl winner has correctly predicted the stock market’s direction nearly 80 percent of the time from 1967 to 2008. Another is David Leinweber’s analysis that shows a 75 percent correlation between butter production in Bangladesh and the level of the Standard & Poor’s 500 Stock Index (1981–1993). Leinweber mined a wide range of international data series and was pleased to find that “a simple dairy product” explained so much.15 Leinweber used a silly example to make a serious point: the failure to distinguish between correlation and causality. This problem arises when researchers observe a correlation between two variables and assume that one caused the other.


pages: 407 words: 114,478

The Four Pillars of Investing: Lessons for Building a Winning Portfolio by William J. Bernstein

asset allocation, Bretton Woods, British Empire, business cycle, butter production in bangladesh, buy and hold, buy low sell high, carried interest, corporate governance, cuban missile crisis, Daniel Kahneman / Amos Tversky, Dava Sobel, diversification, diversified portfolio, Edmond Halley, equity premium, estate planning, Eugene Fama: efficient market hypothesis, financial independence, financial innovation, fixed income, George Santayana, German hyperinflation, high net worth, hindsight bias, Hyman Minsky, index fund, invention of the telegraph, Isaac Newton, John Harrison: Longitude, Long Term Capital Management, loss aversion, market bubble, mental accounting, money market fund, mortgage debt, new economy, pattern recognition, Paul Samuelson, quantitative easing, railway mania, random walk, Richard Thaler, risk tolerance, risk/return, Robert Shiller, Robert Shiller, South Sea Bubble, stocks for the long run, stocks for the long term, survivorship bias, The inhabitant of London could order by telephone, sipping his morning tea in bed, the various products of the whole earth, the rule of 72, transaction costs, Vanguard fund, yield curve, zero-sum game

The classic, if somewhat hackneyed, example of this is the “Super Bowl Indicator”: when a team from the old NFL wins, the market does well, and when a team from the old AFL wins, it does poorly. In fact, if one analyzes a lot of random data, it is not too difficult to find some things that seem to correlate closely with market returns. For example, on a lark, David Leinweber of First Quadrant sifted through a United Nations database and discovered that movements in the stock market were almost perfectly correlated with butter production in Bangladesh. This is not one I’d want to test going forward with my own money. Fama’s timing, though, was perfect. He came to the University of Chicago for graduate work not long after Merrill Lynch had funded the Center for Research in Security Prices (CRSP) in Chicago. This remarkable organization, with the availability of the electronic computer, made possible the storage and analysis of a mass and quality of stock data that Cowles could only dream of.


pages: 348 words: 83,490

More Than You Know: Finding Financial Wisdom in Unconventional Places (Updated and Expanded) by Michael J. Mauboussin

Albert Einstein, Andrei Shleifer, Atul Gawande, availability heuristic, beat the dealer, Benoit Mandelbrot, Black Swan, Brownian motion, butter production in bangladesh, buy and hold, capital asset pricing model, Clayton Christensen, clockwork universe, complexity theory, corporate governance, creative destruction, Daniel Kahneman / Amos Tversky, deliberate practice, demographic transition, discounted cash flows, disruptive innovation, diversification, diversified portfolio, dogs of the Dow, Drosophila, Edward Thorp, en.wikipedia.org, equity premium, Eugene Fama: efficient market hypothesis, fixed income, framing effect, functional fixedness, hindsight bias, hiring and firing, Howard Rheingold, index fund, information asymmetry, intangible asset, invisible hand, Isaac Newton, Jeff Bezos, Kenneth Arrow, Laplace demon, Long Term Capital Management, loss aversion, mandelbrot fractal, margin call, market bubble, Menlo Park, mental accounting, Milgram experiment, Murray Gell-Mann, Nash equilibrium, new economy, Paul Samuelson, Pierre-Simon Laplace, quantitative trading / quantitative finance, random walk, Richard Florida, Richard Thaler, Robert Shiller, Robert Shiller, shareholder value, statistical model, Steven Pinker, stocks for the long run, survivorship bias, The Wisdom of Crowds, transaction costs, traveling salesman, value at risk, wealth creators, women in the workforce, zero-sum game

Investor Risks As this discussion illustrates, investors should be wary of explanations for market activity. Investors that actively seek explanations for the market’s moves risk one of two pitfalls. The first pitfall is confusing correlation for causality. Certain events may be correlated to the market’s moves but may not be at all causal. In one extreme example, Cal Tech’s David Leinweber found that the single best predictor of the S&P 500 Index’s performance was butter production in Bangladesh.7 While no thoughtful investor would use butter production for predicting or explaining the market, factors that are economically closer to home may also suggest faulty causation. The second pitfall is anchoring. Substantial evidence suggests that people anchor on the first number or piece of evidence they hear to explain or describe an event. In one example, researchers asked participants to estimate the percentage of African countries in the United Nations.


pages: 407 words: 104,622

The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution by Gregory Zuckerman

affirmative action, Affordable Care Act / Obamacare, Albert Einstein, Andrew Wiles, automated trading system, backtesting, Bayesian statistics, beat the dealer, Benoit Mandelbrot, Berlin Wall, Bernie Madoff, blockchain, Brownian motion, butter production in bangladesh, buy and hold, buy low sell high, Claude Shannon: information theory, computer age, computerized trading, Credit Default Swap, Daniel Kahneman / Amos Tversky, diversified portfolio, Donald Trump, Edward Thorp, Elon Musk, Emanuel Derman, endowment effect, Flash crash, George Gilder, Gordon Gekko, illegal immigration, index card, index fund, Isaac Newton, John Meriwether, John Nash: game theory, John von Neumann, Loma Prieta earthquake, Long Term Capital Management, loss aversion, Louis Bachelier, mandelbrot fractal, margin call, Mark Zuckerberg, More Guns, Less Crime, Myron Scholes, Naomi Klein, natural language processing, obamacare, p-value, pattern recognition, Peter Thiel, Ponzi scheme, prediction markets, quantitative hedge fund, quantitative trading / quantitative finance, random walk, Renaissance Technologies, Richard Thaler, Robert Mercer, Ronald Reagan, self-driving car, Sharpe ratio, Silicon Valley, sovereign wealth fund, speech recognition, statistical arbitrage, statistical model, Steve Jobs, stochastic process, the scientific method, Thomas Bayes, transaction costs, Turing machine

If one spends enough time sorting data, it’s not hard to identify trades that seem to generate stellar returns but are produced by happenstance. Quants call this flawed approach data overfitting. To highlight the folly of relying on signals with little logic behind them, quant investor David Leinweber later would determine that US stock returns can be predicted with 99 percent accuracy by combining data for the annual butter production in Bangladesh, US cheese production, and the population of sheep in Bangladesh and the US.4 Often, the Renaissance researchers’ solution was to place such head-scratching signals in their trading system, but to limit the money allocated to them, at least at first, as they worked to develop an understanding of why the anomalies appeared. Over time, they frequently discovered reasonable explanations, giving Medallion a leg up on firms that had dismissed the phenomena.


pages: 416 words: 118,592

A Random Walk Down Wall Street: The Time-Tested Strategy for Successful Investing by Burton G. Malkiel

accounting loophole / creative accounting, Albert Einstein, asset allocation, asset-backed security, backtesting, beat the dealer, Bernie Madoff, BRICs, butter production in bangladesh, buy and hold, capital asset pricing model, compound rate of return, correlation coefficient, Credit Default Swap, Daniel Kahneman / Amos Tversky, diversification, diversified portfolio, dogs of the Dow, Edward Thorp, Elliott wave, Eugene Fama: efficient market hypothesis, experimental subject, feminist movement, financial innovation, fixed income, framing effect, hindsight bias, Home mortgage interest deduction, index fund, invisible hand, Isaac Newton, Long Term Capital Management, loss aversion, margin call, market bubble, money market fund, mortgage tax deduction, new economy, Own Your Own Home, passive investing, Paul Samuelson, pets.com, Ponzi scheme, price stability, profit maximization, publish or perish, purchasing power parity, RAND corporation, random walk, Richard Thaler, risk tolerance, risk-adjusted returns, risk/return, Robert Shiller, Robert Shiller, short selling, Silicon Valley, South Sea Bubble, stocks for the long run, survivorship bias, The Myth of the Rational Market, the rule of 72, The Wisdom of Crowds, transaction costs, Vanguard fund, zero-coupon bond

While the indicator sometimes fails, it has been correct far more often than it has been wrong. Naturally, it makes no sense. The results of the Super Bowl indicator simply illustrate nothing more than the fact that it’s sometimes possible to correlate two completely unrelated events. Indeed, Mark Hulbert reports that the stock-market researcher David Leinweber found that the indicator most closely correlated with the S&P 500 Index is the volume of butter production in Bangladesh. The Odd-Lot Theory The odd-lot theory holds that except for the investor who is always right, no one can contribute more to a successful investment strategy than an investor who is invariably wrong. The “odd-lotter,” according to popular superstition, is that kind of person. Thus, success is assured by buying when the odd-lotter sells and selling when the odd-lotter buys. Odd-lotters are the people who trade stocks in less than 100-share lots (called round lots).


pages: 482 words: 121,672

A Random Walk Down Wall Street: The Time-Tested Strategy for Successful Investing (Eleventh Edition) by Burton G. Malkiel

accounting loophole / creative accounting, Albert Einstein, asset allocation, asset-backed security, beat the dealer, Bernie Madoff, bitcoin, butter production in bangladesh, buttonwood tree, buy and hold, capital asset pricing model, compound rate of return, correlation coefficient, Credit Default Swap, Daniel Kahneman / Amos Tversky, Detroit bankruptcy, diversification, diversified portfolio, dogs of the Dow, Edward Thorp, Elliott wave, Eugene Fama: efficient market hypothesis, experimental subject, feminist movement, financial innovation, financial repression, fixed income, framing effect, George Santayana, hindsight bias, Home mortgage interest deduction, index fund, invisible hand, Isaac Newton, Long Term Capital Management, loss aversion, margin call, market bubble, money market fund, mortgage tax deduction, new economy, Own Your Own Home, passive investing, Paul Samuelson, pets.com, Ponzi scheme, price stability, profit maximization, publish or perish, purchasing power parity, RAND corporation, random walk, Richard Thaler, risk tolerance, risk-adjusted returns, risk/return, Robert Shiller, Robert Shiller, short selling, Silicon Valley, South Sea Bubble, stocks for the long run, survivorship bias, the rule of 72, The Wisdom of Crowds, transaction costs, Vanguard fund, zero-coupon bond, zero-sum game

Although the indicator sometimes fails, it has been correct far more often than it has been wrong. Naturally, it makes no sense. The results of the Super Bowl indicator simply illustrate nothing more than the fact that it’s sometimes possible to correlate two completely unrelated events. Indeed, Mark Hulbert reports that the stock-market researcher David Leinweber found that the indicator most closely correlated with the S&P 500 Index is the volume of butter production in Bangladesh. The Odd-Lot Theory The odd-lot theory holds that except for the investor who is always right, no one can contribute more to a successful investment strategy than an investor who is invariably wrong. The “odd-lotter,” according to popular superstition, is that kind of person. Thus, success is assured by buying when the odd-lotter sells and selling when the odd-lotter buys. Odd-lotters are the people who trade stocks in less than 100-share lots (called round lots).


How I Became a Quant: Insights From 25 of Wall Street's Elite by Richard R. Lindsey, Barry Schachter

Albert Einstein, algorithmic trading, Andrew Wiles, Antoine Gombaud: Chevalier de Méré, asset allocation, asset-backed security, backtesting, bank run, banking crisis, Black-Scholes formula, Bonfire of the Vanities, Bretton Woods, Brownian motion, business cycle, business process, butter production in bangladesh, buy and hold, buy low sell high, capital asset pricing model, centre right, collateralized debt obligation, commoditize, computerized markets, corporate governance, correlation coefficient, creative destruction, Credit Default Swap, credit default swaps / collateralized debt obligations, currency manipulation / currency intervention, discounted cash flows, disintermediation, diversification, Donald Knuth, Edward Thorp, Emanuel Derman, en.wikipedia.org, Eugene Fama: efficient market hypothesis, financial innovation, fixed income, full employment, George Akerlof, Gordon Gekko, hiring and firing, implied volatility, index fund, interest rate derivative, interest rate swap, John von Neumann, linear programming, Loma Prieta earthquake, Long Term Capital Management, margin call, market friction, market microstructure, martingale, merger arbitrage, Myron Scholes, Nick Leeson, P = NP, pattern recognition, Paul Samuelson, pensions crisis, performance metric, prediction markets, profit maximization, purchasing power parity, quantitative trading / quantitative finance, QWERTY keyboard, RAND corporation, random walk, Ray Kurzweil, Richard Feynman, Richard Stallman, risk-adjusted returns, risk/return, shareholder value, Sharpe ratio, short selling, Silicon Valley, six sigma, sorting algorithm, statistical arbitrage, statistical model, stem cell, Steven Levy, stochastic process, systematic trading, technology bubble, The Great Moderation, the scientific method, too big to fail, trade route, transaction costs, transfer pricing, value at risk, volatility smile, Wiener process, yield curve, young professional

A central theme for anyone doing this kind of forecasting is that it is remarkably easy to fool yourself. Once as a demonstration, we set our machinery loose to find the best predictor of the year-end close for the S&P 500. We avoided any financial indicators, but used only data the UN compiled profiling 145 member nations. There were thousands of annual time series for each country. Which of all these series had the strongest correlation with U.S. stocks? Butter production in Bangladesh, with a correlation of 75 percent! Getting into the spirit, we tossed in cheese, and brought it up to 95 percent. Using only dairy products is an undiversified approach, so add sheep population to the mix and take it up to 99 percent, in sample, over 10 years. Adding random data to a regression does that. The out-of-sample predictions are less than worthless, often negative. This business with the butter, cheese, and sheep has been widely cited.


Evidence-Based Technical Analysis: Applying the Scientific Method and Statistical Inference to Trading Signals by David Aronson

Albert Einstein, Andrew Wiles, asset allocation, availability heuristic, backtesting, Black Swan, butter production in bangladesh, buy and hold, capital asset pricing model, cognitive dissonance, compound rate of return, computerized trading, Daniel Kahneman / Amos Tversky, distributed generation, Elliott wave, en.wikipedia.org, feminist movement, hindsight bias, index fund, invention of the telescope, invisible hand, Long Term Capital Management, mental accounting, meta analysis, meta-analysis, p-value, pattern recognition, Paul Samuelson, Ponzi scheme, price anchoring, price stability, quantitative trading / quantitative finance, Ralph Nelson Elliott, random walk, retrograde motion, revision control, risk tolerance, risk-adjusted returns, riskless arbitrage, Robert Shiller, Robert Shiller, Sharpe ratio, short selling, source of truth, statistical model, stocks for the long run, systematic trading, the scientific method, transfer pricing, unbiased observer, yield curve, Yogi Berra

Leinweber, on the faculty of California Institute of Technology and formerly a managing partner at First Quandrant, a quantitative pension management company, has warned financial market researchers about the data-mining bias. To illustrate the pitfalls of excessive searching, he tested several hundred economic time series in a UN database to Data-Mining Bias: The Fool’s Gold of Objective TA 261 find the one with the highest predictive correlation to the S&P 500. It turned out to be the level of butter production in Bangladesh, with a correlation of about 0.70, an unusually high correlation in the domain of economic forecasting. Intuition alone would tell us a high correlation between Bangladesh butter and the S&P 500 is specious, but now imagine if the time series with the highest correlation had a plausible connection to the S&P 500. Intuition would not warn us. As Leinweber points out, when the total number of time series examined is taken into account, the correlation between Bangladesh butter and the S&P 500 Index is not statistically significant.


pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzić

Albert Einstein, bioinformatics, business cycle, business intelligence, business process, butter production in bangladesh, combinatorial explosion, computer vision, conceptual framework, correlation coefficient, correlation does not imply causation, data acquisition, discrete time, El Camino Real, fault tolerance, finite state, Gini coefficient, information retrieval, Internet Archive, inventory management, iterative process, knowledge worker, linked data, loose coupling, Menlo Park, natural language processing, Netflix Prize, NP-complete, PageRank, pattern recognition, peer-to-peer, phenotype, random walk, RFID, semantic web, speech recognition, statistical model, Telecommunications Act of 1996, telemarketer, text mining, traveling salesman, web application

When used improperly, data mining can generate lots of “garbage.” As one professor from MIT pointed out: “Given enough time, enough attempts, and enough imagination, almost any set of data can be teased out of any conclusion.” David J. Lainweber, managing director of First Quadrant Corp. in Pasadena, California, gives an example of the pitfalls of data mining. Working with a United Nations data set, he found that historically, butter production in Bangladesh is the single best predictor of the Standard & Poor’s 500-stock index. This example is similar to another absurd correlation that is heard yearly around Super Bowl time—a win by the NFC team implies a rise in stock prices. Peter Coy, Business Week’s associate economics editor, warns of four pitfalls in data mining: 1. It is tempting to develop a theory to fit an oddity found in the data. 2.


pages: 322 words: 84,752