data science

347 results back to index


pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline by Cathy O'Neil, Rachel Schutt

Amazon Mechanical Turk, augmented reality, Augustin-Louis Cauchy, barriers to entry, Bayesian statistics, bike sharing, bioinformatics, computer vision, confounding variable, correlation does not imply causation, crowdsourcing, data science, distributed generation, Dunning–Kruger effect, Edward Snowden, Emanuel Derman, fault tolerance, Filter Bubble, finite state, Firefox, game design, Google Glasses, index card, information retrieval, iterative process, John Harrison: Longitude, Khan Academy, Kickstarter, machine translation, Mars Rover, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, performance metric, personalized medicine, pull request, recommendation engine, rent-seeking, selection bias, Silicon Valley, speech recognition, statistical model, stochastic process, tacit knowledge, text mining, the scientific method, The Wisdom of Crowds, Watson beat the top human players on Jeopardy!, X Prize

datafication, Datafication employment opportunities in, Data Science Jobs Facebook and, The Current Landscape (with a Little History) Harvard Business Review, The Current Landscape (with a Little History) history of, The Current Landscape (with a Little History)–Data Science Jobs industry vs. academia in, Getting Past the Hype LinkedIn and, The Current Landscape (with a Little History) meta-definition thought experiment, Thought Experiment: Meta-Definition privacy and, Privacy process of, The Data Science Process–A Data Scientist’s Role in This Process RealDirect case study, Case Study: RealDirect–Sample R code scientific method and, A Data Scientist’s Role in This Process scientists, A Data Science Profile–Thought Experiment: Meta-Definition sociology and, Gabriel Tarde teams, A Data Science Profile Venn diagram of, The Current Landscape (with a Little History) data science competitions, Background: Data Science Competitions Kaggle and, A Single Contestant data scientists, A Data Science Profile–Thought Experiment: Meta-Definition as problem solvers, Being Problem Solvers chief, The Life of a Chief Data Scientist defining, A Data Science Profile ethics of, Being an Ethical Data Scientist–Being an Ethical Data Scientist female, On Being a Female Data Scientist hubris and, Being an Ethical Data Scientist–Being an Ethical Data Scientist in academia, In Academia in industry, In Industry next generation of, What Are Next-Gen Data Scientists?

datafication, Datafication employment opportunities in, Data Science Jobs Facebook and, The Current Landscape (with a Little History) Harvard Business Review, The Current Landscape (with a Little History) history of, The Current Landscape (with a Little History)–Data Science Jobs industry vs. academia in, Getting Past the Hype LinkedIn and, The Current Landscape (with a Little History) meta-definition thought experiment, Thought Experiment: Meta-Definition privacy and, Privacy process of, The Data Science Process–A Data Scientist’s Role in This Process RealDirect case study, Case Study: RealDirect–Sample R code scientific method and, A Data Scientist’s Role in This Process scientists, A Data Science Profile–Thought Experiment: Meta-Definition sociology and, Gabriel Tarde teams, A Data Science Profile Venn diagram of, The Current Landscape (with a Little History) data science competitions, Background: Data Science Competitions Kaggle and, A Single Contestant data scientists, A Data Science Profile–Thought Experiment: Meta-Definition as problem solvers, Being Problem Solvers chief, The Life of a Chief Data Scientist defining, A Data Science Profile ethics of, Being an Ethical Data Scientist–Being an Ethical Data Scientist female, On Being a Female Data Scientist hubris and, Being an Ethical Data Scientist–Being an Ethical Data Scientist in academia, In Academia in industry, In Industry next generation of, What Are Next-Gen Data Scientists?–Being Question Askers questioning as, Being Question Askers role of, in data science process, A Data Scientist’s Role in This Process soft skills of, Cultivating Soft Skills data visualization, Data Visualization and Fraud Detection–Data Visualization Exercise at Square, Data Visualization at Square–Data Visualization at Square Before Us is the Salesman’s House (Thorp/Hansen), eBay Transactions and Books–eBay Transactions and Books Cronkite Plaza (Thorp/Rubin/Hansen), Cronkite Plaza distant reading, Franco Moretti fraud and, The Risk Challenge–Detecting suspicious activity using machine learning Hansen, Mark, Data Visualization and Fraud Detection–Goals of These Exhibits history of, Data Visualization History Lives on a Screen (Thorp/Hansen), Project Cascade: Lives on a Screen machine learning and, Data Science and Risk Moveable Type (Rubin/Hansen), New York Times Lobby: Moveable Type–New York Times Lobby: Moveable Type personal data collection thought experiment, Mark’s Thought Experiment Processing programming language, Processing risk and, Data Science and Risk–Ian’s Thought Experiment samples of, A Sample of Data Visualization Projects–Mark’s Data Visualization Projects Shakespeare Machine (Rubin/Hansen), Public Theater Shakespeare Machine sociology and, Gabriel Tarde tutorials for, Data Visualization for the Rest of Us–Data Visualization for the Rest of Us data visualization exercise, Data Visualization Exercise data-generating processes, Statistical Inference DataEDGE, Challenges in features and learning datafication, Why Now?

Social scientists also do tend to be good question askers and have other good investigative qualities, so a social scientist who also has the quantitative and programming chops makes a great data scientist. But it’s almost a “historical” (historical is in quotes because 2008 isn’t that long ago) artifact to limit your conception of a data scientist to someone who works only with online user behavior data. There’s another emerging field out there called computational social sciences, which could be thought of as a subset of data science. But we can go back even further. In 2001, William Cleveland wrote a position paper about data science called “Data Science: An action plan to expand the field of statistics.” So data science existed before data scientists? Is this semantics, or does it make sense?


pages: 660 words: 141,595

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by Foster Provost, Tom Fawcett

Albert Einstein, Amazon Mechanical Turk, Apollo 13, big data - Walmart - Pop Tarts, bioinformatics, business process, call centre, chief data officer, Claude Shannon: information theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, data acquisition, data science, David Brooks, en.wikipedia.org, Erik Brynjolfsson, Gini coefficient, Helicobacter pylori, independent contractor, information retrieval, intangible asset, iterative process, Johann Wolfgang von Goethe, Louis Pasteur, Menlo Park, Nate Silver, Netflix Prize, new economy, p-value, pattern recognition, placebo effect, price discrimination, recommendation engine, Ronald Coase, selection bias, Silicon Valley, Skype, SoftBank, speech recognition, Steve Jobs, supply-chain management, systems thinking, Teledyne, text mining, the long tail, The Signal and the Noise by Nate Silver, Thomas Bayes, transaction costs, WikiLeaks

., A Firm’s Data Science Maturity structure, Machine Learning and Data Mining techniques, Data Science, Engineering, and Data-Driven Decision Making technology vs. theory of, Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist–Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist understanding, The Ubiquity of Data Opportunities, Data Processing and “Big Data” data science maturity, of firms, A Firm’s Data Science Maturity–A Firm’s Data Science Maturity data scientists academic, Attracting and Nurturing Data Scientists and Their Teams as scientific advisors, Attracting and Nurturing Data Scientists and Their Teams attracting/nurturing, Attracting and Nurturing Data Scientists and Their Teams–Attracting and Nurturing Data Scientists and Their Teams evaluating, Superior Data Scientists–Superior Data Scientists managing, Superior Data Science Management–Superior Data Science Management Data Scientists, LLC, Attracting and Nurturing Data Scientists and Their Teams data sources, Evaluation, Baseline Performance, and Implications for Investments in Data data understanding, Data Understanding–Data Understanding expected value decomposition and, From an Expected Value Decomposition to a Data Science Solution–From an Expected Value Decomposition to a Data Science Solution expected value framework and, The Expected Value Framework: Structuring a More Complicated Business Problem–The Expected Value Framework: Structuring a More Complicated Business Problem data warehousing, Data Warehousing data-analytic thinking, Data-Analytic Thinking–Data-Analytic Thinking and unbalanced classes, Problems with Unbalanced Classes for business strategies, Thinking Data-Analytically, Redux–Thinking Data-Analytically, Redux data-driven business data science vs., Data Processing and “Big Data” understanding, Data Processing and “Big Data” data-driven causal explanations, Data-Driven Causal Explanation and a Viral Marketing Example–Data-Driven Causal Explanation and a Viral Marketing Example data-driven decision-making, Data Science, Engineering, and Data-Driven Decision Making–Data Science, Engineering, and Data-Driven Decision Making benefits, Data Science, Engineering, and Data-Driven Decision Making discoveries, Data Science, Engineering, and Data-Driven Decision Making repetition, Data Science, Engineering, and Data-Driven Decision Making database queries, as analytic technique, Database Querying–Database Querying database tables, Models, Induction, and Prediction dataset entropy, Example: Attribute Selection with Information Gain datasets, Models, Induction, and Prediction analyzing, Introduction to Predictive Modeling: From Correlation to Supervised Segmentation attributes of, Overfitting in Mathematical Functions cross-validation and, From Holdout Evaluation to Cross-Validation limited, From Holdout Evaluation to Cross-Validation Davis, Miles, Example: Jazz Musicians, Example: Jazz Musicians Deanston single malt scotch, Understanding the Results of Clustering decision boundaries, Visualizing Segmentations, Classification via Mathematical Functions decision lines, Visualizing Segmentations decision nodes, Supervised Segmentation with Tree-Structured Models decision stumps, Evaluation, Baseline Performance, and Implications for Investments in Data decision surfaces, Visualizing Segmentations decision trees, Supervised Segmentation with Tree-Structured Models decision-making, automatic, Data Science, Engineering, and Data-Driven Decision Making deduction, induction vs., Models, Induction, and Prediction Dell, Data preparation, Achieving Competitive Advantage with Data Science demand, local, Example: Hurricane Frances dendrograms, Hierarchical Clustering, Hierarchical Clustering dependent variables, Models, Induction, and Prediction descriptive attributes, Data Mining and Data Science, Revisited descriptive modeling, Models, Induction, and Prediction Dictionary of Distances (Deza & Deza), * Other Distance Functions differential descriptions, * Using Supervised Learning to Generate Cluster Descriptions Digital 100 companies, Data-Analytic Thinking Dillman, Linda, Data Science, Engineering, and Data-Driven Decision Making dimensionality, of nearest-neighbor reasoning, Dimensionality and domain knowledge–Dimensionality and domain knowledge directed marketing example, Targeting the Best Prospects for a Charity Mailing–A Brief Digression on Selection Bias discoveries, Data Science, Engineering, and Data-Driven Decision Making discrete (binary) classifiers, ROC Graphs and Curves discrete classifiers, ROC Graphs and Curves discretized numeric variables, Selecting Informative Attributes discriminants, linear, Linear Discriminant Functions discriminative modeling methods, generative vs., Summary disorder, measuring, Selecting Informative Attributes display advertising, Example: Targeting Online Consumers With Advertisements distance functions, for nearest-neighbor reasoning, * Other Distance Functions–* Other Distance Functions distance, measuring, Similarity and Distance distribution Gaussian, Regression via Mathematical Functions Normal, Regression via Mathematical Functions distribution of properties, Selecting Informative Attributes Doctor Who (television show), Example: Evidence Lifts from Facebook “Likes” document (term), Representation domain knowledge data mining processes and, Dimensionality and domain knowledge nearest-neighbor reasoning and, Dimensionality and domain knowledge–Dimensionality and domain knowledge domain knowledge validation, Associations Among Facebook Likes domains, in association discovery, Associations Among Facebook Likes Dotcom Boom, Results, Formidable Historical Advantage double counting, Costs and benefits draws, statistical, * Logistic Regression: Some Technical Details E edit distance, * Other Distance Functions, * Other Distance Functions Einstein, Albert, Conclusion Elder Research, Attracting and Nurturing Data Scientists and Their Teams Ellington, Duke, Example: Jazz Musicians, Example: Jazz Musicians email, Why Text Is Important engineering, Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist, Business Understanding engineering problems, business problems vs., Other Data Science Tasks and Techniques ensemble method, Bias, Variance, and Ensemble Methods–Bias, Variance, and Ensemble Methods entropy, Selecting Informative Attributes–Selecting Informative Attributes, Selecting Informative Attributes, Example: Attribute Selection with Information Gain, Summary and Inverse Document Frequency, * The Relationship of IDF to Entropy change in, Selecting Informative Attributes equation for, Selecting Informative Attributes graphs, Example: Attribute Selection with Information Gain equations cosine distance, * Other Distance Functions entropy, Selecting Informative Attributes Euclidean distance, Similarity and Distance general linear model, Linear Discriminant Functions information gain (IG), Selecting Informative Attributes Jaccard distance, * Other Distance Functions L2 norm, * Other Distance Functions log-odds linear function, * Logistic Regression: Some Technical Details logistic function, * Logistic Regression: Some Technical Details majority scoring function, * Combining Functions: Calculating Scores from Neighbors majority vote classification, * Combining Functions: Calculating Scores from Neighbors Manhattan distance, * Other Distance Functions similarity-moderated classification, * Combining Functions: Calculating Scores from Neighbors similarity-moderated regression, * Combining Functions: Calculating Scores from Neighbors similarity-moderated scoring, * Combining Functions: Calculating Scores from Neighbors error costs, ROC Graphs and Curves error rates, Plain Accuracy and Its Problems, Error rates errors absolute, Regression via Mathematical Functions computing, Regression via Mathematical Functions false negative vs. false positive, Evaluating Classifiers squared, Regression via Mathematical Functions estimating generalization performance, From Holdout Evaluation to Cross-Validation estimation, frequency based, Probability Estimation ethics of data mining, Privacy, Ethics, and Mining Data About Individuals–Privacy, Ethics, and Mining Data About Individuals Euclid, Similarity and Distance Euclidean distance, Similarity and Distance evaluating models, Decision Analytic Thinking I: What Is a Good Model?

., The Fundamental Concepts of Data Science unique context of, What Data Can’t Do: Humans in the Loop, Revisited using expected values to provide framework for, The Expected Value Framework: Decomposing the Business Problem and Recomposing the Solution Pieces–The Expected Value Framework: Decomposing the Business Problem and Recomposing the Solution Pieces business strategy, Data Science and Business Strategy–A Firm’s Data Science Maturity accepting creative ideas, Be Ready to Accept Creative Ideas from Any Source case studies, examining, Examine Data Science Case Studies competitive advantages, Achieving Competitive Advantage with Data Science–Achieving Competitive Advantage with Data Science, Sustaining Competitive Advantage with Data Science–Superior Data Science Management data scientists, evaluating, Superior Data Scientists–Superior Data Scientists evaluating proposals, Be Ready to Evaluate Proposals for Data Science Projects–Flaws in the Big Red Proposal historical advantages and, Formidable Historical Advantage intangible collateral assets and, Unique Intangible Collateral Assets intellectual property and, Unique Intellectual Property managing data scientists effectively, Superior Data Science Management–Superior Data Science Management maturity of the data science, A Firm’s Data Science Maturity–A Firm’s Data Science Maturity thinking data-analytically for, Thinking Data-Analytically, Redux–Thinking Data-Analytically, Redux C Caesars Entertainment, Data and Data Science Capability as a Strategic Asset call center example, Profiling: Finding Typical Behavior–Profiling: Finding Typical Behavior Capability Maturity Model, A Firm’s Data Science Maturity Capital One, Data and Data Science Capability as a Strategic Asset, From an Expected Value Decomposition to a Data Science Solution Case-Based Reasoning, How Many Neighbors and How Much Influence?

This is facilitated tremendously by strong and deep professional contacts. Data scientists call on each other to help in steering them to the right solutions. The better a professional network is, the better will be the solution. And, the best data scientists have the best connections. Superior Data Science Management Possibly even more critical to success for data science in business is having good management of the data science team. Good data science managers are especially hard to find. They need to understand the fundamentals of data science well, possibly even being competent data scientists themselves. Good data science managers also must possess a set of other abilities that are rare in a single individual: They need to truly understand and appreciate the needs of the business.


Big Data at Work: Dispelling the Myths, Uncovering the Opportunities by Thomas H. Davenport

Automated Insights, autonomous vehicles, bioinformatics, business intelligence, business process, call centre, chief data officer, cloud computing, commoditize, data acquisition, data science, disruptive innovation, Edward Snowden, Erik Brynjolfsson, intermodal, Internet of things, Jeff Bezos, knowledge worker, lifelogging, Mark Zuckerberg, move fast and break things, Narrative Science, natural language processing, Netflix Prize, New Journalism, recommendation engine, RFID, self-driving car, sentiment analysis, Silicon Valley, smart grid, smart meter, social graph, sorting algorithm, statistical model, Tesla Model S, text mining, Thomas Davenport, three-martini lunch

But if you’re interviewing one from another industry, make sure that the candidate shows interest and demonstrated business problem-solving ability in the industry from which he or she comes. Horizontal versus Vertical Data Scientists There are, of course, many types of data scientists. One way to ­characterize an important set of differences between types has been coined by Vincent Granville, who operates Data Science Central, Chapter_04.indd 97 03/12/13 12:00 PM 98 big data @ work a social n ­ etwork for data scientists like himself. In a blog post (with some ­analytical jargon you can toss around at cocktail parties), he described the difference between vertical and horizontal data scientists: • Vertical data scientists have very deep knowledge in some ­narrow field.

Where Do You Get Data Scientists? Data Scientists from Universities I noted earlier that data scientists today often have advanced degrees in science, but that won’t always be the most efficient way to procure the ­necessary skills. How soon will there be more direct educational paths to Chapter_04.indd 101 03/12/13 12:00 PM 102 big data @ work data science? Well, as I write I believe there is no university that has yet issued a degree in data science. But there are: (a) a growing n ­ umber of courses in the field and (b) a growing number of institutions that are ­planning data science degree programs.

And as I hinted in chapter 3, firms such as ­Accenture, Deloitte, and IBM have begun to hire and train data scientists in larger numbers. Predominantly offshore firms such as Mu Sigma, a “math factory” with thousands of quants as employees, are also hiring data scientists in considerable numbers. One data scientist has come up with a creative approach to training new data scientists. The Insight Data Science Fellows Program, started by Jake Klamka (whose academic background is high-energy physics), takes scientists for six weeks and teaches them the skills to be a data scientist. The program includes mentoring by local companies with big data challenges (e.g., Facebook, Twitter, Google, ­LinkedIn).


Succeeding With AI: How to Make AI Work for Your Business by Veljko Krunic

AI winter, Albert Einstein, algorithmic trading, AlphaGo, Amazon Web Services, anti-fragile, anti-pattern, artificial general intelligence, autonomous vehicles, Bayesian statistics, bioinformatics, Black Swan, Boeing 737 MAX, business process, cloud computing, commoditize, computer vision, correlation coefficient, data is the new oil, data science, deep learning, DeepMind, en.wikipedia.org, fail fast, Gini coefficient, high net worth, information retrieval, Internet of things, iterative process, job automation, Lean Startup, license plate recognition, minimum viable product, natural language processing, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, six sigma, smart cities, speech recognition, statistical model, strong AI, tail risk, The Design of Experiments, the scientific method, web application, zero-sum game

Closely related fields that are sometimes considered part of data science include bioinformatics and quantitative analysis. While AI and data science closely overlap, they aren’t identical, because AI includes fields such as robotics, which are traditionally not considered part of data science. Harris, Murphy, and Vaisman’s book [66] provides a good summary of the state of data science before the advancement of deep learning. Data scientist—A practitioner of the field of data science. Many sources (including this book) classify AI practitioners as data scientists. Database administrator (DBA)—A professional responsible for the maintenance of a database. Most commonly, a DBA would be responsible for maintaining a RDBMS-based database.

As a manager, you should look for two things when hiring data scientists for your team. You should look for a candidate who has skills in the core domain that your initial AI project is likely to use, but you also need them to have a demonstrated ability to learn new skills. Chances are good that, along the way, your data scientist will need to learn many new methods. When hiring senior data science team members, don’t just look for a strong background in one set of AI methods. Senior data scientists should have a history of solving concrete problems using a diverse set of methods. Data science is a team sport. To completely cover all of the knowledge that’s part of data science, you need a whole team, so you must assemble a team with complementary skillsets.

They were the best decisions you could have made based on what you knew then. Types of data scientists to hire Leading experts focused on a narrow class of algorithms are worth their weight in gold if they know how to improve the performance of an algorithm by 0.1% that’s bringing $1 billion per year to your organization. However, if your question is, “What can AI do to help my business?” you’re probably better off with a data scientist who has a command of the wide range of data science methods. A data scientist with that profile has the best chance of finding a use case in which the profit margin would be large and you don’t need to get that last 0.1% improvement for the use case to be viable.


pages: 579 words: 76,657

Data Science from Scratch: First Principles with Python by Joel Grus

backpropagation, confounding variable, correlation does not imply causation, data science, deep learning, Hacker News, higher-order functions, natural language processing, Netflix Prize, p-value, Paul Graham, recommendation engine, SpamAssassin, statistical model

If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 978-1-491-90142-7 [LSI] Preface Data Science Data scientist has been called “the sexiest job of the 21st century,” presumably by someone who has never visited a fire station. Nonetheless, data science is a hot and growing field, and it doesn’t take a great deal of sleuthing to find analysts breathlessly prognosticating that over the next 10 years, we’ll need billions and billions more data scientists than we currently have. But what is data science? After all, we can’t produce data scientists if we don’t know what data science is. According to a Venn diagram that is somewhat famous in the industry, data science lies at the intersection of: Hacking skills Math and statistics knowledge Substantive expertise Although I originally intended to write a book covering all three, I quickly realized that a thorough treatment of “substantive expertise” would require tens of thousands of pages.

From Scratch There are lots and lots of data science libraries, frameworks, modules, and toolkits that efficiently implement the most common (as well as the least common) data science algorithms and techniques. If you become a data scientist, you will become intimately familiar with NumPy, with scikit-learn, with pandas, and with a panoply of other libraries. They are great for doing data science. But they are also a good way to start doing data science without actually understanding data science. In this book, we will be approaching data science from scratch. That means we’ll be building tools and implementing algorithms by hand in order to better understand them.

Now, before you start feeling too jaded: some data scientists also occasionally use their skills for good — using data to make government more effective, to help the homeless, and to improve public health. But it certainly won’t hurt your career if you like figuring out the best way to get people to click on advertisements. Motivating Hypothetical: DataSciencester Congratulations! You’ve just been hired to lead the data science efforts at DataSciencester, the social network for data scientists. Despite being for data scientists, DataSciencester has never actually invested in building its own data science practice. (In fairness, DataSciencester has never really invested in building its product either.)


pages: 296 words: 66,815

The AI-First Company by Ash Fontana

23andMe, Amazon Mechanical Turk, Amazon Web Services, autonomous vehicles, barriers to entry, blockchain, business intelligence, business process, business process outsourcing, call centre, Charles Babbage, chief data officer, Clayton Christensen, cloud computing, combinatorial explosion, computer vision, crowdsourcing, data acquisition, data science, deep learning, DevOps, en.wikipedia.org, Geoffrey Hinton, independent contractor, industrial robot, inventory management, John Conway, knowledge economy, Kubernetes, Lean Startup, machine readable, minimum viable product, natural language processing, Network effects, optical character recognition, Pareto efficiency, performance metric, price discrimination, recommendation engine, Ronald Coase, Salesforce, single source of truth, software as a service, source of truth, speech recognition, the scientific method, transaction costs, vertical integration, yield management

Before creating a bunch of chatter between equations, have a single conversation with one equation to see if it answers customers’ questions. We’re not here to build a data science consulting firm, but DLEs start with data science. Most AI models are based on statistical methods. Starting with statistics allows for a smooth transition into AI when there’s enough time and money from customers to build it. Starting Small: Data Science Starting with a data scientist solving a well-defined problem saves time and money when compared to starting with a team big enough to solve an amorphous problem with machine learning. Dedicate a data scientist to serve as a consultant to customers and provide personalized, data-driven answers to a single question in order to demonstrate return on investment (ROI).

Linking databases, cleaning data, creating data pipelines, building features, and designing interfaces is a lot of work, but avoidable work. There’s also evidence that starting with data science works on Kaggle, where the largest community of data scientists and ML engineers compete to win prizes for solving problems. The summary is that data science methods get to the Pareto optimal solution (achieving 80 percent of the optimal solution for 20 percent of the work). Often, only the last 20 percent requires data science. Specifically, data science methods such as ensembles of decision trees—whether random forest or gradient boosted—combined with manual feature engineering win most of the competitions on structured data, and neural networks win most of the competitions on unstructured data.

hiring sequence Data analyst Business Low Yes First Data scientist Statistics Low Partially Second Data engineer Databases Medium Yes Third Machine learning engineer Computer science Medium No Fourth Data product manager Product management Medium No Fifth Data infrastructure engineer Distributed systems High Partially Sixth Machine learning researcher Machine learning High Maybe Seventh WHERE TO FIND THEM Starting with statistics means hiring analysts and data scientists before engineers and ML researchers. Essentially, by decoupling data science and software engineering, hiring can focus on data scientists without software engineering experience, thus broadening the pool of candidates to include every discipline in which manipulating data is part of the research process. One can find analysts and data scientists in the fields of economics, econometrics, accounting, actuarial science, biology, biostatistics, geology, geostatistics, epidemiology, demographics, engineering, and physics because these areas require high levels of mathematics and statistics.


pages: 239 words: 70,206

Data-Ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else by Steve Lohr

"World Economic Forum" Davos, 23andMe, Abraham Maslow, Affordable Care Act / Obamacare, Albert Einstein, Alvin Toffler, Bear Stearns, behavioural economics, big data - Walmart - Pop Tarts, bioinformatics, business cycle, business intelligence, call centre, Carl Icahn, classic study, cloud computing, computer age, conceptual framework, Credit Default Swap, crowdsourcing, Daniel Kahneman / Amos Tversky, Danny Hillis, data is the new oil, data science, David Brooks, driverless car, East Village, Edward Snowden, Emanuel Derman, Erik Brynjolfsson, everywhere but in the productivity statistics, financial engineering, Frederick Winslow Taylor, Future Shock, Google Glasses, Ida Tarbell, impulse control, income inequality, indoor plumbing, industrial robot, informal economy, Internet of things, invention of writing, Johannes Kepler, John Markoff, John von Neumann, lifelogging, machine translation, Mark Zuckerberg, market bubble, meta-analysis, money market fund, natural language processing, obamacare, pattern recognition, payday loans, personalized medicine, planned obsolescence, precision agriculture, pre–internet, Productivity paradox, RAND corporation, rising living standards, Robert Gordon, Robert Solow, Salesforce, scientific management, Second Machine Age, self-driving car, Silicon Valley, Silicon Valley startup, SimCity, six sigma, skunkworks, speech recognition, statistical model, Steve Jobs, Steven Levy, The Design of Experiments, the scientific method, Thomas Kuhn: the structure of scientific revolutions, Tony Fadell, unbanked and underbanked, underbanked, Von Neumann architecture, Watson beat the top human players on Jeopardy!, yottabyte

So, Hammerbacher says, “We decided to mush those two titles together and call them data scientists.” At first, a few PhDs resisted, viewing the change as a loss of title prestige. “But ultimately everyone embraced it, and it took on a life of its own,” he observes. And to him, it seemed natural. “Data science is what we did.” The origins of data science reach back half a century or more. Hammerbacher’s choice of terms wasn’t mere happenstance. As soon as he accepted the job at Facebook, Hammerbacher began poring through technical papers and books that provided clues to the evolution of data science. In the spring of 2012, he taught a course in data science at the University of California at Berkeley.

Alex Pentland, a computational social scientist at the Massachusetts Institute of Technology Media Lab, sees the promise of “a transition on a par with the invention of writing or the Internet.” The ranks of data scientists—people who wield their math and computing smarts to make sense of data—are modest compared to the workforce as a whole, but they loom large. Data science is hailed as the field of the future. Universities are rushing to establish data science centers, institutes, and courses, and companies are scrambling to hire data scientists. There is a trend-chasing side to the current data frenzy that invites ridicule. But it is hard to argue the direction. Jeffrey Hammerbacher was always a numbers kind of guy.

To explain, I think of a conversation with Claudia Perlich, the chief scientist of Dstillery, a data-science start-up in New York that specializes in ad targeting. Perlich is a former research scientist at IBM, a winner of prestigious data science contests, and a lecturer at New York University’s Stern School of Business. When I ask why she is using her skills to deliver ads, Perlich replies that digital marketing is a large, real-world testing ground where practitioners in a young field can safely learn valuable lessons. The online advertising marketplace, she says, is “a wonderful place for data scientists to experiment now. What happens if my algorithm is wrong?


pages: 398 words: 86,855

Bad Data Handbook by Q. Ethan McCallum

Amazon Mechanical Turk, asset allocation, barriers to entry, Benoit Mandelbrot, business intelligence, cellular automata, chief data officer, Chuck Templeton: OpenTable:, cloud computing, cognitive dissonance, combinatorial explosion, commoditize, conceptual framework, data science, database schema, DevOps, en.wikipedia.org, Firefox, Flash crash, functional programming, Gini coefficient, hype cycle, illegal immigration, iterative process, labor-force participation, loose coupling, machine readable, natural language processing, Netflix Prize, One Laptop per Child (OLPC), power law, quantitative trading / quantitative finance, recommendation engine, selection bias, sentiment analysis, SQL injection, statistical model, supply-chain management, survivorship bias, text mining, too big to fail, web application

While they are willing and able to work on many tasks across the data science process, from munging and modeling to visualizing and presenting, it’s quite rare to find talent with extensive experience in all aspects of data science. Organizations and managers would do well to adjust their expectations accordingly. A successful data science function is made up not by one person, but at a minimum two or three individuals whose broad skills have much overlap while their unique expertise does not. Where Do Data Scientists Live Within the Organization? Finding a place for data scientists can be a bit tricky. Sometimes you’ll find them living within an engineering organization, sometimes within a product organization, sometimes within a research organization, and other times they live under some other umbrella or on their own.

His skills as a programmer began while assisting with the development Sahana Disaster Management System, were refined helping Sugar Labs, the software which runs the One Laptop Per Child XO. Tim has recently moved into the escience field, where he works to support the research community’s uptake of technology. Marck Vaisman is a data scientist and claims he’s been one before the term was en vogue. He is also a consultant, entrepreneur, master munger, and hacker. Marck is the principal data scientist at DataXtract, LLC where he helps clients ranging from startups to Fortune 500 firms with all kinds of data science projects. His professional experience spans the management consulting, telecommunications, Internet, and technology industries. He is the co-founder of Data Community DC, an organization focused on building the Washington DC area data community and promoting data and statistical sciences by running Meetup events (including Data Science DC and R Users DC) and other initiatives.

Spillover can happen for any number of reasons, including inadequate accuracy in the partitioning scheme. While you may not be able to completely eliminate spillover, you can at least be aware of it. Don’t expect that the data is partitioned perfectly. Thou Shalt Provide Your Data Scientists with a Single Tool for All Tasks There is no single tool that allows you to perform all of your data science tasks. Many different tools exist, and each tool has a specific purpose. In order to be successful, data scientists should have access to the tools they need and also the ability to configure these tools as needed—at least in a research and development (R&D) environment—without having to jump through hoops to do their work.


pages: 337 words: 86,320

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz

affirmative action, AltaVista, Amazon Mechanical Turk, Asian financial crisis, Bernie Sanders, big data - Walmart - Pop Tarts, Black Lives Matter, Cass Sunstein, computer vision, content marketing, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, desegregation, Donald Trump, Edward Glaeser, Filter Bubble, game design, happiness index / gross national happiness, income inequality, Jeff Bezos, Jeff Seder, John Snow's cholera map, longitudinal study, Mark Zuckerberg, Nate Silver, Nick Bostrom, peer-to-peer lending, Peter Thiel, price discrimination, quantitative hedge fund, Ronald Reagan, Rosa Parks, sentiment analysis, Silicon Valley, statistical model, Steve Jobs, Steven Levy, Steven Pinker, TaskRabbit, The Signal and the Noise by Nate Silver, working poor

In other words, she spotted patterns and predicted how one variable will affect another. Grandma is a data scientist. You are a data scientist, too. When you were a kid, you noticed that when you cried, your mom gave you attention. That is data science. When you reached adulthood, you noticed that if you complain too much, people want to hang out with you less. That is data science, too. When people hang out with you less, you noticed, you are less happy. When you are less happy, you are less friendly. When you are less friendly, people want to hang out with you even less. Data science. Data science. Data science. Because data science is so natural, the best Big Data studies, I have found, can be understood by just about any smart person.

In other words, the Columbia and Microsoft researchers wrote a groundbreaking study by utilizing the natural, obvious methodology that everybody uses to make health diagnoses. But wait. Let’s slow down here. If the methodology of the best data science is frequently natural and intuitive, as I claim, this raises a fundamental question about the value of Big Data. If humans are naturally data scientists, if data science is intuitive, why do we need computers and statistical software? Why do we need the Kolmogorov-Smirnov test? Can’t we just use our gut? Can’t we do it like Grandma does, like nurses and doctors do? This gets to an argument intensified after the release of Malcolm Gladwell’s bestselling book Blink, which extols the magic of people’s gut instincts.

We are often wrong, in other words, about how the world works when we rely just on what we hear or personally experience. While the methodology of good data science is often intuitive, the results are frequently counterintuitive. Data science takes a natural and intuitive human process—spotting patterns and making sense of them—and injects it with steroids, potentially showing us that the world works in a completely different way from how we thought it did. That’s what happened when I studied the predictors of basketball success. When I was a little boy, I had one dream and one dream only: I wanted to grow up to be an economist and data scientist. No. I’m just kidding. I wanted desperately to be a professional basketball player, to follow in the footsteps of my hero, Patrick Ewing, all-star center for the New York Knicks.


pages: 301 words: 85,126

AIQ: How People and Machines Are Smarter Together by Nick Polson, James Scott

Abraham Wald, Air France Flight 447, Albert Einstein, algorithmic bias, Amazon Web Services, Atul Gawande, autonomous vehicles, availability heuristic, basic income, Bayesian statistics, Big Tech, Black Lives Matter, Bletchley Park, business cycle, Cepheid variable, Checklist Manifesto, cloud computing, combinatorial explosion, computer age, computer vision, Daniel Kahneman / Amos Tversky, data science, deep learning, DeepMind, Donald Trump, Douglas Hofstadter, Edward Charles Pickering, Elon Musk, epigenetics, fake news, Flash crash, Grace Hopper, Gödel, Escher, Bach, Hans Moravec, Harvard Computers: women astronomers, Higgs boson, index fund, information security, Isaac Newton, John von Neumann, late fees, low earth orbit, Lyft, machine translation, Magellanic Cloud, mass incarceration, Moneyball by Michael Lewis explains big data, Moravec's paradox, more computing power than Apollo, natural language processing, Netflix Prize, North Sea oil, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, p-value, pattern recognition, Pierre-Simon Laplace, ransomware, recommendation engine, Ronald Reagan, Salesforce, self-driving car, sentiment analysis, side project, Silicon Valley, Skype, smart cities, speech recognition, statistical model, survivorship bias, systems thinking, the scientific method, Thomas Bayes, Uber for X, uber lyft, universal basic income, Watson beat the top human players on Jeopardy!, young professional

It turns out that when she wasn’t caring for soldiers, Nightingale was also a skilled data scientist who successfully convinced hospitals that they could improve health care using statistics. In fact, no other data scientist in history can claim to have saved so many lives as Florence Nightingale. In 1859, in honor of these achievements, she became the first woman ever elected to the U.K.’s Royal Statistical Society. Nightingale’s path to unlocking the power of health-care data offers three distinct lessons for today. First, it illustrates the kind of institutional commitment necessary for a data-science revolution to take hold in a given field.

Her ideas formed a clear model for the international system of disease classification used today, which serves as the bedrock for all of modern epidemiology and medical data science.37 Preventable Mischiefs in the Age of AI Nightingale’s three legacies all have clear parallels today. They also raise some sharp questions. She spoke of the “foul air and preventable mischiefs” that killed the soldiers of the Crimea, and while the air in modern hospitals may be less foul, there are still mischiefs aplenty. One big question is how to staff and train a modern health-care team. After Nightingale, no hospital could function without nurses. When will the same be true of data scientists and experts in artificial intelligence, who now play almost no day-to-day role in health care?

Data Sharing That bring us to another big question: Will data-science teams get access to the data they’ll need to improve the existing AI systems and build new ones? If you work for a single hospital, you might have access to thousands of patient records. But wouldn’t millions of records from lots of hospitals be much better? After all, a big reason that tech firms like Google and Facebook have such good AI is the sheer scale of their data sets. There are surely millions of clinical histories of kidney disease scattered across the medical databases of the world. In principle, these could be brought together, and teams of data scientists could be hired to analyze them using cutting-edge AI tools, in a way that still ensured patient privacy.


pages: 161 words: 39,526

Applied Artificial Intelligence: A Handbook for Business Leaders by Mariya Yao, Adelyn Zhou, Marlene Jia

Airbnb, algorithmic bias, AlphaGo, Amazon Web Services, artificial general intelligence, autonomous vehicles, backpropagation, business intelligence, business process, call centre, chief data officer, cognitive load, computer vision, conceptual framework, data science, deep learning, DeepMind, en.wikipedia.org, fake news, future of work, Geoffrey Hinton, industrial robot, information security, Internet of things, iterative process, Jeff Bezos, job automation, machine translation, Marc Andreessen, natural language processing, new economy, OpenAI, pattern recognition, performance metric, price discrimination, randomized controlled trial, recommendation engine, robotic process automation, Salesforce, self-driving car, sentiment analysis, Silicon Valley, single source of truth, skunkworks, software is eating the world, source of truth, sparse data, speech recognition, statistical model, strong AI, subscription business, technological singularity, The future is already here

These specialized engineers deploy models, manage infrastructure, and run operations related to machine learning projects. They are assisted by data scientists and data engineers to manage databases and build the data infrastructure necessary to support the products and services used by their customers. Data Scientists Data scientists typically work in an offline setting and do not deal directly with the production experience, which is what the end user would see. Data scientists collect data, spend most of their time cleaning it, and the rest of their time looking for patterns in the data and building predictive models. They often have degrees in statistics, data science, or a related discipline. Alternately, many have programming backgrounds and hold degrees in computer science, math, or physics.

Recruit from Specialized Training Programs To meet the rising demand for machine learning talent, education programs have emerged to train junior talent and help them find job placements. Abhi Jha, Director of Advanced Analytics at McKesson, initially hired data science students from Galvanize, a technical skills training provider. “We’ve had a lot of success hiring from career fairs that Galvanize organizes, where we present the unique challenges our company tackles in healthcare,” he adds.(57) Experienced Scientists and Researchers Hiring experienced data scientists and machine learning researchers requires a different approach. For these positions, employers typically look for a doctorate or extensive experience in machine learning, statistical modeling, or related fields.

Understand Different Job Titles Many companies struggle just to understand what “artificial intelligence” is, much less the myriad of titles, roles, skills, and technologies used to describe a prospective hire. Titles and descriptions vary from company to company, and terms are not well-standardized in the industry. However, most of the roles you encounter will resemble the following: Data Science Team Manager A data science team manager understands how best to deploy the expertise of his team in order to maximize their productivity on a project. This manager should have sufficient technical knowledge to understand what his team members are doing and how best to support them; at the same time, this manager must also have good communications skills in order to liaise with the leadership or non-technical units.


Thinking with Data by Max Shron

business intelligence, Carmen Reinhart, confounding variable, correlation does not imply causation, data science, Growth in a Time of Debt, iterative process, Kenneth Rogoff, randomized controlled trial, Richard Feynman, statistical model, The Design of Experiments, the scientific method

Thinking with Data Max Shron Praise for Thinking with Data "Thinking with Data gets to the essence of the process, and guides data scientists in answering that most important question—what’s the problem we’re really trying to solve?” — Hilary Mason Data Scientist in Residence at Accel Partners; co-founder of the DataGotham Conference “Thinking with Data does a wonderful job of reminding data scientists to look past technical issues and to focus on making an impact on the broad business objectives of their employers and clients. It’s a useful supplement to a data science curriculum that is largely focused on the technical machinery of statistics and computer science

Statistics as a whole is concerned with generalizing from old to new data; causal analysis is concerned with generalizing from old to new scenarios where we have deliberately altered something. Generally speaking, because data science as a field is primarily concerned with generalizing knowledge only in highly specific domains (such as for one company, one service, or one type of product), it is able to sidestep many of the issues that snarl causal analysis in more scientific domains. As of today, data scientists do little work building theories intended to capture causal relationships in entirely new scenarios. For those that do, especially if their subject matter concerns human behavior, a more thorough grounding in topics such as construct validity and quasi-experimental design is highly recommended.[10] Defining Causality Different schools of thought have defined causality differently, but a particularly simple interpretation, suitable to many of the problems that are solvable with data, is the alternate universe perspective.

The fields of design, argument studies, critical thinking, national intelligence, problem-solving heuristics, education theory, program evaluation, various parts of the humanities—each of them have insights that data science can learn from. Data science is already a field of bricolage. Swaths of engineering, statistics, machine learning, and graphic communication are already fundamental parts of the data science canon. They are necessary, but they are not sufficient. If we look further afield and incorporate ideas from the “softer” intellectual disciplines, we can make data science successful and help it be more than just this decade’s fad. A focus on why rather than how already pervades the work of the best data professionals.


pages: 391 words: 123,597

Targeted: The Cambridge Analytica Whistleblower's Inside Story of How Big Data, Trump, and Facebook Broke Democracy and How It Can Happen Again by Brittany Kaiser

"World Economic Forum" Davos, Albert Einstein, Amazon Mechanical Turk, Asian financial crisis, Bernie Sanders, Big Tech, bitcoin, blockchain, Boris Johnson, Brexit referendum, Burning Man, call centre, Cambridge Analytica, Carl Icahn, centre right, Chelsea Manning, clean water, cognitive dissonance, crony capitalism, dark pattern, data science, disinformation, Dominic Cummings, Donald Trump, Edward Snowden, Etonian, fake news, haute couture, illegal immigration, Julian Assange, Mark Zuckerberg, Menlo Park, Nelson Mandela, off grid, open borders, public intellectual, Renaissance Technologies, Robert Mercer, rolodex, Russian election interference, sentiment analysis, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Skype, Snapchat, statistical model, Steve Bannon, subprime mortgage crisis, TED Talk, the High Line, the scientific method, WeWork, WikiLeaks, you are the product, young professional

The Mercers’ generosity to conservative causes is well known, but Bob saw in Alexander a marriage between his love for data science and his political motivations. Alexander recalled Bob’s reaction as something like “How much do you want, and where should I send it?” From there, Steve and Bekah and Bob formed the triumvirate that was the board of directors of the new company known as Cambridge Analytica, with Alexander Nix at the helm. Alexander had already hired some data scientists by then, but he went on to hire more, and he began to instruct SCL Group employees to split their time between international work and building the U.S. business. The data scientists began to purchase as much data as they could get their hands on, and within months, Cambridge Analytica had taken off.

That’s because San Antonio was home to Brad Parscale, of Giles-Parscale. Parscale had been a longtime website designer for Trump, and Trump had picked him to run his digital operations. The problem was that Parscale had no data science or data-driven communications experience, so Bekah knew that Trump needed Cambridge. When the early Cambridge Analytica team (which consisted of Matt Oczkowski, Molly Schweickert, and a handful of data scientists) arrived on the scene in San Antonio in June, they found Brad and the Trump campaign’s digital operations in an alarming state of disarray. Oczkowski—“Oz” for short—wrote to me on June 17, when I asked him a question about a commercial client, saying he had no time to help me, as he would need all his energy for working with Brad and getting their analytics up and running.

And for years, the SCL Group, Cambridge Analytica’s parent company, had been identifying and sorting people using the most sophisticated method in behavioral psychology, which gave it the capability of turning what was otherwise just a mountain of information about the American populace into a gold mine. Nix told us about his in-house army of data scientists and psychologists who had learned precisely how to know whom they wanted to message, what messaging to send them, and exactly where to reach them. He had hired the most brilliant data scientists in the world, people who could laser in on individuals wherever they were to be found (on their cell phones, computers, tablets, on television) and through any kind of medium you could imagine (from audio to social media), using “microtargeting.”


pages: 502 words: 107,657

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by Eric Siegel

Alan Greenspan, Albert Einstein, algorithmic trading, Amazon Mechanical Turk, Apollo 11, Apple's 1984 Super Bowl advert, backtesting, Black Swan, book scanning, bounce rate, business intelligence, business process, butter production in bangladesh, call centre, Charles Lindbergh, commoditize, computer age, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data is the new oil, data science, driverless car, en.wikipedia.org, Erik Brynjolfsson, Everything should be made as simple as possible, experimental subject, Google Glasses, happiness index / gross national happiness, information security, job satisfaction, Johann Wolfgang von Goethe, lifelogging, machine readable, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mass immigration, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, Norbert Wiener, personalized medicine, placebo effect, prediction markets, Ray Kurzweil, recommendation engine, risk-adjusted returns, Ronald Coase, Search for Extraterrestrial Intelligence, self-driving car, sentiment analysis, Shai Danziger, software as a service, SpaceShipOne, speech recognition, statistical model, Steven Levy, supply chain finance, text mining, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!, X Prize, Yogi Berra, zero-sum game

The machine learning process is designed to accomplish this task, to mechanically develop new capabilities from data. This automation is the means by which PA builds its predictive power. The hunter returns back to the tribe, proudly displaying his kill. So, too, a data scientist posts her model on the bulletin board near the company ping-pong table. The hunter hands over the kill to the cook, and the data scientist cooks up her model, translates it to a standard computer language, and e-mails it to an engineer for integration. A well-fed tribe shows the love; a psyched executive issues a bonus. The tribe munches and the scientist crunches.

Special Forces: Dean Abbott, “Hiring and Selecting Key Personnel Using Predictive Analytics,” Predictive Analytics World San Francisco Conference, March 4, 2012, San Francisco, CA. www.predictiveanalyticsworld.com/sanfrancisco/2012/agenda.php#day1–1040a. LinkedIn: Manu Sharma, LinkedIn, “Data Science at LinkedIn: Iterative, Big Data Analytics and You,” Predictive Analytics World New York Conference, October 19, 2011, New York, NY. www.predictiveanalyticsworld.com/newyork/2011/agenda.php#track1-lnk. Scott Nicholson, LinkedIn, “Beyond Big Data: Better Living through Data Science,” Predictive Analytics World Boston Conference, October 1, 2012, Boston, MA. www.predictiveanalyticsworld.com/boston/2012/agenda.php#keynote-900. Scott Nicholson, LinkedIn, “Econometric Applications & Extracting Economic Insights from the LinkedIn Dataset,” Predictive Analytics World San Francisco Conference, March 5, 2012, San Francisco, CA. www.predictiveanalyticsworld.com/sanfrancisco/2012/agenda.php#day1–20a.

The stories range from inspiring to downright scary—read them and find out what we’ve been up to while you weren’t paying attention.” —Michael J. A. Berry, author of Data Mining Techniques, Third Edition “Eric Siegel is the Kevin Bacon of the predictive analytics world, organizing conferences where insiders trade knowledge and share recipes. Now, he has thrown the doors open for you. Step in and explore how data scientists are rewriting the rules of business.” —Kaiser Fung, VP, Vimeo; author of Numbers Rule Your World “Written in a lively language, full of great quotes, real-world examples, and case studies, it is a pleasure to read. The more technical audience will enjoy chapters on The Ensemble Effect and uplift modeling—both very hot trends.


pages: 287 words: 69,655

Don't Trust Your Gut: Using Data to Get What You Really Want in LIfe by Seth Stephens-Davidowitz

affirmative action, Airbnb, cognitive bias, commoditize, correlation does not imply causation, COVID-19, Daniel Kahneman / Amos Tversky, data science, deep learning, digital map, Donald Trump, en.wikipedia.org, Erik Brynjolfsson, General Magic , global pandemic, Mark Zuckerberg, meta-analysis, Moneyball by Michael Lewis explains big data, Paul Graham, peak-end rule, randomized controlled trial, Renaissance Technologies, Sam Altman, science of happiness, selection bias, side hustle, Silicon Valley, Steve Jobs, Steve Wozniak, systematic bias, Tony Fadell, twin studies, Tyler Cowen, urban planning, Y Combinator

The artists who make it big, in contrast, present to a far wider set of places, allowing themselves to stumble upon a big break. Many people have talked about the importance in your career of showing up. But data scientists have found it’s about showing up to a wide range of places. This book isn’t meant to give advice only for single people, new parents, or aspiring artists—though there will be more lessons here for all of them. My goal is to offer some lessons in new, big datasets that are useful for you, no matter what stage of life you are in. There will be lessons recently uncovered by data scientists in how to be happier, look better, advance your career, and much more. And the idea for the book all came to me one evening while . . .

In the past few years, other teams of researchers have mined online dating sites, combing through large, new datasets on the traits and swipes of tens of thousands of single people to determine what predicts romantic desirability. The findings from the research on romantic desirability, unlike the research on romantic happiness, has been definitive. While data scientists have found that it is surprisingly difficult to detect the qualities in romantic partners that lead to happiness, data scientists have found it strikingly easy to detect the qualities that are catnip in the dating scene. A recent study, in fact, found that not only is it possible to predict with great accuracy whether someone will swipe left or right on a particular person on an online dating site.

See rich people weather and happiness, 261–262 websites athletic scholarships stats, 95, 98 equestrianism on a budget, 108 neighborhood information, 77, 87 ScenicOrNot, 259 trackyourhappiness.org, 235 twin basketball players models, 103, 104 wholesale distribution business startup, 122 West Point cadets’ success, 195–196 wholesale beverage distribution, 111–113, 122, 130 website on business startup, 122 Wikipedia entries in contrasting counties, 73 work misery factor, 221, 238–239 misery factor lessened, 239–242 quitting job per coin flip, 242 Y you are drawn to you, 15, 39–40 Youkilis, Kevin, 46–48 young entrepreneurs, 140. See also age of typical entrepreneur Z zero-profit condition, 128 Zuckerberg, Mark, 140, 142, 149 About the Author SETH STEPHENS-DAVIDOWITZ is a data scientist, author, and keynote speaker. His 2017 book, Everybody Lies, was a New York Times bestseller and an Economist Book of the Year. He has worked as a contributing op-ed writer for the New York Times, a lecturer at the Wharton School, and a Google data scientist. He received a BA in philosophy from Stanford, where he graduated Phi Beta Kappa, and a PhD in economics from Harvard. He lives in Brooklyn and is a passionate fan of the Mets, Knicks, Jets, and Leonard Cohen.


pages: 23 words: 5,264

Designing Great Data Products by Jeremy Howard, Mike Loukides, Margit Zwemer

AltaVista, data science, Filter Bubble, PageRank, pattern recognition, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, text mining

Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing. Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work. But as data scientists build increasingly sophisticated products, they need a systematic design approach. We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision. Objective-based data products We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.

In an emergency, a data product that just produces more data is of little use. Data scientists now have the predictive tools to build products that increase the common good, but they need to be aware that building the models is not enough if they do not also produce optimized, implementable outcomes. The future for data products We introduced the Drivetrain Approach to provide a framework for designing the next generation of great data products and described how it relies at its heart on optimization. In the future, we hope to see optimization taught in business schools as well as in statistics departments. We hope to see data scientists ship products that are designed to produce desirable business outcomes.

We hope to see data scientists ship products that are designed to produce desirable business outcomes. This is still the dawn of data science. We don’t know what design approaches will be developed in the future, but right now, there is a need for the data science community to coalesce around a shared vocabulary and product design process that can be used to educate others on how to derive value from their predictive models. If we do not do this, we will find that our models only use data to create more data, rather than using data to create actions, disrupt industries and transform lives. About the Authors Mike Loukides is an editor for O'Reilly & Associates.


pages: 252 words: 72,473

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O'Neil

Affordable Care Act / Obamacare, Alan Greenspan, algorithmic bias, Bernie Madoff, big data - Walmart - Pop Tarts, call centre, Cambridge Analytica, carried interest, cloud computing, collateralized debt obligation, correlation does not imply causation, Credit Default Swap, credit default swaps / collateralized debt obligations, crowdsourcing, data science, disinformation, electronic logging device, Emanuel Derman, financial engineering, Financial Modelers Manifesto, Glass-Steagall Act, housing crisis, I will remember that I didn’t make the world, and it doesn’t satisfy my equations, Ida Tarbell, illegal immigration, Internet of things, late fees, low interest rates, machine readable, mass incarceration, medical bankruptcy, Moneyball by Michael Lewis explains big data, new economy, obamacare, Occupy movement, offshore financial centre, payday loans, peer-to-peer lending, Peter Thiel, Ponzi scheme, prediction markets, price discrimination, quantitative hedge fund, Ralph Nader, RAND corporation, real-name policy, recommendation engine, Rubik’s Cube, Salesforce, Sharpe ratio, statistical model, tech worker, Tim Cook: Apple, too big to fail, Unsafe at Any Speed, Upton Sinclair, Watson beat the top human players on Jeopardy!, working poor

,” Boston Globe, November 7, 2015, www.​bostonglobe.​com/​2015/​11/​07/​childwelfare-​bostonglobe-​com/​AZ2kZ7ziiP8c​BMOite2KKP/story.​html. ABOUT THE AUTHOR Cathy O’Neil is a data scientist and the author of the blog mathbabe.​org. She earned a PhD in mathematics from Harvard and taught at Barnard College before moving to the private sector, where she worked for the hedge fund D. E. Shaw. She then worked as a data scientist at various start-ups, building models that predict people’s purchases and clicks. O’Neil started the Lede Program in Data Journalism at Columbia and is the author of Doing Data Science. She appears weekly on the Slate Money podcast. What’s next on your reading list?

For diploma mills like the University of Phoenix, I think it’s safe to say, the goal is to recruit the greatest number of students who can land government loans to pay most of their tuition and fees. With that objective in mind, the data scientists have to figure out how best to manage their various communication channels so that together they generate the most bang for each buck. The data scientists start off with a Bayesian approach, which in statistics is pretty close to plain vanilla. The point of Bayesian analysis is to rank the variables with the most impact on the desired outcome. Search advertising, TV, billboards, and other promotions would each be measured as a function of their effectiveness per dollar.

And if it wants to find out what drives shopping recidivism, it carries out research. Its data scientists don’t just study zip codes and education levels. They also inspect people’s experience within the Amazon ecosystem. They might start by looking at the patterns of all the people who shopped once or twice at Amazon and never returned. Did they have trouble at checkout? Did their packages arrive on time? Did a higher percentage of them post a bad review? The questions go on and on, because the future of the company hinges upon a system that learns continually, one that figures out what makes customers tick. If I had a chance to be a data scientist for the justice system, I would do my best to dig deeply to learn what goes on inside those prisons and what impact those experiences might have on prisoners’ behavior.


Data Action: Using Data for Public Good by Sarah Williams

affirmative action, Amazon Mechanical Turk, Andrei Shleifer, augmented reality, autonomous vehicles, Brexit referendum, Cambridge Analytica, Charles Babbage, City Beautiful movement, commoditize, coronavirus, COVID-19, crowdsourcing, data acquisition, data is the new oil, data philanthropy, data science, digital divide, digital twin, Donald Trump, driverless car, Edward Glaeser, fake news, four colour theorem, global village, Google Earth, informal economy, Internet of things, Jane Jacobs, John Snow's cholera map, Kibera, Lewis Mumford, Marshall McLuhan, mass immigration, mass incarceration, megacity, military-industrial complex, Minecraft, neoliberal agenda, New Urbanism, Norbert Wiener, nowcasting, oil shale / tar sands, openstreetmap, place-making, precautionary principle, RAND corporation, ride hailing / ride sharing, selection bias, self-driving car, sentiment analysis, Sidewalk Labs, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, Steven Levy, the built environment, The Chicago School, The Death and Life of Great American Cities, transatlantic slave trade, Uber for X, upwardly mobile, urban planning, urban renewal, W. E. B. Du Bois, Works Progress Administration

So it is essential we investigate potential bias created by the technology used to collect data. Analysis of data acquired through creative means must include subject experts and the community represented in the data. Working with policy experts helps data scientists ask the right questions. This type of multidisciplinary team can generate more accurate and ethical results from the data. There are numerous examples of data scientists who try to predict the dynamics of human life, successfully at first; but without continued inclusion of subject experts their data models are quickly outdated. This is the hubris of big data. Why do we often think the data analyst can find the right questions to ask without asking those who have in-depth knowledge of the topics we seek to understand?

Method Throughout Data Action, I argue that unlocking data for policy change works best when the process engages multidisciplinary teams that include policy experts, data scientists, and data visualizers, among others. In the book's conclusion, “It's How We Work with Data That Really Matters” I stress that bringing together these experts allows the creative expression of data to truly blossom. Policy experts understand the issues, data scientists know how to develop algorithms, and graphic designers can share the results through compelling visuals. Working together these specialists extend their findings beyond the walls of academia or city hall—and reach the hands and minds of the public.

Kubzansky, “A Framework for Examining Social Stress and Susceptibility to Air Pollution in Respiratory Health,” Environmental Health Perspectives 117, no. 9 (September 1, 2009): 1351–1358, https://doi.org/10.1289/ehp.0900612. 54 Interview with Iyad Kheirbek of [New York City Department of Mental Health and Hygiene], January 2015. 55 “2014 West Africa Ebola Response—OpenStreetMap Wiki,” accessed January 25, 2019, https://wiki.openstreetmap.org/wiki/2014_West_Africa_Ebola_Response. 56 “Ushahidi,” accessed January 25, 2019, https://www.ushahidi.com/. 57 Ida Norheim-Hagtun and Patrick Meier, “Crowdsourcing for Crisis Mapping in Haiti,” Innovations: Technology, Governance, Globalization 5, no. 4 (2010): 81–89. 58 Jessica Ramirez, “‘Ushahidi’ Technology Saves Lives in Haiti and Chile,” Newsweek, March 3, 2010, https://www.newsweek.com/ushahidi-technology-saves-lives-haiti-and-chile-210262. 59 Norheim-Hagtun and Meier, “Crowdsourcing for Crisis Mapping in Haiti.” 60 Patrick Meier, “Crowdsourcing the Evaluation of Post-Sandy Building Damage Using Aerial Imagery,” IRevolutions (blog), November 1, 2012, https://irevolutions.org/2012/11/01/crowdsourcing-sandy-building-damage/. 61 Introduction to Data Science (IDA), “About,” IDS webpage, https://www.idsucla.org/about. 62 Ron Eglash, Juan E. Gilbert, and Ellen Foster, “Toward Culturally Responsive Computing Education,” Commun. ACM 56, no. 7 (July 2013): 33–36, https://doi.org/10.1145/2483852.2483864; Amelia McNamara and Mark Hansen, “Teaching Data Science to Teenagers,” in Proceedings of the Ninth International Conference on Teaching Statistics, 2014. 63 Nicole Lazar and Christine Franklin, “The Big Picture: Preparing Students for a Data-Centric World,” Chance 28, no. 4 (2015): 43–45.


pages: 1,409 words: 205,237

Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale by Jan Kunigk, Ian Buss, Paul Wilkinson, Lars George

Amazon Web Services, barriers to entry, bitcoin, business intelligence, business logic, business process, cloud computing, commoditize, computer vision, continuous integration, create, read, update, delete, data science, database schema, Debian, deep learning, DevOps, domain-specific language, fault tolerance, Firefox, FOSDEM, functional programming, Google Chrome, Induced demand, information security, Infrastructure as a Service, Internet of things, job automation, Kickstarter, Kubernetes, level 1 cache, loose coupling, microservices, natural language processing, Network effects, platform as a service, single source of truth, source of truth, statistical model, vertical integration, web application

The architect is responsible for technical coordination and should have extensive experience working with big data and Hadoop. Data scientist Although the term “data scientist” has been around for some time, the current surge in its use in corporate IT reflects how much big data changes the complexity of data management in IT. It also reflects a shift in academia, where data science has evolved to become a fully qualified discipline at many renowned universities.1 Sometimes we see organizations that are largely indifferent to data science per se, or that simply try to rebrand all existing analyst staff as data scientists. The data scientist, however, actually does more: Statistics and classic BI The data scientist depends on classic tools to present and productize the result of his work, but before these tools can be used, a lot of exploration, cleansing, and modeling is likely to be required on the Hadoop layer.

End users commonly access the web interfaces of the YARN ResourceManager, MapReduce Job History Server, and Spark History Server. In addition, the Hue project offers a comprehensive user interface for many components in the stack, including HDFS, Hive, Impala, Oozie, and Solr. Additional user-oriented or specialized web UIs are also available, such as Jupyter Notebook, Apache Zeppelin, or Cloudera Data Science Workbench for data scientists, as shown in Table 11-1. Table 11-1. A summary of access mechanisms Project Programmatic Command line Web UI HDFS Java, REST (WebHDFS/HttpFS) hdfs NameNode and DataNode YARN Java, REST (RM) yarn ResourceManager and NodeManager ZooKeeper Java/C++ zookeeper-client - HBase Java, HBase REST/Thrift servera hbase shell Master and RegionServer Hive Thrift, JDBC, ODBC beeline HiveServer2 Oozie Java, REST oozie Server via extension Spark Java/Scala/Python, JDBC (via Thrift server) spark-shell, spark-submit, pyspark History Server Impala JDBC, ODBC impala-shell Statestore, catalog server, daemon Solr Java, REST solrctl Server Kudu Java/C++/Python kudu admin utility Master and tablet server Hue Python SDK - Hue Server a Apache Phoenix provides a JDBC interface to Apache HBase.

Coding Whereas the typical analyst or statistician understands methods and models mathematically, a good data scientist also has a solid background in parallel algorithms to build large-scale distributed applications around such models. As we already mentioned, the data scientist is well versed in coding in third-generation and functional programming languages, such as Scala, Java, and Python, in addition to the domain-specific languages of the classic analytics world. In this function, the data scientist collaborates with development departments to build fully fledged distributed applications that can be productively deployed to Hadoop.


pages: 347 words: 97,721

Only Humans Need Apply: Winners and Losers in the Age of Smart Machines by Thomas H. Davenport, Julia Kirby

"World Economic Forum" Davos, AI winter, Amazon Robotics, Andy Kessler, Apollo Guidance Computer, artificial general intelligence, asset allocation, Automated Insights, autonomous vehicles, basic income, Baxter: Rethink Robotics, behavioural economics, business intelligence, business process, call centre, carbon-based life, Clayton Christensen, clockwork universe, commoditize, conceptual framework, content marketing, dark matter, data science, David Brooks, deep learning, deliberate practice, deskilling, digital map, disruptive innovation, Douglas Engelbart, driverless car, Edward Lloyd's coffeehouse, Elon Musk, Erik Brynjolfsson, estate planning, financial engineering, fixed income, flying shuttle, follow your passion, Frank Levy and Richard Murnane: The New Division of Labor, Freestyle chess, game design, general-purpose programming language, global pandemic, Google Glasses, Hans Lippershey, haute cuisine, income inequality, independent contractor, index fund, industrial robot, information retrieval, intermodal, Internet of things, inventory management, Isaac Newton, job automation, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joi Ito, Khan Academy, Kiva Systems, knowledge worker, labor-force participation, lifelogging, longitudinal study, loss aversion, machine translation, Mark Zuckerberg, Narrative Science, natural language processing, Nick Bostrom, Norbert Wiener, nuclear winter, off-the-grid, pattern recognition, performance metric, Peter Thiel, precariat, quantitative trading / quantitative finance, Ray Kurzweil, Richard Feynman, risk tolerance, Robert Shiller, robo advisor, robotic process automation, Rodney Brooks, Second Machine Age, self-driving car, Silicon Valley, six sigma, Skype, social intelligence, speech recognition, spinning jenny, statistical model, Stephen Hawking, Steve Jobs, Steve Wozniak, strong AI, superintelligent machines, supply-chain management, tacit knowledge, tech worker, TED Talk, the long tail, transaction costs, Tyler Cowen, Tyler Cowen: Great Stagnation, Watson beat the top human players on Jeopardy!, Works Progress Administration, Zipcar

If you have a background in one of these (slightly) ancillary aspects of software, you can probably find the same kind of job related to automated software. Data Scientists —A couple of years ago, Tom and D. J. Patil, now chief data scientist of the White House Office of Science and Technology Policy, wrote (with Julia’s editing help) an article suggesting that data scientists held the “sexiest job of the 21st century.”1 It’s not that the people themselves were necessarily sexy, but that the jobs were difficult and hard to fill. They still are, though the shortage may be easing a bit, with the introduction of a number of new master’s programs in data science at U.S. universities. Data scientists are likely to be highly valued when the data used by cognitive systems are highly unstructured (voice or text or human genome records, as opposed to rows and columns of numbers) or difficult to extract from its source.

He does that with his team for about 60 percent of his time. He also spends a lot of time with customers—roughly a couple of days a week. He hears what their needs for new capabilities are, and translates that into data science activity by his team. Whenever he can squeeze it in, he interviews and hires new data scientists, and meets with other DataXu executives. When he hires other data scientists, Catanzaro looks for three types of skills, only one of which is technical. The first is “data science smarts”—being good with big data technologies, statistics, and so forth. He’s not so much interested in knowledge of a particular set of tools, but rather the “raw horsepower” to be able to master new tools.

And they also are likely to have either quantitative modeling skills or natural language processing skills. What do data scientists do day to day in the development of automated decision systems? Automated systems typically use a lot of data, so the data scientist might be scouting around to figure out the next great external data source. After a promising source is identified, he or she might be determining how to get the data into the right format, or how to combine it with the data that the organization already has. The data scientist might also be working on an algorithm to extract insights from the data. Or, since data scientists tend to be good at computational skills, too, they might be architecting or helping to develop the new or modified system.


The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences by Rob Kitchin

Bayesian statistics, business intelligence, business process, cellular automata, Celtic Tiger, cloud computing, collateralized debt obligation, conceptual framework, congestion charging, corporate governance, correlation does not imply causation, crowdsourcing, data science, discrete time, disruptive innovation, George Gilder, Google Earth, hype cycle, Infrastructure as a Service, Internet Archive, Internet of things, invisible hand, knowledge economy, Large Hadron Collider, late capitalism, lifelogging, linked data, longitudinal study, machine readable, Masdar, means of production, Nate Silver, natural language processing, openstreetmap, pattern recognition, platform as a service, recommendation engine, RFID, semantic web, sentiment analysis, SimCity, slashdot, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart grid, smart meter, software as a service, statistical model, supply-chain management, technological solutionism, the scientific method, The Signal and the Noise by Nate Silver, transaction costs

The worry for many commentators is that the potential benefits of data-driven business and science will not be fully realised due to a shortage of human talent, in particular ‘data scientists, who combine the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data’ (Cukier 2010), and managers who understand how to convert such nuggets into wise decisions. With respect to the latter, as Shah et al. (2012: 23) note, ‘[i]nvestments in analytics can be useless, even harmful, unless employees can incorporate that data into complex decision making’. Universities are now starting to create new data science programmes and research centres, and to adapt existing courses to include training in new skills sets, in an effort to ameliorate some skills gaps.

He argues that ‘shifting the analysis to digital data... opens up epistemic questions as to who is the most legitimate producer of knowledge – the museum collector (the clinician, or the molecular biologist) or the statistician analyzing the data’ or producing the simulation or model (2012: 87). Some data scientists are thus undoubtedly ignoring the observations of Porway (2013): Without subject matter experts available to articulate problems in advance, you get [poor] results... Subject matter experts are doubly needed to assess the results of the work, especially when you’re dealing with sensitive data about human behavior. As data scientists, we are well equipped to explain the ‘what’ of data, but rarely should we touch the question of ‘why’ on matters we are not experts in. As Porway notes, what is really needed is for data scientists and domain experts to work with each other to ensure that the data analytics used make sense and that the results from such analytics are sensibly and contextually interpreted.

Berry, D. (2011) ‘The computational turn: thinking about the digital humanities’, Culture Machine, 12, http://www.culturemachine.net/index.php/cm/article/view/440/470 (last accessed 3 December 2012). Bertolucci, J. (2013) ‘IBM, universities team up to build data scientists’, InformationWeek, 15 January. http://www.informationweek.com/big-data/big-data-analytics/ibm-universities-team-upto-build-data-scientists/ (last accessed 16 January 2014). Bettencourt, L.M.A., Lobo, J., Helbing, D., Kuhnert, C. and West, G.B. (2007) ‘Growth, innovation, scaling, and the pace of life in cities’, Proceedings of the National Academy of Sciences, 104(17): 7301–06.


pages: 241 words: 70,307

Leadership by Algorithm: Who Leads and Who Follows in the AI Era? by David de Cremer

"Friedman doctrine" OR "shareholder theory", algorithmic bias, algorithmic management, AlphaGo, bitcoin, blockchain, business climate, business process, Computing Machinery and Intelligence, corporate governance, data is not the new oil, data science, deep learning, DeepMind, Donald Trump, Elon Musk, fake news, future of work, job automation, Kevin Kelly, Mark Zuckerberg, meta-analysis, Norbert Wiener, pattern recognition, Peter Thiel, race to the bottom, robotic process automation, Salesforce, scientific management, shareholder value, Silicon Valley, Social Responsibility of Business Is to Increase Its Profits, Stephen Hawking, The Future of Employment, Turing test, work culture , workplace surveillance , zero-sum game

Second, some departments will employ algorithms more than others, thereby creating differences in work attitudes towards automation, which will make the process of digital transformation more difficult. Integrating teams of data scientists in the daily operations of the company With the work environment gradually being automated, organizations will increasingly hire more people with an engineering and data-science background. These new hires will have an expert understanding of the new technology and the ability to work with big data. A problem, however, is that those experts usually do not share the same mindset as the people who are not trained in the fields of engineering and data science. Organizations often fail to recognize this difference in mindset and put little effort into ensuring that data scientists are integrated into the organization.

In a way, leaders in the 21st century show competence by bringing the right teams of experts together to optimise the use of data to bring the value that is expected. For example, leaders connecting teams of data scientists with HR and finance teams in transparent and effective ways can help to increase the success rate of digital transformation strategies. The team of data scientists will help their colleagues to see what possibilities are available to digitalize information. Equally, the other teams can help data scientists understand their needs and thus provide input in designing a more user-friendly digital environment. Finally, because successes are rarely achieved immediately, it is important that leaders provide regular updates to the different parties involved on how the challenges are being approached.

Organizations often fail to recognize this difference in mindset and put little effort into ensuring that data scientists are integrated into the organization. Becoming an automated organization means that all operations will be affected. Teams of data scientists thus need to understand the goals of the finance department, human resource department, sales department and so forth. Likewise, organizations need to prepare all the other departments to be open and collaborative with the team of data scientists. It is only with an open-minded attitude that successful integration and implementation of algorithms within the context of each department can be achieved.²⁰² Promoting transparency in communication and exchange of data Today there is no longer any doubt that to succeed in the future, organizations will need to have the ability to deal with data and use algorithms.


pages: 374 words: 94,508

Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage by Douglas B. Laney

3D printing, Affordable Care Act / Obamacare, banking crisis, behavioural economics, blockchain, book value, business climate, business intelligence, business logic, business process, call centre, carbon credits, chief data officer, Claude Shannon: information theory, commoditize, conceptual framework, crowdsourcing, dark matter, data acquisition, data science, deep learning, digital rights, digital twin, discounted cash flows, disintermediation, diversification, en.wikipedia.org, endowment effect, Erik Brynjolfsson, full employment, hype cycle, informal economy, information security, intangible asset, Internet of things, it's over 9,000, linked data, Lyft, Nash equilibrium, Neil Armstrong, Network effects, new economy, obamacare, performance metric, profit motive, recommendation engine, RFID, Salesforce, semantic web, single source of truth, smart meter, Snapchat, software as a service, source of truth, supply-chain management, tacit knowledge, technological determinism, text mining, uber lyft, Y2K, yield curve

These types of algorithms can help to monetize almost any kind of information, be it granular IoT data or macro-level economic figures. Expect these kinds of algorithms to be a standard component in the vast majority of data scientist toolboxes and increasingly accepted by business leaders, despite their “black box” models. As a result, many data science tasks will become automated, increasing the productivity of data scientists and enabling a class of “citizen data scientists” to emerge. This will put signifi-cant pressure on information asset supply chains and information curation efforts, and engender a boom in information monetization ideas from all corners of the organization across all industries.

For example, the real estate aggregator Trulia discovered that 90 percent of its web traffic is from people clicking on photos of homes. But Trulia had no information about what was in the photos. The photos had no descriptions or tags. So Trulia’s data science team trained a one-billion-node neural network to learn what is depicted in them. Now, according to Todd Holloway, who started Trulia’s data science program, “The system can find you a home in the Hamptons with photos of wine cellars.”11 Helping a buyer find a home is one thing, but now Trulia can correlate sales data with what site users are looking at, and license this information and insight to realtors, homebuilders, appliance manufacturers, and any type of company within the periphery of the real estate market.

And value-focused CDOs, will deploy information assets to generate supplemental and significant revenue streams. Advanced Analytics, Data Science, and Artificial Intelligence Not just a global trend, but also a technology trend, advanced analytics solutions are becoming increasingly popular in driving business innovation and experimentation—and creating competitive advantage by monetizing available information assets inside and outside the organization. Over the foreseeable future, enterprises will be seeking to adopt advanced analytics and adapt their business models, establish specialist data science teams, and rethink their overall strategies to keep pace with the competition.


Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps by Valliappa Lakshmanan, Sara Robinson, Michael Munn

A Pattern Language, Airbnb, algorithmic trading, automated trading system, business intelligence, business logic, business process, combinatorial explosion, computer vision, continuous integration, COVID-19, data science, deep learning, DevOps, discrete time, en.wikipedia.org, Hacker News, industrial research laboratory, iterative process, Kubernetes, machine translation, microservices, mobile money, natural language processing, Netflix Prize, optical character recognition, pattern recognition, performance metric, recommendation engine, ride hailing / ride sharing, selection bias, self-driving car, sentiment analysis, speech recognition, statistical model, the payments system, web application

Data engineers implement infrastructure and pipelines around data. Machine learning engineers do similar tasks to data engineers, but for ML models. They take models developed by data scientists, and manage the infrastructure and operations around training and deploying those models. ML engineers help build production systems to handle updating models, model versioning, and serving predictions to end users. The smaller the data science team at a company and the more agile the team is, the more likely it is that the same person plays multiple roles. If you are in such a situation, it is very likely that you read the above three descriptions and saw yourself partially in all three categories.

His team builds software solutions for business problems using Google Cloud’s data analytics and machine learning products. He founded Google’s Advanced Solutions Lab ML Immersion program. Before Google, Lak was a Director of Data Science at Climate Corporation and a Research Scientist at NOAA. Sara Robinson is a Developer Advocate on Google’s Cloud Platform team, focusing on machine learning. She inspires developers and data scientists to integrate ML into their applications through demos, online content, and events. Sara has a bachelor’s degree from Brandeis University. Before Google, she was a Developer Advocate on the Firebase team.

Roles Within an organization, there are many different job roles relating to data and machine learning. Below we’ll define a few common ones referenced frequently throughout the book. This book is targeted primarily at data scientists, data engineers, and ML engineers, so let’s start with those. A data scientist is someone focused on collecting, interpreting, and processing datasets. They run statistical and exploratory analysis on data. As it relates to machine learning, a data scientist may work on data collection, feature engineering, model building, and more. Data scientists often work in Python or R in a notebook environment, and are usually the first to build out an organization’s machine learning models.


pages: 366 words: 76,476

Dataclysm: Who We Are (When We Think No One's Looking) by Christian Rudder

4chan, Affordable Care Act / Obamacare, bitcoin, cloud computing, correlation does not imply causation, crowdsourcing, cuban missile crisis, data science, Donald Trump, Edward Snowden, en.wikipedia.org, fake it until you make it, Frank Gehry, Howard Zinn, Jaron Lanier, John Markoff, John Snow's cholera map, lifelogging, Mahatma Gandhi, Mikhail Gorbachev, Nate Silver, Nelson Mandela, new economy, obamacare, Occupy movement, p-value, power law, pre–internet, prosperity theology / prosperity gospel / gospel of success, race to the bottom, retail therapy, Salesforce, selection bias, Snapchat, social graph, Steve Jobs, the scientific method, the strength of weak ties, Twitter Arab Spring, two and twenty

I transported the data to the previous Voronoi partition in order to maintain consistency with the previous Craigslist map. Years ago, an enterprising hacker The hacker is Pete Warden, and his post is “How to Split Up the US,” which you can find here: petewarden.com/2010/02/06/how-to-split-up-the-us/. As Warden notes in a later post, “Why You Should Never Trust a Data Scientist,” his grouping of the United States into the seven new zones is arbitrary—the data science version of “for entertainment purposes only.” I reference them here in that spirit. Matthew Zook, a geographer Professor Zook and his team maintain a fantastic geography blog called Floating Sheep, and that blog was my primary source for his work: floatingsheep.org.

For more information and the full study, please refer to the Facebook Data Science post on Coordinated Migration: www.facebook.com/notes/facebook-data-science/coordinated-migration/10151930946453859. As you’ll see when you visit the link, in reproducing their work, I modified their original map by removing the labels and focusing on a smaller part of the region, to make the map more readable in print. Thank you to Mike Develin, also at Facebook, for helping facilitate permission for this reproduction. All Facebook Data Science work is done on anonymized and aggregated data. Chapter 13: Our Brand Could Be Your Life But what they don’t tell you See Clare Baker, “Behind the Red Triangle: The Bass Pale Ale Brand and Logo” Logoworks.com, November 8, 2013, logoworks.com/blog/bass-pale-ale-brand-and-logo/.

And it’s because these few letters are such a concise description that Shazam is so fast: instead of a guitar, Paul McCartney, and just the right amount of reverb, “Yesterday” starts with •DRUUUUUUDDR. That’s a lot easier to understand. Like an app straining for a song, data science is about finding patterns. Time after time, I—and the many other people doing work like me—have had to devise methods, structures, even shortcuts to find the signal amidst the noise. We’re all looking for our own Parsons code. Something so simple and yet so powerful is a once-in-a-lifetime discovery, but luckily there are a lot of lifetimes out there. And for any problem that data science might face, this book has been my way to say: I like our odds. 1 For more on the Kafkaesque implications of the CFAA, please see “Until Today, If You Were 17, It Could Have Been Illegal to Read Seventeeen.com Under the CFAA” and “Are You a Teenager Who Reads News Online?


pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money by Frank J. Ohlhorst

algorithmic trading, bioinformatics, business intelligence, business logic, business process, call centre, cloud computing, create, read, update, delete, data acquisition, data science, DevOps, extractivism, fault tolerance, information security, Large Hadron Collider, linked data, machine readable, natural language processing, Network effects, pattern recognition, performance metric, personalized medicine, RFID, sentiment analysis, six sigma, smart meter, statistical model, supply-chain management, warehouse automation, Watson beat the top human players on Jeopardy!, web application

Determining those skills is one of the first steps in putting a team together. THE DATA SCIENTIST One of the first concepts to become acquainted with is the data scientist; a relatively new title, it is not readily recognized or accepted by many organizations, but it is here to stay. A data scientist is normally associated with an employee or a business intelligence (BI) consultant who excels at analyzing data, particularly large amounts of data, to help a business gain a competitive edge. The data scientist is usually the de facto team leader during a Big Data analytics project. The title data scientist is sometimes disparaged because it lacks specificity and can be perceived as an aggrandized synonym for data analyst.

Much like the data themselves, the team should not be static in nature and should be able to evolve and adapt to the needs of the business. CHALLENGES REMAIN Locating the right talent to analyze data is the biggest hurdle in building a team. Such talent is in high demand, and the need for data analysts and data scientists continues to grow at an almost exponential rate. Finding this talent means that organizations will have to focus on data science and hire statistical modelers and text data–mining professionals as well as people who specialize in sentiment analysis. Success with Big Data analytics requires solid data models, statistical predictive models, and test analytic models, since these will be the core applications needed to do Big Data.

Nevertheless, the position is gaining acceptance with large enterprises that are interested in deriving meaning from Big Data, the voluminous amount of structured, unstructured, and semistructured data that a large enterprise produces or has access to. A data scientist must possess a combination of analytic, machine learning, data mining, and statistical skills as well as experience with algorithms and coding. However, the most critical skill a data scientist should possess is the ability to translate the significance of data in a way that can be easily understood by others. THE TEAM CHALLENGE Finding and hiring talented workers with analytics skills is the first step in creating an effective data analytics team.


pages: 204 words: 58,565

Keeping Up With the Quants: Your Guide to Understanding and Using Analytics by Thomas H. Davenport, Jinho Kim

behavioural economics, Black-Scholes formula, business intelligence, business process, call centre, computer age, correlation coefficient, correlation does not imply causation, Credit Default Swap, data science, en.wikipedia.org, feminist movement, Florence Nightingale: pie chart, forensic accounting, global supply chain, Gregor Mendel, Hans Rosling, hypertext link, invention of the telescope, inventory management, Jeff Bezos, Johannes Kepler, longitudinal study, margin call, Moneyball by Michael Lewis explains big data, Myron Scholes, Netflix Prize, p-value, performance metric, publish or perish, quantitative hedge fund, random walk, Renaissance Technologies, Robert Shiller, self-driving car, sentiment analysis, six sigma, Skype, statistical model, supply-chain management, TED Talk, text mining, the scientific method, Thomas Davenport

For example, at Intuit, George Roumeliotis heads a data science group that analyzes and creates product features based on the vast amount of online data that Intuit collects. For every project in which his group engages with an internal customer, he recommends a methodology for doing and communicating about the analysis. Most of the steps have a strong business orientation: My understanding of the business problem How I will measure the business impact What data is available The initial solution hypothesis The solution The business impact of the solution Data scientists using this methodology are encouraged to create a wiki so that they can post the results of each step.

In the online information industry, companies have big data involving many petabytes of information. New information comes in at such volume and speed that it would be difficult for humans to comprehend it all. In this environment, the data scientists that work in such organizations (basically quantitative analysts with higher-than-normal IT skills) are often located in product development organizations. Their goal is to develop product prototypes and new product features, not reports or presentations. For example, the Data Science group at the business networking site LinkedIn is a part of the product organization, and has developed a variety of new product features and functions based on the relationships between social networks and jobs.

Patil eventually graduated and became a faculty member at Maryland, and worked on modeling the complexity of weather. He then worked for the US government on intelligence issues. Research funding was limited at the time, so he left to work for Skype, then owned by eBay. He next became the leader of data scientists at LinkedIn, where the people in that highly analytical position have been enormously influential in product development. Now Patil is a data scientist in residence (perhaps the first person with that title as well) at the venture capital firm Greylock Partners, helping the firm’s portfolio companies to think about data and analytics. He’s perhaps the world’s best example of latent math talent.


pages: 918 words: 257,605

The Age of Surveillance Capitalism by Shoshana Zuboff

"World Economic Forum" Davos, algorithmic bias, Amazon Web Services, Andrew Keen, augmented reality, autonomous vehicles, barriers to entry, Bartolomé de las Casas, behavioural economics, Berlin Wall, Big Tech, bitcoin, blockchain, blue-collar work, book scanning, Broken windows theory, California gold rush, call centre, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, choice architecture, citizen journalism, Citizen Lab, classic study, cloud computing, collective bargaining, Computer Numeric Control, computer vision, connected car, context collapse, corporate governance, corporate personhood, creative destruction, cryptocurrency, data science, deep learning, digital capitalism, disinformation, dogs of the Dow, don't be evil, Donald Trump, Dr. Strangelove, driverless car, Easter island, Edward Snowden, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, facts on the ground, fake news, Ford Model T, Ford paid five dollars a day, future of work, game design, gamification, Google Earth, Google Glasses, Google X / Alphabet X, Herman Kahn, hive mind, Ian Bogost, impulse control, income inequality, information security, Internet of things, invention of the printing press, invisible hand, Jean Tirole, job automation, Johann Wolfgang von Goethe, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Kevin Kelly, Kevin Roose, knowledge economy, Lewis Mumford, linked data, longitudinal study, low skilled workers, Mark Zuckerberg, market bubble, means of production, multi-sided market, Naomi Klein, natural language processing, Network effects, new economy, Occupy movement, off grid, off-the-grid, PageRank, Panopticon Jeremy Bentham, pattern recognition, Paul Buchheit, performance metric, Philip Mirowski, precision agriculture, price mechanism, profit maximization, profit motive, public intellectual, recommendation engine, refrigerator car, RFID, Richard Thaler, ride hailing / ride sharing, Robert Bork, Robert Mercer, Salesforce, Second Machine Age, self-driving car, sentiment analysis, shareholder value, Sheryl Sandberg, Shoshana Zuboff, Sidewalk Labs, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, slashdot, smart cities, Snapchat, social contagion, social distancing, social graph, social web, software as a service, speech recognition, statistical model, Steve Bannon, Steve Jobs, Steven Levy, structural adjustment programs, surveillance capitalism, technological determinism, TED Talk, The Future of Employment, The Wealth of Nations by Adam Smith, Tim Cook: Apple, two-sided market, union organizing, vertical integration, Watson beat the top human players on Jeopardy!, winner-take-all economy, Wolfgang Streeck, work culture , Yochai Benkler, you are the product

.… Who knows what other research they’re doing.”14 In other words, Fiske recognized that the experiment was merely an extension of Facebook’s standard practices of behavioral modification, which already flourish without sanction. Facebook data scientist and principal researcher Adam Kramer was deluged with hundreds of media queries, leading him to write on his Facebook page that the corporation really does “care” about its emotional impact. One of his coauthors, Cornell’s Jeffrey Hancock, told the New York Times that he didn’t realize that manipulating the news feeds, even modestly, would make some people feel violated.15 The Wall Street Journal reported that the Facebook data science group had run more than 1,000 experiments since its inception in 2007 and operated with “few limits” and no internal review board.

In studying the surveillance capitalist practices of Google, Facebook, Microsoft, and other corporations, I have paid close attention to interviews, patents, earnings calls, speeches, conferences, videos, and company programs and policies. In addition, between 2012 and 2015 I interviewed 52 data scientists from 19 different companies with a combined 586 years of experience in high-technology corporations and startups, primarily in Silicon Valley. These interviews were conducted as I developed my “ground truth” understanding of surveillance capitalism and its material infrastructure. Early on I approached a small number of highly respected data scientists, senior software developers, and specialists in the “internet of things.” My interview sample grew as scientists introduced me to their colleagues.

Surveillance capitalists adapted many of the highly contestable assumptions of behavioral economists as one cover story with which to legitimate their practical commitment to a unilateral commercial program of behavior modification. The twist here is that nudges are intended to encourage choices that accrue to the architect, not to the individual. The result is data scientists trained on economies of action who regard it as perfectly normal to master the art and science of the “digital nudge” for the sake of their company’s commercial interests. For example, the chief data scientist for a national drugstore chain described how his company designs automatic digital nudges that subtly push people toward the specific behaviors favored by the company: “You can make people do things with this technology.


Mindf*ck: Cambridge Analytica and the Plot to Break America by Christopher Wylie

4chan, affirmative action, Affordable Care Act / Obamacare, air gap, availability heuristic, Berlin Wall, Bernie Sanders, Big Tech, big-box store, Boris Johnson, Brexit referendum, British Empire, call centre, Cambridge Analytica, Chelsea Manning, chief data officer, cognitive bias, cognitive dissonance, colonial rule, computer vision, conceptual framework, cryptocurrency, Daniel Kahneman / Amos Tversky, dark pattern, dark triade / dark tetrad, data science, deep learning, desegregation, disinformation, Dominic Cummings, Donald Trump, Downton Abbey, Edward Snowden, Elon Musk, emotional labour, Etonian, fake news, first-past-the-post, gamification, gentleman farmer, Google Earth, growth hacking, housing crisis, income inequality, indoor plumbing, information asymmetry, Internet of things, Julian Assange, Lyft, Marc Andreessen, Mark Zuckerberg, Menlo Park, move fast and break things, Network effects, new economy, obamacare, Peter Thiel, Potemkin village, recommendation engine, Renaissance Technologies, Robert Mercer, Ronald Reagan, Rosa Parks, Sand Hill Road, Scientific racism, Shoshana Zuboff, side project, Silicon Valley, Skype, Stephen Fry, Steve Bannon, surveillance capitalism, tech bro, uber lyft, unpaid internship, Valery Gerasimov, web application, WikiLeaks, zero-sum game

But that, of course, was not why Nix gave them full access to the private data of hundreds of millions of American citizens. Nix’s dream, as he had confided in our very first meeting, was to become the “Palantir of propaganda.” One lead data scientist from Palantir began making regular trips to the Cambridge Analytica office to work with the data science team on building profiling models. He was occasionally accompanied by colleagues, but the entire arrangement was kept secret from the rest of the CA teams—and perhaps Palantir itself. I can’t speculate about why, but the Palantir staff received Cambridge Analytica database logins and emails with fairly obvious pseudonyms like “Dr.

Fresh out of university, I had taken a job at a London firm called SCL Group, which was supplying the U.K. Ministry of Defence and NATO armies with expertise in information operations. After Western militaries were grappling with how to tackle radicalization online, the firm wanted me to help build a team of data scientists to create new tools to identify and combat extremism online. It was fascinating, challenging, and exciting all at once. We were about to break new ground for the cyber defenses of Britain, America, and their allies and confront bubbling insurgencies of radical extremism with data, algorithms, and targeted narratives online.

It seemed entirely hypocritical to, on the one hand, frustrate jihadist groups in places like Pakistan and then, on the other, assist an autocratic and Islamist-backed regime in Egypt in creating its own tyranny of people. But Nix didn’t care. Business was business; he just wanted to clinch the deal. The main challenge for me and the growing team of psychologists and data scientists at SCL was in the objective substance of extremism itself. What does it mean to be an extremist? What exactly is extremism, and how can you model it? These were subjective definitions, and clearly the Egyptian government had one idea, while we had another. But if you want to be able to quantify and predict a trait, you have to be able to create a definition of it.


pages: 245 words: 83,272

Artificial Unintelligence: How Computers Misunderstand the World by Meredith Broussard

"Susan Fowler" uber, 1960s counterculture, A Declaration of the Independence of Cyberspace, Ada Lovelace, AI winter, Airbnb, algorithmic bias, AlphaGo, Amazon Web Services, autonomous vehicles, availability heuristic, barriers to entry, Bernie Sanders, Big Tech, bitcoin, Buckminster Fuller, Charles Babbage, Chris Urmson, Clayton Christensen, cloud computing, cognitive bias, complexity theory, computer vision, Computing Machinery and Intelligence, crowdsourcing, Danny Hillis, DARPA: Urban Challenge, data science, deep learning, Dennis Ritchie, digital map, disruptive innovation, Donald Trump, Douglas Engelbart, driverless car, easy for humans, difficult for computers, Electric Kool-Aid Acid Test, Elon Musk, fake news, Firefox, gamification, gig economy, global supply chain, Google Glasses, Google X / Alphabet X, Greyball, Hacker Ethic, independent contractor, Jaron Lanier, Jeff Bezos, Jeremy Corbyn, John Perry Barlow, John von Neumann, Joi Ito, Joseph-Marie Jacquard, life extension, Lyft, machine translation, Mark Zuckerberg, mass incarceration, Minecraft, minimum viable product, Mother of all demos, move fast and break things, Nate Silver, natural language processing, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, One Laptop per Child (OLPC), opioid epidemic / opioid crisis, PageRank, Paradox of Choice, payday loans, paypal mafia, performance metric, Peter Thiel, price discrimination, Ray Kurzweil, ride hailing / ride sharing, Ross Ulbricht, Saturday Night Live, school choice, self-driving car, Silicon Valley, Silicon Valley billionaire, speech recognition, statistical model, Steve Jobs, Steven Levy, Stewart Brand, TechCrunch disrupt, Tesla Model S, the High Line, The Signal and the Noise by Nate Silver, theory of mind, traumatic brain injury, Travis Kalanick, trolley problem, Turing test, Uber for X, uber lyft, Watson beat the top human players on Jeopardy!, We are as Gods, Whole Earth Catalog, women in the workforce, work culture , yottabyte

I’m going to take you through a tutorial from a site called DataCamp, which was recommended as a first step for competing in data-science competitions by a different site, Kaggle.16 Kaggle, which is owned by Google’s parent company, Alphabet, is a site in which people compete to get the highest score for analyzing a dataset. Data scientists use it to compete in teams, sharpen their skills, or practice collaborating. It’s also useful for teaching students about data science or for finding datasets. We’re going to do a DataCamp Titanic tutorial using Python and a few popular Python libraries: pandas, scikit-learn, and numpy.

Pilhofer, “A Note to Users of DocumentCloud.” Acknowledgments Thank you to all the people who helped to bring this book to reality. I am grateful to my colleagues at New York University’s (NYU’s) Arthur L. Carter Journalism Institute, my colleagues at the Moore-Sloan Data Science Environment at NYU’s Center for Data Science, the faculty and staff at the Tow Center for Digital Journalism at Columbia Journalism School, and my former colleagues at Temple University and the University of Pennsylvania. For reading, consulting on, or otherwise midwifing this manuscript, I am eternally indebted to Elena Lahr-Vivaz, Rosalie Siegel, Jordan Ellenberg, Cathy O’Neil, Miriam Peskowitz, Samira Baird, Lori Tharps, Kira Baker-Doyle, Jane Dmochowski, Josephine Wolff, Solon Barocas, Hanna Wallach, Katy Boss, Janet Alteveer, Leslie Hunt, Elizabeth Hunt, Kay Kinsey, Karen Masse, Stevie Santangelo, Jay Kirk, Claire Wardle, Gita Manaktala, Melinda Rankin, Kathleen Caruso, Kyle Gipson, my writers’ group, and the talented team at the MIT Press.

There has never been, nor will there ever be, a technological innovation that moves us away from the essential problems of human nature. Why, then, do people persist in thinking there’s a sunny technological future just around the corner? I started thinking about technochauvinism one day when I was talking with a twenty-something friend who works as a data scientist. I mentioned something about Philadelphia schools that didn’t have enough books. “Why not just use laptops or iPads and get electronic textbooks?” asked my friend. “Doesn’t technology make everything faster, cheaper, and better?” He got an earful. (You’ll get one too in a later chapter.) However, his assumption stuck with me.


pages: 292 words: 85,151

Exponential Organizations: Why New Organizations Are Ten Times Better, Faster, and Cheaper Than Yours (And What to Do About It) by Salim Ismail, Yuri van Geest

23andMe, 3D printing, Airbnb, Amazon Mechanical Turk, Amazon Web Services, anti-fragile, augmented reality, autonomous vehicles, Baxter: Rethink Robotics, behavioural economics, Ben Horowitz, bike sharing, bioinformatics, bitcoin, Black Swan, blockchain, Blue Ocean Strategy, book value, Burning Man, business intelligence, business process, call centre, chief data officer, Chris Wanstrath, circular economy, Clayton Christensen, clean water, cloud computing, cognitive bias, collaborative consumption, collaborative economy, commoditize, corporate social responsibility, cross-subsidies, crowdsourcing, cryptocurrency, dark matter, data science, Dean Kamen, deep learning, DeepMind, dematerialisation, discounted cash flows, disruptive innovation, distributed ledger, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, Ethereum, ethereum blockchain, fail fast, game design, gamification, Google Glasses, Google Hangouts, Google X / Alphabet X, gravity well, hiring and firing, holacracy, Hyperloop, industrial robot, Innovator's Dilemma, intangible asset, Internet of things, Iridium satellite, Isaac Newton, Jeff Bezos, Joi Ito, Kevin Kelly, Kickstarter, knowledge worker, Kodak vs Instagram, Law of Accelerating Returns, Lean Startup, life extension, lifelogging, loose coupling, loss aversion, low earth orbit, Lyft, Marc Andreessen, Mark Zuckerberg, market design, Max Levchin, means of production, Michael Milken, minimum viable product, natural language processing, Netflix Prize, NetJets, Network effects, new economy, Oculus Rift, offshore financial centre, PageRank, pattern recognition, Paul Graham, paypal mafia, peer-to-peer, peer-to-peer model, Peter H. Diamandis: Planetary Resources, Peter Thiel, Planet Labs, prediction markets, profit motive, publish or perish, radical decentralization, Ray Kurzweil, recommendation engine, RFID, ride hailing / ride sharing, risk tolerance, Ronald Coase, Rutger Bregman, Salesforce, Second Machine Age, self-driving car, sharing economy, Silicon Valley, skunkworks, Skype, smart contracts, Snapchat, social software, software is eating the world, SpaceShipOne, speech recognition, stealth mode startup, Stephen Hawking, Steve Jobs, Steve Jurvetson, subscription business, supply-chain management, synthetic biology, TaskRabbit, TED Talk, telepresence, telepresence robot, the long tail, Tony Hsieh, transaction costs, Travis Kalanick, Tyler Cowen, Tyler Cowen: Great Stagnation, uber lyft, urban planning, Virgin Galactic, WikiLeaks, winner-take-all economy, X Prize, Y Combinator, zero-sum game

For talented workers, working on and getting paid for multiple projects is a particularly welcome opportunity. But there’s another angle as well: an increase in the diversity of ideas. The data science company Kaggle, for example, offers a platform that hosts private and public algorithm contests in which more than 185,000 data scientists around the world vie for prizes and recognition. In 2011, Insurance giant Allstate, with forty of the best actuaries and data scientists money could buy, wanted to see if its claims algorithm could be improved upon, so it ran a contest on Kaggle. It turned out that the Allstate algorithm, which has been carefully optimized for over six decades, was bested within three days by 107 competing teams.

In fact, in every one of Kaggle’s 150 contests to date, external data scientists have beaten the internal algorithms, often by a wide margin. And in most cases outsiders (non-experts) have beaten the experts in a particular domain, which shows the power of fresh thinking and diverse perspectives. In years past, having a large workforce differentiated your enterprise and allowed it to accomplish more. Today, that same large workforce can become an anchor that reduces maneuverability and slows you down. Moreover, traditional industries have great difficulty attracting on-demand high-skill workers such as data scientists because the available positions are perceived as being low in terms of opportunity and high in terms of bureaucratic obstacles.

While growing exponentially as a company, Interfaces are critical if an organization is to scale seamlessly, especially on a global level. The same is true of other firms that coordinate data and oversee everything from prizes to personnel. Kaggle has its own unique mechanisms to manage its 200,000 data scientists. The X Prize Foundation has created mechanisms and dedicated teams for each of its prizes. TED has strict guidelines that help its many “franchised” TEDx events around the world deliver with consistency. And Uber has its own ways of handling its army of drivers. Most of these Interface processes are unique and proprietary to the organization that developed them, and as such comprise a unique type of intellectual property that can be of considerable market value.


pages: 50 words: 13,399

The Elements of Data Analytic Style by Jeff Leek

correlation does not imply causation, data science, Netflix Prize, p-value, pattern recognition, Ronald Coase, statistical model, TED Talk

A large number of these are summarized in Karl Browman’s excellent presentation on displaying data badly. 11. Presenting data Giving data science talks can help you: Meet people Get people excited about your ideas/software/results Help people understand your ideas/software/results The importance of the first point can’t be overstated. The primary reason you are giving talks is for people to get to know you. Being well known and well regarded can make a huge range of parts of your job easier. So first and foremost make sure you don’t forget to talk to people before, after, and during your talk. Point 2 is more important than point 3. As a data scientist, it is hard to accept that the primary purpose of a talk is advertising, not data science.

As a data scientist, it is hard to accept that the primary purpose of a talk is advertising, not data science. See for example Hilary Mason’s great presentation Entertain, don’t teach. Here are reasons why entertainment is more important: That being said, be very careful to avoid giving a TED talk. If you are giving a data science presentation the goal is to communicate specific ideas. So while you are entertaining, don’t forget why you are entertaining. 11.1 Tailor your talk to your audience It depends on the event and the goals of the event. Here is a non-comprehensive list: Small group meeting: Goal: Update people you work with on what you are doing and get help.

Additional resources 15.1 Class lecture notes Johns Hopkins Data Science Specialization and Additional resources Data wrangling, exploration, and analysis with R Tools for Reproducible Research Data carpentry 15.2 Tutorials Git/github tutorial Make tutorial knitr in a knutshell Writing an R package from scratch 15.3 Leek group guides To data sharing To giving talks To developing R packages 15.4 Books An introduction to statistical learning Advanced data analysis from an elementary point of view Advanced R programming OpenIntro Statistics Statistical inference for data science


pages: 271 words: 62,538

The Best Interface Is No Interface: The Simple Path to Brilliant Technology (Voices That Matter) by Golden Krishna

Airbnb, Bear Stearns, computer vision, crossover SUV, data science, en.wikipedia.org, fear of failure, impulse control, Inbox Zero, Internet Archive, Internet of things, Jeff Bezos, Jony Ive, Kickstarter, lock screen, Mark Zuckerberg, microdosing, new economy, Oculus Rift, off-the-grid, Paradox of Choice, pattern recognition, QR code, RFID, self-driving car, Silicon Valley, skeuomorphism, Skype, Snapchat, Steve Jobs, tech worker, technoutopianism, TED Talk, Tim Cook: Apple, Y Combinator, Y2K

People like Gordon Bell—through his MyLifeBits project at Microsoft Research Silicon Valley Laboratory—started collecting information about their own lives to beat memory loss.11 Over time, the digital medium’s incredible ability to remember built up to what today we call “big data.” Enter the world of the data scientist to find relevant meaning in all that digital data, to power solutions that adjust to your wonderful uniqueness. To think of you. What do you want to eat tonight? What’s the best way home? Data science is one way we can find meaning in all that cheaply stored information—whether big data or even small, relevant, searchable sets—and develop real insights and accurate answers to valuable individual questions.

In 2006, Jonathan Goldman, a PhD graduate in physics from Stanford working at the professional social network LinkedIn, wondered if, by studying you, the service could suggest connections you might know. Could he create an interface of smart connections built around you and your friends rather than putting you in a box as part of a generic database? Even at LinkedIn, where data scientist celebrity DJ Patil, um, coined the term “data scientist” with Jeff Hammerbacher back in 2008, the concept of creating platforms that adapted to your wonderfully distinct qualities seemed foreign just two years earlier.17 According to Patel, outspoken stakeholders in the company were said to have been “openly dismissive” of Goldman’s concept.

Rosenwald, “For Tablet Computer Visionary Roger Fidler, a Lot of What-Ifs,” Washington Post, March 10, 2012. http://www.washingtonpost.com/business/for-tablet-computer-visionary-roger-fidler-a-lot-of-what-ifs/2012/02/28/gIQAM0kN1R_story.html 17 “Some colleagues were openly dismissive of Goldman’s ideas. Why would users need LinkedIn to figure out their networks for them?” Thomas H. Davenport and D.J. Patil, “Data Scientist: The Sexiest Job of the 21st Century,” Harvard Business Review, October 2012. http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ar/1 Chapter 15 Proactive Computing 1 “During the second World War, the robot was stored in the basement of the Weeks’s family home in Ohio, where he became 8-year-old Jack’s playmate.” Noel Sharkey, “Sign In To Read: The Return of Elektro, the First Celebrity Robot,” New Scientist, December 25, 2008. http://www.newscientist.com/article/mg20026873.000-the-return-of-elektro-the-first-celebrity-robot.html?


pages: 276 words: 81,153

Outnumbered: From Facebook and Google to Fake News and Filter-Bubbles – the Algorithms That Control Our Lives by David Sumpter

affirmative action, algorithmic bias, AlphaGo, Bernie Sanders, Brexit referendum, Cambridge Analytica, classic study, cognitive load, Computing Machinery and Intelligence, correlation does not imply causation, crowdsourcing, data science, DeepMind, Demis Hassabis, disinformation, don't be evil, Donald Trump, Elon Musk, fake news, Filter Bubble, Geoffrey Hinton, Google Glasses, illegal immigration, James Webb Space Telescope, Jeff Bezos, job automation, Kenneth Arrow, Loebner Prize, Mark Zuckerberg, meta-analysis, Minecraft, Nate Silver, natural language processing, Nelson Mandela, Nick Bostrom, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, p-value, post-truth, power law, prediction markets, random walk, Ray Kurzweil, Robert Mercer, selection bias, self-driving car, Silicon Valley, Skype, Snapchat, social contagion, speech recognition, statistical model, Stephen Hawking, Steve Bannon, Steven Pinker, TED Talk, The Signal and the Noise by Nate Silver, traveling salesman, Turing test

He describes the Big Five personality traits; he outlines how surveys can be replaced by Facebook profiles; he claims that the results of one of his regression models reveals our conscientiousness and neuroticism; he talks about how political messages can be targeted to the individual and then he closes by claiming that ‘my model, given your Facebook likes, your age and your gender, can predict how agreeable you are just as well as your spouse’. One day, he says, we might fall in love with a computer that understands us better than our partner. I start to doubt whether the data scientist in the video truly believes what he is saying. I’m not sure he even expects his audience to believe it. His ‘research’ is the product of an eight-week training programme at ASI Data Science, a programme for aspiring data scientists. But even if this is just some sort of practice talk, I am deeply disturbed by what I see. This is a young man with the highest level of scientific training: a PhD in theoretical physics from Cambridge University.

The story of Cambridge Analytica took me deep into a web of blogs and privacy activists’ websites. Following these links, I found my way to a YouTube video of a young data scientist, who now works for Cambridge Analytica, presenting a research project he had carried out when working as an intern at the company. He starts his presentation with a reference to the film Her, in which the lead character Theodore, played by Joaquin Phoenix, falls in love with his operating system (OS). In the film, the computer forms a deep understanding of Theodore’s personality, and the human and the OS fall in love. The young data scientist uses this story to set up his own five-minute presentation: ‘Can a computer ever understand us better than a human?’

Glenn told me that the process of making recommendations is far from a pure science, ‘half of my job is trying to work out which computer-generated responses make sense’. When Glenn chose his job title, he asked to be called ‘data alchemist’ instead of ‘data scientist’. He sees his job not as searching for abstract truths about musical styles, but as providing classi­fications that make sense to people. This process requires humans and computers to work together. Given the vast scope of ‘Every Noise at Once’, Glenn’s modesty resonated strongly with me. Like many of the data scientists I had spoken to, he saw his job as navigating a very high-dimensional space. But he was the first person I had talked to who openly acknowledged the deeply personal and unknowable dimensions of our minds.


pages: 482 words: 121,173

Tools and Weapons: The Promise and the Peril of the Digital Age by Brad Smith, Carol Ann Browne

"World Economic Forum" Davos, Affordable Care Act / Obamacare, AI winter, air gap, airport security, Alan Greenspan, Albert Einstein, algorithmic bias, augmented reality, autonomous vehicles, barriers to entry, Berlin Wall, Big Tech, Bletchley Park, Blitzscaling, Boeing 737 MAX, business process, call centre, Cambridge Analytica, Celtic Tiger, Charlie Hebdo massacre, chief data officer, cloud computing, computer vision, corporate social responsibility, data science, deep learning, digital divide, disinformation, Donald Trump, Eben Moglen, Edward Snowden, en.wikipedia.org, Hacker News, immigration reform, income inequality, Internet of things, invention of movable type, invention of the telephone, Jeff Bezos, Kevin Roose, Laura Poitras, machine readable, Mark Zuckerberg, minimum viable product, national security letter, natural language processing, Network effects, new economy, Nick Bostrom, off-the-grid, operational security, opioid epidemic / opioid crisis, pattern recognition, precision agriculture, race to the bottom, ransomware, Ronald Reagan, Rubik’s Cube, Salesforce, school vouchers, self-driving car, Sheryl Sandberg, Shoshana Zuboff, Silicon Valley, Skype, speech recognition, Steve Ballmer, Steve Jobs, surveillance capitalism, tech worker, The Rise and Fall of American Growth, Tim Cook: Apple, Wargames Reagan, WikiLeaks, women in the workforce

More than in the past this will require that those who create technology come not only from disciplines such as computer and data science but also from the social and natural sciences and humanities. If we’re to ensure that artificial intelligence makes decisions based on the best that humanity has to offer, its development must result from a multidisciplinary process. And as we think about the future of higher education, we’ll need to make certain that every computer and data scientist is exposed to the liberal arts, just as everyone who majors in the liberal arts will need a dose of computer and data science. We’ll also need to see more focus on ethics in computer and data science courses themselves.

Back to note reference 8. In 2018, we created a dedicated data science team to help us advance our work on key societal issues. We recruited one of Microsoft’s most experienced data scientists, John Kahan, to lead the team. He had led a large team that applied data analytics to track and analyze the company’s sales and product usage, and I had seen first-hand in weekly Senior Leadership Team meetings how this had improved our business performance. He also had a much broader set of interests, based in part on the work he and his team had pursued to use data science to better diagnose the causes of Sudden Infant Death Syndrome, or SIDS, to which John and his wife had lost their infant son, Aaron, more than a decade before.

He also had a much broader set of interests, based in part on the work he and his team had pursued to use data science to better diagnose the causes of Sudden Infant Death Syndrome, or SIDS, to which John and his wife had lost their infant son, Aaron, more than a decade before. Dina Bass, “Bereaved Father, Microsoft Data Scientists Crunch Numbers to Combat Infant Deaths,” Seattle Times, June 11, 2017, https://www.seattletimes.com/business/bereaved-father-microsoft-data-scientists-crunch-numbers-to-combat-infant-deaths/. One of the first projects we gave to the new team was to dig into the concerns we had developed regarding the FCC’s national data map on broadband availability. Within a few months, the team had used multiple data sets to analyze the broadband gap across the country, including data from the FCC and the Pew Research Center, as well as anonymized Microsoft data collected as part of ongoing work to improve the performance and security of our software and services.


pages: 208 words: 57,602

Futureproof: 9 Rules for Humans in the Age of Automation by Kevin Roose

"World Economic Forum" Davos, adjacent possible, Airbnb, Albert Einstein, algorithmic bias, algorithmic management, Alvin Toffler, Amazon Web Services, Atul Gawande, augmented reality, automated trading system, basic income, Bayesian statistics, Big Tech, big-box store, Black Lives Matter, business process, call centre, choice architecture, coronavirus, COVID-19, data science, deep learning, deepfake, DeepMind, disinformation, Elon Musk, Erik Brynjolfsson, factory automation, fake news, fault tolerance, Frederick Winslow Taylor, Freestyle chess, future of work, Future Shock, Geoffrey Hinton, George Floyd, gig economy, Google Hangouts, GPT-3, hiring and firing, hustle culture, hype cycle, income inequality, industrial robot, Jeff Bezos, job automation, John Markoff, Kevin Roose, knowledge worker, Kodak vs Instagram, labor-force participation, lockdown, Lyft, mandatory minimum, Marc Andreessen, Mark Zuckerberg, meta-analysis, Narrative Science, new economy, Norbert Wiener, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, off-the-grid, OpenAI, pattern recognition, planetary scale, plutocrats, Productivity paradox, QAnon, recommendation engine, remote working, risk tolerance, robotic process automation, scientific management, Second Machine Age, self-driving car, Shoshana Zuboff, Silicon Valley, Silicon Valley startup, social distancing, Steve Jobs, Stuart Kauffman, surveillance capitalism, tech worker, The Future of Employment, The Wealth of Nations by Adam Smith, TikTok, Travis Kalanick, Uber and Lyft, uber lyft, universal basic income, warehouse robotics, Watson beat the top human players on Jeopardy!, work culture

Just by tweaking its algorithms, Netflix can steer users to its original shows, Amazon can steer users to its house brands, and Apple can recommend its own apps in the App Store, even when other apps might be preferable. The power to change users’ preferences at scale has made some technologists uncomfortable. Rachel Schutt, a data scientist, said as much in a 2012 interview with the Times: “Models do not just predict,” she said, “but they can make things happen.” A former product manager at Facebook went even further, telling BuzzFeed News that Facebook’s recommendation algorithms amounted to an attempt to “reprogram humans.” “It’s hard to believe that you could get humans to override all of their values that they came in with,” the former Facebook employee said.

They’re discouraging us from building the kind of personal autonomy that will protect us in the age of AI and automation, by allowing us to think and act for ourselves. And they’re doing it under the guise of helping us. In a 2017 paper about the history of Amazon’s recommendation algorithms, Amazon engineer Brent Smith and Microsoft data scientist Greg Linden sketched out a vision of the AI-driven future that feels, to me, both deeply dystopian and very, very plausible. “Every interaction should reflect who you are and what you like, and help you find what other people like you have already discovered,” they wrote. “It should feel hollow and pathetic when you see something that’s obviously not you; do you not know me by now?”

Near the end of the talk, LeCun made an unexpected prediction about the effects all of this AI and machine learning technology would have on the job market. Despite being a technologist himself, he said that the people with the best chances of coming out ahead in the economy of the future were not programmers and data scientists, but artists and artisans. To illustrate his point, he projected a slide with two photos: one of a Blu-Ray DVD player, which was being sold on Amazon for $47, and another of a handmade ceramic bowl, which was selling for $750. The difference in complexity between the two objects, he said, was stark.


pages: 588 words: 131,025

The Patient Will See You Now: The Future of Medicine Is in Your Hands by Eric Topol

23andMe, 3D printing, Affordable Care Act / Obamacare, Anne Wojcicki, Atul Gawande, augmented reality, Big Tech, bioinformatics, call centre, Clayton Christensen, clean water, cloud computing, commoditize, computer vision, conceptual framework, connected car, correlation does not imply causation, creative destruction, crowdsourcing, dark matter, data acquisition, data science, deep learning, digital divide, disintermediation, disruptive innovation, don't be evil, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Firefox, gamification, global village, Google Glasses, Google X / Alphabet X, Ignaz Semmelweis: hand washing, information asymmetry, interchangeable parts, Internet of things, Isaac Newton, it's over 9,000, job automation, Julian Assange, Kevin Kelly, license plate recognition, lifelogging, Lyft, Mark Zuckerberg, Marshall McLuhan, meta-analysis, microbiome, Nate Silver, natural language processing, Network effects, Nicholas Carr, obamacare, pattern recognition, personalized medicine, phenotype, placebo effect, quantum cryptography, RAND corporation, randomized controlled trial, Salesforce, Second Machine Age, self-driving car, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, Snapchat, social graph, speech recognition, stealth mode startup, Steve Jobs, synthetic biology, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, traumatic brain injury, Turing test, Uber for X, uber lyft, Watson beat the top human players on Jeopardy!, WikiLeaks, X Prize

While there are a relatively small number of such professionals in a world inundated with data challenges in every sector, health care has not been able to attract enough gifted individuals proportional to the size and importance of this field. The irony now is that data scientists are the ones being referred to as the “high priests.”75b The iMedicine Galaxy Patients, companies, employers, doctors, government, data scientists: these major forces are like stars gravitationally bound in a galaxy, the movement of any one of them affecting all the others. What I have tried to convey here is that we have a new galaxy in the making. Just as the printing press was the great object around which modern culture has orbited, the smartphone and iMedicine are forcing a comparable transformation.

That seems a pretty grim prognosis. But medicine is morphing into a data science, now that big data, unsupervised algorithms, predictive analytics, machine learning, augmented reality, and neuromorphic computing are coming in. There’s still an opportunity to change medicine for the better and at least a chance for prevention. That is, if there was a surefire signal before a disease had ever manifested itself in a person—and this information was highly actionable—the individual’s illness might be preempted. This dream isn’t simply one of better data science, however. It is inextricably linked to the democratization of medicine.

Madrigal, “I’m Being Followed: How Google—and 104 Other Companies—Are Tracking Me on the Web,” The Atlantic, February 2014, http://www.theatlantic.com/technology/print/2012/02/im-being-foll%C951-and-104-other-companies-151-are-tracking-me-on-the-web/253758/. 26. S. Wolfram, “Data Science of the Facebook World,” Stephen Wolfram Blog, April 24, 2014, http://blog.stephenwolfram.com/2013/04/data-science-of-the-facebook-world/. 27. D. Mann, “1984 in 2014,” EP Studios Software, April 21, 2014, http://www.epstudiossoftware.com/?p=1411. 28. H. Kelly, “After Boston: The Pros and Cons of Surveillance Cameras,” CNN, April 26, 2013, http://www.cnn.com/2013/04/26/tech/innovation/security-cameras-boston-bombings/. 29.


Scikit-Learn Cookbook by Trent Hauck

bioinformatics, book value, computer vision, data science, information retrieval, p-value

ISBN 978-1-78398-948-5 www.packtpub.com www.it-ebooks.info Credits Author Project Coordinator Trent Hauck Harshal Ved Reviewers Proofreaders Anoop Thomas Mathew Simran Bhogal Xingzhong Bridget Braund Amy Johnson Commissioning Editor Kunal Parikh Indexer Tejal Soni Acquisition Editor Owen Roberts Graphics Sheetal Aute Content Development Editor Dayan Hyames Technical Editors Mrunal M. Chavan Dennis John Copy Editors Janbal Dharmaraj Ronak Dhruv Abhinash Sahu Production Coordinator Manu Joseph Cover Work Manu Joseph Sayanee Mukherjee www.it-ebooks.info About the Author Trent Hauck is a data scientist living and working in the Seattle area. He grew up in Wichita, Kansas and received his undergraduate and graduate degrees from the University of Kansas. He is the author of the book Instant Data Intensive Apps with pandas How-to, Packt Publishing—a book that can get you up to speed quickly with pandas and other associated technologies.

It's a pretty simple function: f ( x) = 1 1 + e−t 75 www.it-ebooks.info Working with Linear Models Visually, it looks like the following: Let's use the make_classification method, create a dataset, and get to classifying: >>> from sklearn.datasets import make_classification >>> X, y = make_classification(n_samples=1000, n_features=4) How to do it... The LogisticRegression object works in the same way as the other linear models: >>> from sklearn.linear_model import LogisticRegression >>> lr = LogisticRegression() Since we're good data scientists, we will pull out the last 200 samples to test the trained model on. Since this is a random dataset, it's fine to hold out the last 200; if you're dealing with structured data, don't do this (for example, if you deal with time series data): >>> >>> >>> >>> X_train = X[:-200] X_test = X[-200:] y_train = y[:-200] y_test = y[-200:] We'll discuss more on cross-validation later in the book.

Okay, so now that we've looked at how we can classify points based on distribution, let's look at how we can do this in scikit-learn: >>> >>> >>> >>> from sklearn.mixture import GMM gmm = GMM(n_components=2) X = np.row_stack((class_A, class_B)) y = np.hstack((np.ones(100), np.zeros(100))) Since we're good little data scientists, we'll create a training set: >>> train = np.random.choice([True, False], 200) >>> gmm.fit(X[train]) GMM(covariance_type='diag', init_params='wmc', min_covar=0.001, n_components=2, n_init=1, n_iter=100, params='wmc', random_state=None, thresh=0.01) 110 www.it-ebooks.info Chapter 3 Fitting and predicting is done in the same way as fitting is done for many of the other objects in scikit-learn: >>> gmm.fit(X[train]) >>> gmm.predict(X[train])[:5] array([0, 0, 0, 0, 0]) There are other methods worth looking at now that the model has been fit.


pages: 125 words: 27,675

Applied Text Analysis With Python: Enabling Language-Aware Data Products With Machine Learning by Benjamin Bengfort, Rebecca Bilbro, Tony Ojeda

data science, full text search, natural language processing, quantitative easing, sentiment analysis, statistical model, the long tail

By specifically targeting text data about baseball or basketball, we reduce this ambiguity, but we also reduce the overall size of our corpus. This is a significant tradeoff, because we will need a very large corpus in order to provide sufficient training examples to our language models, thus we must find a balance between domain specificity and corpus size. Data Ingestion of Text As data scientists, we rely heavily on structure and patterns, not only in the content of our data, but in its history and provenance. In general, good data sources have a determinable structure, where different pieces of content are organized according to some schema and can be extracted systematically via the application of some logic to that schema.

Most modern web and social media services have APIs that developers can access, and they are typically accompanied by documentation with instructions on how to access and obtain the data. Note As a web service evolves, both the API and the documentation are usually updated as well, and as developers and data scientists, we need to stay current on changes to the APIs we use in our data products. A RESTful API is a type of web service API that adheres to representational state transfer (REST) architectural constraints. REST is a simple way to organize interactions between independent systems, allowing for lightweight interaction with clients such as mobile phones and other websites.

The primary reason they do this is so that they can prevent abuse of their service. Many service providers allow for registration using OAuth, which is an open authentication standard that allows a user’s information to be communicated to a third party without exposing confidential information such as their password. APIs are popular data sources among data scientists because they provide us with a source of ingestion that is authorized, structured, and well-documented. The service provider is giving us permission and access to retrieve and use the data they have in a responsible manner. This isn’t true of crawling/scraping or RSS, and for this reason, obtaining data via API is preferable whenever it is an option.


pages: 706 words: 202,591

Facebook: The Inside Story by Steven Levy

active measures, Airbnb, Airbus A320, Amazon Mechanical Turk, AOL-Time Warner, Apple's 1984 Super Bowl advert, augmented reality, Ben Horowitz, Benchmark Capital, Big Tech, Black Lives Matter, Blitzscaling, blockchain, Burning Man, business intelligence, Cambridge Analytica, cloud computing, company town, computer vision, crowdsourcing, cryptocurrency, data science, deep learning, disinformation, don't be evil, Donald Trump, Dunbar number, East Village, Edward Snowden, El Camino Real, Elon Musk, end-to-end encryption, fake news, Firefox, Frank Gehry, Geoffrey Hinton, glass ceiling, GPS: selective availability, growth hacking, imposter syndrome, indoor plumbing, information security, Jeff Bezos, John Markoff, Jony Ive, Kevin Kelly, Kickstarter, lock screen, Lyft, machine translation, Mahatma Gandhi, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Max Levchin, Menlo Park, Metcalfe’s law, MITM: man-in-the-middle, move fast and break things, natural language processing, Network effects, Oculus Rift, operational security, PageRank, Paul Buchheit, paypal mafia, Peter Thiel, pets.com, post-work, Ray Kurzweil, recommendation engine, Robert Mercer, Robert Metcalfe, rolodex, Russian election interference, Salesforce, Sam Altman, Sand Hill Road, self-driving car, sexual politics, Sheryl Sandberg, Shoshana Zuboff, side project, Silicon Valley, Silicon Valley startup, skeuomorphism, slashdot, Snapchat, social contagion, social graph, social software, South of Market, San Francisco, Startup school, Steve Ballmer, Steve Bannon, Steve Jobs, Steven Levy, Steven Pinker, surveillance capitalism, tech billionaire, techlash, Tim Cook: Apple, Tragedy of the Commons, web application, WeWork, WikiLeaks, women in the workforce, Y Combinator, Y2K, you are the product

For the 2010 midterm election, Facebook expanded the program, with a prominent button proclaiming “I Voted” visible to users. But not all visitors. Facebook used the midterms to conduct an elaborate experiment. Two of Facebook’s top data scientists, working with researchers at the University of California at San Diego, decided to test whether the voter button actually affected voter turnout. If you saw that your friends voted, would it influence you to do the same? Cameron Marlow, who headed Data Science for Facebook at the time, says the experiment was an innocent exercise: “We had a product that had run in every single election and we were starting to run in other countries’ elections—the goal was to get people out to vote.”

The work still continued—after all, research led to Growth!—but the company did not want to be misunderstood again. “I don’t think that they stopped doing experiments—they just stopped publishing them,” says Cameron Marlow, who headed Data Science but left shortly before the emotion paper was published. “So is that a good thing for society? Probably not.” As I found by informal conversations at a Data Science conference on campus in 2019, though, most of its researchers stuck around. They feel that their work is important. * * * • • • BEGINNING IN 2013, Kogan was visiting the Facebook campus regularly. He ate a lot of free lunches.

He became a co-founder of Cloudera, a company that stored data in the Internet cloud, and later became involved in trying to solve cancer with data analysis. Though he felt no animosity toward his former employer, at times he has expressed its motivations in phrases that speak volumes. In 2011, assessing the jobs of data scientists at Facebook and its peers, he made a remark to a BusinessWeek journalist that would reverberate for years: “The best minds of my generation are thinking about how to make people click ads,” he said. “That sucks.” * * * • • • IN THE OLD television show Mission: Impossible, each episode began when the leader, Jim, flipped through the dossiers of spies, strongmen, and honeypots, putting rejects in one pile and tossing the photos of the talented ones who were perfect for the mission into a stack on his coffee table.


pages: 388 words: 111,099

Democracy for Sale: Dark Money and Dirty Politics by Peter Geoghegan

4chan, Adam Curtis, Affordable Care Act / Obamacare, American Legislative Exchange Council, anti-globalists, basic income, Berlin Wall, Big Tech, Black Lives Matter, Boris Johnson, Brexit referendum, British Empire, Cambridge Analytica, centre right, corporate raider, crony capitalism, data science, deepfake, deindustrialization, demographic winter, disinformation, Dominic Cummings, Donald Trump, East Village, Etonian, F. W. de Klerk, fake news, first-past-the-post, Francis Fukuyama: the end of history, Frank Gehry, Greta Thunberg, invisible hand, James Dyson, Jeremy Corbyn, John Bercow, Mark Zuckerberg, market fundamentalism, military-industrial complex, moral panic, Naomi Klein, Nelson Mandela, obamacare, offshore financial centre, open borders, Overton Window, Paris climate accords, plutocrats, post-truth, post-war consensus, pre–internet, private military company, Renaissance Technologies, Robert Mercer, Ronald Reagan, Silicon Valley, Snapchat, special economic zone, Steve Bannon, surveillance capitalism, tech billionaire, technoutopianism, Torches of Freedom, universal basic income, WikiLeaks, Yochai Benkler, éminence grise

“We could say, for example, we will target women between 35 and 45 who live in these particular geographical entities, who don’t have a degree,” Cummings later explained. He boasted of using physicists and experts in “quantum information” to crunch voter data. Vote Leave recorded spending over £70,000 with a firm called Advanced Skills Initiative.47 The company is better known as ASI Data Science, a tech start-up that marketed itself as a world leader in artificial intelligence, and which employed a number of data scientists that worked for Cambridge Analytica. Cummings also came up with clever ruses to find data on the “missing three million”. He ran an online competition during the 2016 European Football Championship. Win £50 million, the advertisement proclaimed, by successfully predicting the winner of all 51 games.

See also https://firstdraftnews.org/latest/thousands-of-misleading-conservative-ads-side-step-scrutiny-thanks-to-facebook-policy/; accessed 26 Jan. 2020. 44 Ella Hollowood and Matthew D’Ancona, ‘Big little lies’, Tortoise, December 2019. See also https://members.tortoisemedia.com/2019/12/11/lies-191211/content.html; accessed 26 Jan. 2020. 45 Dominic Cummings, ‘“Two hands are a lot” – we’re hiring data scientists, project managers, policy experts, assorted weirdos…’, Dominic Cummings’s Blog, January 2020. See also https://dominiccummings.com/2020/01/02/two-hands-are-a-lot-were-hiring-data-scientists-project-managers-policy-experts-assorted-weirdos/; accessed 26 Jan. 2020. 46 Rowland Manthorpe, ‘General election: WhatsApp messages urge British Hindus to vote against Labour’, Sky News, November 2019. 47 Jim Waterson, ‘What we learned about the media this election’, Guardian, December 2019. 48 James Cusick, ‘New evidence that LibDems sold voter data for £100,000 held back till after election’, openDemocracy, November 2019. 49 Rowland Manthorpe, ‘Data protection experts want watchdog to investigate Conservative and Labour parties’, Sky News, October 2019. 50 ‘General election 2019: Zac Goldsmith loses seat to Lib Dems again’, BBC, December 2019. 51 Kate Proctor, ‘Johnson accused of “rewarding racism” after Zac Goldsmith peerage’, Guardian, December 2019. 52 Isobel Thompson, ‘How Irish anti-abortion activists are drawing on Brexit and Trump campaigns to influence referendum’, openDemocracy, May 2018. 53 ‘Republicans Overseas UK: An Evening with GOP Strategist Matt Mackowiak’, Republicans Overseas UK.

Dominic Cummings Said It’s “TOP PRIORITY”’, Buzzfeed, September 2019. 3 William Norton, White Elephant: How the North East Said No (London, 2008), p. 200. 4 Dominic Cummings, ‘On the referendum #20: the campaign, physics and data science – Vote Leave’s ‘Voter Intention Collection System’ (VICS) now available for all’, Dominic Cummings’s Blog, October 2016. See also https://dominiccummings.com/2016/10/29/on-the-referendum-20-the-campaign-physics-and-data-science-vote-leaves-voter-intention-collection-system-vics-now-available-for-all/; accessed 19 Jan. 2020. 5 Alice Thomson and Rachel Sylvester, ‘Sir Nicholas Soames interview: “Johnson is nothing like Churchill and Jacob Rees-Mogg is an absolute fraud”’, The Times, September 2019. 6 Sam Knight, ‘The man who brought you Brexit’, Guardian, September 2016. 7 Tim Shipman, All Out War: The Full Story of Brexit (London, 2017), p. 27. 8 George Eaton, ‘Vote Leave head Matthew Elliott: “The Brexiteers won the battle but we could lose the war”’, New Statesman, September 2018. 9 Chloe Farand and Mat Hope, ‘Matthew and Sarah Elliott: How a UK Power Couple Links US Libertarians and Fossil Fuel Lobbyists to Brexit’, DeSmog UK, November 2018. 10 Robert Booth, ‘Who is behind the Taxpayers’ Alliance?’


pages: 344 words: 96,020

Hacking Growth: How Today's Fastest-Growing Companies Drive Breakout Success by Sean Ellis, Morgan Brown

Airbnb, Amazon Web Services, barriers to entry, behavioural economics, Ben Horowitz, bounce rate, business intelligence, business process, content marketing, correlation does not imply causation, crowdsourcing, dark pattern, data science, DevOps, disruptive innovation, Elon Musk, game design, gamification, Google Glasses, growth hacking, Internet of things, inventory management, iterative process, Jeff Bezos, Khan Academy, Kickstarter, Lean Startup, Lyft, Mark Zuckerberg, market design, minimum viable product, multi-armed bandit, Network effects, Paul Graham, Peter Thiel, Ponzi scheme, recommendation engine, ride hailing / ride sharing, Salesforce, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, Skype, Snapchat, software as a service, Steve Jobs, Steve Jurvetson, subscription business, TED Talk, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, working poor, Y Combinator, young professional

The growth hacking practices innovated by these early practitioners and others who have followed have been honed into a finely tuned business methodology—and spawned a powerful movement with hundreds of thousands of practitioners (and growing) across the globe. This vibrant community of growth hackers includes entrepreneurs, marketers, engineers, product managers, data scientists, and more, not just from the tech start-up world but from all walks of industry, from technology, retail, business-to-business, professional services, entertainment, and even the political arena. And while the details of how it is implemented vary somewhat from company to company, the core elements of the method are: • the creation of a cross-functional team, or a set of teams that break down the traditional silos of marketing and product development and combine talents; • the use of qualitative research and quantitative data analysis to gain deep insights into user behavior and preferences; and • the rapid generation and testing of ideas, and the use of rigorous metrics to evaluate—and then act on—those results.

Recognizing that Walmart’s greatest asset is its data, Brian Monahan, the company’s former VP of marketing, pushed forward a unification of the company’s data platforms across all divisions, one that would allow all teams, from engineering, to merchandising, to marketing, and even external agencies and suppliers, to capitalize on the data generated and collected. Growth hacking cultivates the maximization of big data through collaboration and information sharing. Monahan highlighted the business need this approach solves: “You need marketers who can appreciate what it takes to actually write software and you need data scientists who can really appreciate consumer insights and understand business problems,” he explained.19 THE RISING COSTS AND DUBIOUS RETURNS OF TRADITIONAL MARKETING The techniques of traditional marketing—both print and television advertising, and the newer online versions that have become essential parts of the traditional marketing toolkit—are in crisis, as markets are becoming more and more fragmented and ephemeral, while advertising is becoming both more expensive and less viewed.

Depending on the degree of sophistication of the experiments a team is running, it might be possible for the marketing or engineering team member to play this role, as in both of those fields, a certain level of data analytics aptitude has become important. At more technically advanced companies, analysts with expertise in reporting of experiments as well as data scientists, who are mining for deep insight, should both play a role. What is essential is that data analysis not be farmed out to the intern who knows how to use Google Analytics or to a digital agency, to cite extreme but all too frequent realities. As we will discuss in detail coming up in Chapter Three, too many companies do not place enough emphasis on data analysis, and rely too heavily on prepackaged programs, such as Google Analytics, with limited capacity for combining various pools of data, such as from sales and from customer service, and limited ability to delve into that data to make discoveries.


pages: 349 words: 98,868

Nervous States: Democracy and the Decline of Reason by William Davies

active measures, Affordable Care Act / Obamacare, Amazon Web Services, Anthropocene, bank run, banking crisis, basic income, Black Lives Matter, Brexit referendum, business cycle, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, citizen journalism, Climategate, Climatic Research Unit, Colonization of Mars, continuation of politics by other means, creative destruction, credit crunch, data science, decarbonisation, deep learning, DeepMind, deindustrialization, digital divide, discovery of penicillin, Dominic Cummings, Donald Trump, drone strike, Elon Musk, failed state, fake news, Filter Bubble, first-past-the-post, Frank Gehry, gig economy, government statistician, housing crisis, income inequality, Isaac Newton, Jeff Bezos, Jeremy Corbyn, Johannes Kepler, Joseph Schumpeter, knowledge economy, loss aversion, low skilled workers, Mahatma Gandhi, Mark Zuckerberg, mass immigration, meta-analysis, Mont Pelerin Society, mutually assured destruction, Northern Rock, obamacare, Occupy movement, opioid epidemic / opioid crisis, Paris climate accords, pattern recognition, Peace of Westphalia, Peter Thiel, Philip Mirowski, planetary scale, post-industrial society, post-truth, quantitative easing, RAND corporation, Ray Kurzweil, Richard Florida, road to serfdom, Robert Mercer, Ronald Reagan, sentiment analysis, Silicon Valley, Silicon Valley billionaire, Silicon Valley startup, smart cities, Social Justice Warrior, statistical model, Steve Bannon, Steve Jobs, tacit knowledge, the scientific method, Turing machine, Uber for X, universal basic income, University of East Anglia, Valery Gerasimov, W. E. B. Du Bois, We are the 99%, WikiLeaks, women in the workforce, zero-sum game

Behind the scenes, this is gobbled up and mathematically processed. As the math has become more and more sophisticated, the user no longer even experiences it as mathematical. From science to data science A curiosity of big-data analytics is that its specialists are relatively uninterested in whether the data is generated by people, particles in the atmosphere, cars, financial prices, or bacteria. Data scientists are more often trained in mathematics or physics than in social science. They generate knowledge about our behavior, but they don’t profess any expertise about people, or shopping, or finance, or cities.

They don’t study nature or society, in the way that the archetypal expert does, but seek patterns in data that computers have already captured. As opposed to a scientist, a data scientist might better be compared to a librarian, someone who is skilled in navigating a vast collection of already-recorded information. The difference is that the data archive is growing at great speed, thanks to the mass of nonhuman sensors that gather it, and can only be sifted algorithmically. Take the example of psychology. Data science reveals a great deal that is of interest to psychologists, given the ability of algorithms to detect emotions, behaviors, and anxieties across populations.

The analyst’s value lies in pruning vast quantities of useless data, leaving only that which deserves our attention.16 But if they lack any intrinsic interest in the topic at hand (other than the mathematics), they also have no view of their own regarding what “something meaningful” means—they are therefore in the service of a client. Alternatively, their biases and assumptions creep in, without being consciously reflected on or criticized.17 The clients for data science are multiplying all the time. “Quants” can make big money working for Wall Street banks and hedge funds, building algorithms to analyze price movements. “Smart city” projects depend on data scientists to extract patterns of activity from the frenetic movements of urban populations, resources, and transport. Firms such as Peter Thiel’s Palantir help security services identify potential security threats, by isolating dangerous patterns of behavior.


Data and the City by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle

A Declaration of the Independence of Cyberspace, algorithmic management, bike sharing, bitcoin, blockchain, Bretton Woods, Chelsea Manning, citizen journalism, Claude Shannon: information theory, clean water, cloud computing, complexity theory, conceptual framework, corporate governance, correlation does not imply causation, create, read, update, delete, crowdsourcing, cryptocurrency, data science, dematerialisation, digital divide, digital map, digital rights, distributed ledger, Evgeny Morozov, fault tolerance, fiat currency, Filter Bubble, floating exchange rates, folksonomy, functional programming, global value chain, Google Earth, Hacker News, hive mind, information security, Internet of things, Kickstarter, knowledge economy, Lewis Mumford, lifelogging, linked data, loose coupling, machine readable, new economy, New Urbanism, Nicholas Carr, nowcasting, open economy, openstreetmap, OSI model, packet switching, pattern recognition, performance metric, place-making, power law, quantum entanglement, RAND corporation, RFID, Richard Florida, ride hailing / ride sharing, semantic web, sentiment analysis, sharing economy, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, smart grid, smart meter, social graph, software studies, statistical model, tacit knowledge, TaskRabbit, technological determinism, technological solutionism, text mining, The Chicago School, The Death and Life of Great American Cities, the long tail, the market place, the medium is the message, the scientific method, Toyota Production System, urban planning, urban sprawl, web application

Since the architecture provides flexible techniques for data and analysis sharing, communications within organizations, between organizations and between citizens and organizations can be improved. This is an important requirement for cities and citizens. With the rise of citizen data scientist and the corporate use of urban data, sharing data about cities using various bindings is advantageous. The architecture also can efficiently communicate with big data technologies. Current city dashboards are useful tool for now-casting. With big data technologies and data science, cities need to have other systems for sharing data using polyglot bindings and providing indicators and metrics about city for future (forecasting) using predictive analytics.

British Academy (2012) ‘Society Counts – Quantitative Studies in the Social Sciences and Humanities’, A British Academy Position Statement available from: www.britac.ac.uk/ policy/Society_Counts.cfm [accessed 9 December 2016]. Burkert, H. (1992) ‘The legal framework of public sector information: Recent legal policy developments in the EC’, Government Publications Review 19(5): 483–496. Cabinet Office (2015) ‘Open policy making toolkit: Data science’, available from: www. gov.uk/open-policy-making-toolkit-data-science [accessed 9 December 2016]. Clarke, R. (1988) ‘Information technology and dataveillance’, Communications of the ACM 31(5): 498–512. Crampton, J. and Krygier, J. (2005) ‘An introduction to critical cartography’, ACME: An International E-Journal for Critical Geographies 4(1), available from: http://ojs.unbc. ca/index.php/acme/article/view/723/585 [accessed 9 December 2016].

Moreover, further research is required to understand how data influence digital labour, investigating issues such as how institutional and organizational structures change with the introduction of new databased regimes, how data ecosystems change government and corporate work practices, and how the database managers and data scientists become more important within institutions with their knowledge and expertise becoming privileged over others. Epistemology As the chapters make clear, there are a diverse set of epistemologies being deployed to make sense of urban data, data-driven systems, and the relationship between data and the city.


Beautiful Data: The Stories Behind Elegant Data Solutions by Toby Segaran, Jeff Hammerbacher

23andMe, airport security, Amazon Mechanical Turk, bioinformatics, Black Swan, business intelligence, card file, cloud computing, computer vision, correlation coefficient, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, DARPA: Urban Challenge, data acquisition, data science, database schema, double helix, en.wikipedia.org, epigenetics, fault tolerance, Firefox, Gregor Mendel, Hans Rosling, housing crisis, information retrieval, lake wobegon effect, Large Hadron Collider, longitudinal study, machine readable, machine translation, Mars Rover, natural language processing, openstreetmap, Paradox of Choice, power law, prediction markets, profit motive, semantic web, sentiment analysis, Simon Singh, social bookmarking, social graph, SPARQL, sparse data, speech recognition, statistical model, supply-chain management, systematic bias, TED Talk, text mining, the long tail, Vernor Vinge, web application

The most critical human component for accelerating the learning process and making use of the Information Platform is taking the shape of a new role: the Data Scientist. The Data Scientist In a recent interview, Hal Varian, Google’s chief economist, highlighted the need for employees able to extract information from the Information Platforms described earlier. As Varian puts it, “find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what’s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis.” INFORMATION PLATFORMS AND THE RISE OF THE DATA SCIENTIST Download at Boykma.Com 83 At Facebook, we felt that traditional titles such as Business Analyst, Statistician, Engineer, and Research Scientist didn’t quite capture what we were after for our team.

The workload for the role was diverse: on any given day, a team member could author a multistage processing pipeline in Python, design a hypothesis test, perform a regression analysis over data samples with R, design and implement an algorithm for some data-intensive product or service in Hadoop, or communicate the results of our analyses to other members of the organization in a clear and concise fashion. To capture the skill set required to perform this multitude of tasks, we created the role of “Data Scientist.” In the financial services domain, large data stores of past market activity are built to serve as the proving ground for complex new models developed by the Data Scientists of their domain, known as Quants. Outside of industry, I’ve found that grad students in many scientific domains are playing the role of the Data Scientist. One of our hires for the Facebook Data team came from a bioinformatics lab where he was building data pipelines and performing offline data analysis of a similar kind.

Recent books such as Davenport and Harris’s Competing on Analytics (Harvard Business School Press, 2007), Baker’s The Numerati (Houghton Mifflin Harcourt, 2008), and Ayres’s Super Crunchers (Bantam, 2008) have emphasized the critical role of the Data Scientist across industries in enabling an organization to improve over time based on the information it collects. In conjunction with the research community’s investigation of dataspaces, further definition for the role of the Data Scientist is needed over the coming years. By better articulating the role, we’ll be able to construct training curricula, formulate promotion hierarchies, organize conferences, write books, and fill in all of the other trappings of a recognized profession. In the process, the pool of available Data Scientists will expand to meet the growing need for expert pilots for the rapidly proliferating Information Platforms, further speeding the learning process across all organizations.


pages: 442 words: 94,734

The Art of Statistics: Learning From Data by David Spiegelhalter

Abraham Wald, algorithmic bias, Anthropocene, Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Brexit referendum, Carmen Reinhart, Charles Babbage, complexity theory, computer vision, confounding variable, correlation coefficient, correlation does not imply causation, dark matter, data science, deep learning, DeepMind, Edmond Halley, Estimating the Reproducibility of Psychological Science, government statistician, Gregor Mendel, Hans Rosling, Higgs boson, Kenneth Rogoff, meta-analysis, Nate Silver, Netflix Prize, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, p-value, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, randomized controlled trial, recommendation engine, replication crisis, self-driving car, seminal paper, sparse data, speech recognition, statistical model, sugar pill, systematic bias, TED Talk, The Design of Experiments, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Malthus, Two Sigma

This common view of statistics as a basic ‘bag of tools’ is now facing major challenges. First, we are in an age of data science, in which large and complex data sets are collected from routine sources such as traffic monitors, social media posts and internet purchases, and used as a basis for technological innovations such as optimizing travel routes, targeted advertising or purchase recommendation systems – we shall look at algorithms based on ‘big data’ in Chapter 6. Statistical training is increasingly seen as just one necessary component of being a data scientist, together with skills in data management, programming and algorithm development, as well as proper knowledge of the subject matter.

cox regression: See hazard ratio. data literacy: the ability to understand the principles behind learning from data, carry out basic data analyses, and critique the quality of claims made on the basis of data. data science: the study and application of techniques for deriving insights from data, including constructing algorithms for prediction. Traditional statistical science forms part of data science, which also includes a strong element of coding and data management. deep learning: a machine-learning technique that extends standard artificial neural network models to many layers representing different levels of abstraction, say going from individual pixels of an image through to recognition of objects.

Any conclusions generally raise more questions, and so the cycle starts over again, as when we started looking at the time of day when Shipman’s patients died. Although in practice the PPDAC cycle laid out in Figure 0.3 may not be followed precisely, it underscores that formal techniques for statistical analysis play only one part in the work of a statistician or data scientist. Statistical science is a lot more than a branch of mathematics involving esoteric formulae with which generations of students have (often reluctantly) struggled. This Book When I was a student in Britain in the 1970s, there were just three TV channels, computers were the size of a double wardrobe, and the closest thing we had to Wikipedia was on the imaginary handheld device in Douglas Adams’ (remarkably prescient) Hitchhiker’s Guide to the Galaxy.


pages: 404 words: 92,713

The Art of Statistics: How to Learn From Data by David Spiegelhalter

Abraham Wald, algorithmic bias, Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Brexit referendum, Carmen Reinhart, Charles Babbage, complexity theory, computer vision, confounding variable, correlation coefficient, correlation does not imply causation, dark matter, data science, deep learning, DeepMind, Edmond Halley, Estimating the Reproducibility of Psychological Science, government statistician, Gregor Mendel, Hans Rosling, Higgs boson, Kenneth Rogoff, meta-analysis, Nate Silver, Netflix Prize, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, p-value, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, randomized controlled trial, recommendation engine, replication crisis, self-driving car, seminal paper, sparse data, speech recognition, statistical model, sugar pill, systematic bias, TED Talk, The Design of Experiments, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Malthus, Two Sigma

This common view of statistics as a basic ‘bag of tools’ is now facing major challenges. First, we are in an age of data science, in which large and complex data sets are collected from routine sources such as traffic monitors, social media posts and internet purchases, and used as a basis for technological innovations such as optimizing travel routes, targeted advertising or purchase recommendation systems—we shall look at algorithms based on ‘big data’ in Chapter 6. Statistical training is increasingly seen as just one necessary component of being a data scientist, together with skills in data management, programming and algorithm development, as well as proper knowledge of the subject matter.

cox regression: See hazard ratio. data literacy: the ability to understand the principles behind learning from data, carry out basic data analyses, and critique the quality of claims made on the basis of data. data science: the study and application of techniques for deriving insights from data, including constructing algorithms for prediction. Traditional statistical science forms part of data science, which also includes a strong element of coding and data management. deep learning: a machine-learning technique that extends standard artificial neural network models to many layers representing different levels of abstraction, say going from individual pixels of an image through to recognition of objects.

Any conclusions generally raise more questions, and so the cycle starts over again, as when we started looking at the time of day when Shipman’s patients died. Although in practice the PPDAC cycle laid out in Figure 0.3 may not be followed precisely, it underscores that formal techniques for statistical analysis play only one part in the work of a statistician or data scientist. Statistical science is a lot more than a branch of mathematics involving esoteric formulae with which generations of students have (often reluctantly) struggled. This Book When I was a student in Britain in the 1970s, there were just three TV channels, computers were the size of a double wardrobe, and the closest thing we had to Wikipedia was on the imaginary handheld device in Douglas Adams’ (remarkably prescient) Hitchhiker’s Guide to the Galaxy.


Calling Bullshit: The Art of Scepticism in a Data-Driven World by Jevin D. West, Carl T. Bergstrom

airport security, algorithmic bias, AlphaGo, Amazon Mechanical Turk, Andrew Wiles, Anthropocene, autism spectrum disorder, bitcoin, Charles Babbage, cloud computing, computer vision, content marketing, correlation coefficient, correlation does not imply causation, crowdsourcing, cryptocurrency, data science, deep learning, deepfake, delayed gratification, disinformation, Dmitri Mendeleev, Donald Trump, Elon Musk, epigenetics, Estimating the Reproducibility of Psychological Science, experimental economics, fake news, Ford Model T, Goodhart's law, Helicobacter pylori, Higgs boson, invention of the printing press, John Markoff, Large Hadron Collider, longitudinal study, Lyft, machine translation, meta-analysis, new economy, nowcasting, opioid epidemic / opioid crisis, p-value, Pluto: dwarf planet, publication bias, RAND corporation, randomized controlled trial, replication crisis, ride hailing / ride sharing, Ronald Reagan, selection bias, self-driving car, Silicon Valley, Silicon Valley startup, social graph, Socratic dialogue, Stanford marshmallow experiment, statistical model, stem cell, superintelligent machines, systematic bias, tech bro, TED Talk, the long tail, the scientific method, theory of mind, Tim Cook: Apple, twin studies, Uber and Lyft, Uber for X, uber lyft, When a measure becomes a target

* * * — WE HAVE DEVOTED OUR careers to teaching students how to think logically and quantitatively about data. This book emerged from a course we teach at the University of Washington, also titled “Calling Bullshit.” We hope it will show you that you do not need to be a professional statistician or econometrician or data scientist to think critically about quantitative arguments, nor do you need extensive data sets and weeks of effort to see through bullshit. It is often sufficient to apply basic logical reasoning to a problem and, where needed, augment that with information readily discovered via search engine. We have civic motives for wanting to help people spot and refute bullshit.

If you do know some of these things, you still probably don’t remember all of the details. We, the authors, use statistics on a daily basis, but we still have to look up this sort of stuff all the time. As a result, you can’t unpack the black box; you can’t go into the details of the analysis in order to pick apart possible problems. Unless you’re a data scientist, and probably even then, you run into the same kind of problem you encounter when you read about a paper that uses the newest ResNet algorithm to reveal differences in the facial features of dog and cat owners. Whether or not this is intentional on the part of the author, this kind of black box shields the claim against scrutiny.

If we continue the time series across the intervening years since Vigen published this figure, the correlation completely falls apart. Vigen finds his spurious correlation examples by collecting a large number of data sets about how things change over time. Then he uses a computer program to compare each trend with every other trend. This is an extreme form of what data scientists call data dredging. With a mere one hundred data series, one can compare nearly ten thousand pairs. Some of these pairs are going to show very similar trends—and thus high correlations—just by chance. For example, check out the correlation between the numbers of deaths caused by anticoagulants and the number of sociology degrees awarded in the US: You look at these two trends and think, Wow—what are the chances they would line up that well?


pages: 475 words: 134,707

The Hype Machine: How Social Media Disrupts Our Elections, Our Economy, and Our Health--And How We Must Adapt by Sinan Aral

Airbnb, Albert Einstein, algorithmic bias, AlphaGo, Any sufficiently advanced technology is indistinguishable from magic, AOL-Time Warner, augmented reality, behavioural economics, Bernie Sanders, Big Tech, bitcoin, Black Lives Matter, Cambridge Analytica, carbon footprint, Cass Sunstein, computer vision, contact tracing, coronavirus, correlation does not imply causation, COVID-19, crowdsourcing, cryptocurrency, data science, death of newspapers, deep learning, deepfake, digital divide, digital nomad, disinformation, disintermediation, Donald Trump, Drosophila, Edward Snowden, Elon Musk, en.wikipedia.org, end-to-end encryption, Erik Brynjolfsson, experimental subject, facts on the ground, fake news, Filter Bubble, George Floyd, global pandemic, hive mind, illegal immigration, income inequality, Kickstarter, knowledge worker, lockdown, longitudinal study, low skilled workers, Lyft, Mahatma Gandhi, Mark Zuckerberg, Menlo Park, meta-analysis, Metcalfe’s law, mobile money, move fast and break things, multi-sided market, Nate Silver, natural language processing, Neal Stephenson, Network effects, performance metric, phenotype, recommendation engine, Robert Bork, Robert Shiller, Russian election interference, Second Machine Age, seminal paper, sentiment analysis, shareholder value, Sheryl Sandberg, skunkworks, Snapchat, social contagion, social distancing, social graph, social intelligence, social software, social web, statistical model, stem cell, Stephen Hawking, Steve Bannon, Steve Jobs, Steve Jurvetson, surveillance capitalism, Susan Wojcicki, Telecommunications Act of 1996, The Chicago School, the strength of weak ties, The Wisdom of Crowds, theory of mind, TikTok, Tim Cook: Apple, Uber and Lyft, uber lyft, WikiLeaks, work culture , Yogi Berra

Whether you simply want to understand the how and why, or need to make business or policy decisions, this is a must-read!” —FOSTER PROVOST, NYU Stern School of Business, author of Data Science for Business “Sinan Aral is a scientist and entrepreneur, and his unique perspective makes him the perfect guide to the world we live in today. From ads to fake news, The Hype Machine is the best critical foundation for understanding the connected world and how we might navigate through it to a better future.” —HILARY MASON, founder and CEO of Fast Forward Labs, data scientist in residence at Accel, and former chief scientist at Bitly “If you want the truth about falsehoods, real information about misinformation, and rigorous analysis of hype, this is the book for you.

Moreover, while 92 percent of consumers read reviews, only 6 percent write reviews, which means a vocal minority is influencing the opinions of the overwhelming majority. The potential consequences of ratings herding are significant because the 6 percent have an outsize impact on how the rest of us shop. Sean Taylor, my PhD student at the time, who’s now a senior data scientist at Lyft and the former head of the statistics team in Facebook’s Core Data Science group, overheard my conversation with Lev and wandered across the hall. “Hey, what are you guys talking about?” This is how social science starts—it’s sparked by everyday puzzles that evolve into investigations of how and why things happen. Lev, Sean, and I took this nagging question about herd behavior and embarked on a research project to uncover the truth about population-scale opinion dynamics in the crowd.

He’s the David Austin Professor of Management, Marketing, IT, and Data Science at MIT, director of the MIT Initiative on the Digital Economy, and head of MIT’s Social Analytics Lab. He was the chief scientist of Social Amp and Humin before co-founding Manifest Capital, a VC fund that grows startups into the Hype Machine. Aral has worked closely with Facebook, Yahoo!, Twitter, LinkedIn, Snapchat, WeChat, and The New York Times, among others, and currently serves on the advisory boards of the Alan Turing Institute, the British national institute for data science, in London; the Centre for Responsible Media Technology and Innovation in Norway; and C6 Bank, one of the first all-digital banks of Brazil.


pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos

Albert Einstein, Amazon Mechanical Turk, Arthur Eddington, backpropagation, basic income, Bayesian statistics, Benoit Mandelbrot, bioinformatics, Black Swan, Brownian motion, cellular automata, Charles Babbage, Claude Shannon: information theory, combinatorial explosion, computer vision, constrained optimization, correlation does not imply causation, creative destruction, crowdsourcing, Danny Hillis, data is not the new oil, data is the new oil, data science, deep learning, DeepMind, double helix, Douglas Hofstadter, driverless car, Erik Brynjolfsson, experimental subject, Filter Bubble, future of work, Geoffrey Hinton, global village, Google Glasses, Gödel, Escher, Bach, Hans Moravec, incognito mode, information retrieval, Jeff Hawkins, job automation, John Markoff, John Snow's cholera map, John von Neumann, Joseph Schumpeter, Kevin Kelly, large language model, lone genius, machine translation, mandelbrot fractal, Mark Zuckerberg, Moneyball by Michael Lewis explains big data, Narrative Science, Nate Silver, natural language processing, Netflix Prize, Network effects, Nick Bostrom, NP-complete, off grid, P = NP, PageRank, pattern recognition, phenotype, planetary scale, power law, pre–internet, random walk, Ray Kurzweil, recommendation engine, Richard Feynman, scientific worldview, Second Machine Age, self-driving car, Silicon Valley, social intelligence, speech recognition, Stanford marshmallow experiment, statistical model, Stephen Hawking, Steven Levy, Steven Pinker, superintelligent machines, the long tail, the scientific method, The Signal and the Noise by Nate Silver, theory of mind, Thomas Bayes, transaction costs, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, white flight, yottabyte, zero-sum game

., 91, 94–95 Connectomics, 118–119 Consciousness, 96 Consilience (Wilson), 31 Constrained optimization, 193–195, 241, 242 Constraints, support vector machines and, 193–195 Convolutional neural networks, 117–119, 303 Cope, David, 199, 307 Cornell University, Creative Machines Lab, 121–122 Cortex, 118, 138 unity of, 26–28, 299–300 Counterexamples, 67 Cover, Tom, 185 Crawlers, 8–9 Creative Machines Lab, 121–122 Credit-assignment problem, 102, 104, 107, 127 Crick, Francis, 122, 236 Crossover, 124–125, 134–136, 241, 243 Curse of dimensionality, 186–190, 196, 201, 307 Cyber Command, 19 Cyberwar, 19–21, 279–282, 299, 310 Cyc project, 35, 300 DARPA, 21, 37, 113, 121, 255 Darwin, Charles, 28, 30, 131, 235 algorithm, 122–128 analogy and, 178 Hume and, 58 on lack of mathematical ability, 127 on selective breeding, 123–124 variation and, 124 Data accuracy of held-out, 75–76 Bayes’ theorem and, 31–32 control of, 45 first principal component of the, 214 human intuition and, 39 learning from finite, 24–25 Master Algorithm and, 25–26 patterns in, 70–75 sciences and complex, 14 as strategic asset for business, 13 theory and, 46 See also Big data; Overfitting; Personal data Database engine, 49–50 Databases, 8, 9 Data mining, 8, 73, 232–233, 298, 306. See also Machine learning Data science, 8. See also Machine learning Data scientist, 9 Data sharing, 270–276 Data unions, 274–275 Dawkins, Richard, 284 Decision making, artificial intelligence and, 282–286 Decision theory, 165 Decision tree induction, 85–89 Decision tree learners, 24, 301 Decision trees, 24, 85–90, 181–182, 188, 237–238 Deduction induction as inverse of, 80–83, 301 Turing machine and, 34 Deductive reasoning, 80–81 Deep learning, 104, 115–118, 172, 195, 241, 302 DeepMind, 222 Democracy, machine learning and, 18–19 Dempster, Arthur, 209 Dendrites, 95 Descartes, René, 58, 64 Descriptive theories, normative theories vs., 141–142, 304 Determinism, Laplace and, 145 Developmental psychology, 203–204, 308 DiCaprio, Leonardo, 177 Diderot, Denis, 63 Diffusion equation, 30 Dimensionality, curse of, 186–190, 307 Dimensionality reduction, 189–190, 211–215, 255 nonlinear, 215–217 Dirty Harry (film), 65 Disney animators, S curves and, 106 “Divide and conquer” algorithm, 77–78, 80, 81, 87 DNA sequencers, 84 Downweighting attributes, 189 Driverless cars, 8, 113, 166, 172, 306 Drones, 21, 281 Drugs, 15, 41–42, 83.

If you’re curious what all the hubbub surrounding big data and machine learning is about and suspect that there’s something deeper going on than what you see in the papers, you’re right! This book is your guide to the revolution. If your main interest is in the business uses of machine learning, this book can help you in at least six ways: to become a savvier consumer of analytics; to make the most of your data scientists; to avoid the pitfalls that kill so many data-mining projects; to discover what you can automate without the expense of hand-coded software; to reduce the rigidity of your information systems; and to anticipate some of the new technology that’s coming your way. I’ve seen too much time and money wasted trying to solve a problem with the wrong learning algorithm, or misinterpreting what the algorithm said.

At the end of the day, a browser is just a standard piece of software, but a search engine requires a different mind-set. The other reason machine learners are the über-geeks is that the world has far fewer of them than it needs, even by the already dire standards of computer science. According to tech guru Tim O’Reilly, “data scientist” is the hottest job title in Silicon Valley. The McKinsey Global Institute estimates that by 2018 the United States alone will need 140,000 to 190,000 more machine-learning experts than will be available, and 1.5 million more data-savvy managers. Machine learning’s applications have exploded too suddenly for education to keep up, and it has a reputation for being a difficult subject.


pages: 742 words: 137,937

The Future of the Professions: How Technology Will Transform the Work of Human Experts by Richard Susskind, Daniel Susskind

23andMe, 3D printing, Abraham Maslow, additive manufacturing, AI winter, Albert Einstein, Amazon Mechanical Turk, Amazon Robotics, Amazon Web Services, Andrew Keen, Atul Gawande, Automated Insights, autonomous vehicles, Big bang: deregulation of the City of London, big data - Walmart - Pop Tarts, Bill Joy: nanobots, Blue Ocean Strategy, business process, business process outsourcing, Cass Sunstein, Checklist Manifesto, Clapham omnibus, Clayton Christensen, clean water, cloud computing, commoditize, computer age, Computer Numeric Control, computer vision, Computing Machinery and Intelligence, conceptual framework, corporate governance, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, death of newspapers, disintermediation, Douglas Hofstadter, driverless car, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, Filter Bubble, full employment, future of work, Garrett Hardin, Google Glasses, Google X / Alphabet X, Hacker Ethic, industrial robot, informal economy, information retrieval, interchangeable parts, Internet of things, Isaac Newton, James Hargreaves, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Khan Academy, knowledge economy, Large Hadron Collider, lifelogging, lump of labour, machine translation, Marshall McLuhan, Metcalfe’s law, Narrative Science, natural language processing, Network effects, Nick Bostrom, optical character recognition, Paul Samuelson, personalized medicine, planned obsolescence, pre–internet, Ray Kurzweil, Richard Feynman, Second Machine Age, self-driving car, semantic web, Shoshana Zuboff, Skype, social web, speech recognition, spinning jenny, strong AI, supply-chain management, Susan Wojcicki, tacit knowledge, TED Talk, telepresence, The Future of Employment, the market place, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Tragedy of the Commons, transaction costs, Turing test, Two Sigma, warehouse robotics, Watson beat the top human players on Jeopardy!, WikiLeaks, world market for maybe five computers, Yochai Benkler, young professional

When this term was first used, it was confined to techniques for the handling of vast bodies of data—for example, the masses of data recorded by the Large Hadron Collider. Now, Big Data is also used to refer to the use of technology to analyse much smaller bodies of information. Some speak instead of ‘data analytics’, ‘data science’, and ‘predictive analytics’, all of which seem to mean roughly the same thing.36 Specialists in the area, whatever label is preferred, are often called ‘data scientists’. There has been no shortage of hype about Big Data. There are commentators who argue, with some justification, that its claims are too extravagant and that its methodology is underdeveloped.37 What is hard to deny is the volume of data that are swilling around.

There is less need for the ‘sage on the stage’ and more of a job for the ‘guide on the side’—those who help students navigate through alternative sources of expertise. There are new roles and new disciplines, like education software designers who build the ‘adaptive’ learning systems, the content curators who compile and manage online content, and the data scientists who collect large data sets and develop ‘learning analytics’ to interpret them. It is not surprising, therefore, that Larry Summers, former chair of the White House Council of Economic Advisers and past President of Harvard, has said that ‘the next quarter century will see more change in higher education than the last three combined’,93 and that Sir Michael Barber, a former Downing Street adviser, anticipates transformations in education in his aptly named report, An Avalanche is Coming.94 2.3.

But the competition is also advancing from outside the traditional boundaries of the professions—from new people and different institutions. In Chapter 2 we see a recurring need to draw on people with very different skills, talents, and ways of working. Practising doctors, priests, teachers, and auditors did not, for example, develop the software that supports the systems that we describe. Stepping forward instead are data scientists, process analysts, knowledge engineers, systems engineers, and many more (see Chapter 6). Today, professionals still provide much of the content, but in time they may find themselves down-staged by these new specialists. We also see a diverse set of institutions entering the fray—business process outsourcers, retail brands, Internet companies, major software and service vendors, to name a few.


pages: 251 words: 80,831

Super Founders: What Data Reveals About Billion-Dollar Startups by Ali Tamaseb

"World Economic Forum" Davos, 23andMe, additive manufacturing, Affordable Care Act / Obamacare, Airbnb, Anne Wojcicki, asset light, barriers to entry, Ben Horowitz, Benchmark Capital, bitcoin, business intelligence, buy and hold, Chris Wanstrath, clean water, cloud computing, coronavirus, corporate governance, correlation does not imply causation, COVID-19, cryptocurrency, data science, discounted cash flows, diversified portfolio, Elon Musk, Fairchild Semiconductor, game design, General Magic , gig economy, high net worth, hiring and firing, index fund, Internet Archive, Jeff Bezos, John Zimmer (Lyft cofounder), Kickstarter, late fees, lockdown, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Max Levchin, Mitch Kapor, natural language processing, Network effects, nuclear winter, PageRank, PalmPilot, Parker Conrad, Paul Buchheit, Paul Graham, peer-to-peer lending, Peter Thiel, Planet Labs, power law, QR code, Recombinant DNA, remote working, ride hailing / ride sharing, robotic process automation, rolodex, Ruby on Rails, Salesforce, Sam Altman, Sand Hill Road, self-driving car, shareholder value, sharing economy, side hustle, side project, Silicon Valley, Silicon Valley startup, Skype, Snapchat, SoftBank, software as a service, software is eating the world, sovereign wealth fund, Startup school, Steve Jobs, Steve Wozniak, survivorship bias, TaskRabbit, telepresence, the payments system, TikTok, Tony Fadell, Tony Hsieh, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, ubercab, web application, WeWork, work culture , Y Combinator

As a result, Lake had to learn how to be incredibly efficient with the money she had. Stitch Fix sold clothes before paying its vendors, and it turned over inventory as quickly as possible. The company changed its cash cycle so that it didn’t have to sit on unsold products. By hiring very talented and senior data scientists, she also turned Stitch Fix into a technology powerhouse and a talent magnet for the best data scientists. Stitch Fix was able to better understand a customer’s style through a series of sophisticated algorithms, which could augment the workforce of human stylists and significantly lower costs. “If I had been handed $100 million, I don’t know that I would have understood the business as well as I did,” says Lake.

A business like Stitch Fix—which involves buying and holding inventory, storage, shipping, and a lot of physical labor, from stylists to warehouse workers—would be considered capital intensive and inefficient at first sight. Lake managed to become more efficient by renegotiating contracts with vendors and hiring data scientists to augment stylists to scale the business. It’s true that many capital-efficient companies are capital light. (They have low capital expenditures, abbreviated as “low CapEx.”) But in practice, companies with high capital needs can also be successful in both raising funding and reaching multibillion-dollar outcomes.

David Vélez was a partner at Sequoia Capital looking into Latin American investment opportunities before venturing out to start Nubank, the Brazilian online bank unicorn, and Andy Rachleff co-founded Benchmark Capital before starting Wealthfront, a consumer financial advisory company. Abe Othman, head of data science at AngelList—a website enabling angel investors to invest in early-stage startups and to diversify through index investments—told me, “One of the most significant characteristics we’ve observed among successful startups is when a founder had experience as an investor or angel investor.” This could be because former investors have an easier time raising money, and because they’re better at filtering their ideas and betting on the right one to spend time on.


pages: 424 words: 114,905

Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again by Eric Topol

"World Economic Forum" Davos, 23andMe, Affordable Care Act / Obamacare, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic bias, AlphaGo, Apollo 11, artificial general intelligence, augmented reality, autism spectrum disorder, autonomous vehicles, backpropagation, Big Tech, bioinformatics, blockchain, Cambridge Analytica, cloud computing, cognitive bias, Colonization of Mars, computer age, computer vision, Computing Machinery and Intelligence, conceptual framework, creative destruction, CRISPR, crowdsourcing, Daniel Kahneman / Amos Tversky, dark matter, data science, David Brooks, deep learning, DeepMind, Demis Hassabis, digital twin, driverless car, Elon Musk, en.wikipedia.org, epigenetics, Erik Brynjolfsson, fake news, fault tolerance, gamification, general purpose technology, Geoffrey Hinton, George Santayana, Google Glasses, ImageNet competition, Jeff Bezos, job automation, job satisfaction, Joi Ito, machine translation, Mark Zuckerberg, medical residency, meta-analysis, microbiome, move 37, natural language processing, new economy, Nicholas Carr, Nick Bostrom, nudge unit, OpenAI, opioid epidemic / opioid crisis, pattern recognition, performance metric, personalized medicine, phenotype, placebo effect, post-truth, randomized controlled trial, recommendation engine, Rubik’s Cube, Sam Altman, self-driving car, Silicon Valley, Skinner box, speech recognition, Stephen Hawking, techlash, TED Talk, text mining, the scientific method, Tim Cook: Apple, traumatic brain injury, trolley problem, War on Poverty, Watson beat the top human players on Jeopardy!, working-age population

Among today’s medical specialties, it will be the radiologist who, having a deep understanding of the nuances of such image-based diagnostic algorithms, is best positioned to communicate results to patients and provide guidance for how to respond to them. Nevertheless, although some have asserted that “radiologists of the future will be essential data scientists of medicine,” I don’t think that’s necessarily the direction we’re headed.44 Instead, they likely will be connecting far more with patients, acting as real doctors. FIGURE 6.2: Predicting longevity from a deep neural network of CT scans. Source: Adapted from L. Oakden-Rayner et al., “Precision Radiology: Predicting Longevity Using Feature Engineering and Deep Learning Methods in a Radiomics Framework,” Sci Rep (2017): 7(1), 1648.

With a heightened awareness of the opportunity to prevent such tragedies, in 2017, CEO Mark Zuckerberg announced new algorithms that look for patterns of posts and words for rapid review by dedicated Facebook employees: “In the future, AI will be able to understand more of the subtle nuances of language and will be able to identify different issues beyond suicide as well, including quickly spotting more kinds of bullying and hate.” Unfortunately, and critically, Facebook has refused to disclose the algorithmic details, but the company claims to have interceded with more than a hundred people intending to commit self-harm.42 Data scientists are now using machine learning on the Crisis Text Line’s 75 million texts43 to try to unravel text or emoji risk factors.44 Overall, even these very early attempts at using AI to detect depression and suicidal risk show some promising signs that we can do far better than the traditional subjective and clinical risk factors.

., “Assessing Suicide Risk and Emotional Distress in Chinese Social Media: A Text Mining and Machine Learning Study.” J Med Internet Res, 2017. 19(7): p. e243. 42. McConnon, “AI Helps Identify Those at Risk for Suicide.” 43. “Crisis Trends,” July 19, 2018. https://crisistrends.org/#visualizations. 44. Resnick, B., “How Data Scientists Are Using AI for Suicide Prevention,” Vox. 2018. 45. Anthes, E., “Depression: A Change of Mind.” Nature, 2014. 515(7526): pp. 185–187. 46. Firth, J., et al., “The Efficacy of Smartphone-Based Mental Health Interventions for Depressive Symptoms: A Meta-Analysis of Randomized Controlled Trials.”


pages: 523 words: 61,179

Human + Machine: Reimagining Work in the Age of AI by Paul R. Daugherty, H. James Wilson

3D printing, AI winter, algorithmic management, algorithmic trading, AlphaGo, Amazon Mechanical Turk, Amazon Robotics, augmented reality, autonomous vehicles, blockchain, business process, call centre, carbon footprint, circular economy, cloud computing, computer vision, correlation does not imply causation, crowdsourcing, data science, deep learning, DeepMind, digital twin, disintermediation, Douglas Hofstadter, driverless car, en.wikipedia.org, Erik Brynjolfsson, fail fast, friendly AI, fulfillment center, future of work, Geoffrey Hinton, Hans Moravec, industrial robot, Internet of things, inventory management, iterative process, Jeff Bezos, job automation, job satisfaction, knowledge worker, Lyft, machine translation, Marc Benioff, natural language processing, Neal Stephenson, personalized medicine, precision agriculture, Ray Kurzweil, recommendation engine, RFID, ride hailing / ride sharing, risk tolerance, robotic process automation, Rodney Brooks, Salesforce, Second Machine Age, self-driving car, sensor fusion, sentiment analysis, Shoshana Zuboff, Silicon Valley, Snow Crash, software as a service, speech recognition, tacit knowledge, telepresence, telepresence robot, text mining, the scientific method, uber lyft, warehouse automation, warehouse robotics

This way, customer satisfaction doesn’t take a hit, and the company saves on energy costs.22 Enable Discovery What kind of conversations are you having with your data? Are only analysts and data scientists benefiting from analysis tools? Your goal should be to extract insights in such a way that anyone, especially less-technical business users can take advantage of the story that the data is trying to tell. Ayasdi is democratizing discovery, providing software that’s useful to data scientists and non-technical business leaders alike. One of its customers, Texas Medical Center (TMC), focuses on the analysis of high-volume, high-dimensional data sets such as data from breast-cancer patients.

When you start typing search terms, Google not only considers the most generally popular associations for its autocomplete feature, but also considers your geographic location, previous search terms, and other factors. It can feel as if the software is reading your thoughts. Leveraging AI to Find a Job Bot-based empowerment skills also come in handy for job searches. If there’s one guarantee for workers in these AI days, it’s that the job landscape is quickly changing. Positions such as data scientist, which barely existed five years ago, are now all the rage. And positions that focus on rote tasks like data entry are quickly fading from job listings. How can people forge new career paths, find new training opportunities, or boost their online presence or personal brand on social media? The answer is bot-based empowerment.

Cade Metz, “DARPA Goes Full Tron with Its Brand Battle of the Hack Bots,” Wired, July 5, 2016, https://www.wired.com/2016/07/_trashed-19/. Various security companies have their own approaches to the problem. SparkCognition, for instance, offers a product called Deep Armor, which uses a combination of AI techniques including neural networks, heuristics, data science, and natural-language processing to detect threats never seen before and remove malicious files. Another company called Darktrace offers a product called Antigena, which is modeled on the human immune system, identifying and neutralizing bugs as they’re encountered.7 Behavioral analysis of network traffic is key to another company called Vectra.


pages: 304 words: 82,395

Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schonberger, Kenneth Cukier

23andMe, Affordable Care Act / Obamacare, airport security, Apollo 11, barriers to entry, Berlin Wall, big data - Walmart - Pop Tarts, Black Swan, book scanning, book value, business intelligence, business process, call centre, cloud computing, computer age, correlation does not imply causation, dark matter, data science, double entry bookkeeping, Eratosthenes, Erik Brynjolfsson, game design, hype cycle, IBM and the Holocaust, index card, informal economy, intangible asset, Internet of things, invention of the printing press, Jeff Bezos, Joi Ito, lifelogging, Louis Pasteur, machine readable, machine translation, Marc Benioff, Mark Zuckerberg, Max Levchin, Menlo Park, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, obamacare, optical character recognition, PageRank, paypal mafia, performance metric, Peter Thiel, Plato's cave, post-materialism, random walk, recommendation engine, Salesforce, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, smart grid, smart meter, social graph, sparse data, speech recognition, Steve Jobs, Steven Levy, systematic bias, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, Thomas Davenport, Turing test, vertical integration, Watson beat the top human players on Jeopardy!

So far, the first two of these elements get the most attention: the skills, which today are scarce, and the data, which seems abundant. A new profession has emerged in recent years, the “data scientist,” which combines the skills of the statistician, software programmer, infographics designer, and storyteller. Instead of squinting into a microscope to unlock a mystery of the universe, the data scientist peers into databases to make a discovery. The McKinsey Global Institute proffers dire predictions about the dearth of data scientists now and in the future (which today’s data scientists like to cite to feel special and to pump up their salaries). Hal Varian, Google’s chief economist, famously calls statistician the “sexiest” job around.

The value in data’s reuse is good news for organizations that collect or control large datasets but currently make little use of them, such as conventional businesses that mostly operate offline. They may sit on untapped informational geysers. Some companies may have collected data, used it once (if at all), and just kept it around because of low storage cost—in “data tombs,” as data scientists call the places where such old info resides. Internet and technology companies are on the front lines of harnessing the data deluge, since they collect so much information just by being online and are ahead of the rest of industry in analyzing it. But all firms stand to gain. The consultants at McKinsey & Company point to a logistics company, whose name they keep anonymous, which noticed that in the process of delivering goods, it was amassing reams of information on product shipments around the globe.

Data exhaust is the mechanism behind many services like voice recognition, spam filters, language translation, and much more. When users indicate to a voice-recognition program that it has misunderstood what they said, they in effect “train” the system to get better. Many businesses are starting to engineer their systems to collect and use information in this way. In Facebook’s early days, its first “data scientist,” Jeff Hammerbacher (and among the people credited with coining the term), examined its rich trove of data exhaust. He and the team found that a big predictor that people would take an action (post content, click an icon, and so on) was whether they had seen their friends do the same thing. So Facebook redesigned its system to put greater emphasis on making friends’ activities more visible, which sparked a virtuous circle of new contributions to the site.


pages: 317 words: 87,566

The Happiness Industry: How the Government and Big Business Sold Us Well-Being by William Davies

"Friedman doctrine" OR "shareholder theory", "World Economic Forum" Davos, 1960s counterculture, Abraham Maslow, Airbnb, behavioural economics, business intelligence, business logic, corporate governance, data science, dematerialisation, experimental subject, Exxon Valdez, Frederick Winslow Taylor, Gini coefficient, income inequality, intangible asset, invisible hand, joint-stock company, Leo Hollis, lifelogging, market bubble, mental accounting, military-industrial complex, nudge unit, Panopticon Jeremy Bentham, Philip Mirowski, power law, profit maximization, randomized controlled trial, Richard Thaler, road to serfdom, Ronald Coase, Ronald Reagan, science of happiness, scientific management, selective serotonin reuptake inhibitor (SSRI), sentiment analysis, sharing economy, Slavoj Žižek, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, social contagion, social intelligence, Social Responsibility of Business Is to Increase Its Profits, Steve Jobs, TED Talk, The Chicago School, The Spirit Level, theory of mind, urban planning, Vilfredo Pareto, W. E. B. Du Bois, you are the product

But possibilities for psychological and behavioural data are heavily shaped by the power structures which facilitate them. The current explosion in happiness and well-being data is really an effect of new technologies and practices of surveillance. In turn, these depend on pre-existing power inequalities. Building the new laboratory In 2012, Harvard Business Review declared that ‘data scientist’ would be the ‘sexiest job of the twenty-first century’.4 We live during a time of tremendous optimism regarding the possibilities for data collection and analysis that is refuelling the behaviourist and utilitarian ambition to manage society purely through careful scientific observation of mind, body and brain.

But as that science becomes ever more advanced, eventually the subjective element of it starts to drop out of the picture altogether. Bentham’s presumption, that pleasure and pain are the only real dimensions of psychology, is now leading squarely towards the philosophical riddle whereby a neuroscientist or data scientist can tell me that I am objectively wrong about my own mood. We are reaching the point where our bodies are more trusted communicators than our words. If one way of ‘seeing’ happiness as a physiological event is via the face, the other way is to get even closer to its supposed locus: the brain.

, theatlantic.com, 2 April 2012. 25Jeremy Gilbert, ‘Capitalism, Creativity and the Crisis in the Music Industry’, opendemocracy.net, 14 September 2012. 7 Living in the Lab 1Jennifer Scanlon, ‘Mediators in the International Marketplace: US Advertising in Latin America in the Early Twentieth Century’, The Business History Review 77: 3, 2003. 2Jeff Merron, ‘Putting Foreign Consumers on the Map: J. Walter Thompson’s Struggle with General Motors’ International Advertising Account in the 1920s’, The Business History Review 73: 3, 1999. 3Ibid. 4Thomas Davenport and D. J. Patil, ‘Data Scientist: The Sexiest Job of the 21st Century’, Harvard Business Review, October 2012. 5Viktor Mayer-Schönberger, and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work and Think, London: John Murray, 2013. 6Anthony Townsend, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, New York: W.


pages: 292 words: 94,660

The Loop: How Technology Is Creating a World Without Choices and How to Fight Back by Jacob Ward

2021 United States Capitol attack, 4chan, Abraham Wald, AI winter, Albert Einstein, Albert Michelson, Amazon Mechanical Turk, assortative mating, autonomous vehicles, availability heuristic, barriers to entry, Bayesian statistics, Benoit Mandelbrot, Big Tech, bitcoin, Black Lives Matter, Black Swan, blockchain, Broken windows theory, call centre, Cass Sunstein, cloud computing, contact tracing, coronavirus, COVID-19, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, dark matter, data science, deep learning, Donald Trump, drone strike, endowment effect, George Akerlof, George Floyd, hindsight bias, invisible hand, Isaac Newton, Jeffrey Epstein, license plate recognition, lockdown, longitudinal study, Lyft, mandelbrot fractal, Mark Zuckerberg, meta-analysis, natural language processing, non-fungible token, nudge unit, OpenAI, opioid epidemic / opioid crisis, pattern recognition, QAnon, RAND corporation, Richard Thaler, Robert Shiller, selection bias, self-driving car, seminal paper, shareholder value, smart cities, social contagion, social distancing, Steven Levy, survivorship bias, TikTok, Turing test

With enough time to refine its process, the system will, eventually, astound us with its gift for telling us that this panting poodle is a dog, and that this heifer nosing the curtains is a cow. But how did the system get there? It matters because, as we’ll see, this technology goes about answering stupid questions and world-changing questions the same way. A data scientist handed this task would want to know more about the goal of the project in order to employ the most suitable flavor of machine learning to accomplish it. If you wanted the system to identify the cows and dogs in a photograph of the stage, for instance, you’d probably feed it into a convolutional neural network, a popular means of recognizing objects in a photograph these days.

We knew we probably would not win the competition that way, but there was a bigger point that we needed to make.2 And they were right: they didn’t win. The organizers of the competition would not allow the judges to play with and evaluate the Duke team’s visualization tool, and a trio from IBM Research won the competition for a system based on Boolean rules that would help a human data scientist inspect the black box. In its paper describing the winning system, the IBM team rightly pointed out the need for explainability “as machine learning pushes further into domains such as medicine, criminal justice, and business, where such models complement human decision-makers and decisions can have major consequences on human lives.”

Wald is credited with helping to identify something that statisticians now work hard to fight off: survivorship bias, the tendency to optimize our behavior for the future based on what may in fact be the rare instances in which someone or something survives extremely long odds and only as a result comes to our attention. Forecasting probability based on limited data is something statisticians now know not to do, thanks to Wald—data scientists often post his report’s famous diagram of a shot-up plane as an online meme—but Anna Todd is the spear-tip of an industry that forecasts future success by studying survivors. When Todd and I met, roughly four million writers were posting more than twenty-four hours of reading material to WattPad every sixty seconds of every day, according to the company.


pages: 589 words: 69,193

Mastering Pandas by Femi Anthony

Amazon Web Services, Bayesian statistics, correlation coefficient, correlation does not imply causation, data science, Debian, en.wikipedia.org, Internet of things, Large Hadron Collider, natural language processing, p-value, power law, random walk, side project, sparse data, statistical model, Thomas Bayes

He enjoys analyzing data and solving complex business problems using SAS, R, EViews/Gretl, Minitab, SQL, and Python. Opeyemi is also an adjunct at Northwood University where he designs and teaches undergraduate courses in microeconomics and macroeconomics. Louis Hénault is a data scientist at OgilvyOne Paris. He loves combining mathematics and computer science to solve real-world problems in an innovative way. After getting a master's degree in engineering with a major in data sciences and another degree in applied mathematics in France, he entered into the French start-up ecosystem, working on several projects. Louis has gained experience in various industries, including geophysics, application performance management, online music platforms, e-commerce, and digital advertising.

Note For more information, refer the Wikipedia page on Python at http://en.wikipedia.org/wiki/Python_%28programming_language%29. Among the characteristics that make Python popular for data science are its very user-friendly (human-readable) syntax, the fact that it is interpreted rather than compiled (leading to faster development time), and its very comprehensive library for parsing and analyzing data, as well as its capacity for doing numerical and statistical computations. Python has libraries that provide a complete toolkit for data science and analysis. The major ones are as follows: NumPy: The general-purpose array functionality with emphasis on numeric computation SciPy: Numerical computing Matplotlib: Graphics pandas: Series and data frames (1D and 2D array-like types) Scikit-Learn: Machine learning NLTK: Natural language processing Statstool: Statistical analysis For this book, we will be focusing on the 4th library listed in the preceding list, pandas.

Application of machine learning – Kaggle Titanic competition In order to illustrate how we can use pandas to assist us at the start of our machine learning journey, we will apply it to a classic problem, which is hosted on the Kaggle website (http://www.kaggle.com). Kaggle is a competition platform for machine learning problems. The idea behind Kaggle is to enable companies that are interested in solving predictive analytics problems with their data to post their data on Kaggle and invite data scientists to come up with the proposed solutions to their problems. The competition can be ongoing over a period of time, and the rankings of the competitors are posted on a leader board. At the end of the competition, the top-ranked competitors receive cash prizes. The classic problem that we will study in order to illustrate the use of pandas for machine learning with scikit-learn is the Titanic: machine learning from disaster problem hosted on Kaggle as their classic introductory machine learning problem.


pages: 688 words: 147,571

Robot Rules: Regulating Artificial Intelligence by Jacob Turner

"World Economic Forum" Davos, Ada Lovelace, Affordable Care Act / Obamacare, AI winter, algorithmic bias, algorithmic trading, AlphaGo, artificial general intelligence, Asilomar, Asilomar Conference on Recombinant DNA, autonomous vehicles, backpropagation, Basel III, bitcoin, Black Monday: stock market crash in 1987, blockchain, brain emulation, Brexit referendum, Cambridge Analytica, Charles Babbage, Clapham omnibus, cognitive dissonance, Computing Machinery and Intelligence, corporate governance, corporate social responsibility, correlation does not imply causation, crowdsourcing, data science, deep learning, DeepMind, Demis Hassabis, distributed ledger, don't be evil, Donald Trump, driverless car, easy for humans, difficult for computers, effective altruism, Elon Musk, financial exclusion, financial innovation, friendly fire, future of work, hallucination problem, hive mind, Internet of things, iterative process, job automation, John Markoff, John von Neumann, Loebner Prize, machine readable, machine translation, medical malpractice, Nate Silver, natural language processing, Nick Bostrom, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, nudge unit, obamacare, off grid, OpenAI, paperclip maximiser, pattern recognition, Peace of Westphalia, Philippa Foot, race to the bottom, Ray Kurzweil, Recombinant DNA, Rodney Brooks, self-driving car, Silicon Valley, Stanislav Petrov, Stephen Hawking, Steve Wozniak, strong AI, technological singularity, Tesla Model S, The Coming Technological Singularity, The Future of Employment, The Signal and the Noise by Nate Silver, trolley problem, Turing test, Vernor Vinge

An AI system learns to play the game independently of human training, but when its actions are matched to the human descriptions, a narrative can be generated by sewing together these descriptors. Taking a different route to Riedl et al., data scientist Daniel Whitenack identifies three general capabilities required for transparency in AI: data provenance (knowing the source of all data); reproducibility (the ability to recreate a given result); and data versioning (saving snapshot copies of the AI in particular states with a view to recording which input led to which output). Whitenack suggests that in order to make these three desiderata “standards within data science, we need proper tools to integrate these characteristics into workflows”. He says that ideally, AI transparency tools will be:Language agnostic—The language wars in data science between python, R, scala, and others will continue on forever.

See also James Vincent, “Tencent Says There Are Only 300,000 AI Engineers Worldwide, but Millions Are Needed”, The Verge, 5 December 2017, https://​www.​theverge.​com/​2017/​12/​5/​16737224/​global-ai-talent-shortfall-tencent-report, accessed 1 June 2018. By contrast, PWC estimate that in the USA alone, there will be 2.9 m people with data science and analytics skills by 2018. Not all will be AI professionals per se, but many of their skills will overlap. “What’s Next for the 2017 Data Science and Analytics Job Market?”, PWC Website, https://​www.​pwc.​com/​us/​en/​library/​data-science-and-analytics.​html, accessed 1 June 2018. 144Katja Grace, “The Asilomar Conference: A Case Study in Risk Mitigation”, MIRI Research Institute, Technical Report, 2015–9 (Berkeley, CA: MIRI, 15 July 2015), 15. 145A constantly-updated database of tech ethics curricula is available at: https://​docs.​google.​com/​spreadsheets/​d/​1jWIrA8jHz5fYAW4​h9CkUD8gKS5V98PD​JDymRf8d9vKI/​edit#gid=​0, accessed 1 June 2018. 146 Microsoft, The Future Computed: Artificial Intelligence and Its Role in Society (Redmond, WA: Microsoft Corporation, U.S.A., 2018), 55, https://​msblob.​blob.​core.​windows.​net/​ncmedia/​2018/​01/​The-Future_​Computed_​1.​26.​18.​pdf, accessed 1 June 2018. 147See, for example, s. 1 of the UK Road Traffic Act 1988, or s. 249(1)(a) of the Canadian Criminal Code. 148“About TensorFlow”, Website of TensorfFlow, https://​www.​tensorflow.​org/​, accessed 1 June 2018. 149See, for example, the UK Government’s “Guidance: Wine Duty”, 9 November 2009, https://​www.​gov.​uk/​guidance/​wine-duty, accessed 1 June 2018. 150See, for example, Max Weber, “Politics as a Vocation”, in From Max Weber: Essays in Sociology, translated by H.H.

He says that ideally, AI transparency tools will be:Language agnostic—The language wars in data science between python, R, scala, and others will continue on forever. We will always need a mix of languages and frameworks to enable advancements in a field as broad as data science. However, if tools enabling data versioning/provenance are language specific, they are unlikely to be integrated as standard practice. Infrastructure Agnostic—The tools should be able to be deployed on your existing infrastructure—locally, in the cloud, or on-prem. Scalable/distributed—It would be impractical to implement changes to a workflow if they were not able to scale up to production requirements.


pages: 237 words: 65,794

Mining Social Media: Finding Stories in Internet Data by Lam Thuy Vo

barriers to entry, correlation does not imply causation, data science, Donald Trump, en.wikipedia.org, Filter Bubble, Firefox, Google Chrome, Internet Archive, natural language processing, social web, web application

To that end, what follows are a few helpful resources on writing clean code with Python and pandas, as well as producing reproducible data analysis in Jupyter Notebook. While by no means a comprehensive guide, they’re a good starting point: The general Python style guide (https://docs.python-guide.org/writing/style/) and a style guide for data scientists (http://columbia-applied-data-science.github.io/pages/lowclass-python-style-guide.html) Think Python, 2nd Edition, a book by Allen B. Downey (O’Reilly, 2015), available for free under the Creative Commons license on the author’s site (https://greenteapress.com/wp/think-python-2e/) “A Beginner’s Guide to Optimizing Pandas Code for Speed,” an article by Sofia Heisler (https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6/) “What We’ve Learned About Sharing Our Data Analysis,” an article by Jeremy Singer-Vine (https://source.opennews.org/articles/what-weve-learned-about-sharing-our-data-analysis/) The libraries and tools we’ve used in this book have stood the test of time among Python users, but new libraries pop up all the time and may do certain things better than what is already available.

The Jupyter Notebook web app, which evolved out of the web app IPython Notebooks, was created to accommodate three programming languages—Julia, Python, and R (Ju-Pyt-R)—but has since evolved to support many other coding languages. Jupyter Notebook is also used by many data scientists in a diverse range of fields, including people crunching numbers to improve website performance, sociologists studying demographic information, and journalists searching for trends and anomalies in data obtained through Freedom of Information Act requests. One huge benefit of this is that many of these data scientists and researchers put their notebooks—often featuring detailed and annotated analyses—online on code-sharing platforms like GitHub, making it easier for beginning learners like you to replicate their studies.

But the findings in summary data come from rather messy, wildly varying replies to surveys or other databases of raw data—that is, data that has not been processed yet. Data tables provided by organizations like the US Census Bureau often have been cleaned, processed, and aggregated from thousands—if not millions—of raw data entries, many of which may contain several inconsistencies that data scientists worked to resolve. For example, in a simple table listing people’s occupations, these organizations may have resolved different but essentially equivalent responses like “attorney” and “lawyer.” Likewise, the raw data we look at in this book—data from the social web—can be quite irregular and challenging to process because it’s pro-duced by real people, each with unique quirks and posting habits.


Pandas for Everyone: Python Data Analysis by Unknown

data science

Numpy 3 http://pandas.pydata.org/pandas-docs/stable/basics.html#descriptivestatistics print(ages.mean()) 49.0 print(ages.min()) 37 print(ages.max()) 61 print(ages.std()) 16.97056274847714 The mean, min, max, and std are also methods in the numpy.ndarray Series append methods Description Concatenates 2 or more Series corr Calculate a correlation with another Series* cov Calculate a covariance with another Series* describe Calculate summary statistics* drop duplicates Returns a Series without duplicates equals Sees if a Series has the same elements get values Get values of the Series, same as the values attribute hist Draw a histogram min Return the minimum value max Returns the maximum value mean Returns the arithmetic mean median Returns the median mode Returns the mode(s) quantile Returns the value at a given quantile replace Replaces values in the Series with a specified value sample Returns a random sample of values from the Series sort values Sort values to frame Converts Series to DataFrame transpose Return the transpose unique Returns a numpy.ndarray of unique values indicates missing values will be automatically dropped 2.5.2 Boolean subsetting Series Chapter 1 showed how we can use specific indicies to subset our data. However, it is rare that we know the exact row or column index to subset the data. Typically you are looking for values that meet (or don’t meet) a particular calculation or observation. First, let’s use a larger dataset scientists pd.read_csv(’../data/scientists.csv’) We just saw how we can calculate basic descriptive metrics of vectors 4 http://does.scipy.org/doc/numpy/reference/arrays.ndarray.html ages = scientists[’Age’] print(ages) 0 37 1 61 2 90 3 66 4 56 5 45 6 41 7 77 Name: Age, dtype: int64 print(ages.mean()) 59.125 print(ages.describe()) count 8.000000 mean 59.125000 std 18.325918 min 37.000000 25% 44.000000 50% 58.500000 75% 68.750000 max 90.000000 Name: Age, dtype: float64 What if we wanted to subset our ages by those above the mean?

print(taxi_loop_concat_comp equals(taxi_loop_concat)) True 6.7 Summary Here I showed you how we can reshape data to a format that is conducive for data analysis, visualization, and collection. We followed Hadley Wickham's Tidy Data paper to show the various functions and methods to reshape our data. This is an important skill since various functions will need data in a certain shape, tidy or not, in order to work. Knowing how to reshape your data will be an important still as a data scientist and analyst.

You can see TODO USING FUNCTIONS of you need more information on using function parameters 7 http://pandas.pydata.org/pandasdocs/stable/generated/pandas.Series.to_csv.html 8 http://pandas.pydata.org/pandasdocs/stable/generated/pandas.DataFrame.to_csv.html 9 http://pandas.pydata.org/pandasdocs/stable/generated/pandas.DataFrame.to_csv.html 10 http://pandas.pydata.org/pandasdocs/stable/generated/pandas.read_csv.html 2.8.3 Excel Excel, probably the most common data type (or second most common, next to CSVs). Excel has a bad reputation within the data science community. I discuessed some of the reasons why in Chapter 1.1. The goal of this book isn’t to bash Excel, but to teach you a resonable alternative tool for data analytics. In short, the more you can do your work in a scripting language, the easier it will be to scale up to larger projects, catch and fix mistakes, and collaborate.


pages: 156 words: 15,746

Personal Finance with Python by Max Humber

asset allocation, backtesting, bitcoin, cryptocurrency, data science, Dogecoin, en.wikipedia.org, Ethereum, passive income, web application

The second edition of this hands-on guide—updated for Python 3.5 and pandas 1.0 —is packed with practical cases studies that show you how to effectively solve a broad set of data analysis problems, using Python libraries such as NumPy, pandas, Matplotlib, and IPython. Python Data Science Handbook: Tools and Techniques for Developers by Jake VanderPlas For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with this book do you get them all—IPython, NumPy, pandas, Matplotlib, scikit-learn, and other related tools. Finally, please e-mail me if you have any concerns, questions, or comments about this book.

Conclusion Chapter 3:​ Convert openexchangerate​s.​org Secrets Documentation Encapsulate show_​alternative .​apply Conclusion Chapter 4:​ Amortize Banks Amortization Payment Loop A Loop B Functionize Evaluate Conclusion Chapter 5:​ Budget Dates datetime Timestamp .​normalize Horizon Flows Totals Visualization Updating Vacation I English get_​dates Fun YAML Functionize Vacation II Loading YAML Conclusion Chapter 6:​ Invest Trade-Offs Instantiate Prices Orders Deposit Simulate Quotes get_​price get_​historical Portfolio Rebalance Conclusion Chapter 7:​ Spend Prophet Purchases Forecast Visualize Conclusion Appendix: Next Index About the Author and About the Technical Reviewer About the Author Max Humberis a Data Engineer interested in improving finance with technology. He works for Wealthsimple and previously served as the first data scientist for the online lending platform Borrowell. He has spoken at Pycon, ODSC, PyData, useR, and BigDataX in Colombia, London, Berlin, Brussels, and Toronto. About the Technical Reviewer Michael Thomas has worked in software development for more than 20 years as an individual contributor, team lead, program manager, and vice president of engineering.


pages: 25 words: 5,789

Data for the Public Good by Alex Howard

"World Economic Forum" Davos, 23andMe, Atul Gawande, Cass Sunstein, cloud computing, crowdsourcing, data science, Hernando de Soto, Internet of things, Kickstarter, lifelogging, machine readable, Network effects, openstreetmap, Silicon Valley, slashdot, social intelligence, social software, social web, web application

If you want a deep look at what the work of digitizing data really looks like, read Carl Malamud’s interview with Slashdot on opening government data. Data for the public good, however, goes far beyond government’s own actions. In many cases, it will happen despite government action — or, often, inaction — as civic developers, data scientists and clinicians pioneer better analysis, visualization and feedback loops. For every civic startup or regulation, there’s a backstory that often involves a broad number of stakeholders. Governments have to commit to open up themselves but will, in many cases, need external expertise or even funding to do so.

To create public good from public goods — the public sector data that governments collect, the private sector data that is being collected and the social data that we generate ourselves — we will need to collectively forge new compacts that honor existing laws and visionary agreements that enable the new data science to put the data to work. About the Author Alexander B. Howard is the Government 2.0 Correspondent for O’Reilly Media, where he reports on technology, open government and online civics. Before joining O’Reilly, Howard was the associate editor of SearchCompliance.com at TechTarget. His work there focused on how regulations affect IT operations, including issues of data protection, privacy, security and enterprise IT strategy.


pages: 296 words: 78,112

Devil's Bargain: Steve Bannon, Donald Trump, and the Storming of the Presidency by Joshua Green

4chan, Affordable Care Act / Obamacare, Ayatollah Khomeini, Bernie Sanders, Biosphere 2, Black Lives Matter, business climate, Cambridge Analytica, Carl Icahn, centre right, Charles Lindbergh, coherent worldview, collateralized debt obligation, conceptual framework, corporate raider, crony capitalism, currency manipulation / currency intervention, data science, Donald Trump, Dr. Strangelove, fake news, Fractional reserve banking, Glass-Steagall Act, Goldman Sachs: Vampire Squid, Gordon Gekko, guest worker program, hype cycle, illegal immigration, immigration reform, Jim Simons, junk bonds, liberation theology, low skilled workers, machine translation, Michael Milken, Nate Silver, Nelson Mandela, nuclear winter, obamacare, open immigration, Peace of Westphalia, Peter Thiel, quantitative hedge fund, Renaissance Technologies, Robert Mercer, Ronald Reagan, Silicon Valley, social intelligence, speech recognition, Steve Bannon, urban planning, vertical integration

Trump’s campaign rejected the attack and criticized the ADL for involving itself in partisan politics. “Darkness is good,” Bannon counseled Trump. “Don’t let up.” By this point, the campaign had curtailed most of its polling. But it wasn’t quite flying blind. A few days earlier, Trump’s team of data scientists, squirreled away in an office down in San Antonio, had delivered a report titled “Predictions: Five Days Out,” which contained stunning news that contradicted the widespread assumption that Clinton would win easily. It was suddenly clear that Comey’s FBI investigation was roiling the electorate.

Bannon was fixated on Michigan, constantly urging Stepien to click back over and zoom in on bellwethers like Macomb County—and with 30 percent, 40 percent, then 50 percent of precincts reporting, Trump’s lead was holding steady. Or growing. As the night wore on, he even led in Wisconsin, a scenario none of Trump’s data scientists had ever imagined. Later on, several of those crammed into the room would recall a moment when Stepien’s manic patter flagged for just a second and the room fell quiet. Then somebody—no one could remember who—muttered, “Holy shit. This is happening.” Drudge was right. The corporate media blew it.

The unmarked entrance is framed by palmetto trees and sits beneath a large, second-story veranda with sweeping overhead fans, where the (mostly male) staff gathers in the afternoons to smoke cigars and brainstorm. Established in 2012 to study crony capitalism and governmental malfeasance, GAI is staffed with lawyers, data scientists, and forensic investigators and has collaborated with such mainstream news outlets as Newsweek, ABC News, and CBS’s 60 Minutes on stories ranging from insider trading in Congress to credit-card fraud among presidential campaigns. It’s a mining operation for political scoops that, for two years, had trained its investigative firepower on the Clintons.


pages: 197 words: 35,256

NumPy Cookbook by Ivan Idris

business intelligence, cloud computing, computer vision, data science, Debian, en.wikipedia.org, Eratosthenes, mandelbrot fractal, p-value, power law, sorting algorithm, statistical model, transaction costs, web application

By night, he cultivates his academic interests in mathematics and computer science, and plays with mathematical and scientific software. Ryan R. Rosario is a Doctoral Candidate at the University of California, Los Angeles. He works at Riot Games as a Data Scientist, and he enjoys turning large quantities of massive, messy data into gold. He is heavily involved in the open source community, particularly with R, Python, Hadoop, and Machine Learning, and has also contributed code to various Python and R projects. He maintains a blog dedicated to Data Science and related topics at http://www.bytemining.com. He has also served as a technical reviewer for NumPy 1.5 Beginner's Guide. www.PacktPub.com Support files, eBooks, discount offers and more You might want to visit www.PacktPub.com for support files and downloads related to your book.

., 0], 2, 0.3, 0.2) axis('off') imshow(edges) show() The code produces an image of the edges within the original picture, as shown in the following screenshot: Installing Pandas Pandas is a Python library for data analysis. It has some similarities with the R programming language, which are not coincidental. R is a specialized programming language popular with data scientists. For instance, the core DataFrame object is inspired by R. How to do it... On PyPi, the project is called pandas. So, for instance, run either of the following two command: sudo easy_install -U pandas pip install pandas If you are using a Linux package manager, you will need to install the python-pandas project.


pages: 586 words: 186,548

Architects of Intelligence by Martin Ford

3D printing, agricultural Revolution, AI winter, algorithmic bias, Alignment Problem, AlphaGo, Apple II, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, backpropagation, barriers to entry, basic income, Baxter: Rethink Robotics, Bayesian statistics, Big Tech, bitcoin, Boeing 747, Boston Dynamics, business intelligence, business process, call centre, Cambridge Analytica, cloud computing, cognitive bias, Colonization of Mars, computer vision, Computing Machinery and Intelligence, correlation does not imply causation, CRISPR, crowdsourcing, DARPA: Urban Challenge, data science, deep learning, DeepMind, Demis Hassabis, deskilling, disruptive innovation, Donald Trump, Douglas Hofstadter, driverless car, Elon Musk, Erik Brynjolfsson, Ernest Rutherford, fake news, Fellow of the Royal Society, Flash crash, future of work, general purpose technology, Geoffrey Hinton, gig economy, Google X / Alphabet X, Gödel, Escher, Bach, Hans Moravec, Hans Rosling, hype cycle, ImageNet competition, income inequality, industrial research laboratory, industrial robot, information retrieval, job automation, John von Neumann, Large Hadron Collider, Law of Accelerating Returns, life extension, Loebner Prize, machine translation, Mark Zuckerberg, Mars Rover, means of production, Mitch Kapor, Mustafa Suleyman, natural language processing, new economy, Nick Bostrom, OpenAI, opioid epidemic / opioid crisis, optical character recognition, paperclip maximiser, pattern recognition, phenotype, Productivity paradox, radical life extension, Ray Kurzweil, recommendation engine, Robert Gordon, Rodney Brooks, Sam Altman, self-driving car, seminal paper, sensor fusion, sentiment analysis, Silicon Valley, smart cities, social intelligence, sparse data, speech recognition, statistical model, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, Steven Pinker, strong AI, superintelligent machines, synthetic biology, systems thinking, Ted Kaczynski, TED Talk, The Rise and Fall of American Growth, theory of mind, Thomas Bayes, Travis Kalanick, Turing test, universal basic income, Wall-E, Watson beat the top human players on Jeopardy!, women in the workforce, working-age population, workplace surveillance , zero-sum game, Zipcar

So, it’s not surprising that when we have real robots, they’re going to be able to do those jobs. I also think that the current mindset among governments is: “Oh, well then. I guess we really need to start training people to be data scientists, because that’s the job of the future—or robot engineers.” This clearly isn’t the solution because we don’t need a billion data scientists and robot engineers: we just need a few million. This might be a strategy for a small country like Singapore; or where I am currently, in Dubai, it might also be a viable strategy. But it’s not a viable strategy for any major country because there is simply not going to be enough jobs in those areas.

These are not often done within a single company. Does that integration pose new challenges? DAPHNE KOLLER: Absolutely. I think the biggest challenge is actually cultural, in getting scientists and data scientists to work together as equal partners. In many companies, one group sets the direction, and the other takes a back seat. At insitro, we really need to build a culture in which scientists, engineers, and data scientists work closely together to define problems, design experiments, analyze data, and derive insights that will lead us to new therapeutics. We believe that building this team and this culture well is as important to the success of our mission as the quality of the science or the machine learning that these different groups will create.

At the end of the day, we’re designing these systems, and we get to say how they are deployed, we can turn the switch off. CEO & CO-FOUNDER OF AFFECTIVA Rana el Kaliouby is the co-founder and CEO of Affectiva, a startup company that specializes in AI systems that sense and understand human emotions. Affectiva is developing cutting-edge AI technologies that apply machine learning, deep learning, and data science to bring new levels of emotional intelligence to AI. Rana is an active participant in international forums that focus on ethical issues and the regulation of AI to help ensure the technology has a positive impact on society. She was selected as a Young Global Leader by the World Economic Forum in 2017.


pages: 561 words: 157,589

WTF?: What's the Future and Why It's Up to Us by Tim O'Reilly

"Friedman doctrine" OR "shareholder theory", 4chan, Affordable Care Act / Obamacare, Airbnb, AlphaGo, Alvin Roth, Amazon Mechanical Turk, Amazon Robotics, Amazon Web Services, AOL-Time Warner, artificial general intelligence, augmented reality, autonomous vehicles, barriers to entry, basic income, behavioural economics, benefit corporation, Bernie Madoff, Bernie Sanders, Bill Joy: nanobots, bitcoin, Blitzscaling, blockchain, book value, Bretton Woods, Brewster Kahle, British Empire, business process, call centre, Capital in the Twenty-First Century by Thomas Piketty, Captain Sullenberger Hudson, carbon tax, Carl Icahn, Chuck Templeton: OpenTable:, Clayton Christensen, clean water, cloud computing, cognitive dissonance, collateralized debt obligation, commoditize, computer vision, congestion pricing, corporate governance, corporate raider, creative destruction, CRISPR, crowdsourcing, Danny Hillis, data acquisition, data science, deep learning, DeepMind, Demis Hassabis, Dennis Ritchie, deskilling, DevOps, Didi Chuxing, digital capitalism, disinformation, do well by doing good, Donald Davies, Donald Trump, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, fake news, Filter Bubble, Firefox, Flash crash, Free Software Foundation, fulfillment center, full employment, future of work, George Akerlof, gig economy, glass ceiling, Glass-Steagall Act, Goodhart's law, Google Glasses, Gordon Gekko, gravity well, greed is good, Greyball, Guido van Rossum, High speed trading, hiring and firing, Home mortgage interest deduction, Hyperloop, income inequality, independent contractor, index fund, informal economy, information asymmetry, Internet Archive, Internet of things, invention of movable type, invisible hand, iterative process, Jaron Lanier, Jeff Bezos, jitney, job automation, job satisfaction, John Bogle, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John Zimmer (Lyft cofounder), Kaizen: continuous improvement, Ken Thompson, Kevin Kelly, Khan Academy, Kickstarter, Kim Stanley Robinson, knowledge worker, Kodak vs Instagram, Lao Tzu, Larry Ellison, Larry Wall, Lean Startup, Leonard Kleinrock, Lyft, machine readable, machine translation, Marc Andreessen, Mark Zuckerberg, market fundamentalism, Marshall McLuhan, McMansion, microbiome, microservices, minimum viable product, mortgage tax deduction, move fast and break things, Network effects, new economy, Nicholas Carr, Nick Bostrom, obamacare, Oculus Rift, OpenAI, OSI model, Overton Window, packet switching, PageRank, pattern recognition, Paul Buchheit, peer-to-peer, peer-to-peer model, Ponzi scheme, post-truth, race to the bottom, Ralph Nader, randomized controlled trial, RFC: Request For Comment, Richard Feynman, Richard Stallman, ride hailing / ride sharing, Robert Gordon, Robert Metcalfe, Ronald Coase, Rutger Bregman, Salesforce, Sam Altman, school choice, Second Machine Age, secular stagnation, self-driving car, SETI@home, shareholder value, Silicon Valley, Silicon Valley startup, skunkworks, Skype, smart contracts, Snapchat, Social Responsibility of Business Is to Increase Its Profits, social web, software as a service, software patent, spectrum auction, speech recognition, Stephen Hawking, Steve Ballmer, Steve Jobs, Steven Levy, Stewart Brand, stock buybacks, strong AI, synthetic biology, TaskRabbit, telepresence, the built environment, the Cathedral and the Bazaar, The future is already here, The Future of Employment, the map is not the territory, The Nature of the Firm, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, Thomas Davenport, Tony Fadell, Tragedy of the Commons, transaction costs, transcontinental railway, transportation-network company, Travis Kalanick, trickle-down economics, two-pizza team, Uber and Lyft, Uber for X, uber lyft, ubercab, universal basic income, US Airways Flight 1549, VA Linux, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, We are the 99%, web application, Whole Earth Catalog, winner-take-all economy, women in the workforce, Y Combinator, yellow journalism, zero-sum game, Zipcar

CHAPTER 8: MANAGING A WORKFORCE OF DJINNS 155 breakthroughs and business processes: Steve Lohr, “The Origins of ‘Big Data’: An Etymological Detective Story,” New York Times, February 1, 2013, https://bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/. 155 speech recognition and machine translation: Alon Halevy, Peter Norvig, and Fernando Pereira, “The Unreasonable Effectiveness of Data,” IEEE Intelligent Systems, 1541–1672/09, retrieved March 31, 2017, https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35179.pdf. 156 “the sexiest job of the 21st century”: Thomas Davenport and D. J. Patil, “Data Scientist: The Sexiest Job of the 21st Century,” Harvard Business Review, October 2012, https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century. Hal Varian had used this same phrase about statistics in 2009. See “Hal Varian on How the Web Challenges Managers,” McKinsey & Company, January 2009, http://www.mckinsey.com/industries/high-tech/our-insights/hal-varian-on-how-the-web-challenges-managers. 157 “the right values for these parameters is something of a black art”: Sergey Brin and Larry Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Stanford University, retrieved March 31, 2017, http://infolab.stanford. edu/~backrub/google.html. 158 as many as 50,000 subsignals: Danny Sullivan, “FAQ: All About the Google RankBrain Algorithm,” Search Engine Land, June 23, 2016, http://searchengine land.com/faq-all-about-the-new-google-rankbrain-algorithm-234440. 158 “new synapses for the global brain”: Tim O’Reilly, “Freebase Will Prove Addictive,” O’Reilly Radar, March 8, 2007, http://radar.oreilly.com/2007/03/free base-will-prove-addictive.html. 158 “10 experiments for every successful launch”: Matt McGee, “BusinessWeek Dives Deep into Google’s Search Quality,” Search Engine Land, October 6, 2009, http://searchengineland.com/businessweek-dives-deep-into-googles-search-quality-27317. 159 the manual that they provide: Search Quality Evaluator Guide, Google, March 14, 2017, http://static.googleusercontent.com/media/www.google.com/en//inside search/howsearchworks/assets/search qualityevaluatorguidelines.pdf. 160 “Another big difference”: Brin and Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Section 3.2.

Its insight, that “simple models and a lot of data trump more elaborate models based on less data,” has been fundamental to progress in field after field, and is at the heart of many Silicon Valley companies. It is even more central to the latest breakthroughs in artificial intelligence. In 2008, D. J. Patil at LinkedIn and Jeff Hammerbacher at Facebook coined the term data science to describe their jobs, naming a field that a few years later was dubbed by Harvard Business Review as “the sexiest job of the 21st century.” Understanding the data science mindset and approach and how it differs from older methods of programming is critical for anyone who is grappling with the challenges of the twenty-first century. How Google deals with search quality provides important lessons.

Tellingly, Jeff Hammerbacher, who worked on Wall Street before leading the data team at Facebook, once said, “The best minds of my generation are thinking about how to make people click ads. That sucks.” Jeff left Facebook and now plays a dual role as chief scientist and cofounder at big data company Cloudera and faculty member of the Icahn School of Medicine at Mount Sinai, in New York, where he runs the Hammer Lab, a team of software developers and data scientists trying to understand how the immune system battles cancer. The choice of the problems to which we apply the superpowers of our new digital workforce is ultimately up to us. We are creating a race of djinns, eager to do our bidding. What shall we ask them to do? 9 “A HOT TEMPER LEAPS O’ER A COLD DECREE” I SPOKE IN EARLY 2017 AT A GATHERING OF MINISTERS FROM the Organisation for Economic Co-operation and Development (OECD) and G20 nations to discuss the digital future.


pages: 788 words: 223,004

Merchants of Truth: The Business of News and the Fight for Facts by Jill Abramson

"World Economic Forum" Davos, 23andMe, 4chan, Affordable Care Act / Obamacare, Alexander Shulgin, Apple's 1984 Super Bowl advert, barriers to entry, Bernie Madoff, Bernie Sanders, Big Tech, Black Lives Matter, Cambridge Analytica, Charles Lindbergh, Charlie Hebdo massacre, Chelsea Manning, citizen journalism, cloud computing, commoditize, content marketing, corporate governance, creative destruction, crowdsourcing, data science, death of newspapers, digital twin, diversified portfolio, Donald Trump, East Village, Edward Snowden, fake news, Ferguson, Missouri, Filter Bubble, future of journalism, glass ceiling, Google Glasses, haute couture, hive mind, income inequality, information asymmetry, invisible hand, Jeff Bezos, Joseph Schumpeter, Khyber Pass, late capitalism, Laura Poitras, Marc Andreessen, Mark Zuckerberg, move fast and break things, Nate Silver, new economy, obamacare, Occupy movement, Paris climate accords, performance metric, Peter Thiel, phenotype, pre–internet, race to the bottom, recommendation engine, Robert Mercer, Ronald Reagan, Saturday Night Live, self-driving car, sentiment analysis, Sheryl Sandberg, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, skunkworks, Snapchat, social contagion, social intelligence, social web, SoftBank, Steve Bannon, Steve Jobs, Steven Levy, tech billionaire, technoutopianism, telemarketer, the scientific method, The Wisdom of Crowds, Tim Cook: Apple, too big to fail, vertical integration, WeWork, WikiLeaks, work culture , Yochai Benkler, you are the product

Her team created: Mathew Ingram, “BuzzFeed Opens Up Access to Its Viral Dashboard,” Gigaom, September 2, 2010, https://gigaom.com/2010/09/02/buzzfeed-opens-up-access-to-its-viral-dashboard/. This was called Social Lift: Dao Nguyen and Ky Harlin, “How BuzzFeed Thinks about Data Science,” BuzzFeed, September 24, 2014, https://www.buzzfeed.com/daozers/how-buzzfeed-thinks-about-data-science?utm_term=.cuPg946z1#.uqLpMYjOL. The dashboard offered more than: Felix Oberholzer-Gee, “BuzzFeed—The Promise of Native Advertising,” Harvard Business School Case 714-512, June 2014 (revised August 2014), 539; Christine Lagorio-Chafkin, “Meet BuzzFeed’s Secret Weapon,” Inc., September 2, 2014, https://www.inc.com/christine-lagorio/buzzfeed-secret-growth-weapon.html.

When Zuckerberg shared a photo of himself giving out full-size candy bars for Halloween, Peretti made known that he “liked” it and shared a link to BuzzFeed’s coverage of a trick-or-treat stunt, led by Matt Stopera’s fawning confession, “I personally really admire him for this.” He kept in touch with Cameron Marlow, his grad-school friend who had since become Facebook’s chief data scientist, by posting public messages back and forth on each other’s “walls.” He made “Facebook-official” friendships with the company’s heads of product, media partnerships, and global creative strategy, one of its cofounders, and a board member. Facebook was growing fast. By 2008 the number of monthly users had nearly tripled, and over the next two years it would more than quadruple to over 600 million.

“News might not be as big a business as entertainment, but news is the best way to have a big impact on the world.” He disliked wishy-washy barometers like “impact,” but for the time being the term would have to do. The practical value of journalism was too nebulous for his quantitatively inclined mind. Then he received a gift from Nguyen and her team of data scientists. It was a new tool built into the dashboard, which they called a Heat Map. The software noted how far down the page a reader scrolled, then collated those data in one simple visual with whether the story lost readers’ attention and where. It was just what BuzzFeed needed to apply its analytical rigor to stories that were longer and larger in scope.


pages: 467 words: 149,632

If Then: How Simulmatics Corporation Invented the Future by Jill Lepore

A Declaration of the Independence of Cyberspace, Alvin Toffler, anti-communist, Apollo 11, Buckminster Fuller, Cambridge Analytica, company town, computer age, coronavirus, cuban missile crisis, data science, desegregation, don't be evil, Donald Trump, Dr. Strangelove, Elon Musk, fake news, game design, George Gilder, Grace Hopper, Hacker Ethic, Howard Zinn, index card, information retrieval, Jaron Lanier, Jeff Bezos, Jeffrey Epstein, job automation, John Perry Barlow, land reform, linear programming, Mahatma Gandhi, Marc Andreessen, Mark Zuckerberg, mass incarceration, Maui Hawaii, Menlo Park, military-industrial complex, New Journalism, New Urbanism, Norbert Wiener, Norman Mailer, packet switching, Peter Thiel, profit motive, punch-card reader, RAND corporation, Robert Bork, Ronald Reagan, Rosa Parks, self-driving car, Silicon Valley, SimCity, smart cities, social distancing, South China Sea, Stewart Brand, technoutopianism, Ted Sorensen, Telecommunications Act of 1996, urban renewal, War on Poverty, white flight, Whole Earth Catalog

Although, notably, Google abandoned “Don’t be evil” in 2015, and Alphabet Inc. adopted a code of conduct that urged employees to “do the right thing,” http://blogs.wsj.com/digits/2015/10/02/as-google-becomes-alphabet-dont-be-evil-vanishes/. Mathematicians-turned-businessmen Jeff Hammerbacher and D. J. Patil claim to have coined the term “data scientist” in 2008, and not long after, data science was also embraced both within and outside the academy as a new scientific method, a “fourth paradigm,” following the earlier paradigms of empirical, theoretical, and computational analysis. Thomas H. Davenport and D. J. Patil, “Data Scientist: The Sexiest Job of the 21st Century,” Harvard Business Review, October 2012. Tony Hey, Stewart Tansley, and Kristin Michele Tolle, The Fourth Paradigm: Data-Intensive Scientific Discovery (Redmond, WA: Microsoft Research, 2009).

“Don’t be evil,” the motto of Google, marked the limit of a swaggering, devil-may-care ethical ambition; doing good did not come into it.7 Incubated decades before, beneath a honeycombed, geodesic dome in Wading River, this work found a place, too, in universities. In the 2010s, a flood of money into universities attempted to make the study of data a science, with data science initiatives, data science programs, data science degrees, data science centers.8 Much academic research that fell under the label “data science” produced excellent and invaluable work, across many fields of inquiry, findings that would not have been possible with computational discovery.9 And no field should be judged by its worst practitioners. Still, the shadiest data science, like the shadiest behavioral science, grew in influence by way of self-mystification, exaggerated claims, and all-around chicanery, including fast-changing, razzle-dazzle buzzwords, from “big data” to “data analytics.”

Much of university life had by the 2010s followed the model of the Media Lab, collapsing the boundaries between corporate commissions, academic inquiry, and hucksterism; at its worst, behavioral data science’s self-mystification was meant to boggle the mind, daunt critics, and entice corporate sponsors and venture capitalists. “Data science is in its infancy,” wrote one MIT computer scientist in 2015. “Few individuals or organizations understand the potential of and the paradigm shift associated with Data science, let alone understand it conceptually.”13 The more mystification, the wealthier the donors. The number of data science programs stretched into the hundreds, even though little consensus had been reached on the meaning or purpose of “data science.” New disciplines and methods take time to find their way; that’s all to the good.


pages: 320 words: 87,853

The Black Box Society: The Secret Algorithms That Control Money and Information by Frank Pasquale

Adam Curtis, Affordable Care Act / Obamacare, Alan Greenspan, algorithmic trading, Amazon Mechanical Turk, American Legislative Exchange Council, asset-backed security, Atul Gawande, bank run, barriers to entry, basic income, Bear Stearns, Berlin Wall, Bernie Madoff, Black Swan, bonus culture, Brian Krebs, business cycle, business logic, call centre, Capital in the Twenty-First Century by Thomas Piketty, Chelsea Manning, Chuck Templeton: OpenTable:, cloud computing, collateralized debt obligation, computerized markets, corporate governance, Credit Default Swap, credit default swaps / collateralized debt obligations, crowdsourcing, cryptocurrency, data science, Debian, digital rights, don't be evil, drone strike, Edward Snowden, en.wikipedia.org, Evgeny Morozov, Fall of the Berlin Wall, Filter Bubble, financial engineering, financial innovation, financial thriller, fixed income, Flash crash, folksonomy, full employment, Gabriella Coleman, Goldman Sachs: Vampire Squid, Google Earth, Hernando de Soto, High speed trading, hiring and firing, housing crisis, Ian Bogost, informal economy, information asymmetry, information retrieval, information security, interest rate swap, Internet of things, invisible hand, Jaron Lanier, Jeff Bezos, job automation, John Bogle, Julian Assange, Kevin Kelly, Kevin Roose, knowledge worker, Kodak vs Instagram, kremlinology, late fees, London Interbank Offered Rate, London Whale, machine readable, Marc Andreessen, Mark Zuckerberg, Michael Milken, mobile money, moral hazard, new economy, Nicholas Carr, offshore financial centre, PageRank, pattern recognition, Philip Mirowski, precariat, profit maximization, profit motive, public intellectual, quantitative easing, race to the bottom, reality distortion field, recommendation engine, regulatory arbitrage, risk-adjusted returns, Satyajit Das, Savings and loan crisis, search engine result page, shareholder value, Silicon Valley, Snapchat, social intelligence, Spread Networks laid a new fibre optics cable between New York and Chicago, statistical arbitrage, statistical model, Steven Levy, technological solutionism, the scientific method, too big to fail, transaction costs, two-sided market, universal basic income, Upton Sinclair, value at risk, vertical integration, WikiLeaks, Yochai Benkler, zero-sum game

To have been prominent at a critical point in Internet development was a similar piece of luck. Google or Facebook were once in the right place at the right time. It’s not clear whether they are still better than anyone else at online data science, or whether their prominence is such that they’ve become the permanent “default.” We also have to ask whether data science is still key here, or just the data itself. When intermediaries like Google and Facebook leverage their enormous databases of personalized information to target advertising, how much value do they add in the process? This is a matter of some dispute.

Yet choosing a car, or even a restaurant, is not as straightforward as optimizing an engine or routing a drive. Does the recommendation engine take into account, say, whether the restaurant or car company gives its workers health benefits or maternity leave? Could we prompt it to do so? In their race for the most profitable methods of mapping social reality, the data scientists of Silicon Valley and Wall Street tend to treat recommendations as purely technical problems. The values and prerogatives that the encoded rules enact are hidden within black boxes.23 INTRODUCTION—THE NEED TO KNOW 9 The most obvious question is: Are these algorithmic applications fair? Why, for instance, does YouTube (owned by Google) so consistently beat out other video sites in Google’s video search results?

“Competition is one click away,” chant the Silicon Valley antitrust lawyers when someone calls out a behemoth firm for unfair or misleading business practices.149 It’s not so. Alternatives are demonstrably worse, and likely to remain so as long as the dominant firms’ self-reinforcing data advantage grows. 84 THE BLACK BOX SOCIETY Search and Compensation At the 2013 Governing Algorithms conference at New York University, a data scientist gave a dazzling presentation of how her company maximized ad revenue for its clients. She mapped out information exchanges among networks, advertisers, publishers, and the other stars of the Internet universe, emphasizing how computers are taught by skilled programmers like herself to fi nd unexpected correlations in click-through activity.


pages: 205 words: 71,872

Whistleblower: My Journey to Silicon Valley and Fight for Justice at Uber by Susan Fowler

"Susan Fowler" uber, Airbnb, Albert Einstein, Big Tech, Burning Man, cloud computing, data science, deep learning, DevOps, Donald Trump, Elon Musk, end-to-end encryption, fault tolerance, Grace Hopper, Higgs boson, Large Hadron Collider, Lyft, Maui Hawaii, messenger bag, microservices, Mitch Kapor, Richard Feynman, ride hailing / ride sharing, self-driving car, Silicon Valley, TechCrunch disrupt, Travis Kalanick, Uber for X, uber lyft, work culture

It was that infrastructure—the servers, the operating systems, the networks, and all of the code that connected the applications together—that I would be working on, that I would need to make better, more reliable, and more fault-tolerant. After Engucation came more specialized training to prepare new hires for their particular roles within the company. New data scientists spent time with their data science teams, front-end developers learned how to work with the front-end code, and I would embed with one of the site reliability engineering (SRE) teams and learn the basics before I could join my permanent team. Eamon assigned me to one of his SRE teams for this phase of training.

The entire time I’d been a student at Penn, I’d watched as my undergraduate classmates in both physics and philosophy interviewed with companies like Facebook and Google, did software engineering internships in San Francisco over the summers, and boasted about their high-paying job offers from companies like Palantir and Microsoft. Almost all of the graduate students in my physics lab left academia after they completed their PhDs or finished their postdocs and moved to New York or San Francisco to work as data scientists, software engineers, and product managers. Jumping from physics into the technology industry wasn’t a very big leap: physics was a technical field, and physics students were technically inclined, comfortable with mathematics and computer science. I suspected that the leap from physics into software engineering wouldn’t be that difficult for me.

Everyone went out for drinks that night to celebrate the two new employees: me and the new office manager, Heidi. We were the only women in the office; as I later learned, they had us start on the same day so that we wouldn’t feel “alone.” I was the only woman in the office in a technical role (the only other technical woman, a data scientist, lived and worked in Boston), but that didn’t seem strange or unusual to me; at Penn, I had been the only woman on the floor where my lab was located in the David Rittenhouse Laboratory, and there were only men’s bathrooms on the floor where I worked. As the happy hour came to an end, we all stood outside the bar while each person called an Uber ride home.


pages: 290 words: 90,057

Billion Dollar Brand Club: How Dollar Shave Club, Warby Parker, and Other Disruptors Are Remaking What We Buy by Lawrence Ingrassia

air freight, Airbnb, airport security, Amazon Robotics, augmented reality, barriers to entry, call centre, commoditize, computer vision, data science, fake news, fulfillment center, global supply chain, Hacker News, industrial robot, Jeff Bezos, Kickstarter, Kiva Systems, Lyft, Mark Zuckerberg, minimum viable product, natural language processing, Netflix Prize, rolodex, San Francisco homelessness, side project, Silicon Valley, Silicon Valley startup, Snapchat, Steve Jobs, supply-chain management, Uber and Lyft, uber lyft, warehouse automation, warehouse robotics, WeWork

A “decision tree” algorithm, as its name implies, correlates a cascading series of dozens or hundreds or thousands of data points (like branches on a tree) to winnow out items you probably won’t like and add things you probably will. “Random forest” algorithms are ensembles of dozens or even hundreds of different simpler algorithms that work together and correct for possible errors. Stitch Fix, which employs a chief algorithms officer and more than a hundred data scientists, probably is the only consumer product company ever to post on its website an “Algorithms Tour.” An elaborate and lengthy online graphic, it explains how Stitch Fix uses data (some not having an obvious correlation or connection) to act as a clothing matchmaker. “Each attribute that describes a piece of merchandise can be represented as data and reconciled to each client’s unique preferences,” Eric Colson, Stitch Fix’s chief algorithms officer, explains on the company’s MultiThreaded blog.

As Stitch Fix has developed more sophisticated algorithms, it has incorporated the use of computer vision to help select clothing. “We have our machines look at photos of clothing that customers like (e.g., from Pinterest), and look for visually similar items,” the website explains. And while the company initially sold apparel and accessories made by others, its data scientists in 2017 started designing “Stitch Fix exclusive brand” items by combining different style characteristics from popular clothing. In-house designers create these “Hybrid Designs” by taking ideas generated by artificial intelligence about the kinds of clothing its customers might like. Predictive analysis is now employed by a wide variety of digitally native brands.

Then the women were asked if they liked the color better than the results they got with a do-it-yourself kit at a drugstore. “We recruited people who were willing to put this product on their hair, with no idea who we really are. The results were positive enough for us to say we have something. Now what?” he recalls. The next step: tapping their knowledge of tech and data science to figure out how to replicate color customization for sale to fifty thousand women, not just fifty. “We wanted to do something that was really innovative, not bullshit innovative,” Mourad says. “Really different and value added.” Omar Mourad, Tamim’s younger brother and another of PriceGrabber’s cofounders, read everything he could about dyeing hair.


pages: 248 words: 73,689

Age of the City: Why Our Future Will Be Won or Lost Together by Ian Goldin, Tom Lee-Devlin

15-minute city, 1960s counterculture, agricultural Revolution, Alvin Toffler, Anthropocene, anti-globalists, Berlin Wall, Bonfire of the Vanities, Brixton riot, call centre, car-free, carbon footprint, Cass Sunstein, charter city, Chuck Templeton: OpenTable:, clean water, cloud computing, congestion charging, contact tracing, coronavirus, COVID-19, CRISPR, data science, David Brooks, David Ricardo: comparative advantage, decarbonisation, deindustrialization, Deng Xiaoping, desegregation, Edward Glaeser, Edward Jenner, Enrique Peñalosa, fake news, Fall of the Berlin Wall, financial engineering, financial independence, future of work, General Motors Futurama, gentrification, germ theory of disease, global pandemic, global supply chain, global village, Haight Ashbury, Hernando de Soto, high-speed rail, household responsibility system, housing crisis, Howard Rheingold, income per capita, Induced demand, industrial robot, informal economy, invention of the printing press, invention of the wheel, Jane Jacobs, Jeff Bezos, job automation, John Perry Barlow, John Snow's cholera map, Kickstarter, knowledge economy, knowledge worker, labour mobility, Lewis Mumford, lockdown, Louis Pasteur, low interest rates, low skilled workers, manufacturing employment, Marshall McLuhan, mass immigration, megacity, Neal Stephenson, Network effects, New Urbanism, offshore financial centre, open borders, open economy, Pearl River Delta, race to the bottom, Ray Oldenburg, remote working, rent control, Republic of Letters, Richard Florida, ride hailing / ride sharing, rising living standards, Salesforce, Shenzhen special economic zone , smart cities, smart meter, Snow Crash, social distancing, special economic zone, spinning jenny, Steve Jobs, Stewart Brand, superstar cities, the built environment, The Death and Life of Great American Cities, The Great Good Place, The Wealth of Nations by Adam Smith, trade liberalization, trade route, Upton Sinclair, uranium enrichment, urban decay, urban planning, urban sprawl, Victor Gruen, white flight, working poor, working-age population, zero-sum game, zoonotic diseases

The combination of globalization and rapid technological progress has led to the disappearance of many manufacturing and clerical jobs – long anchors of the middle class – as work was either eliminated entirely or sent overseas to be performed more cheaply. As these jobs have disappeared, a barbell-shaped economy has emerged, characterized by high-paid knowledge jobs, such as management consultants and data scientists, on the one hand and low-paid service jobs, like baristas and warehouse workers, on the other. In the words of economists Maarten Goos and Alan Manning, we have seen the workforce split into ‘lousy’ and ‘lovely’ jobs.23 Manufacturing jobs in particular have long been the life force behind smaller cities and also many rural towns.

Company value chains often cross industry boundaries. Proximity to financial and professional services, for example, is an important drawcard for many large cities. The presence of top universities is another factor. The same universities that produce investment bankers also produce lawyers, data scientists and medics. Lastly, university-educated professionals typically choose to live in places where other university-educated professionals live. Partly this is about humans’ natural tendency to gravitate towards people with similar outlooks to themselves, what sociologists call ‘homophily’. Increasingly, it is also about romance.

As automation makes it increasingly challenging for poor countries to follow the traditional path of industrial development, a significant expansion of investment in education will be required. While we are sceptical that the future of knowledge work is entirely remote, we do see scope for certain activities to be shifted to well-educated workers in poor countries in cases where the gains to companies from lower wages overseas outweigh the disadvantages of remote collaboration. Data science, financial analysis and doubtless other areas of knowledge work could probably be performed from abroad. The experience of Bangalore suggests that a hub for offshoring knowledge work can also mature into a global centre of innovation in its own right. It is vital that, as they become more prosperous, the cities of the developing world do not adopt the car-based sprawl of many rich world cities.


pages: 389 words: 87,758

No Ordinary Disruption: The Four Global Forces Breaking All the Trends by Richard Dobbs, James Manyika

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, access to a mobile phone, additive manufacturing, Airbnb, Amazon Mechanical Turk, American Society of Civil Engineers: Report Card, asset light, autonomous vehicles, Bakken shale, barriers to entry, business cycle, business intelligence, carbon tax, Carmen Reinhart, central bank independence, circular economy, cloud computing, corporate governance, creative destruction, crowdsourcing, data science, demographic dividend, deskilling, digital capitalism, disintermediation, disruptive innovation, distributed generation, driverless car, Erik Brynjolfsson, financial innovation, first square of the chessboard, first square of the chessboard / second half of the chessboard, Gini coefficient, global supply chain, global village, high-speed rail, hydraulic fracturing, illegal immigration, income inequality, index fund, industrial robot, intangible asset, Intergovernmental Panel on Climate Change (IPCC), Internet of things, inventory management, job automation, Just-in-time delivery, Kenneth Rogoff, Kickstarter, knowledge worker, labor-force participation, low interest rates, low skilled workers, Lyft, M-Pesa, machine readable, mass immigration, megacity, megaproject, mobile money, Mohammed Bouazizi, Network effects, new economy, New Urbanism, ocean acidification, oil shale / tar sands, oil shock, old age dependency ratio, openstreetmap, peer-to-peer lending, pension reform, pension time bomb, private sector deleveraging, purchasing power parity, quantitative easing, recommendation engine, Report Card for America’s Infrastructure, RFID, ride hailing / ride sharing, Salesforce, Second Machine Age, self-driving car, sharing economy, Silicon Valley, Silicon Valley startup, Skype, smart cities, Snapchat, sovereign wealth fund, spinning jenny, stem cell, Steve Jobs, subscription business, supply-chain management, synthetic biology, TaskRabbit, The Great Moderation, trade route, transaction costs, Travis Kalanick, uber lyft, urban sprawl, Watson beat the top human players on Jeopardy!, working-age population, Zipcar

As big data emerged as the next big opportunity in sectors ranging from finance to government, both the talent supply and employers’ understanding of the skills they need struggled to keep up. “There aren’t enough data scientists, not even close,” said Sandy Pentland, a computer scientist and management thinker at MIT. “We tend to teach people that everything that matters happens between your ears when in fact it actually mostly happens between people.” Pentland argues that the lack of data scientists makes it more difficult to fully apply the technology.5 More than two-thirds of companies are struggling against limited or no capabilities in data analytics techniques.6 The story is not restricted to data analytics positions.

Richard Dobbs, Anu Madgavkar, Dominic Barton, Eric Labaye, James Minyika, Charles Roxburgh, Susan Lund, and Siddarth Madhav, “The world at work: Jobs, pay, and skills for 3.5 billion people,” June 2012, McKinsey & Company, www.mckinsey.com/insights/employment_and_growth/the_world_at_work. 5. Danny Palmer, “Not enough data scientists, MIT expert tells Computing,” Computing, September 4, 2013, www.computing.co.uk/ctg/news/2292485/not-enough-data-scientists-mit-expert-tells-computing. 6. Thomas Wailgum, “Monday metric: 68% of companies struggle with big data analytics,” ASUG News, March 18, 2013, www.asugnews.com/article/monday-metric-68-of-companies-struggle-with-big-data-analytics. 7.

Upgrading to premium membership—monthly prices start at $59.99 per month for the Business Plus account—affords the user greater insight into who has been looking at his or her profile, the ability to send more messages to potential leads, and the use of more advanced search filters.60 A third model is monetization of big data, either through innovative business-to-business offerings (for example, crowd-sourcing business intelligence or outsourced data science services) or through developing more relevant products, services, or content for which consumers are willing to pay. LinkedIn, for example, makes 20 percent of its revenue from subscriptions, 30 percent from marketing, and 50 percent from talent solutions, a core part of which is selling targeted talent intelligence and tools to recruiters.61 You will have to keep experimenting in order to capture more consumer surplus for your business.


pages: 345 words: 75,660

Prediction Machines: The Simple Economics of Artificial Intelligence by Ajay Agrawal, Joshua Gans, Avi Goldfarb

Abraham Wald, Ada Lovelace, AI winter, Air France Flight 447, Airbus A320, algorithmic bias, AlphaGo, Amazon Picking Challenge, artificial general intelligence, autonomous vehicles, backpropagation, basic income, Bayesian statistics, Black Swan, blockchain, call centre, Capital in the Twenty-First Century by Thomas Piketty, Captain Sullenberger Hudson, carbon tax, Charles Babbage, classic study, collateralized debt obligation, computer age, creative destruction, Daniel Kahneman / Amos Tversky, data acquisition, data is the new oil, data science, deep learning, DeepMind, deskilling, disruptive innovation, driverless car, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, everywhere but in the productivity statistics, financial engineering, fulfillment center, general purpose technology, Geoffrey Hinton, Google Glasses, high net worth, ImageNet competition, income inequality, information retrieval, inventory management, invisible hand, Jeff Hawkins, job automation, John Markoff, Joseph Schumpeter, Kevin Kelly, Lyft, Minecraft, Mitch Kapor, Moneyball by Michael Lewis explains big data, Nate Silver, new economy, Nick Bostrom, On the Economy of Machinery and Manufactures, OpenAI, paperclip maximiser, pattern recognition, performance metric, profit maximization, QWERTY keyboard, race to the bottom, randomized controlled trial, Ray Kurzweil, ride hailing / ride sharing, Robert Solow, Salesforce, Second Machine Age, self-driving car, shareholder value, Silicon Valley, statistical model, Stephen Hawking, Steve Jobs, Steve Jurvetson, Steven Levy, strong AI, The Future of Employment, the long tail, The Signal and the Noise by Nate Silver, Tim Cook: Apple, trolley problem, Turing test, Uber and Lyft, uber lyft, US Airways Flight 1549, Vernor Vinge, vertical integration, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, William Langewiesche, Y Combinator, zero-sum game

Cardiogram, in its preliminary study, used six thousand people, including just two hundred with an irregular heart rhythm. Over time, one way to collect further data is through feedback on whether the app’s users have or develop irregular heart rhythms. Where did the six thousand come from? Data scientists have excellent tools for assessing the amount of data required given the expected reliability of the prediction and the need for accuracy. These tools are called “power calculations” and tell you how many units you need to analyze to generate a useful prediction.5 The salient management point is that you must make a trade-off: more accurate predictions require more units to study, and acquiring these additional units can be costly.

A sabermetric analyst develops measures for the rewards that the team would receive from signing different players. Sabermetric analysts are baseball’s reward function engineers. Now, most teams have at least one such analyst, and the role has appeared, under different names, in other sports. Better prediction created a new high-level position on the org chart. The research scientists, data scientists, and vice presidents of analytics are listed as key roles in the online front office directories. The Houston Astros even have a separate decision sciences unit headed by former NASA engineer Sig Mejdal. The strategic change also means a switch in who the team employs to pick the players. These analytics experts have mathematical skills, but the finest of them understand best what to tell the prediction machine to do.

Joshua holds a PhD in economics from Stanford University and, in 2008, was awarded the Economic Society of Australia’s Young Economist Award (the Australian equivalent of the John Bates Clark medal). AVI GOLDFARB is the Ellison Professor of Marketing at the Rotman School of Management, University of Toronto. Avi is also chief data scientist at the Creative Destruction Lab, senior editor at Marketing Science, and a research associate at the National Bureau of Economic Research. Avi’s research focuses on the opportunities and challenges of the digital economy, with funding from Google, Industry Canada, Bell Canada, AIMIA, SSHRC, the National Science Foundation, the Sloan Foundation, and others.


pages: 392 words: 114,189

The Ransomware Hunting Team: A Band of Misfits' Improbable Crusade to Save the World From Cybercrime by Renee Dudley, Daniel Golden

2021 United States Capitol attack, Amazon Web Services, Bellingcat, Berlin Wall, bitcoin, Black Lives Matter, blockchain, Brian Krebs, call centre, centralized clearinghouse, company town, coronavirus, corporate governance, COVID-19, cryptocurrency, data science, disinformation, Donald Trump, fake it until you make it, Hacker News, heat death of the universe, information security, late fees, lockdown, Menlo Park, Minecraft, moral hazard, offshore financial centre, Oklahoma City bombing, operational security, opioid epidemic / opioid crisis, Picturephone, pirate software, publish or perish, ransomware, Richard Feynman, Ross Ulbricht, seminal paper, smart meter, social distancing, strikebreaker, subprime mortgage crisis, tech worker, Timothy McVeigh, union organizing, War on Poverty, Y2K, zero day

Concerned about whether the HTCU could provide extra attention that people with autism might need, even broad-minded Marijn had his doubts about Peter’s pitch. But an HTCU program coordinator named Yvonne Horst was a believer. At her urging, the HTCU agreed to give Mark a six-month trial internship. Mark started as a junior data science intern at the HTCU at the same time as three professional data scientists hired by the unit. His new colleagues had university degrees and career experience at major companies such as accounting giant KPMG. Mark’s inexperience starkly contrasted with their advanced skills. “My six-month introductory course at ITvitae is not comparable to four years of university studies,” he conceded.

However, the Dutch National Police also sought candidates to work in its ten regional cyber squads, which handled more routine cases. Over time, ITvitae sent about two dozen additional students to work on those regional squads. Some handled tedious yet essential tasks such as reviewing crime-scene footage from security cameras, and were able to home in on small details that others might have missed. Others worked as data scientists or digital investigators. “If Mark did not advance the cold case, then Tom wouldn’t have fulfilled his dream to work at the police, and twenty-five others probably wouldn’t have had the opportunity,” Peter said, his voice quaking with emotion. “They are misfits. But they are very, very important to have at your organization

After crippling the DCH Regional Medical Center in Tuscaloosa, Alabama, and other hospitals in 2019, it doubled down on healthcare attacks in October 2020, sowing anxiety and confusion among patients and providers across the United States. The timing suggests that Ryuk was avenging one of the biggest and most damaging actions taken against ransomware. Since 2018, Microsoft’s Digital Crimes Unit—consisting of more than forty full-time investigators, analysts, data scientists, engineers, and attorneys—had been investigating TrickBot, the Russian malware that delivered Ryuk into victims’ computers. Concerns that the Putin regime might use TrickBot to disrupt the 2020 U.S. presidential election added urgency to the task, though the fears would prove unfounded. Microsoft investigators analyzed 61,000 samples of TrickBot malware, as well as the infrastructure underpinning the network of infected computers.


pages: 370 words: 112,809

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future by Orly Lobel

2021 United States Capitol attack, 23andMe, Ada Lovelace, affirmative action, Airbnb, airport security, Albert Einstein, algorithmic bias, Amazon Mechanical Turk, augmented reality, barriers to entry, basic income, Big Tech, bioinformatics, Black Lives Matter, Boston Dynamics, Charles Babbage, choice architecture, computer vision, Computing Machinery and Intelligence, contact tracing, coronavirus, corporate social responsibility, correlation does not imply causation, COVID-19, crowdsourcing, data science, David Attenborough, David Heinemeier Hansson, deep learning, deepfake, digital divide, digital map, Elon Musk, emotional labour, equal pay for equal work, feminist movement, Filter Bubble, game design, gender pay gap, George Floyd, gig economy, glass ceiling, global pandemic, Google Chrome, Grace Hopper, income inequality, index fund, information asymmetry, Internet of things, invisible hand, it's over 9,000, iterative process, job automation, Lao Tzu, large language model, lockdown, machine readable, machine translation, Mark Zuckerberg, market bubble, microaggression, Moneyball by Michael Lewis explains big data, natural language processing, Netflix Prize, Network effects, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, occupational segregation, old-boy network, OpenAI, openstreetmap, paperclip maximiser, pattern recognition, performance metric, personalized medicine, price discrimination, publish or perish, QR code, randomized controlled trial, remote working, risk tolerance, robot derives from the Czech word robota Czech, meaning slave, Ronald Coase, Salesforce, self-driving car, sharing economy, Sheryl Sandberg, Silicon Valley, social distancing, social intelligence, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, surveillance capitalism, tech worker, TechCrunch disrupt, The Future of Employment, TikTok, Turing test, universal basic income, Wall-E, warehouse automation, women in the workforce, work culture , you are the product

For example, the Bread for the World Institute recently issued a report showing that 92 percent of gender-specific economic data is missing from Africa. We are not measuring the struggles of those who are most in need. Millions remain in the shadows. But we can tackle the problem today thanks to technology, and the institute is working with volunteer coders, data scientists, statisticians, and graphic designers to begin to systematically collect what matters—to materialize the missing data and bring problems like these to light. Supreme Court Justice Louis Brandeis famously said that sunlight is the best of disinfectants, and electric light the most efficient policeman.

An equality machine mindset actively charts the course of the future, anticipating the many ways in which that future is unknown. This means designing governance systems and infrastructure that will continue to channel technological advancement down a progressive path. Inside Out, Outside In In late 2021, Frances Haugen, a thirty-seven-year-old data scientist, became one of the most famous whistleblowers in recent times. Testifying before both American and European legislatures, Haugen revealed that Facebook, her former employer, was time and again choosing profit over the safety and well-being of its users. Haugen asserted that Facebook consistently valued profit over safety by allowing algorithms to favor hateful content in order to bring users back to the social media platform for more traffic.

As we move beyond traditional litigation frameworks, government agencies also become research and development arms that incentivize, test, approve, and monitor proactive prevention programs. The immense challenge of harnessing technology for equality is one that must involve people from all disciplines and sectors. Social scientists, for example, must work with data scientists to provide context and ask the critical questions about definitions, the sources of data, and the interpretation of patterns. There are accelerated, heated debates and numerous legislative proposals to tighten the regulation of digital technology, including to amend Section 230 of the U.S. Communications Decency Act of 1996 in ways that would limit digital platform immunity and require platforms to moderate illegal content.


pages: 571 words: 105,054

Advances in Financial Machine Learning by Marcos Lopez de Prado

algorithmic trading, Amazon Web Services, asset allocation, backtesting, behavioural economics, bioinformatics, Brownian motion, business process, Claude Shannon: information theory, cloud computing, complexity theory, correlation coefficient, correlation does not imply causation, data science, diversification, diversified portfolio, en.wikipedia.org, financial engineering, fixed income, Flash crash, G4S, Higgs boson, implied volatility, information asymmetry, latency arbitrage, margin call, market fragmentation, market microstructure, martingale, NP-complete, P = NP, p-value, paper trading, pattern recognition, performance metric, profit maximization, quantitative trading / quantitative finance, RAND corporation, random walk, risk free rate, risk-adjusted returns, risk/return, selection bias, Sharpe ratio, short selling, Silicon Valley, smart cities, smart meter, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, traveling salesman

These features were discovered by different analysts studying a wide range of instruments and asset classes. The goal of the strategist is to make sense of all these observations and to formulate a general theory that explains them. Therefore, the strategy is merely the experiment designed to test the validity of this theory. Team members are data scientists with a deep knowledge of financial markets and the economy. Remember, the theory needs to explain a large collection of important features. In particular, a theory must identify the economic mechanism that causes an agent to lose money to us. Is it a behavioral bias? Asymmetric information? Regulatory constraints?

One of the scenarios of interest is how the strategy would perform if history repeated itself. However, the historical path is merely one of the possible outcomes of a stochastic process, and not necessarily the most likely going forward. Alternative scenarios must be evaluated, consistent with the knowledge of the weaknesses and strengths of a proposed strategy. Team members are data scientists with a deep understanding of empirical and experimental techniques. A good backtester incorporates in his analysis meta-information regarding how the strategy came about. In particular, his analysis must evaluate the probability of backtest overfitting by taking into account the number of trials it took to distill the strategy.

I have listed a few of them in the references section. The core audience of this book is investment professionals with a strong ML background. My goals are that you monetize what you learn in this book, help us modernize finance, and deliver actual value for investors. This book also targets data scientists who have successfully implemented ML algorithms in a variety of fields outside finance. If you have worked at Google and have applied deep neural networks to face recognition, but things do not seem to work so well when you run your algorithms on financial data, this book will help you. Sometimes you may not understand the financial rationale behind some structures (e.g., meta-labeling, the triple-barrier method, fracdiff), but bear with me: Once you have managed an investment portfolio long enough, the rules of the game will become clearer to you, along with the meaning of these chapters. 1.5 Requisites Investment management is one of the most multi-disciplinary areas of research, and this book reflects that fact.


pages: 602 words: 177,874

Thank You for Being Late: An Optimist's Guide to Thriving in the Age of Accelerations by Thomas L. Friedman

3D printing, additive manufacturing, affirmative action, Airbnb, AltaVista, Amazon Web Services, Anthropocene, Apple Newton, autonomous vehicles, Ayatollah Khomeini, barriers to entry, Berlin Wall, Bernie Sanders, Big Tech, biodiversity loss, bitcoin, blockchain, Bob Noyce, business cycle, business process, call centre, carbon tax, centre right, Chris Wanstrath, Clayton Christensen, clean tech, clean water, cloud computing, cognitive load, corporate social responsibility, creative destruction, CRISPR, crowdsourcing, data science, David Brooks, deep learning, demand response, demographic dividend, demographic transition, Deng Xiaoping, digital divide, disinformation, Donald Trump, dual-use technology, end-to-end encryption, Erik Brynjolfsson, fail fast, failed state, Fairchild Semiconductor, Fall of the Berlin Wall, Ferguson, Missouri, first square of the chessboard / second half of the chessboard, Flash crash, fulfillment center, game design, gig economy, global pandemic, global supply chain, Great Leap Forward, illegal immigration, immigration reform, income inequality, indoor plumbing, intangible asset, Intergovernmental Panel on Climate Change (IPCC), Internet of things, invention of the steam engine, inventory management, Irwin Jacobs: Qualcomm, Jeff Bezos, job automation, John Markoff, John von Neumann, Khan Academy, Kickstarter, knowledge economy, knowledge worker, land tenure, linear programming, Live Aid, low interest rates, low skilled workers, Lyft, Marc Andreessen, Mark Zuckerberg, mass immigration, Maui Hawaii, Menlo Park, Mikhail Gorbachev, mutually assured destruction, Neil Armstrong, Nelson Mandela, ocean acidification, PalmPilot, pattern recognition, planetary scale, power law, pull request, Ralph Waldo Emerson, ransomware, Ray Kurzweil, Richard Florida, ride hailing / ride sharing, Robert Gordon, Ronald Reagan, Salesforce, Second Machine Age, self-driving car, shareholder value, sharing economy, Silicon Valley, Skype, smart cities, Solyndra, South China Sea, Steve Jobs, subscription business, supercomputer in your pocket, synthetic biology, systems thinking, TaskRabbit, tech worker, TED Talk, The Rise and Fall of American Growth, Thomas L Friedman, Tony Fadell, transaction costs, Transnistria, uber lyft, undersea cable, urban decay, urban planning, Watson beat the top human players on Jeopardy!, WikiLeaks, women in the workforce, Y2K, Yogi Berra, zero-sum game

This approach has driven a lot of innovation in education, most notably the partnership between Udacity, AT&T, and Georgia Tech to create an online master’s degree in computer science for $6,600 for the entire course—as compared with the $45,000 it would cost for two years on campus at Georgia Tech. Coursera has partnered with Johns Hopkins and Rice to create a similar certificate in data science. This is driving down the cost of education for everyone. The education “pie just got bigger,” said Blase. “We can now assist you to get the job of your dreams.” That’s intelligent assistance. “We do $250 million of training a year,” said Blase. A lot is teaching people to climb poles, install services, and run retail stores, but now a lot more is in data science, software-defined networks, Web development, introduction to programming, machine learning, and the Internet of Things.

It can’t just be the advocacy of abstract principles. When you put your value set together with your analysis of how the Machine works and your understanding of how it is affecting people and culture in different contexts, you have a worldview that you can then apply to all kinds of situations to produce your opinions. Just as a data scientist needs an algorithm to cut through all the unstructured data and all the noise to see the relevant patterns, an opinion writer needs a worldview to create heat and light. But to keep that worldview fresh and relevant, I suggested to Bojia, you have to be constantly reporting and learning—more so today than ever.

“The idea,” he explained, “was that if you know exactly how the gas turbine and combustion engine work, you can use the laws of physics and say: ‘This is how it is going to work and when it is going to break.’ There was not a belief in the traditional engineering community that the data had much to offer. They used the data to verify their physics models and then act upon them. The new breed of data scientists here say: ‘You don’t need to understand the physics to look for and find the patterns.’ There are patterns that a human mind could not find, because the signals are so weak early on that you won’t see them. But now that we have all this processing power, those weak signals just pop out at you.


pages: 170 words: 49,193

The People vs Tech: How the Internet Is Killing Democracy (And How We Save It) by Jamie Bartlett

Ada Lovelace, Airbnb, AlphaGo, Amazon Mechanical Turk, Andrew Keen, autonomous vehicles, barriers to entry, basic income, Bernie Sanders, Big Tech, bitcoin, Black Lives Matter, blockchain, Boris Johnson, Californian Ideology, Cambridge Analytica, central bank independence, Chelsea Manning, cloud computing, computer vision, creative destruction, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, deep learning, DeepMind, disinformation, Dominic Cummings, Donald Trump, driverless car, Edward Snowden, Elon Musk, Evgeny Morozov, fake news, Filter Bubble, future of work, general purpose technology, gig economy, global village, Google bus, Hans Moravec, hive mind, Howard Rheingold, information retrieval, initial coin offering, Internet of things, Jeff Bezos, Jeremy Corbyn, job automation, John Gilmore, John Maynard Keynes: technological unemployment, John Perry Barlow, Julian Assange, manufacturing employment, Mark Zuckerberg, Marshall McLuhan, Menlo Park, meta-analysis, mittelstand, move fast and break things, Network effects, Nicholas Carr, Nick Bostrom, off grid, Panopticon Jeremy Bentham, payday loans, Peter Thiel, post-truth, prediction markets, QR code, ransomware, Ray Kurzweil, recommendation engine, Renaissance Technologies, ride hailing / ride sharing, Robert Mercer, Ross Ulbricht, Sam Altman, Satoshi Nakamoto, Second Machine Age, sharing economy, Silicon Valley, Silicon Valley billionaire, Silicon Valley ideology, Silicon Valley startup, smart cities, smart contracts, smart meter, Snapchat, Stanford prison experiment, Steve Bannon, Steve Jobs, Steven Levy, strong AI, surveillance capitalism, TaskRabbit, tech worker, technological singularity, technoutopianism, Ted Kaczynski, TED Talk, the long tail, the medium is the message, the scientific method, The Spirit Level, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, too big to fail, ultimatum game, universal basic income, WikiLeaks, World Values Survey, Y Combinator, you are the product

As I walked in to the ordinary looking office in central London – all offices are normal looking, except those of tech firms – I spotted a framed posted with a picture of Trump and a quote from famed US pollster Frank Luntz: ‘There are no longer any experts except Cambridge Analytica. They were Trump’s digital team who figured out how to win.’ Rows of employees were sitting staring at screens: project managers, IT specialists and data scientists.29 On a shelf in Nix’s glass office were copies of The Bad Boys of Brexit, the book written by UKIP donor Arron Banks, and Stealing Elections by John Fund. He seemed perfectly happy with these techniques, and said that micro-targeting was just getting started and represented the future of campaigning.

Although he rejects any single ‘why’, it’s clear that he thinks data was instrumental: One of our central ideas was that the campaign had to do things in the field of data that have never been done before. This included a) integrating data from social media, online advertising, websites, apps, canvassing, direct mail, polls, online fundraising, activist feedback and some new things we tried such as a new way to do polling . . . and b) having experts in physics and machine learning do proper data science in the way only they can – i.e. far beyond the normal skills applied in political campaigns. We were the first campaign in the UK to put almost all our money into digital communication then have it partly controlled by people whose normal work was subjects like quantum information . . . If you want to make big improvements in communication, my advice is – hire physicists, not communications people from normal companies

This did not seem to bother investors, since this 11-person business, with ambitions to revolutionise the entire trucking industry by building a self-driving fleet, managed to raise millions of dollars of funding from venture capitalists. ‘Everyone thought I was mad,’ 27-year-old Stefan told me when I visited Starsky’s Florida headquarters, a large rented property in a gated community, a few months ago. These days however, like many other industries, trucking is being disrupted by data science, artificial intelligence and venture capital. Stefan had agreed to let me drive in Starsky’s newest and shiniest truck with their resident driver Tony Hughes, a diminutive and friendly man with 20 years’ experience, who is perhaps better described as part-driver, part-machine supervisor. Tony is in his fifties, with a high school diploma in general studies from Shawnee Mission Northwest (Kansas) and a ‘solid track record of achieving efficient, cost-effective transportation operations’, but now finds himself training the machines that might eventually put him out of a job.


pages: 241 words: 43,252

Modern Vim: Craft Your Development Environment With Vim 8 and Neovim by Drew Neil

bash_history, Bram Moolenaar, data science, Debian, DevOps, en.wikipedia.org, functional programming, microservices, pull request, remote working, text mining

Venkat Subramaniam (280 pages) ISBN: 9781934356760 $35 Data Science Essentials in Python Go from messy, unstructured artifacts stored in SQL and NoSQL databases to a neat, well-organized dataset with this quick reference for the busy data scientist. Understand text mining, machine learning, and network analysis; process numeric data with the NumPy and Pandas modules; describe and analyze data using statistical and network-theoretical methods; and see actual examples of data analysis at work. This one-stop solution covers the essential data science you need in Python. Dmitry Zinoviev (224 pages) ISBN: 9781680501841 $29 Practical Programming, Third Edition Classroom-tested by tens of thousands of students, this new edition of the best-selling intro to programming book is for anyone who wants to understand computer science.


pages: 458 words: 116,832

The Costs of Connection: How Data Is Colonizing Human Life and Appropriating It for Capitalism by Nick Couldry, Ulises A. Mejias

"World Economic Forum" Davos, 23andMe, Airbnb, Amazon Mechanical Turk, Amazon Web Services, behavioural economics, Big Tech, British Empire, call centre, Cambridge Analytica, Cass Sunstein, choice architecture, cloud computing, colonial rule, computer vision, corporate governance, dark matter, data acquisition, data is the new oil, data science, deep learning, different worldview, digital capitalism, digital divide, discovery of the americas, disinformation, diversification, driverless car, Edward Snowden, emotional labour, en.wikipedia.org, European colonialism, Evgeny Morozov, extractivism, fake news, Gabriella Coleman, gamification, gig economy, global supply chain, Google Chrome, Google Earth, hiring and firing, income inequality, independent contractor, information asymmetry, Infrastructure as a Service, intangible asset, Internet of things, Jaron Lanier, job automation, Kevin Kelly, late capitalism, lifelogging, linked data, machine readable, Marc Andreessen, Mark Zuckerberg, means of production, military-industrial complex, move fast and break things, multi-sided market, Naomi Klein, Network effects, new economy, New Urbanism, PageRank, pattern recognition, payday loans, Philip Mirowski, profit maximization, Ray Kurzweil, RFID, Richard Stallman, Richard Thaler, Salesforce, scientific management, Scientific racism, Second Machine Age, sharing economy, Shoshana Zuboff, side hustle, Sidewalk Labs, Silicon Valley, Slavoj Žižek, smart cities, Snapchat, social graph, social intelligence, software studies, sovereign wealth fund, surveillance capitalism, techlash, The Future of Employment, the scientific method, Thomas Davenport, Tim Cook: Apple, trade liberalization, trade route, undersea cable, urban planning, W. E. B. Du Bois, wages for housework, work culture , workplace surveillance

Take any aspect of the social world, even ones for which we currently lack a causal model, and simply generate a proxy for it; there is no limit to what might work as such a proxy, and indeed the vagueness as to what is a “proxy variable” is a problem in legal proceedings that increasingly rely on them.64 So data scientists may ask: Could visual cues in Google Street View scenes be proxies of the likelihood of nearby crime? Could patterns in the distribution of more-expensive car models in Google Earth pictures be proxy demographic variables (income levels, relative poverty/wealth)? The temptation to pursue such proxy hunts is considerable, especially when public census data is costly and only intermittently collected.65 The scope for social experimentation that data relations provide to parts of the social quantification sector is huge but has sometimes proved controversial.66 Privacy concerns may act as a constraint, but, if so, China’s assumed lower sensitivity to privacy concerns works as a market advantage for its AI industry.67 Collect Everything If Big Data reasoning relies on the predictive power that comes from repetitive processing of unstructured data, this data can be generated, directly or indirectly, from whole populations.

This claim might shock those who see in “datafication . . . an essential enrichment in human comprehension” or “a great infrastructural project that rivals” the Enlightenment’s Encyclopédie.134 But let’s get beyond the hype and look at what is actually going on in the social sciences today. A good example is the work of celebrated data scientist Alex Pentland at the MIT Media Lab. Pentland contributed to the World Economic Forum’s Global Information Technology Reports in 2008, 2009, and 2014;135 his research team won the Defense Advanced Research Projects Agency’s (DARPA) prize to commemorate the internet’s fortieth anniversary.136 In his book Social Physics, Pentland reaches back to the origins of sociology.

There is, however, one bright note: the rise of critical information science that seeks to systematically establish the distortions woven into social caching’s presentations of the world. There is now an emerging movement for algorithmic justice, which has raised awareness of many specific issues. The critical movement within data science concerned with “Fairness, Accountability and Transparency” has regular conferences, and a number of universities have focused programs for investigating how algorithms cover the social domain.160 More generally, an important intersection between critical information science, legal theory, and social theory is opening up the question of how the social qualification sector presents the social world for action by powerful institutions.161 US civil society has generated some effective campaigning.


pages: 1,172 words: 114,305

New Laws of Robotics: Defending Human Expertise in the Age of AI by Frank Pasquale

affirmative action, Affordable Care Act / Obamacare, Airbnb, algorithmic bias, Amazon Mechanical Turk, Anthropocene, augmented reality, Automated Insights, autonomous vehicles, basic income, battle of ideas, Bernie Sanders, Big Tech, Bill Joy: nanobots, bitcoin, blockchain, Brexit referendum, call centre, Cambridge Analytica, carbon tax, citizen journalism, Clayton Christensen, collective bargaining, commoditize, computer vision, conceptual framework, contact tracing, coronavirus, corporate social responsibility, correlation does not imply causation, COVID-19, critical race theory, cryptocurrency, data is the new oil, data science, decarbonisation, deep learning, deepfake, deskilling, digital divide, digital twin, disinformation, disruptive innovation, don't be evil, Donald Trump, Douglas Engelbart, driverless car, effective altruism, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, fake news, Filter Bubble, finite state, Flash crash, future of work, gamification, general purpose technology, Google Chrome, Google Glasses, Great Leap Forward, green new deal, guns versus butter model, Hans Moravec, high net worth, hiring and firing, holacracy, Ian Bogost, independent contractor, informal economy, information asymmetry, information retrieval, interchangeable parts, invisible hand, James Bridle, Jaron Lanier, job automation, John Markoff, Joi Ito, Khan Academy, knowledge economy, late capitalism, lockdown, machine readable, Marc Andreessen, Mark Zuckerberg, means of production, medical malpractice, megaproject, meta-analysis, military-industrial complex, Modern Monetary Theory, Money creation, move fast and break things, mutually assured destruction, natural language processing, new economy, Nicholas Carr, Nick Bostrom, Norbert Wiener, nuclear winter, obamacare, One Laptop per Child (OLPC), open immigration, OpenAI, opioid epidemic / opioid crisis, paperclip maximiser, paradox of thrift, pattern recognition, payday loans, personalized medicine, Peter Singer: altruism, Philip Mirowski, pink-collar, plutocrats, post-truth, pre–internet, profit motive, public intellectual, QR code, quantitative easing, race to the bottom, RAND corporation, Ray Kurzweil, recommendation engine, regulatory arbitrage, Robert Shiller, Rodney Brooks, Ronald Reagan, self-driving car, sentiment analysis, Shoshana Zuboff, Silicon Valley, Singularitarianism, smart cities, smart contracts, software is eating the world, South China Sea, Steve Bannon, Strategic Defense Initiative, surveillance capitalism, Susan Wojcicki, tacit knowledge, TaskRabbit, technological solutionism, technoutopianism, TED Talk, telepresence, telerobotics, The Future of Employment, The Turner Diaries, Therac-25, Thorstein Veblen, too big to fail, Turing test, universal basic income, unorthodox policies, wage slave, Watson beat the top human players on Jeopardy!, working poor, workplace surveillance , Works Progress Administration, zero day

At its beginning, credit scoring was confined to one area of life (determining eligibility for loans) and based on a limited set of data (repayment history). Over decades, credit scores and similar measures have come to inform other determinations, including insurance rates and employment opportunities. More recently, data scientists have proposed more data sources for credit scoring, ranging from the way people type, to their political affiliation, to the types of websites they visit online. The Chinese government has also expanded the stakes of surveillance, proposing that “social credit scores” play a role in determining what trains or planes a citizen can board, what hotels a person can stay in, and what schools a family’s children can attend.

Malpractice law is designed to give patients reassurance that if their physician falls below a standard of care, a penalty will be imposed and some portion of it dedicated to their recovery.22 If providers fail to use sufficiently representative datasets to develop their medical AI, lawsuits should help hold them accountable, to ensure that everyone benefits from AI in medicine (not just those lucky enough to belong to the most studied groups). Data scientists sometimes joke that AI is simply a better-marketed form of statistics. Certainly narrow AI, designed to make specific predictions, is based on quantifying probability.23 It is but one of many steps taken over the past two decades to modernize medicine with a more extensive evidence base.24 Medical researchers have seized on predictive analytics, big data, artificial intelligence, machine learning, and deep learning as master metaphors for optimizing system performance.

Once a single perfect way of doing some task was found, there was little rationale for a human to continue doing it. Taylorism dovetailed with the psychological school of behaviorism, which sought to develop for human beings a blend of punishments and reinforcements reminiscent of animal training. The rise of data-driven predictive analytics has given behaviorism new purchase. The chief data scientist of a Silicon Valley e-learning firm once stated, “The goal of everything we do is to change people’s actual behavior at scale. When people use our app, we can capture their behaviors, identify good and bad behaviors, and develop ways to reward the good and punish the bad. We can test how actionable our cues are for them and how profitable for us.”72 A crowded field of edtech innovators promises to drastically reduce the cost of primary, secondary, and tertiary education with roughly similar methods: broadcast courses, intricate labyrinths of computerized assessment tools, and 360-degree surveillance tools to guarantee that students are not cheating.


The Data Journalism Handbook by Jonathan Gray, Lucy Chambers, Liliana Bounegru

Amazon Web Services, barriers to entry, bioinformatics, business intelligence, carbon footprint, citizen journalism, correlation does not imply causation, crowdsourcing, data science, David Heinemeier Hansson, eurozone crisis, fail fast, Firefox, Florence Nightingale: pie chart, game design, Google Earth, Hans Rosling, high-speed rail, information asymmetry, Internet Archive, John Snow's cholera map, Julian Assange, linked data, machine readable, moral hazard, MVC pattern, New Journalism, openstreetmap, Ronald Reagan, Ruby on Rails, Silicon Valley, social graph, Solyndra, SPARQL, text mining, Wayback Machine, web application, WikiLeaks

Less guessing, less looking for quotes; instead, a journalist can build a strong position supported by data, and this can affect the role of journalism greatly. Additionally, getting into data journalism offers a future perspective. Today, when newsrooms downsize, most journalists hope to switch to public relations. Data journalists or data scientists, though, are already a sought-after group of employees, not only in the media. Companies and institutions around the world are looking for “sensemakers” and professionals who know how to dig through data and transform it into something tangible. There is a promise in data, and this is what excites newsrooms, making them look for a new type of reporter.

Far from it. In the information age, journalists are needed more than ever to curate, verify, analyze, and synthesize the wash of data. In that context, data journalism has profound importance for society. Today, making sense of big data, particularly unstructured data, will be a central goal for data scientists around the world, whether they work in newsrooms, Wall Street, or Silicon Valley. Notably, that goal will be substantially enabled by a growing set of common tools, whether they’re employed by government technologists opening Chicago, healthcare technologists, or newsroom developers. — Alex Howard, O’Reilly Media Our Lives Are Data Good data journalism is hard, because good journalism is hard.

INSEAD Working Paper, 2010 Business Models for Data Journalism Amidst all the interest and hope regarding data-driven journalism, there is one question that newsrooms are always curious about: what are the business models? While we must be careful about making predictions, a look at the recent history and current state of the media industry can give us some insight. Today there are many news organizations who have gained by adopting new approaches. Terms like “data journalism” and the newest buzzword, “data science,” may sound like they describe something new, but this is not strictly true. Instead these new labels are just ways of characterizing a shift that has been gaining strength over decades. Many journalists seem to be unaware of the size of the revenue that is already generated through data collection, data analytics, and visualization.


pages: 234 words: 68,798

The Science of Storytelling: Why Stories Make Us Human, and How to Tell Them Better by Will Storr

data science, David Brooks, Demis Hassabis, Gordon Gekko, heat death of the universe, meta-analysis, Steven Pinker, TED Talk, theory of mind, Wall-E

Jockers (Allen Lane, 2016) p. 163. 4.1 Researchers downloaded 1,327: ‘The emotional arcs of stories are dominated by six basic shapes’, Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, Peter Sheridan Dodds, EPJ Data Science, 5:31, 4 November 2016. 4.2 For the neuroscientist Professor Beau Lotto: Deviate, Beau Lotto (W&N, 2017) Kindle location 685. When the data scientist David Robinson: Examining the arc of 100,000 stories: a tidy analysis by David Robinson, http://varianceexplained.org/r/tidytext-plots, 26 April 2017. The psychologist and story analyst Professor Jordan Peterson: Maps of Meaning video lectures.

It’s only by being active, and having the courage to take on the external world with all its challenges and provocations, that these core mechanisms can ever be broken down and rebuilt. For the neuroscientist Professor Beau Lotto it’s ‘not just important to be active, it is neurologically necessary’. It’s the only way we grow. When the data scientist David Robinson analysed an enormous tranche of 112,000 plots including books, movies, television episodes and video games, his algorithm found one common story shape. Robinson described this as, ‘Things get worse and worse until, at the last minute, they get better.’ The pattern he detected reveals that many stories have a point, just prior to their resolution, in which the hero endures some deeply significant test.


Work in the Future The Automation Revolution-Palgrave MacMillan (2019) by Robert Skidelsky Nan Craig

3D printing, Airbnb, algorithmic trading, AlphaGo, Alvin Toffler, Amazon Web Services, anti-work, antiwork, artificial general intelligence, asset light, autonomous vehicles, basic income, behavioural economics, business cycle, cloud computing, collective bargaining, Computing Machinery and Intelligence, correlation does not imply causation, creative destruction, data is the new oil, data science, David Graeber, David Ricardo: comparative advantage, deep learning, DeepMind, deindustrialization, Demis Hassabis, deskilling, disintermediation, do what you love, Donald Trump, driverless car, Erik Brynjolfsson, fake news, feminist movement, Ford Model T, Frederick Winslow Taylor, future of work, Future Shock, general purpose technology, gig economy, global supply chain, income inequality, independent contractor, informal economy, Internet of things, Jarndyce and Jarndyce, Jarndyce and Jarndyce, job automation, job polarisation, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John von Neumann, Joseph Schumpeter, knowledge economy, Loebner Prize, low skilled workers, Lyft, Mark Zuckerberg, means of production, moral panic, Network effects, new economy, Nick Bostrom, off grid, pattern recognition, post-work, Ronald Coase, scientific management, Second Machine Age, self-driving car, sharing economy, SoftBank, Steve Jobs, strong AI, tacit knowledge, technological determinism, technoutopianism, TED Talk, The Chicago School, The Future of Employment, the market place, The Nature of the Firm, The Wealth of Nations by Adam Smith, Thorstein Veblen, Turing test, Uber for X, uber lyft, universal basic income, wealth creators, working poor

xii Notes on Contributors She then switched over to the private sector, working as a quant for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She left finance in 2011 and started working as a data scientist in the New York start-up scene, building models that predicted people’s purchases and clicks. She wrote Doing Data Science in 2013 and launched the Lede Program in Data Journalism at Columbia in 2014. She is a regular contributor to Bloomberg View and wrote the book Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. She recently founded ORCAA, an algorithmic auditing company.

and next you scan your memories, as well as your closet, for outfits that can optimize to that definition of success. In the case of a formal algorithm, the definition of success does not waver; it’s codified in computer code, and that precise concept of “success,” as well as the associated concept of the cost of failure, are embedded in a mathematical object called the objective function. Once the data scientist decides on the objective function, and the historical training data, the ensuing algorithm is largely determined. Sounds simple, and it sometimes is. But when the output of the algorithm (the prediction itself ) is used in a powerful way, a feedback loop is created: the algorithm doesn’t just predict the future, it causes the future.


pages: 326 words: 103,170

The Seventh Sense: Power, Fortune, and Survival in the Age of Networks by Joshua Cooper Ramo

air gap, Airbnb, Alan Greenspan, Albert Einstein, algorithmic trading, barriers to entry, Berlin Wall, bitcoin, Bletchley Park, British Empire, cloud computing, Computing Machinery and Intelligence, crowdsourcing, Danny Hillis, data science, deep learning, defense in depth, Deng Xiaoping, drone strike, Edward Snowden, Fairchild Semiconductor, Fall of the Berlin Wall, financial engineering, Firefox, Google Chrome, growth hacking, Herman Kahn, income inequality, information security, Isaac Newton, Jeff Bezos, job automation, Joi Ito, Laura Poitras, machine translation, market bubble, Menlo Park, Metcalfe’s law, Mitch Kapor, Morris worm, natural language processing, Neal Stephenson, Network effects, Nick Bostrom, Norbert Wiener, Oculus Rift, off-the-grid, packet switching, paperclip maximiser, Paul Graham, power law, price stability, quantitative easing, RAND corporation, reality distortion field, Recombinant DNA, recommendation engine, Republic of Letters, Richard Feynman, road to serfdom, Robert Metcalfe, Sand Hill Road, secular stagnation, self-driving car, Silicon Valley, Skype, Snapchat, Snow Crash, social web, sovereign wealth fund, Steve Jobs, Steve Wozniak, Stewart Brand, Stuxnet, superintelligent machines, systems thinking, technological singularity, The Coming Technological Singularity, The Wealth of Nations by Adam Smith, too big to fail, Vernor Vinge, zero day

These systems run faster and better and more profitably because they are a shared system. They are gated by technology standards and by common connection. When we say that networks crave gates, this is the sort of gate we mean. If you had to look for your friends one by one on Facebook, Friendster, MySpace, and Google Plus, you’d exhaust yourself. So one winner emerges. Data scientists attribute the success of these winning nodes to preferential attachment—the idea that if Brian Arthur is using Microsoft Word, and I’m using it, you are likely to do so too. But there’s another secret: More widespread adoption makes the whole system faster. Think of five mechanics trying to fix a broken engine.

“seven friends in ten days”: Chamath Palihapitiya, “How We Put Facebook on the Path to 1 Billion Users” (lecture for the Udemy course “Growth Hacking: An Introduction,” published January 9, 2013, and available at https://www.youtube.com/watch?v=raIUQP71SBU). Pretty soon: Eman Yasser Daraghmi and Shyan-Ming Yuan, “We Are So Close, Less Than 4 Degrees Separating You and Me!,” Computers in Human Behavior 30 (January 2014), 273–85. Data scientists: Laurent Hébert-Dufresne et al. “Complex Networks as an Emerging Property of Hierarchical Preferential Attachment,” Physical Review E 92, 062809 (2015). But there’s another secret: Albert-László Barabási, “Network Science,” Philosophical Transactions of the Royal Society A: Mathematical, Physical, and Engineering Sciences 371, no. 1987 (March 2013).

Their appeal was both the potential of the new and the chance to get away from the rotting smell of old politics. This is one reason it is wrong to look at the world and consider it filled merely with random events, with so-called black swans. In fact, patterns appear everywhere. They can be searched and mapped and studied with the tools of data science, but, of course, they can also be felt. They may surprise you if you don’t know how to look for them. But they are there. There is more to human history than earthquakes alone. Even if it can’t be predicted, complexity in any system, whether it is an Indonesian coral reef or a Russian computer network, can at least be measured.


pages: 337 words: 103,522

The Creativity Code: How AI Is Learning to Write, Paint and Think by Marcus Du Sautoy

3D printing, Ada Lovelace, Albert Einstein, algorithmic bias, AlphaGo, Alvin Roth, Andrew Wiles, Automated Insights, Benoit Mandelbrot, Bletchley Park, Cambridge Analytica, Charles Babbage, Claude Shannon: information theory, computer vision, Computing Machinery and Intelligence, correlation does not imply causation, crowdsourcing, data is the new oil, data science, deep learning, DeepMind, Demis Hassabis, Donald Trump, double helix, Douglas Hofstadter, driverless car, Elon Musk, Erik Brynjolfsson, Fellow of the Royal Society, Flash crash, Gödel, Escher, Bach, Henri Poincaré, Jacquard loom, John Conway, Kickstarter, Loebner Prize, machine translation, mandelbrot fractal, Minecraft, move 37, music of the spheres, Mustafa Suleyman, Narrative Science, natural language processing, Netflix Prize, PageRank, pattern recognition, Paul Erdős, Peter Thiel, random walk, Ray Kurzweil, recommendation engine, Rubik’s Cube, Second Machine Age, Silicon Valley, speech recognition, stable marriage problem, Turing test, Watson beat the top human players on Jeopardy!, wikimedia commons

Was Rembrandt’s considerable output sufficient for an algorithm to be able to learn how to create a new portrait that would be recognisably his? The internet contains millions of images of cats, but Shakespeare wrote thirty-seven plays and Beethoven nine symphonies. Will creative genius be protected from machine learning by a lack of data? Data scientists at Microsoft and Delft University of Technology were of the view that there was enough data for an algorithm to learn how to paint like Rembrandt. Ron Augustus from Microsoft, who worked on the project, believed the old master himself would approve of their project: ‘We are using technology and data like Rembrandt uses his paints and brushes to create something new.’

By layering the negatives on top of each other and exposing the resulting image Galton was rather shocked to see the array of distorted and ugly faces he had used transform into a handsome composite. It seems that when you smooth out the asymmetries, you end up with something quite attractive. The data scientists would have to devise a more clever plan if they were going to produce a painting that might be taken for a Rembrandt. Their algorithm would have to create new eyes, a new nose and a new mouth, as if it could see the world through Rembrandt’s eyes. Having created these features, they then investigated the proportions Rembrandt used to place these features on the faces he painted.

The current drive by humans to create algorithmic creativity is in the most part not one fuelled by the desire for extending artistic creation but rather enlarging a company bank balance. There is a huge amount of hype about AI. There are too many initiatives that are branded as AI but which are little more than statistics or data science. Just as any company wishing to make it at the turn of the millennium would put .com on the end of its name, today it is the addition of the tag AI or Deep which is what companies are using to jump on the bandwagon. Companies would love to be able to convince an audience that this AI is so great that it can write articles on its own, that it can compose music, paint Rembrandts.


pages: 420 words: 100,811

We Are Data: Algorithms and the Making of Our Digital Selves by John Cheney-Lippold

algorithmic bias, bioinformatics, business logic, Cass Sunstein, centre right, computer vision, critical race theory, dark matter, data science, digital capitalism, drone strike, Edward Snowden, Evgeny Morozov, Filter Bubble, Google Chrome, Google Earth, Hans Moravec, Ian Bogost, informal economy, iterative process, James Bridle, Jaron Lanier, Julian Assange, Kevin Kelly, late capitalism, Laura Poitras, lifelogging, Lyft, machine readable, machine translation, Mark Zuckerberg, Marshall McLuhan, mass incarceration, Mercator projection, meta-analysis, Nick Bostrom, Norbert Wiener, offshore financial centre, pattern recognition, price discrimination, RAND corporation, Ray Kurzweil, Richard Thaler, ride hailing / ride sharing, Rosa Parks, Silicon Valley, Silicon Valley startup, Skype, Snapchat, software studies, statistical model, Steven Levy, technological singularity, technoutopianism, the scientific method, Thomas Bayes, Toyota Production System, Turing machine, uber lyft, web application, WikiLeaks, Zimmermann PGP

Big data needs to be analyzed at a distance, because as Franco Moretti claims, “distance, let me repeat it, is a condition of knowledge: it allows you to focus on units that are much smaller or much larger than the [singular] text.”66 Big-data practitioners like Viktor Mayer-Schönberger and Kenneth Cukier aren’t really concerned with analytical rigor: “moving to a large scale changes not only the expectations of precision but the practical ability to achieve exactitude.”67 Big data is instead about understanding a new version of the world, one previously unavailable to us because in an antiquated era before digital computers and ubiquitous surveillance, we didn’t have the breadth and memory to store it all. Moretti, though, is no data scientist. He’s a humanist, best known for coining the concept of “distant reading.” Unlike the disciplinary English practice of “close reading,” in which a section of a certain text is painstakingly parsed for its unique semantic and grammatological usage, distant reading finds utility in the inability to do a close reading of large quantities of texts.68 One human cannot read and remember every last word of Hemingway, and she would be far less able to do the same for other U.S.

Nidhi Makhija-Chimnani, “People’s Insights Volume 1, Issue 52: Vicks Mobile Ad Campaign,” MSLGroup, 2013, asia.mslgroup.com. 59. David Lazer, Ryan Kennedy, Gary King, and Alessandro Vespignani, “The Parable of Google Flu: Traps in Big Data Analysis,” Science 343 (March 14, 2014): 1203–1205. 60. Steve Lohr, “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights,” New York Times, August 17, 2014. 61. Lazer, Kennedy, and Vespignani, “Parable of Google Flu.” 62. Foucault, “Society Must Be Defended,” 249. 63. Mauricio Santillana, D. Wendong Zhang, Benjamin Althouse, and John Ayers, “What Can Digital Disease Detection Learn from (an External Revision to) Google Flu Trends?

It suggests, “We can stop looking for models. We can analyze the data without hypotheses about what it might show.”119 Of course, as discussed in the preceding pages, data is never “just” data. But this antihypothetical interpretation, with all its methodological baggage, has set the trajectory for big-data science itself. Data-mining and machine-learning research is at a fever pitch. The data culled from the surveillant assemblages of our networked society has been dubbed “oil of the 21st century,” while algorithmic analytics are “the combustion engine.”120 Even with powerful critiques of this hyperpositivism coming from scholars like David Ribes, Steven J.


pages: 1,082 words: 87,792

Python for Algorithmic Trading: From Idea to Cloud Deployment by Yves Hilpisch

algorithmic trading, Amazon Web Services, automated trading system, backtesting, barriers to entry, bitcoin, Brownian motion, cloud computing, coronavirus, cryptocurrency, data science, deep learning, Edward Thorp, fiat currency, global macro, Gordon Gekko, Guido van Rossum, implied volatility, information retrieval, margin call, market microstructure, Myron Scholes, natural language processing, paper trading, passive investing, popular electronics, prediction markets, quantitative trading / quantitative finance, random walk, risk free rate, risk/return, Rubik’s Cube, seminal paper, Sharpe ratio, short selling, sorting algorithm, systematic trading, transaction costs, value at risk

The book by Hilpisch (2020) focuses exclusively on the application of algorithms for machine and deep learning to the problem of identifying statistical inefficiencies and exploiting economic inefficiencies through algorithmic trading: Guido, Sarah, and Andreas Müller. 2016. Introduction to Machine Learning with Python: A Guide for Data Scientists. Sebastopol: O’Reilly. Hilpisch, Yves. 2020. Artificial Intelligence in Finance: A Python-Based Guide. Sebastopol: O’Reilly. VanderPlas, Jake. 2016. Python Data Science Handbook: Essential Tools for Working with Data. Sebastopol: O’Reilly. The books by Hastie et al. (2008) and James et al. (2013) provide a thorough, mathematical overview of popular machine learning techniques and algorithms: Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2008.

Background information about Python as applied to finance, financial data science, and artificial intelligence can be found in the following books: Hilpisch, Yves. 2018. Python for Finance: Mastering Data-Driven Finance. 2nd ed. Sebastopol: O’Reilly. ⸻. 2020. Artificial Intelligence in Finance: A Python-Based Guide. Sebastopol: O’Reilly. McKinney, Wes. 2017. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 2nd ed. Sebastopol: O’Reilly. Ramalho, Luciano. 2021. Fluent Python: Clear, Concise, and Effective Programming. 2nd ed. Sebastopol: O’Reilly. VanderPlas, Jake. 2016. Python Data Science Handbook: Essential Tools for Working with Data.

The reader can post questions and comments in the user forum on the Quant Platform at any time (accounts are free). Online/video training (paid subscription) The Python Quants offer comprehensive online training programs that make use of the contents presented in the book and that add additional content, covering important topics such as financial data science, artificial intelligence in finance, Python for Excel and databases, and additional Python tools and skills. Contents and Structure Here’s a quick overview of the topics and contents presented in each chapter. Chapter 1, Python and Algorithmic Trading The first chapter is an introduction to the topic of algorithmic trading—that is, the automated trading of financial instruments based on computer algorithms.


pages: 340 words: 97,723

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity by Amy Webb

"Friedman doctrine" OR "shareholder theory", Ada Lovelace, AI winter, air gap, Airbnb, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic bias, AlphaGo, Andy Rubin, artificial general intelligence, Asilomar, autonomous vehicles, backpropagation, Bayesian statistics, behavioural economics, Bernie Sanders, Big Tech, bioinformatics, Black Lives Matter, blockchain, Bretton Woods, business intelligence, Cambridge Analytica, Cass Sunstein, Charles Babbage, Claude Shannon: information theory, cloud computing, cognitive bias, complexity theory, computer vision, Computing Machinery and Intelligence, CRISPR, cross-border payments, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, deep learning, DeepMind, Demis Hassabis, Deng Xiaoping, disinformation, distributed ledger, don't be evil, Donald Trump, Elon Musk, fail fast, fake news, Filter Bubble, Flynn Effect, Geoffrey Hinton, gig economy, Google Glasses, Grace Hopper, Gödel, Escher, Bach, Herman Kahn, high-speed rail, Inbox Zero, Internet of things, Jacques de Vaucanson, Jeff Bezos, Joan Didion, job automation, John von Neumann, knowledge worker, Lyft, machine translation, Mark Zuckerberg, Menlo Park, move fast and break things, Mustafa Suleyman, natural language processing, New Urbanism, Nick Bostrom, one-China policy, optical character recognition, packet switching, paperclip maximiser, pattern recognition, personalized medicine, RAND corporation, Ray Kurzweil, Recombinant DNA, ride hailing / ride sharing, Rodney Brooks, Rubik’s Cube, Salesforce, Sand Hill Road, Second Machine Age, self-driving car, seminal paper, SETI@home, side project, Silicon Valley, Silicon Valley startup, skunkworks, Skype, smart cities, South China Sea, sovereign wealth fund, speech recognition, Stephen Hawking, strong AI, superintelligent machines, surveillance capitalism, technological singularity, The Coming Technological Singularity, the long tail, theory of mind, Tim Cook: Apple, trade route, Turing machine, Turing test, uber lyft, Von Neumann architecture, Watson beat the top human players on Jeopardy!, zero day

There is another reason we should be concerned about China’s plans, and that brings us back to that place where AI’s tribes form: education. China is actively draining professors and researchers away from AI’s hubs in Canada and the United States, offering them attractive repatriation packages. There’s already a shortage of trained data scientists and machine-learning specialists. Siphoning off people will soon create a talent vacuum in the West. By far, this is China’s smartest long-term play—because it deprives the West of its ability to compete in the future. China’s talent pipeline is draining researchers back into the mainland as part of its Thousand Talents Plan.

My second-grader will be 30, and by then she may be reading a New York Times bestseller written entirely by a machine. My dad will be in his late 90s, and all of his medical specialists (cardiologists, nephrologists, radiologists) will be AGIs, directed and managed by a highly trained general practitioner, who is both an MD and a data scientist. The advent of ASI could follow soon or much longer after, between the 2040s and 2060s. It doesn’t mean that by 2070 superintelligent AIs will have crushed all life on Earth under the weight of quintillions of paperclips. But it doesn’t mean they won’t have either. The Stories We Must Tell Ourselves Planning for the futures of AI requires us to build new narratives using data from the real world.

In the United States, the G-MAFIA can commit to recalibrating its own hiring processes, which at present prioritize a prospective hire’s skills and whether they will fit into company culture. What this process unintentionally overlooks is someone’s personal understanding of ethics. Hilary Mason, a highly respected data scientist and the founder of Fast Forward Labs, explained a simple process for ethics screening during interviews. She recommends asking pointed questions and listening intently to a candidate’s answers. Questions like: “You’re working on a model for consumer access to a financial service. Race is a significant feature in your model, but you can’t use race.


pages: 562 words: 153,825

Dark Mirror: Edward Snowden and the Surveillance State by Barton Gellman

4chan, A Declaration of the Independence of Cyberspace, Aaron Swartz, active measures, air gap, Anton Chekhov, Big Tech, bitcoin, Cass Sunstein, Citizen Lab, cloud computing, corporate governance, crowdsourcing, data acquisition, data science, Debian, desegregation, Donald Trump, Edward Snowden, end-to-end encryption, evil maid attack, financial independence, Firefox, GnuPG, Google Hangouts, housing justice, informal economy, information security, Jacob Appelbaum, job automation, John Perry Barlow, Julian Assange, Ken Thompson, Laura Poitras, MITM: man-in-the-middle, national security letter, off-the-grid, operational security, planetary scale, private military company, ransomware, Reflections on Trusting Trust, Robert Gordon, Robert Hanssen: Double agent, rolodex, Ronald Reagan, Saturday Night Live, seminal paper, Seymour Hersh, Silicon Valley, Skype, social graph, standardized shipping container, Steven Levy, TED Talk, telepresence, the long tail, undersea cable, Wayback Machine, web of trust, WikiLeaks, zero day, Zimmermann PGP

Whatever that number, dozens or hundreds, you multiply it by itself to measure the growth at each hop. The NSA’s deputy director, John C. Inglis, had testified in Congress just the day before Negroponte and Blair joined me onstage. Inglis said NSA analysts typically “go out two or three hops” when they chain through the call database. For context, data scientists estimated decades ago that it would take no more than six hops to trace a path between any two people on Earth. Their finding made its way into popular culture in Six Degrees of Separation, the play and subsequent film by John Guare. Three students at Albright College refashioned the film as a parlor game, “Six Degrees of Kevin Bacon.”

MAINWAY’s analytic engine traced hidden paths across the map, looking for relationships that human analysts could not detect. MAINWAY had to produce that map on demand, under pressure of time, whenever its operators asked for a new contact chain. No one could predict the name or telephone number of the next Tsarnaev. From a data scientist’s point of view, the logical remedy was clear. If anyone could become an intelligence target, MAINWAY should try to get a head start on everyone. “You have to establish all those relationships, tag them, so that when you do launch the query you can quickly get them,” Rick Ledgett, the former NSA deputy director, told me years later.

He gave a valedictory interview to National Public Radio on January 10, 2014, archived at https://archive.is/5j5Yg. “go out two or three hops”: Testimony of John C. Inglis, “The Administration’s Use of FISA Authorities,” House Committee on the Judiciary, July 17, 2013, https://fas.org/irp/congress/2013_hr/fisa.pdf. data scientists estimated: Among the early works in this field was Michael Gurevitch, whose 1991 doctoral dissertation, “The Social Structure of Acquaintanceship Networks,” may be found at https://dspace.mit.edu/handle/1721.1/11312. Six Degrees of Separation: The play, which opened in previews on October 30, 1990, won the New York Drama Critics’ Circle Best Play of 1990.


Traffic: Genius, Rivalry, and Delusion in the Billion-Dollar Race to Go Viral by Ben Smith

2021 United States Capitol attack, 4chan, Affordable Care Act / Obamacare, AOL-Time Warner, behavioural economics, Bernie Sanders, Big Tech, blockchain, Cambridge Analytica, citizen journalism, COVID-19, cryptocurrency, data science, David Brooks, deplatforming, Donald Trump, drone strike, fake news, Filter Bubble, Frank Gehry, full stack developer, future of journalism, hype cycle, Jeff Bezos, Kevin Roose, Larry Ellison, late capitalism, lolcat, Marc Andreessen, Mark Zuckerberg, Menlo Park, moral panic, obamacare, paypal mafia, Peter Thiel, post-work, public intellectual, reality distortion field, Robert Mercer, Sand Hill Road, Saturday Night Live, sentiment analysis, side hustle, Silicon Valley, Silicon Valley billionaire, skunkworks, slashdot, Snapchat, social web, Socratic dialogue, SoftBank, Steve Bannon, Steven Levy, subscription business, tech worker, TikTok, traveling salesman, WeWork, WikiLeaks, young professional, Zenefits

Jonah proposed moving to Palo Alto to oversee Facebook’s News Feed even as he continued to try his ideas out on BuzzFeed. He’d bring in Duncan Watts, the six degrees of separation expert, as Facebook’s “chief sociologist.” They’d work together with Jonah’s old MIT friend Cameron Marlow, who was already in the process of creating Facebook’s data science team, and “take the best from data science and machine learning, sociology and behavioral economics, and build enhanced publisher relationships” to make News Feed—the ever-changing column of content that still rules most people’s experience of Facebook—“live up to its name.” Jonah had other ideas too. He thought Facebook could track incoming and outgoing traffic to the rest of the web more deeply.

It turned out that Mark Wilkie, the developer, had created a domain called BuzzFed, with a single e, and was using it to host embedded content. The tricky-looking similar domain triggered Google’s new screens for malware and knocked the site out of the search engine. Five days later, all the Google traffic came roaring back. September 14 was “the biggest traffic day ever, by all metrics!” one of Jonah’s new hires, a data scientist, wrote him triumphantly. Suddenly the site was setting new records every few weeks. BuzzFeed’s best traffic day yet came on December 5, 2011. The post, one of Stopera’s, was a simple list titled “The 45 Most Powerful Images of 2011,” bringing readers back through a series of emotional public moments: Riots in London.

Elsewhere on the internet, the Los Angeles BuzzFeed employee Baked Alaska had revived the style of dumb stunt that had made him a star on Vine, and by 2018 was roving the streets of Los Angeles, recording a video livestream while speakers broadcast comments from viewers. Within minutes, the speakers were blasting out the N-word. Facebook employees knew what they’d done. “Our approach has had unhealthy side effects on important slices of public content, such as politics and news,” a team of data scientists wrote in a memo that a whistleblower, Frances Haugen, provided to The Wall Street Journal. But top executives couldn’t countenance the loss of engagement that might come with actually tamping down on the divisive speech that seemed to attract, in the late 2010s, most of Americans’ attention. The company did figure out how to dial back the toxicity of its platform, but it would reserve that for extreme situations—the run-up to elections around the world, for instance—and usually left the tap of anger open, keeping its users glued to the site.


Data Wrangling With Python: Tips and Tools to Make Your Life Easier by Jacqueline Kazil

Amazon Web Services, bash_history, business logic, cloud computing, correlation coefficient, crowdsourcing, data acquisition, data science, database schema, Debian, en.wikipedia.org, Fairphone, Firefox, Global Witness, Google Chrome, Hacker News, job automation, machine readable, Nate Silver, natural language processing, pull request, Ronald Reagan, Ruby on Rails, selection bias, social web, statistical model, web application, WikiLeaks

In her career, she has worked in technology focusing in finance, government, and journalism. Most notably, she is a former Presidential Innovation Fellow and cofounded a technology organization in government called 18F. Her career has consisted of many data science and wrangling projects including Geoq, an open source mapping workflow tool; a Congress.gov remake; and Top Secret America. She is active in Python and data communities—Python Software Foundation, PyLadies, Women Data Science DC, and more. She teaches Python in Washington, D.C. at meetups, conferences, and mini bootcamps. She often pairs pro‐ grams with her sidekick, Ellie (@ellie_the_brave). You can find her on Twitter @jack‐ iekazil or follow her blog, The coderSnorts.

Data Wrangling with Python TIPS AND TOOLS TO MAKE YOUR LIFE EASIER Jacqueline Kazil & Katharine Jarmul Praise for Data Wrangling with Python “This should be required reading for any new data scientist, data engineer or other technical data professional. This hands-on, step-by-step guide is exactly what the field needs and what I wish I had when I first starting manipulating data in Python. If you are a data geek that likes to get their hands dirty and that needs a good definitive source, this is your book.” —Dr. Tyrone Grandison, CEO, Proficiency Labs Intl. “There’s a lot more to data wrangling than just writing code, and this well-written book tells you everything you need to know.

She would like to thank all four of her parents for their patience with endless book updates and dong bells. Sie möchte auch Frau Hoffmann für ihre endlose Geduld bei zahllosen Gesprä‐ chen auf Deutsch über dieses Buch bedanken. xvi | Preface CHAPTER 1 Introduction to Python Whether you are a journalist, an analyst, or a budding data scientist, you likely picked up this book because you want to learn how to analyze data programmatically, sum‐ marize your findings, and clearly communicate those findings to others. You might show your findings in a report, a graphic, or summarized statistics. Essentially, you are trying to tell a story.


Digital Transformation at Scale: Why the Strategy Is Delivery by Andrew Greenway,Ben Terrett,Mike Bracken,Tom Loosemore

Airbnb, behavioural economics, bitcoin, blockchain, butterfly effect, call centre, chief data officer, choice architecture, cognitive dissonance, cryptocurrency, data science, Diane Coyle, en.wikipedia.org, fail fast, G4S, hype cycle, Internet of things, Kevin Kelly, Kickstarter, loose coupling, M-Pesa, machine readable, megaproject, minimum viable product, nudge unit, performance metric, ransomware, robotic process automation, Silicon Valley, social web, The future is already here, the long tail, the market place, The Wisdom of Crowds, work culture

Open-minded, multidisciplinary teams can deliver a lot more than just elegant websites. The internet era has arguably created some genuinely new roles, or at least redefined existing roles to the extent that they will be taken by different people applying a new attitude. Given the chance, statisticians will say data scientists are just chancers with good PR blagging the same job they’ve been unfashionably plugging away with for years. That’s an argument for another book. Many of the skills needed for digital transformation are not new. The UK government has achieved some proud moments in design, for example (Henry Beck’s famous Underground map and Margaret Calvert and Jock Kinneir’s work on road signs in the 1960s37 were both emulated worldwide), and couldn’t have done that without employing people who understood its value.

Be as iterative with your approach to communicating as you are with the products you build. The GDS began with one blog for the whole organisation and made that part of government communications infrastructure. From there, the team created many more tightly focused blogs, each with discrete and defined audiences, covering a huge variety of topics from user research to data science and HR. These created bounded spaces for experts to write to an audience they knew was interested, starting a conversation rather than a broadcast. They opened up networks, and left a legacy of knowledge that is still available for anyone to draw on. In many large organisations, hoarding information in emails and memos is a common form of controlling power.

If the culture, people and working practices of the institution are still grounded in principles that were set down in the age of the telegraph, the chances of responding with the requisite flexibility and agility to machine learning are slim. How can you be sure you’re not buying snake oil? Which roles and professions should be part of the conversation? Which start looking obsolete? Can you buy into the business models that AI or data science services will use? If you’ve failed to get through the first digital transformation of your organisation, you will also fail to make the best of the second. Retrospective: paperless driving For all the cold water we are pouring on them in this chapter, AI and machine learning represent a new frontier.


pages: 444 words: 118,393

The Nature of Software Development: Keep It Simple, Make It Valuable, Build It Piece by Piece by Ron Jeffries

Amazon Web Services, anti-pattern, bitcoin, business cycle, business intelligence, business logic, business process, c2.com, call centre, cloud computing, continuous integration, Conway's law, creative destruction, dark matter, data science, database schema, deep learning, DevOps, disinformation, duck typing, en.wikipedia.org, fail fast, fault tolerance, Firefox, Hacker News, industrial robot, information security, Infrastructure as a Service, Internet of things, Jeff Bezos, Kanban, Kubernetes, load shedding, loose coupling, machine readable, Mars Rover, microservices, Minecraft, minimum viable product, MITM: man-in-the-middle, Morris worm, move fast and break things, OSI model, peer-to-peer lending, platform as a service, power law, ransomware, revision control, Ruby on Rails, Schrödinger's Cat, Silicon Valley, six sigma, software is eating the world, source of truth, SQL injection, systems thinking, text mining, time value of money, transaction costs, Turing machine, two-pizza team, web application, zero day

Michael Keeling (358 pages) ISBN: 9781680502091 $41.95 Data Science Essentials in Python Go from messy, unstructured artifacts stored in SQL and NoSQL databases to a neat, well-organized dataset with this quick reference for the busy data scientist. Understand text mining, machine learning, and network analysis; process numeric data with the NumPy and Pandas modules; describe and analyze data using statistical and network-theoretical methods; and see actual examples of data analysis at work. This one-stop solution covers the essential data science you need in Python. Dmitry Zinoviev (224 pages) ISBN: 9781680501841 $29 A Common-Sense Guide to Data Structures and Algorithms If you last saw algorithms in a university course or at a job interview, you’re missing out on what they can do for your code.

If the system can determine in advance that it will fail at an operation, it’s always better to fail fast. That way, the caller doesn’t have to tie up any of its capacity waiting and can get on with other work. How can the system tell whether it will fail? Do we need Deep Learning? Don’t worry, you won’t need to hire a cadre of data scientists. It’s actually much more mundane than that. There’s a large class of “resource unavailable” failures. For example, when a load balancer gets a connection request but not one of the servers in its service pool is functioning, it should immediately refuse the connection. Some configurations have the load balancer queue the connection request for a while in the hopes that a server will become available in a short period of time.


pages: 254 words: 61,387

This Could Be Our Future: A Manifesto for a More Generous World by Yancey Strickler

"Friedman doctrine" OR "shareholder theory", "World Economic Forum" Davos, Abraham Maslow, accelerated depreciation, Adam Curtis, basic income, benefit corporation, Big Tech, big-box store, business logic, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, cognitive dissonance, corporate governance, Daniel Kahneman / Amos Tversky, data science, David Graeber, Donald Trump, Doomsday Clock, Dutch auction, effective altruism, Elon Musk, financial independence, gender pay gap, gentrification, global supply chain, Hacker News, housing crisis, Ignaz Semmelweis: hand washing, invention of the printing press, invisible hand, Jeff Bezos, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Nash: game theory, Joi Ito, Joseph Schumpeter, Kickstarter, Kōnosuke Matsushita, Larry Ellison, Louis Pasteur, Mark Zuckerberg, medical bankruptcy, Mr. Money Mustache, new economy, Oculus Rift, off grid, offshore financial centre, Parker Conrad, Ralph Nader, RAND corporation, Richard Thaler, Ronald Reagan, Rutger Bregman, self-driving car, shareholder value, Silicon Valley, Simon Kuznets, Snapchat, Social Responsibility of Business Is to Increase Its Profits, Solyndra, stem cell, Steve Jobs, stock buybacks, TechCrunch disrupt, TED Talk, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, Travis Kalanick, Tyler Cowen, universal basic income, white flight, Zenefits

Using your skills to maximize financial value seems like a waste when a whole new frontier of value awaits. Led by some of the best and brightest of the Millennial and Z generations, these people become the pioneers of the new Values Maximizing Class. Accountants, carpenters, community organizers, construction workers, data scientists, designers, ecologists, economists, engineers, entrepreneurs, financial analysts, journalists, lawyers, line cooks, meteorologists, politicians, social scientists, teachers, truck drivers, venture capitalists, waitresses, students, and retirees dedicate themselves to the mission of identifying, measuring, and growing rational, nonfinancial values.

Coaches and TV announcers condemned it as selfish. It wasn’t how the game was played. But in the first decade of the 2000s, the way people thought about sports began to change. In the wake of Moneyball, the 2003 Michael Lewis book about an underdog baseball team using data analysis to outperform better-resourced competitors, data science became a new focus in sports. Including basketball. Trailblazing analysts started to ask new questions. Things like: where are the most efficient places on the court to shoot? This was a new kind of question. To know the most efficient shot, new forms of measurement were needed. To get the necessary data, new kinds of technology were required.

., 54 Chick-fil-A, 165–66, 169, 175, 264 Chile, 198–99 China, ix, xii, 58–59, 71 Chouinard, Yvon, 172 Clear Channel Communications, 39–40 climate change, 144, 191–92 Cold War, 27–28, 31, 105 communitarianism, 236–37 community, xv, 48, 135, 243 companies contribute to, 51, 213, 216–17 as governing value, 142, 145 highly valued, xi–xiii, 45, 48 in pursuit of, 162–63 companies on hypergrowth path, 95–97, 100, 236 and public service, 60, 62, 101–2 purpose-oriented, 100–101 secular missions of, 212–13, 217–18 share-holder centric, 82–85, 169–70 values-minded, 165–66, 210–18 See also specific names; specific topics competition, 33, 39, 53–54, 83, 98–99, 104, 153, 172–74, 196 Compleat Strategyst, The (Williams), 29–31 compound interest, xiv, 191, 194 Confederacy of Dunces, A (Toole), 11 Conrad, Parker, 95–96 consumerism, xii, 51, 120, 168, 187, 217 cooperation/collaboration, 32–34, 102, 198–99, 213 Creative Independent, The, 170–71, 270 creativity and creating value, 12, 171, 175 highly valued, 45, 48 investment in, 5, 7, 10–13, 88, 170–71 and producing profits, 43–44, 134, 170 credit cards, 65–66, 74 crowdfunding, 4–13, 15, 247–48. See also specific companies cultural heritage, 180–81 Curtis, Adam, 270–71 data/data science, xv, 97, 123, 150, 159–62, 215 decision making, 28, 96–97 affects other people, 127–28, 144 and Bentoism, 130, 134, 138–40, 206–11 best-case outcome for, 126, 152, 243 guided by defaults, 22–23, 34–35 and making money, x–xi, 23, 135 of Maximizing Class, 62–63 rational, xiii, 134–35, 137, 139 values-driven, xiii, 132, 138–39, 152–53, 174–76, 209–10, 223 See also autonomy defaults (hidden) and bento box, 129–30 explanation of, 19–23 and financial maximization, x, 22–26, 64, 83 and game theory, 32, 34–35 and maximizing here/now, 136–37 set what’s normal, 34–35 and values, 214–15, 223 defaults (visible), 22 Defense Advanced Research Projects Agency (DARPA), 78–79 deregulation, 77–78, 83, 257 disrupting, in business, 87–88, 95–99, 103 downtown, demise of, 48, 51–52, 54 Drive (Pink), 117–19 drugs, xii, 23, 81, 249 Dublin, Ireland, 9–10 e-commerce, 47, 162 “Economic Possibilities for Our Grandchildren” (Keynes), 193–94 economy, 261 downturn in, 61–62, 71, 120 and financial maximization, x, xiii, 70, 72, 116 growth of, 120, 151, 193–95, 267 “Mullet,” 66–74, 77, 84, 110, 163 shareholder-centric, 60–61, 67–73, 82–85 See also gross domestic product (GDP); stock buybacks education, 24–26, 74–75, 110, 170–71, 197, 216, 259 electric cars, 173–75, 183 Ellison, Larry, 109–10 emotions, 22–23, 103, 113–15, 195, 260 Enron, 78, 210 Entrepreneurial State, The (Mazzucato), 78 entrepreneurship, 52–54, 75, 78–81, 196, 241 environmental issues, 14–15, 77, 172–74, 201, 212 Etsy, 212 “evergreen” model, 217 exercise, xiv, 177–78, 184–87, 189–90, 265, 267 Facebook, 53–54, 98, 109 fairness, xii–xiii, xv, 102, 142, 145, 158, 163, 195, 202, 216 family, the, xiii, xv, 26–27, 90–91, 111, 127, 138, 142, 222 Federal Reserve, 49 Federal Reserve Bank of Dallas, 72–73 financial crises, 77–78 debt, 65–66, 74–75 growth, xv–xvi instability, 110, 112 security, xi, 109–14, 116, 141, 201, 205 Financial Independence Retire Early (FIRE), 166–69 financial maximization, 236 becomes mainstream, 59–61, 63–65, 91, 180 case against, 110–11, 115–16, 119 dominance of, x–xiii, xvi, 23–25, 27, 37, 73, 97, 104–5, 133–35 downsides to, xi–xiii, 45, 196–97, 199, 243 ending its reign, xiii, 13–14, 225 four phases of, 83–85 growth of, xi, 92, 123, 196, 265 moving beyond it, 163, 168–69, 206, 209, 212 normalization of, 91, 183 not the goal, 9–10, 43–44 origins of, x, xvi, 26–32, 255 prioritizing it, 14, 63, 82–83 Financial Times, 70–71 First Amendment, 39 Fonda, Jane, 187 Food and Drug Administration, 188 Fortune, 68 Fox News, xii Friedman, Milton, 59–61, 63, 82, 90–91, 105, 180, 255, 261 Future Me examples of, 138, 167–69, 202–6, 211 explanation of, 132, 143, 206 and values helix, 218–23 Future Us, 196 examples of, 138, 169, 171–75, 201–6, 211 explanation of, 132, 144–45, 206 and values helix, 218–19 game theory, 237, 250 and collaboration, 32–34 and Community Game, 33–35, 98 its notion of rationality, 29–35, 97 and Prisoner’s Dilemma, 28–34, 130–33, 198–99 and Stag Hunt, 32–34 and Wall Street Game, 33–35, 98 Garfield, James, 146–47, 149, 179 Gates, Bill, 109–10 generational change, xiv, 180–84, 187, 191–92, 266–67 influence, xi, 152, 218–24, 271 generosity, xii, 7, 118, 134, 175 Gomory, Ralph, 82 Google, 53–54, 110, 123 Great Depression, 71, 77, 120, 193 Greatest Generation, 192 grit, 135, 143–45 gross domestic product (GDP), xii, 23, 83, 120–24, 196, 235 gross domestic value (GDV), 217–18 Groupon, 96–97, 100 gyms, xiv, 20, 186–87 Hanchett, Thomas, 49 happiness.


pages: 533

Future Politics: Living Together in a World Transformed by Tech by Jamie Susskind

3D printing, additive manufacturing, affirmative action, agricultural Revolution, Airbnb, airport security, algorithmic bias, AlphaGo, Amazon Robotics, Andrew Keen, Apollo Guidance Computer, artificial general intelligence, augmented reality, automated trading system, autonomous vehicles, basic income, Bertrand Russell: In Praise of Idleness, Big Tech, bitcoin, Bletchley Park, blockchain, Boeing 747, brain emulation, Brexit referendum, British Empire, business process, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, cashless society, Cass Sunstein, cellular automata, Citizen Lab, cloud computing, commons-based peer production, computer age, computer vision, continuation of politics by other means, correlation does not imply causation, CRISPR, crowdsourcing, cryptocurrency, data science, deep learning, DeepMind, digital divide, digital map, disinformation, distributed ledger, Donald Trump, driverless car, easy for humans, difficult for computers, Edward Snowden, Elon Musk, en.wikipedia.org, end-to-end encryption, Erik Brynjolfsson, Ethereum, ethereum blockchain, Evgeny Morozov, fake news, Filter Bubble, future of work, Future Shock, Gabriella Coleman, Google bus, Google X / Alphabet X, Googley, industrial robot, informal economy, intangible asset, Internet of things, invention of the printing press, invention of writing, Isaac Newton, Jaron Lanier, John Markoff, Joseph Schumpeter, Kevin Kelly, knowledge economy, Large Hadron Collider, Lewis Mumford, lifelogging, machine translation, Metcalfe’s law, mittelstand, more computing power than Apollo, move fast and break things, natural language processing, Neil Armstrong, Network effects, new economy, Nick Bostrom, night-watchman state, Oculus Rift, Panopticon Jeremy Bentham, pattern recognition, payday loans, Philippa Foot, post-truth, power law, price discrimination, price mechanism, RAND corporation, ransomware, Ray Kurzweil, Richard Stallman, ride hailing / ride sharing, road to serfdom, Robert Mercer, Satoshi Nakamoto, Second Machine Age, selection bias, self-driving car, sexual politics, sharing economy, Silicon Valley, Silicon Valley startup, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, Snapchat, speech recognition, Steve Bannon, Steve Jobs, Steve Wozniak, Steven Levy, tech bro, technological determinism, technological singularity, technological solutionism, the built environment, the Cathedral and the Bazaar, The Structural Transformation of the Public Sphere, The Wisdom of Crowds, Thomas L Friedman, Tragedy of the Commons, trolley problem, universal basic income, urban planning, Watson beat the top human players on Jeopardy!, work culture , working-age population, Yochai Benkler

Mid-range cars already contain multiple microprocessors and sensors, allowing them to upload performance data to carmakers when the vehicle is serviced.24 The proportion of the world’s data drawn from machine sensors was 11 per cent in 2005; it is predicted to increase to 42 per cent in 2020.25 Data scientists have always wrestled with the challenge of turning raw data into information (by cleaning, processing, and organizing it), then into knowledge (by analysing and interpreting it).26 The arrival of big data has required some methodological innovation. As Mayer-Schönberger and Cukier explain, the benefit of analysing vast amounts of data about a topic rather than using a small representative sample has depended upon data scientists’ willingness to accept ‘data’s real-world messiness’ rather than seeking ­precision.27 In the 1990s IBM launched Candide, its effort to ­automate language translation using ten years’ worth of highquality transcripts from the Canadian parliament.

Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (New York: Crown, 2016), 114. 21. O’Neil, Weapons, 120. 22. Laurence Mills, ‘Numbers, Data and Algorithms: Why HR Professionals and Employment Lawyers Should Take Data Science and Analytics Seriously’, Future of Work Hub, 4 April 2017 <http:// www.futureofworkhub.info/comment/2017/4/4/numbers-dataand-algorithms-why-hr-professionals-and-employment-lawyersshould-take-data-science-seriously> (accessed 1 December 2017); Ifeoma Ajunwa, Kate Crawford, and Jason Schultz, ‘Limitless Worker Surveillance’, California Law Review 105, no. 3, 13 March 2016 <https:// OUP CORRECTED PROOF – FINAL, 30/05/18, SPi РЕЛИЗ ПОДГОТОВИЛА ГРУППА "What's News" VK.COM/WSNWS Notes 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 419 papers.ssrn.com/sol3/papers.cfm?

Miller, David and Larry Siedentop, eds. The Nature of Political Theory. Oxford: Oxford University Press, 1983. Mills, Laurence. ‘Numbers, Data and Algorithms—Why HR Professionals and Employment Lawyers Should Take Data Science and Analytics Seriously’. Future of Work Hub, 4 Apr. 2017 <http://www.futureofworkhub.info/comment/2017/4/4/numbers-data-and-algorithmswhy-hr-professionals-and-employment-lawyers-should-take-data-­ science-seriously> (accessed 1 Dec. 2017). Millward, David. ‘How Ford Will create a new generation of driverless cars’. Telegraph, 27 Feb. 2017 <http://www.telegraph.co.uk/business/ 2017/02/27/ford-seeks-pioneer-new-generation-driverless-cars/> (accessed 28 Nov. 2017).


pages: 625 words: 167,349

The Alignment Problem: Machine Learning and Human Values by Brian Christian

Albert Einstein, algorithmic bias, Alignment Problem, AlphaGo, Amazon Mechanical Turk, artificial general intelligence, augmented reality, autonomous vehicles, backpropagation, butterfly effect, Cambridge Analytica, Cass Sunstein, Claude Shannon: information theory, computer vision, Computing Machinery and Intelligence, data science, deep learning, DeepMind, Donald Knuth, Douglas Hofstadter, effective altruism, Elaine Herzberg, Elon Musk, Frances Oldham Kelsey, game design, gamification, Geoffrey Hinton, Goodhart's law, Google Chrome, Google Glasses, Google X / Alphabet X, Gödel, Escher, Bach, Hans Moravec, hedonic treadmill, ImageNet competition, industrial robot, Internet Archive, John von Neumann, Joi Ito, Kenneth Arrow, language acquisition, longitudinal study, machine translation, mandatory minimum, mass incarceration, multi-armed bandit, natural language processing, Nick Bostrom, Norbert Wiener, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, OpenAI, Panopticon Jeremy Bentham, pattern recognition, Peter Singer: altruism, Peter Thiel, precautionary principle, premature optimization, RAND corporation, recommendation engine, Richard Feynman, Rodney Brooks, Saturday Night Live, selection bias, self-driving car, seminal paper, side project, Silicon Valley, Skinner box, sparse data, speech recognition, Stanislav Petrov, statistical model, Steve Jobs, strong AI, the map is not the territory, theory of mind, Tim Cook: Apple, W. E. B. Du Bois, Wayback Machine, zero-sum game

Rudin, Cynthia, and Joanna Radin. “Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson From An Explainable AI Competition.” Harvard Data Science Review, 2019. Rudin, Cynthia, and Berk Ustun. “Optimized Scoring Systems: Toward Trust in Machine Learning for Healthcare and Criminal Justice.” Interfaces 48, no. 5 (2018): 449–66. Rudin, Cynthia, Caroline Wang, and Beau Coker. “The Age of Secrecy and Unfairness in Recidivism Prediction.” Harvard Data Science Review 2, no. 1 (2020). Rumelhart, D. E., G. E. Hinton, and R. J. Williams. “Learning Internal Representations by Error Propagation.” In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 1:318–62.

So how risky did those defendants turn out to be? There was only one way to find out. “I had the sad realization,” Angwin recounts, “that we had to look up the criminal records of every one of those eighteen thousand people. Which we did. And it sucked.”26 To link the set of COMPAS scores to the set of criminal records—what data scientists call a “join”—would take Angwin, and her team, and the county staff, almost an entire additional year of work. “We used, obviously, a lot of automated scraping of the criminal records,” she explains. “And then we had to match them on name and date of birth, which is the most terrible thing you could possibly ever imagine.

In 2014, United States Defense Advanced Research Projects Agency (DARPA) program manager Dave Gunning was talking to Dan Kaufman, director of DARPA’s Information Innovation Office. “We were just trying to kick around different ideas on what to do in AI,” Gunning tells me.15 “They had had a whole effort where they had sent a whole group of data scientists to Afghanistan to analyze data, try to find patterns that would be useful to the war fighters. And they were already beginning to see that these machine-learning techniques were learning interesting patterns, but the users often didn’t get an explanation for why.” A rapidly evolving set of tools was able to take in financial records, movement records, cell phone logs, and more to determine whether some group of people might be planning to strike.


pages: 510 words: 120,048

Who Owns the Future? by Jaron Lanier

3D printing, 4chan, Abraham Maslow, Affordable Care Act / Obamacare, Airbnb, augmented reality, automated trading system, barriers to entry, bitcoin, Black Monday: stock market crash in 1987, book scanning, book value, Burning Man, call centre, carbon credits, carbon footprint, cloud computing, commoditize, company town, computer age, Computer Lib, crowdsourcing, data science, David Brooks, David Graeber, delayed gratification, digital capitalism, digital Maoism, digital rights, Douglas Engelbart, en.wikipedia.org, Everything should be made as simple as possible, facts on the ground, Filter Bubble, financial deregulation, Fractional reserve banking, Francis Fukuyama: the end of history, Garrett Hardin, George Akerlof, global supply chain, global village, Haight Ashbury, hive mind, if you build it, they will come, income inequality, informal economy, information asymmetry, invisible hand, Ivan Sutherland, Jaron Lanier, Jeff Bezos, job automation, John Markoff, John Perry Barlow, Kevin Kelly, Khan Academy, Kickstarter, Kodak vs Instagram, life extension, Long Term Capital Management, machine translation, Marc Andreessen, Mark Zuckerberg, meta-analysis, Metcalfe’s law, moral hazard, mutually assured destruction, Neal Stephenson, Network effects, new economy, Norbert Wiener, obamacare, off-the-grid, packet switching, Panopticon Jeremy Bentham, Peter Thiel, place-making, plutocrats, Ponzi scheme, post-oil, pre–internet, Project Xanadu, race to the bottom, Ray Kurzweil, rent-seeking, reversible computing, Richard Feynman, Ronald Reagan, scientific worldview, self-driving car, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Skype, smart meter, stem cell, Steve Jobs, Steve Wozniak, Stewart Brand, synthetic biology, tech billionaire, technological determinism, Ted Nelson, The Market for Lemons, Thomas Malthus, too big to fail, Tragedy of the Commons, trickle-down economics, Turing test, Vannevar Bush, WikiLeaks, zero-sum game

The code in some standardized form or framework that makes it reusable and tweakable? • Must analysis be performed in a way that anticipates standard practices of meta-analysis? • What documentation of the chain of custody of data must be standardized? • Must there be new practices established, analogous to double-blind tests or placebos, that help prevent big data scientists from fooling themselves? Should there be multiple groups developing code to analyze big data that remain completely insulated from each other in order to arrive at independent results? Before long, all these questions will be answered, but for now, practices are still in flux. Though the details need to mature, the core commitment to testing hypotheses unites all scientists whether their data is big or small.

So I am arguing both from the perspective of a big-time macher and from the perspective of a more typical person, because any solution has to be a solution from both perspectives. Big human data, that vase-shaped gap, is the arbiter of influence and power in our times. Finance is no longer about the case-by-case judgment of financiers, but about how good they are at locking in the best big-data scientists and technologists into exclusive contracts. Politicians target voters using similar algorithms to those that evaluate people for access to credit or insurance. The list goes on and on. As technology advances, Siren Servers will be ever more the objects of the struggle for wealth and power, because they are the only links in the chain that will not be commoditized.

We have become used to treating big business data as legitimate, even though it might really only seem so because of its special position in a network. Such data is valid by dint of tautology to an unknowable degree. Science demands a different approach to big data, but we don’t know as much about that approach as we will soon. Scientific method for big data is not yet entirely codified. Once practices are established for big data science, there will be uncontroversial answers to questions like: • What standard would have to be met to allow for the publication of replication of a result? To what degree must replication require the gathering of different, but similar big data, and not just the reuse of the same data with different algorithms?


pages: 499 words: 144,278

Coders: The Making of a New Tribe and the Remaking of the World by Clive Thompson

"Margaret Hamilton" Apollo, "Susan Fowler" uber, 2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, 4chan, 8-hour work day, Aaron Swartz, Ada Lovelace, AI winter, air gap, Airbnb, algorithmic bias, AlphaGo, Amazon Web Services, Andy Rubin, Asperger Syndrome, augmented reality, Ayatollah Khomeini, backpropagation, barriers to entry, basic income, behavioural economics, Bernie Sanders, Big Tech, bitcoin, Bletchley Park, blockchain, blue-collar work, Brewster Kahle, Brian Krebs, Broken windows theory, call centre, Cambridge Analytica, cellular automata, Charles Babbage, Chelsea Manning, Citizen Lab, clean water, cloud computing, cognitive dissonance, computer vision, Conway's Game of Life, crisis actor, crowdsourcing, cryptocurrency, Danny Hillis, data science, David Heinemeier Hansson, deep learning, DeepMind, Demis Hassabis, disinformation, don't be evil, don't repeat yourself, Donald Trump, driverless car, dumpster diving, Edward Snowden, Elon Musk, Erik Brynjolfsson, Ernest Rutherford, Ethereum, ethereum blockchain, fake news, false flag, Firefox, Frederick Winslow Taylor, Free Software Foundation, Gabriella Coleman, game design, Geoffrey Hinton, glass ceiling, Golden Gate Park, Google Hangouts, Google X / Alphabet X, Grace Hopper, growth hacking, Guido van Rossum, Hacker Ethic, hockey-stick growth, HyperCard, Ian Bogost, illegal immigration, ImageNet competition, information security, Internet Archive, Internet of things, Jane Jacobs, John Markoff, Jony Ive, Julian Assange, Ken Thompson, Kickstarter, Larry Wall, lone genius, Lyft, Marc Andreessen, Mark Shuttleworth, Mark Zuckerberg, Max Levchin, Menlo Park, meritocracy, microdosing, microservices, Minecraft, move 37, move fast and break things, Nate Silver, Network effects, neurotypical, Nicholas Carr, Nick Bostrom, no silver bullet, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, Oculus Rift, off-the-grid, OpenAI, operational security, opioid epidemic / opioid crisis, PageRank, PalmPilot, paperclip maximiser, pattern recognition, Paul Graham, paypal mafia, Peter Thiel, pink-collar, planetary scale, profit motive, ransomware, recommendation engine, Richard Stallman, ride hailing / ride sharing, Rubik’s Cube, Ruby on Rails, Sam Altman, Satoshi Nakamoto, Saturday Night Live, scientific management, self-driving car, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, single-payer health, Skype, smart contracts, Snapchat, social software, software is eating the world, sorting algorithm, South of Market, San Francisco, speech recognition, Steve Wozniak, Steven Levy, systems thinking, TaskRabbit, tech worker, techlash, TED Talk, the High Line, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, universal basic income, urban planning, Wall-E, Watson beat the top human players on Jeopardy!, WeWork, WikiLeaks, women in the workforce, Y Combinator, Zimmermann PGP, éminence grise

You have to deal with uncertainty, weirdness. You guide the system toward doing what it’s supposed to do, like herding the cats of cognition. Maybe you’ll get them where you want; maybe you won’t. It’s like a point my friend Hilary Mason, a top data and machine-learning scientist, made about data science in the Harvard Business Review: “At the outset of a data science project, you don’t know if it’s going to work. At the outset of a software engineering project, you know it’s going to work.” On top of that, there’s the black-box problem. Once a neural net has been trained and you’re recognizing those cat photos, great! But if you ask the coder who built it, “How is this thing working?”

She thinks the reputation comes partly from their being comfortable and fluent with machines that intimidate and mystify most of the rest of the population. “If I had to characterize the programmers I know, I’d say there’s a certain confidence that comes with being infused with technology. It’s that confidence in actually understanding what this device in our hands is doing.” Mason is a pioneering data scientist and as committed a nerd as they come; when I first met her years earlier, she enthusiastically told me how she had “replaced myself with a bunch of small shell scripts”—she’d written dozens of short little programs to reply to dull, rote emails (students of hers asking “Will this be on the exam?”)

Mason is a pioneering data scientist and as committed a nerd as they come; when I first met her years earlier, she enthusiastically told me how she had “replaced myself with a bunch of small shell scripts”—she’d written dozens of short little programs to reply to dull, rote emails (students of hers asking “Will this be on the exam?”) because she’d rather save her time for more important stuff. But she’s also a connector who’s founded or helped start all manner of organizations designed to help bootstrap newbies to tech, including a Brooklyn “hackerspace” and hackNY, which runs hackathons for students. In a very data-scientist fashion, she rebels at the idea that a single archetype can hold true across an ever-larger cohort of coders worldwide. The population has grown so huge that you can’t generalize across the entire field anymore when it comes to personality. She’s right that the population of programmers has exploded.


The Deep Learning Revolution (The MIT Press) by Terrence J. Sejnowski

AI winter, Albert Einstein, algorithmic bias, algorithmic trading, AlphaGo, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, autonomous vehicles, backpropagation, Baxter: Rethink Robotics, behavioural economics, bioinformatics, cellular automata, Claude Shannon: information theory, cloud computing, complexity theory, computer vision, conceptual framework, constrained optimization, Conway's Game of Life, correlation does not imply causation, crowdsourcing, Danny Hillis, data science, deep learning, DeepMind, delayed gratification, Demis Hassabis, Dennis Ritchie, discovery of DNA, Donald Trump, Douglas Engelbart, driverless car, Drosophila, Elon Musk, en.wikipedia.org, epigenetics, Flynn Effect, Frank Gehry, future of work, Geoffrey Hinton, Google Glasses, Google X / Alphabet X, Guggenheim Bilbao, Gödel, Escher, Bach, haute couture, Henri Poincaré, I think there is a world market for maybe five computers, industrial robot, informal economy, Internet of things, Isaac Newton, Jim Simons, John Conway, John Markoff, John von Neumann, language acquisition, Large Hadron Collider, machine readable, Mark Zuckerberg, Minecraft, natural language processing, Neil Armstrong, Netflix Prize, Norbert Wiener, OpenAI, orbital mechanics / astrodynamics, PageRank, pattern recognition, pneumatic tube, prediction markets, randomized controlled trial, Recombinant DNA, recommendation engine, Renaissance Technologies, Rodney Brooks, self-driving car, Silicon Valley, Silicon Valley startup, Socratic dialogue, speech recognition, statistical model, Stephen Hawking, Stuart Kauffman, theory of mind, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, traveling salesman, Turing machine, Von Neumann architecture, Watson beat the top human players on Jeopardy!, world market for maybe five computers, X Prize, Yogi Berra

But the terabyte-scale data sets collected by the Sloan Digital Sky Survey will themselves be outstripped a thousandfold by the petabyte-scale data sets to be collected by the Large Synoptic Sky Survey Telescope (https://www.lsst.org/) under construction. When Yann LeCun founded the Center for Data Science at New York University in 2013, faculty from every department came knocking on his door with data in hand. In 2018 UCSD dedicated a new Halıcıoğlu Data Science Institute. Master’s in Data Science degrees (MDSs) are becoming as popular as MBAs. Neural Information Processing Systems 165 Deep Learning at the Gaming Table Deep learning came of age at the 2012 NIPS Conference at Lake Tahoe (figure 11.3).

George Orwell, Nineteen Eighty-Four (London: Secker & Warburg, 1949). This book has recently taken on new meaning. 6. Founded in 2006, Women in Machine Learning has been creating opportunities for women in machine learning to present and promote their research. See http:// wimlworkshop.org. Chapter 12 1. The Kaggle website has a million data scientists who vie with each other to win the prize with the best performance. Cade Metz, “Uncle Sam Wants Your Deep Neural Networks,” New York Times, June 22, 2017, https://www.nytimes .com/2017/06/22/technology/homeland-security-artificial-intelligence-neural -network.html. Notes 307 2. For a video of my lecture “Cognitive Computing: Past and Present,” see https:// www.youtube.com/watch?

In the 1980s, there was hostility from faculty in their department toward neural networks, which was common at many institutions, but this did not deter either Ben or Andreas. Indeed, Andreas would go on to become a full professor at Hopkins and to cofound the Johns Hopkins University Center for Language and Speech Processing. Ben has a consulting group on data science for political and corporate clients. Learning to Recognize Handwritten Zip Codes More recently, Geoffrey Hinton and his students at the University of Toronto trained a Boltzmann machine with three layers of hidden units to classify handwritten zip codes with high accuracy (figure 7.6).20 Because the Boltzmann network had feedback as well as feedforward connections, it was possible to run the network in reverse, clamping one of the output units and generating input patterns that corresponded to the clamped output unit (figure 7.7).


pages: 232 words: 71,237

Kill It With Fire: Manage Aging Computer Systems by Marianne Bellotti

anti-pattern, barriers to entry, business logic, cloud computing, cognitive bias, computer age, continuous integration, create, read, update, delete, Daniel Kahneman / Amos Tversky, data science, database schema, Dennis Ritchie, DevOps, fault tolerance, fear of failure, Google Chrome, Hans Moravec, iterative process, Ken Thompson, loose coupling, microservices, minimum viable product, Multics, no silver bullet, off-by-one error, platform as a service, pull request, QWERTY keyboard, Richard Stallman, risk tolerance, Schrödinger's Cat, side project, software as a service, Steven Levy, systems thinking, web application, Y Combinator, Y2K

The team maintaining the complete system has about 11 people on it. Four people are on operations, maintaining the servers and building tooling to help enforce standards. Four people are on the data science team, designing models and writing the code to implement them, and the remaining three people build the web services. That three-person team maintains Service B but also another service elsewhere in the system. The data science team maintains Service A, but also two other services. Both of those teams are a bit overloaded for their staffing levels, but the usage of the system is low, so the pressure isn’t too great.

Organizations tend to have responsibility gaps in the following areas: So-called 20 percent projects, or tools and services built (usually by a single engineer) as a side project. Interfaces. Not so much visual design but common components that were built to standardize experience or style before the organization was large enough to run a team to maintain them. New specializations. Is the role of a data engineer closer to a database administrator or a data scientist? Product engineering versus whatever the product runs on. Dev-Ops/site reliability engineering (SRE) didn’t solve that problem; this just moved it under more abstraction layers. If you’ve automated your infrastructure configuration, great—who maintains the automation tools? When there’s a responsibility gap, the organization has a blind spot.


pages: 339 words: 94,769

Possible Minds: Twenty-Five Ways of Looking at AI by John Brockman

AI winter, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Alignment Problem, AlphaGo, artificial general intelligence, Asilomar, autonomous vehicles, basic income, Benoit Mandelbrot, Bill Joy: nanobots, Bletchley Park, Buckminster Fuller, cellular automata, Claude Shannon: information theory, Computing Machinery and Intelligence, CRISPR, Daniel Kahneman / Amos Tversky, Danny Hillis, data science, David Graeber, deep learning, DeepMind, Demis Hassabis, easy for humans, difficult for computers, Elon Musk, Eratosthenes, Ernest Rutherford, fake news, finite state, friendly AI, future of work, Geoffrey Hinton, Geoffrey West, Santa Fe Institute, gig economy, Hans Moravec, heat death of the universe, hype cycle, income inequality, industrial robot, information retrieval, invention of writing, it is difficult to get a man to understand something, when his salary depends on his not understanding it, James Watt: steam engine, Jeff Hawkins, Johannes Kepler, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John von Neumann, Kevin Kelly, Kickstarter, Laplace demon, Large Hadron Collider, Loebner Prize, machine translation, market fundamentalism, Marshall McLuhan, Menlo Park, military-industrial complex, mirror neurons, Nick Bostrom, Norbert Wiener, OpenAI, optical character recognition, paperclip maximiser, pattern recognition, personalized medicine, Picturephone, profit maximization, profit motive, public intellectual, quantum cryptography, RAND corporation, random walk, Ray Kurzweil, Recombinant DNA, Richard Feynman, Rodney Brooks, self-driving car, sexual politics, Silicon Valley, Skype, social graph, speech recognition, statistical model, Stephen Hawking, Steven Pinker, Stewart Brand, strong AI, superintelligent machines, supervolcano, synthetic biology, systems thinking, technological determinism, technological singularity, technoutopianism, TED Talk, telemarketer, telerobotics, The future is already here, the long tail, the scientific method, theory of mind, trolley problem, Turing machine, Turing test, universal basic income, Upton Sinclair, Von Neumann architecture, Whole Earth Catalog, Y2K, you are the product, zero-sum game

That goal was abandoned as they evolved into mathematical abstractions unrelated to how neurons actually function. But now there’s a kind of convergence that can be thought of as forward- rather than reverse-engineering biology, as the results of deep learning echo brain layers and regions. One of the most difficult research projects I’ve managed paired what we’d now call data scientists with AI pioneers. It was a miserable experience in moving goalposts. As the former progressed in solving long-standing problems posed by the latter, this was deemed to not count because it wasn’t accompanied by corresponding leaps in understanding the solutions. What’s the value of a chess-playing computer if you can’t explain how it plays chess?

By imposing statistical prediction, she continues, law enforcement in Camden during her tenure was able to reduce murders by 41 percent, saving thirty-seven lives, while dropping the total crime rate by 26 percent. After joining the Arnold Foundation as its vice president for criminal justice, she established a team of data scientists and statisticians to create a risk-assessment tool; fundamentally, she construed the team’s mission as deciding how to put “dangerous people” in jail while releasing the nondangerous. “The reason for this,” Milgram contended, “is the way we make decisions. Judges have the best intentions when they make these decisions about risk, but they’re making them subjectively.

Such an experiment would never have occurred to a Babylonian data fitter. Model-blind approaches impose intrinsic limitations on the cognitive tasks that Strong AI can perform. My general conclusion is that human-level AI cannot emerge solely from model-blind learning machines; it requires the symbiotic collaboration of data and models. Data science is a science only to the extent that it facilitates the interpretation of data—a two-body problem, connecting data to reality. Data alone are hardly a science, no matter how “big” they get and how skillfully they are manipulated. Opaque learning systems may get us to Babylon, but not to Athens.


The Book of Why: The New Science of Cause and Effect by Judea Pearl, Dana Mackenzie

affirmative action, Albert Einstein, AlphaGo, Asilomar, Bayesian statistics, computer age, computer vision, Computing Machinery and Intelligence, confounding variable, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, data science, deep learning, DeepMind, driverless car, Edmond Halley, Elon Musk, en.wikipedia.org, experimental subject, Great Leap Forward, Gregor Mendel, Isaac Newton, iterative process, John Snow's cholera map, Loebner Prize, loose coupling, Louis Pasteur, Menlo Park, Monty Hall problem, pattern recognition, Paul Erdős, personalized medicine, Pierre-Simon Laplace, placebo effect, Plato's cave, prisoner's dilemma, probability theory / Blaise Pascal / Pierre de Fermat, randomized controlled trial, Recombinant DNA, selection bias, self-driving car, seminal paper, Silicon Valley, speech recognition, statistical model, Stephen Hawking, Steve Jobs, strong AI, The Design of Experiments, the scientific method, Thomas Bayes, Turing test

The rest of statistics, including the many disciplines that looked to it for guidance, remained in the Prohibition era, falsely believing that the answers to all scientific questions reside in the data, to be unveiled through clever data-mining tricks. Much of this data-centric history still haunts us today. We live in an era that presumes Big Data to be the solution to all our problems. Courses in “data science” are proliferating in our universities, and jobs for “data scientists” are lucrative in the companies that participate in the “data economy.” But I hope with this book to convince you that data are profoundly dumb. Data can tell you that the people who took a medicine recovered faster than those who did not take it, but they can’t tell you why.

Chapter 1 assembles the three steps of observation, intervention, and counterfactuals into the Ladder of Causation, the central metaphor of this book. It will also expose you to the basics of reasoning with causal diagrams, our main modeling tool, and set you well on your way to becoming a proficient causal reasoner—in fact, you will be far ahead of generations of data scientists who attempted to interpret data through a model-blind lens, oblivious to the distinctions that the Ladder of Causation illuminates. Chapter 2 tells the bizarre story of how the discipline of statistics inflicted causal blindness on itself, with far-reaching effects for all sciences that depend on data.

I have watched its progress take shape in students’ cubicles and research laboratories, and I have heard its breakthroughs resonate in somber scientific conferences, far from the limelight of public attention. Now, as we enter the era of strong artificial intelligence (AI) and many tout the endless possibilities of Big Data and deep learning, I find it timely and exciting to present to the reader some of the most adventurous paths that the new science is taking, how it impacts data science, and the many ways in which it will change our lives in the twenty-first century. When you hear me describe these achievements as a “new science,” you may be skeptical. You may even ask, Why wasn’t this done a long time ago? Say when Virgil first proclaimed, “Lucky is he who has been able to understand the causes of things” (29 BC).


Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth by Stuart Ritchie

Albert Einstein, anesthesia awareness, autism spectrum disorder, Bayesian statistics, Black Lives Matter, Carmen Reinhart, Cass Sunstein, Charles Babbage, citation needed, Climatic Research Unit, cognitive dissonance, complexity theory, coronavirus, correlation does not imply causation, COVID-19, crowdsourcing, data science, deindustrialization, Donald Trump, double helix, en.wikipedia.org, epigenetics, Estimating the Reproducibility of Psychological Science, fake news, Goodhart's law, Growth in a Time of Debt, Helicobacter pylori, Higgs boson, hype cycle, Kenneth Rogoff, l'esprit de l'escalier, Large Hadron Collider, meta-analysis, microbiome, Milgram experiment, mouse model, New Journalism, ocean acidification, p-value, phenotype, placebo effect, profit motive, publication bias, publish or perish, quantum entanglement, race to the bottom, randomized controlled trial, recommendation engine, rent-seeking, replication crisis, Richard Thaler, risk tolerance, Ronald Reagan, Scientific racism, selection bias, Silicon Valley, Silicon Valley startup, social distancing, Stanford prison experiment, statistical model, stem cell, Steven Pinker, TED Talk, Thomas Bayes, twin studies, Tyler Cowen, University of East Anglia, Wayback Machine

Meta-science experiments in which multiple research groups are tasked with analysing the same dataset or designing their own study from scratch to test the same hypothesis, have found a high degree of variation in method and results.70 Endless choices offer endless opportunities for scientists who begin their analysis without a clear idea of what they’re looking for. But as should now be clear, more analyses mean more chances for false-positive results. As the data scientists Tal Yarkoni and Jake Westfall explain, ‘The more flexible a[n] … investigator is willing to be – that is, the wider the range of patterns they are willing to ‘see’ in the data – the greater the risk of hallucinating a pattern that is not there at all.’71 It gets worse. So far, I’ve made it sound as though all p-hacking is done explicitly – running lots of analyses and publishing only those that give p-values lower than 0.05.

But the advocates of 0.005 are making the case that the problem of false positives, which their method would likely reduce, is a more pressing concern than that of false negatives. Here’s another way to deal with statistical bias and p-hacking: take the analysis completely out of the researchers’ hands. In this scenario, upon collecting their data, scientists would hand them over for analysis to independent statisticians or other experts, who would presumably be mostly free of the specific biases and desires of those who designed and performed the experiment.33 Such a system would be tricky to run and one can imagine it leading to conflict when scientists disagree with the analysis or interpretation that their assigned statistician has imposed on their precious data.34 As with some of the radical ideas for reforms that we’ll see later in the chapter, it could still be worth trying at small scale.

The authors wryly concluded that by ‘extrapolating the upward trend of positive words over the past forty years to the future, we predict that the word ‘novel’ will appear in every [abstract] by the year 2123’.57 It seems doubtful that scientific innovation has genuinely accelerated alongside the dramatic upsurge in hyperbolic language.58 A more likely explanation is that scientists are using this kind of language more frequently because it’s a great way to make their results appeal to readers and, perhaps more importantly, to the reviewers and editors of big-name journals. The most glamorous journals state on their websites that they want papers that have ‘great potential impact’ (Nature); that are ‘most influential in their fields’ and ‘present novel and broadly important data’ (Science); and that are of ‘unusual significance’ (Cell) or ‘exceptional importance’ (Proceedings of the National Academy of Sciences).59 Conspicuous by their absence from this list are any words about rigour or replicability – though hats off to the New England Journal of Medicine, the world’s top medical journal, for stating that it’s looking for ‘scientific accuracy, novelty, and importance’, in that order.60 The steep rise in positive-sounding phrases in scientific journals tells us that hype isn’t just restricted to press releases and popular-science books: it has seeped into the way scientists write their papers.


pages: 1,331 words: 163,200

Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron

AlphaGo, Amazon Mechanical Turk, Anton Chekhov, backpropagation, combinatorial explosion, computer vision, constrained optimization, correlation coefficient, crowdsourcing, data science, deep learning, DeepMind, don't repeat yourself, duck typing, Elon Musk, en.wikipedia.org, friendly AI, Geoffrey Hinton, ImageNet competition, information retrieval, iterative process, John von Neumann, Kickstarter, machine translation, natural language processing, Netflix Prize, NP-complete, OpenAI, optical character recognition, P = NP, p-value, pattern recognition, pull request, recommendation engine, self-driving car, sentiment analysis, SpamAssassin, speech recognition, stochastic process

Then, before we set out to explore the Machine Learning continent, we will take a look at the map and learn about the main regions and the most notable landmarks: supervised versus unsupervised learning, online versus batch learning, instance-based versus model-based learning. Then we will look at the workflow of a typical ML project, discuss the main challenges you may face, and cover how to evaluate and fine-tune a Machine Learning system. This chapter introduces a lot of fundamental concepts (and jargon) that every data scientist should know by heart. It will be a high-level overview (the only chapter without much code), all rather simple, but you should make sure everything is crystal-clear to you before continuing to the rest of the book. So grab a coffee and let’s get started! Tip If you already know all the Machine Learning basics, you may want to skip directly to Chapter 2.

Poor-Quality Data Obviously, if your training data is full of errors, outliers, and noise (e.g., due to poor-quality measurements), it will make it harder for the system to detect the underlying patterns, so your system is less likely to perform well. It is often well worth the effort to spend time cleaning up your training data. The truth is, most data scientists spend a significant part of their time doing just that. For example: If some instances are clearly outliers, it may help to simply discard them or try to fix the errors manually. If some instances are missing a few features (e.g., 5% of your customers did not specify their age), you must decide whether you want to ignore this attribute altogether, ignore these instances, fill in the missing values (e.g., with the median age), or train one model with the feature and one model without it, and so on.

In practice it often creates a few clusters per person, and sometimes mixes up two people who look alike, so you need to provide a few labels per person and manually clean up some clusters. 5 By convention, the Greek letter θ (theta) is frequently used to represent model parameters. 6 The code assumes that prepare_country_stats() is already defined: it merges the GDP and life satisfaction data into a single Pandas dataframe. 7 It’s okay if you don’t understand all the code yet; we will present Scikit-Learn in the following chapters. 8 For example, knowing whether to write “to,” “two,” or “too” depending on the context. 9 Figure reproduced with permission from Banko and Brill (2001), “Learning Curves for Confusion Set Disambiguation.” 10 “The Unreasonable Effectiveness of Data,” Peter Norvig et al. (2009). 11 “The Lack of A Priori Distinctions Between Learning Algorithms,” D. Wolperts (1996). Chapter 2. End-to-End Machine Learning Project In this chapter, you will go through an example project end to end, pretending to be a recently hired data scientist in a real estate company.1 Here are the main steps you will go through: Look at the big picture. Get the data. Discover and visualize the data to gain insights. Prepare the data for Machine Learning algorithms. Select a model and train it. Fine-tune your model. Present your solution. Launch, monitor, and maintain your system.


pages: 411 words: 98,128

Bezonomics: How Amazon Is Changing Our Lives and What the World's Best Companies Are Learning From It by Brian Dumaine

activist fund / activist shareholder / activist investor, AI winter, Airbnb, Amazon Robotics, Amazon Web Services, Atul Gawande, autonomous vehicles, basic income, Bernie Sanders, Big Tech, Black Swan, call centre, Cambridge Analytica, carbon tax, Carl Icahn, Chris Urmson, cloud computing, corporate raider, creative destruction, Danny Hillis, data science, deep learning, Donald Trump, Elon Musk, Erik Brynjolfsson, Fairchild Semiconductor, fake news, fulfillment center, future of work, gig economy, Glass-Steagall Act, Google Glasses, Google X / Alphabet X, income inequality, independent contractor, industrial robot, Internet of things, Jeff Bezos, job automation, Joseph Schumpeter, Kevin Kelly, Kevin Roose, Lyft, Marc Andreessen, Mark Zuckerberg, military-industrial complex, money market fund, natural language processing, no-fly zone, Ocado, pets.com, plutocrats, race to the bottom, ride hailing / ride sharing, Salesforce, Sand Hill Road, self-driving car, shareholder value, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Snapchat, speech recognition, Steve Jobs, Stewart Brand, supply-chain management, TED Talk, Tim Cook: Apple, too big to fail, Travis Kalanick, two-pizza team, Uber and Lyft, uber lyft, universal basic income, warehouse automation, warehouse robotics, wealth creators, web application, Whole Earth Catalog, work culture

The company has gotten so good at applying computer technology that it has started to learn and get smarter on its own. No corporation has ever done this as successfully as Amazon. A lot of CEOs pay lip service to AI and hire a handful of data scientists in an effort to tack this technology onto their business model. At Amazon, technology is the key driver to everything it does. Consider that for the development and upgrading of its magic genie, Alexa, which runs on AI voice software, the company as of 2019 had deployed some ten thousand workers, the lion’s share of which were data scientists, engineers, and programmers. From day one, Amazon has been a technology company that just happens to sell books. Since those early days, Bezos has made big data and AI the heart of the company.

It comes as little surprise: “IDC Survey Finds Artificial Intelligence to Be a Priority for Organizations, but Few Have Implemented an Enterprise-Wide Strategy,” Business Wire, July 8, 2019. Today an estimated 35 percent: Stephen Cohn and Matthew W. Granade, “Models Will Run the World,” Wall Street Journal, August 19, 2018. This is why a computer science graduate in the U.S.: Entry Level Data Scientist Salaries, Glassdoor, https://www.glassdoor.com/Salaries/entry-level-data-scientist-salary-SRCH_KO0,26.htm. Facebook’s algorithms keep getting better: “Number of Monthly Active Facebook Users Worldwide as of 2nd Quarter 2019 (in Millions),” Statista, 2019, https://www.statista.com/statistics/264810/number-of-monthly-active-facebook-users-worldwide/.

Smart algorithms every day, every hour, every second learn how to please Amazon’s customers by figuring out ways to lower prices or speed up a delivery or suggest the appropriate songs or movies or have Alexa answer a question correctly in a few milliseconds. Think of this new iteration as the AI flywheel. The tens of thousands of engineers, data scientists, and programmers whom Bezos has hired have made the AI flywheel a learning machine, a cyber contraption with its own intelligence that takes all the data that Amazon collects on its 300 million customers and then analyzes it in minute detail. The machine makes decisions about what items to purchase, how much to charge for them, and where in the world to stock them.


Seeking SRE: Conversations About Running Production Systems at Scale by David N. Blank-Edelman

Affordable Care Act / Obamacare, algorithmic trading, AlphaGo, Amazon Web Services, backpropagation, Black Lives Matter, Bletchley Park, bounce rate, business continuity plan, business logic, business process, cloud computing, cognitive bias, cognitive dissonance, cognitive load, commoditize, continuous integration, Conway's law, crowdsourcing, dark matter, data science, database schema, Debian, deep learning, DeepMind, defense in depth, DevOps, digital rights, domain-specific language, emotional labour, en.wikipedia.org, exponential backoff, fail fast, fallacies of distributed computing, fault tolerance, fear of failure, friendly fire, game design, Grace Hopper, imposter syndrome, information retrieval, Infrastructure as a Service, Internet of things, invisible hand, iterative process, Kaizen: continuous improvement, Kanban, Kubernetes, loose coupling, Lyft, machine readable, Marc Andreessen, Maslow's hierarchy, microaggression, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, pull request, RAND corporation, remote working, Richard Feynman, risk tolerance, Ruby on Rails, Salesforce, scientific management, search engine result page, self-driving car, sentiment analysis, Silicon Valley, single page application, Snapchat, software as a service, software is eating the world, source of truth, systems thinking, the long tail, the scientific method, Toyota Production System, traumatic brain injury, value engineering, vertical integration, web application, WebSocket, zero day

I strongly believe that we are now, in 2018, very close to 2001’s incredible imagined achievements. Data science and machine learning software have been available for some time now. The languages they use makes them more reachable by engineers to explore new ways to apply machine learning. In Figure 18-1, we see Python and the R language being the clear winners in this field. Notably, Anaconda, TensorFlow, and scikit-learn all use Python for interfacing with the user. For the hands-on section later in this chapter, we use TensorFlow, Python, and some Keras. Figure 18-1. Different software for machine learning (source: https://www.kdnuggets.com/2017/05/poll-analytics-data-science-machine-learning-software-leaders.html) What Is Machine Learning?

First, we need to share a little context about our engineering culture: at Spotify, we organize into small, autonomous teams. The idea is that every team owns a certain feature or user experience from front to back. In practice, this means a single engineering team consists of a cross-functional set of developers — from designer to backend developer to data scientist — working together on the various Spotify clients, backend services, and data pipelines. To support our feature teams, we created groups centered around infrastructure. These infrastructure teams in turn also became small, cross-functional, and autonomous to provide self-service infrastructure products.

You can start small by ensuring that service level indicators are inclusively measuring the experiences of all users, not just able-bodied users with fast, low-latency internet connections. Expanding your reach further, you can advocate for equity in products you contribute to. And your SRE skills come in useful if you choose to participate in social movements. Contributor Bio Emily Gorcenski is a data scientist and anti-racist activist from Charlottesville, Virginia, who now lives in Berlin. Her passion is the intersection of technology, regulation, and society, and she is a tireless advocate of transgender rights. Liz Fong-Jones is a developer advocate, activist, and site reliability engineer (SRE) with 14+ years of experience based out of Brooklyn, New York, and San Francisco, California.


Industry 4.0: The Industrial Internet of Things by Alasdair Gilchrist

3D printing, additive manufacturing, air gap, AlphaGo, Amazon Web Services, augmented reality, autonomous vehicles, barriers to entry, business intelligence, business logic, business process, chief data officer, cloud computing, connected car, cyber-physical system, data science, deep learning, DeepMind, deindustrialization, DevOps, digital twin, fault tolerance, fulfillment center, global value chain, Google Glasses, hiring and firing, industrial robot, inflight wifi, Infrastructure as a Service, Internet of things, inventory management, job automation, low cost airline, low skilled workers, microservices, millennium bug, OSI model, pattern recognition, peer-to-peer, platform as a service, pre–internet, race to the bottom, RFID, Salesforce, Skype, smart cities, smart grid, smart meter, smart transportation, software as a service, stealth mode startup, supply-chain management, The future is already here, trade route, undersea cable, vertical integration, warehouse robotics, web application, WebRTC, Y2K

Adequately Skilled and Trained Staff This is imperative if you expect to benefit from serious analytics work as you will certainly need skilled data scientists, process engineers, and electromechanical engineers. Securing talent with the correct skills is proving to be a daunting task as colleges and universities seem to be behind the curve and are still pushing school leavers into careers as programmers rather than data scientists. This doesn’t seem to be changing anytime soon. This is despite the huge demand for data scientists and electro-mechanical engineers predicted over the next decade. The harsh financial reality is that the better the data analytical skills, the more likely the company can produce the algorithms required to distil information from their vast data lakes.

The whole point of intelligent devices in the Industrial Internet context is to harvest raw data and then manage the data flow, from device to the data store, to the analytic systems, to the data scientists, to the process, and then back to the device. This is the data flow cycle, where data flows from intelligent devices, through the gathering and analytical apparatus before perhaps returning as control feedback into the device. It is within this cycle where data scientists can extract prime value from the information. Key Opportunities and Benefits Not unexpectedly, when asked which key benefits most IIoT adopters want from the Industrial Internet, they say increased profits, increased revenue flows, and lower operational expenditures, in that order.

The answer is that currently they cannot, yet they can collect data in huge quantities and store it in distributed data storage facilities such as the cloud, and even take advantage of advanced analytical software to try to determine trends and correlation. However, we are not able to actually achieve this feat now, as we do not know the right questions to ask of the data. What we will require are data scientists, people skilled in understanding and trolling through vast quantities of unstructured data in search of sense and order, to distinguish patterns that ultimately will deliver value. Data scientists can use their skills in data analysis to determine patterns in the data, which is the core of M2M communication and understanding, while at the same time ask the relevant questions that derive true value from the data that will empower business strategy.


pages: 296 words: 78,631

Hello World: Being Human in the Age of Algorithms by Hannah Fry

23andMe, 3D printing, Air France Flight 447, Airbnb, airport security, algorithmic bias, algorithmic management, augmented reality, autonomous vehicles, backpropagation, Brixton riot, Cambridge Analytica, chief data officer, computer vision, crowdsourcing, DARPA: Urban Challenge, data science, deep learning, DeepMind, Douglas Hofstadter, driverless car, Elon Musk, fake news, Firefox, Geoffrey Hinton, Google Chrome, Gödel, Escher, Bach, Ignaz Semmelweis: hand washing, John Markoff, Mark Zuckerberg, meta-analysis, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, pattern recognition, Peter Thiel, RAND corporation, ransomware, recommendation engine, ride hailing / ride sharing, selection bias, self-driving car, Shai Danziger, Silicon Valley, Silicon Valley startup, Snapchat, sparse data, speech recognition, Stanislav Petrov, statistical model, Stephen Hawking, Steven Levy, systematic bias, TED Talk, Tesla Model S, The Wisdom of Crowds, Thomas Bayes, trolley problem, Watson beat the top human players on Jeopardy!, web of trust, William Langewiesche, you are the product

Those rules had previously been approved in October 2016 by the Federal Communications Commission; but, after the change in government at the end of that year, they were opposed by the FCC’s new Republican majority and Republicans in Congress.15 So what does all this mean for your privacy? Well, let me tell you about an investigation led by German journalist Svea Eckert and data scientist Andreas Dewes that should give you a clear idea.16 Eckert and her team set up a fake data broker and used it to buy the anonymous browsing data of 3 million German citizens. (Getting hold of people’s internet histories was easy. Plenty of companies had an abundance of that kind of data for sale on British or US customers – the only challenge was finding data focused on Germany.)

As Rogier Creemers, an academic specializing in Chinese law and governance at the Van Vollenhoven Institute at Leiden University, puts it: ‘The best way to understand it is as a sort of bastard love child of a loyalty scheme.’29 I don’t have much comfort to offer in the case of Sesame Credit, but I don’t want to fill you completely with doom and gloom, either. There are glimmers of hope elsewhere. However grim the journey ahead appears, there are signs that the tide is slowly turning. Many in the data science community have known about and objected to the exploitation of people’s information for profit for quite some time. But until the furore over Cambridge Analytica these issues hadn’t drawn sustained, international front-page attention. When that scandal broke in early 2018 the general public saw for the first time how algorithms are silently harvesting their data, and acknow­ledged that, without oversight or regulation, it could have dramatic repercussions.

For more on this, see James Surowiecki, The Wisdom of Crowds: Why the Many Are Smarter than the Few (New York: Doubleday, 2004), p. 4. 22. Netflix Technology Blog, https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-2-d9b96aa399f5. 23. Shih-ho Cheng, ‘Unboxing the random forest classifier: the threshold distributions’, Airbnb Engineering and Data Science, https://medium.com/airbnb-engineering/unboxing-the-random-forest-classifier-the-threshold-distributions-22ea2bb58ea6. 24. Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig and Sendhil Mullainathan, Human Decisions and Machine Predictions, NBER Working Paper no. 23180 (Cambridge, MA: National Bureau of Economic Research, Feb. 2017), http://www.nber.org/papers/w23180.


pages: 277 words: 70,506

We Are Bellingcat: Global Crime, Online Sleuths, and the Bold Future of News by Eliot Higgins

4chan, active measures, Andy Carvin, anti-communist, anti-globalists, barriers to entry, belling the cat, Bellingcat, bitcoin, blockchain, citizen journalism, Columbine, coronavirus, COVID-19, crowdsourcing, cryptocurrency, data science, deepfake, disinformation, Donald Trump, driverless car, Elon Musk, en.wikipedia.org, failed state, fake news, false flag, gamification, George Floyd, Google Earth, hive mind, Julian Assange, Kickstarter, lateral thinking, off-the-grid, OpenAI, pattern recognition, post-truth, rolodex, Seymour Hersh, Silicon Valley, Skype, Tactical Technology Collective, the scientific method, WikiLeaks

But beyond sensible precautions, I find no reason to panic. Our opponents could try to harm me. Yet Bellingcat has become far more than a single person. In the year after our first Skripal investigations, Bellingcat opened our first office, a proper workspace in The Hague added to the mailing address in Leicester. We hired a business director, a data scientist and administrative experts, too, nearing twenty staffers – still nimble and innovative but with the heft of an established enterprise. While it’s true that I could do little to stop an attack, our opponents could do nothing to stop what we are becoming. 5 Next Steps The future of justice and the power of AI Eighteen men in orange jumpsuits knelt on the ground, hands tied behind their backs, hoods over their heads.

While the security services had sixty staffers struggling to advance their investigation, he watched Bellingcat swiftly pull together the route of the Buk launcher from open sources alone. When he pointed this out to colleagues, some were receptive. But the police force was conservative and preferred traditional methods of investigation. Soon thereafter, he completed a Ph.D. on anticipating criminal behaviour; with ambitions to use data science for better policing, he quit and founded his own company, Pandora Intelligence. One of its notable projects revamps the emergency-dispatch call using OSINT. In the traditional scenario, a dispatcher answers, notes down what is deemed significant, then forwards a briefing to an emergency service unit.

Yet at the executive level, many organisations – in the news media, human-rights activism, humanitarian law and beyond – still do not realise what is possible. In order to seed these powers among the younger generation, Bellingcat has launched a pilot training programme for university students in the Netherlands, attempting to build a grassroots movement among those studying journalism, data science and visualisation. With Dutch government funding, we conducted our first project in Utrecht, instructing a score of university students aged eighteen to twenty-five. The programme proved so successful that we expanded to five bootcamps. Each involved five days of training, spread over several weeks, allowing students to consolidate and practise the Bellingcat method: geolocation, chronolocation, social-media trawling, and so on.


pages: 290 words: 87,549

The Airbnb Story: How Three Ordinary Guys Disrupted an Industry, Made Billions...and Created Plenty of Controversy by Leigh Gallagher

Abraham Maslow, Airbnb, Amazon Web Services, barriers to entry, Ben Horowitz, Bernie Sanders, Blitzscaling, cloud computing, crowdsourcing, data science, don't be evil, Donald Trump, East Village, Elon Musk, fixed-gear, gentrification, geopolitical risk, growth hacking, Hacker News, hockey-stick growth, housing crisis, iterative process, Jeff Bezos, John Zimmer (Lyft cofounder), Jony Ive, Justin.tv, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, medical residency, Menlo Park, Network effects, Paul Buchheit, Paul Graham, performance metric, Peter Thiel, RFID, Salesforce, Sam Altman, Sand Hill Road, Saturday Night Live, sharing economy, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, South of Market, San Francisco, Startup school, Steve Jobs, TaskRabbit, TED Talk, the payments system, Tony Hsieh, traumatic brain injury, Travis Kalanick, uber lyft, Y Combinator, yield management

Within that framework, a community-defense team performs proactive work to try to identify suspicious activity in advance, conducting spot checks on reservations and looking for signs that might suggest fraud or bad actors, while a community-response team handles incoming issues. The product team includes data scientists, who create behavioral models to help identify whether a reservation has a higher likelihood of, say, resulting in someone throwing a party or committing a crime (reservations are assigned a credibility score similar to a credit score), and engineers, who use machine learning to develop tools that analyze reservations to help detect risk.

Guesty, a professional management service for hosts, started by Israeli twin brothers, is one of the largest: hosts give Guesty access to their Airbnb accounts, and it handles booking management, all guest communication, calendar updating, and scheduling and coordinating with cleaners and other local service providers, for a fee of 3 percent of the booking charge. San Francisco–based Pillow creates a listing, hires cleaners, handles keys, and employs an algorithm to determine best pricing options. HonorTab brings a minifridge concept to Airbnb. Everbooked was founded by a self-described yield-management geek with expertise in data science who saw the need for dynamic pricing tools for Airbnb hosts. One of the biggest chores that hosts often need help with, for example, is turning over keys to guests. It can be hard to always arrange to be home when the guest arrives, especially if the host has a full-time job, or is out of town, or when travelers’ flights are delayed.

When the entire executive team took the Myers-Briggs Type Indicator personality test at an off-site one year, Blecharczyk registered as an ISTJ personality type, which correlated with the “inspector” role in the related Keirsey Temperament Sorter personality questionnaire. The characterization made the executive team laugh in recognition. (“That’s how they know me,” he says, “as someone who probes the details.”) Over time, Blecharczyk developed an interest in strategy, especially when, as CTO, he began to see more of the insights that were coming out of the data-science department, which reported directly to him. In the summer of 2014, after the executive team was beginning to realize the company wasn’t fully aligned on its many initiatives and goals, Blecharczyk started an “activity map” to document every project being worked on throughout the company. He identified 110 of them, but they were extremely fragmented, with different executives overseeing multiple projects in the same area.


pages: 286 words: 92,521

How Medicine Works and When It Doesn't: Learning Who to Trust to Get and Stay Healthy by F. Perry Wilson

Affordable Care Act / Obamacare, barriers to entry, Barry Marshall: ulcers, cognitive bias, Comet Ping Pong, confounding variable, coronavirus, correlation does not imply causation, COVID-19, data science, Donald Trump, fake news, Helicobacter pylori, Ignaz Semmelweis: hand washing, Louis Pasteur, medical malpractice, meta-analysis, multilevel marketing, opioid epidemic / opioid crisis, p-value, personalized medicine, profit motive, randomized controlled trial, risk tolerance, selection bias, statistical model, stem cell, sugar pill, the scientific method, Thomas Bayes

But we are not here to gawk; we are here to make money. Because tonight, for once, the odds are in our favor. You see, one of my data science interns, in an effort to please his cantankerous boss, has hacked into the servers controlling the slot machines and adjusted their payouts to our benefit. Two machines were successfully hacked. We just need to decide which one to play. Hacked slot machine #1 costs $1 per pull on the lever. The payout for a jackpot is $200. And thanks to my data science intern, we will hit that jackpot, on average, one out of every one hundred times. These are great odds, as you can see—chances are that by spending $100, we will win $200.

I spend most of my days at Yale’s Clinical and Translational Research Accelerator with members of my work family, who have heard about this book for the past two years in one form or another. I am thankful for their indulgence as I tried to get this project over the finish line. The team—researchers, statisticians, data scientists, technicians, students, coordinators, administrators—are all consummate scientific professionals, but they are also wonderful, compassionate individuals. They are the very engine of progress. Thanks especially to Deb Kearns, who somehow managed to keep my schedule intact during this period. Butterfly to the moon.

These are great odds, as you can see—chances are that by spending $100, we will win $200. It’s not guaranteed, of course, but it’s a hell of a lot better than the normal house odds. Hacked slot machine #2 is in the high-roller area. It costs $1,000 per pull on the lever. Like slot machine #1, the data science intern has hacked it to pay out, on average, one out of every one hundred plays. But this machine has an even better jackpot: $300,000. That’s right. We have a one in one hundred chance for a jackpot on both machines, but the jackpot for machine #1 is two hundred times the price to play, whereas it is three hundred times the price to play for machine #2.


Mastering Machine Learning With Scikit-Learn by Gavin Hackeling

backpropagation, computer vision, constrained optimization, correlation coefficient, data science, Debian, deep learning, distributed generation, iterative process, natural language processing, Occam's razor, optical character recognition, performance metric, recommendation engine

His work has appeared at top dependability conferences—DSN, ISSRE, ICAC, Middleware, and SRDS—and he has been awarded grants to attend DSN, ICAC, and ICNP. Fahad has also been an active contributor to security research while working as a cybersecurity engineer at NEEScomm IT. He has recently taken on a position as a systems engineer in the industry. Sarah Guido is a data scientist at Reonomy, where she's helping build disruptive technology in the commercial real estate industry. She loves Python, machine learning, and the startup world. She is an accomplished conference speaker and an O'Reilly Media author, and is very involved in the Python community. Prior to joining Reonomy, Sarah earned a Master's degree from the University of Michigan School of Information.

Packed with several machine learning libraries available in the Clojure ecosystem. Machine Learning with R ISBN: 978-1-78216-214-8 Paperback: 396 pages Learn how to use R to apply powerful machine learning methods and gain an insight into real-world applications 1. Harness the power of R for statistical computing and data science. 2. Use R to apply common machine learning algorithms with real-world applications. 3. Prepare, examine, and visualize data for analysis. 4. Understand how to choose between machine learning models. Please check www.PacktPub.com for information on our titles www.it-ebooks.info


Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data by Dipanjan Sarkar

bioinformatics, business intelligence, business logic, computer vision, continuous integration, data science, deep learning, Dr. Strangelove, en.wikipedia.org, functional programming, general-purpose programming language, Guido van Rossum, information retrieval, Internet of things, invention of the printing press, iterative process, language acquisition, machine readable, machine translation, natural language processing, out of africa, performance metric, premature optimization, recommendation engine, self-driving car, semantic web, sentiment analysis, speech recognition, statistical model, text mining, Turing test, web application

Automated Text Classification Text Classification Blueprint Text Normalization Feature Extraction Bag of Words Model TF-IDF Model Advanced Word Vectorization Models Classification Algorithms Multinomial Naïve Bayes Support Vector Machines Evaluating Classification Models Building a Multi-Class Classification System Applications and Uses Summary Chapter 5:​ Text Summarization Text Summarization and Information Extraction Important Concepts Documents Text Normalization Feature Extraction Feature Matrix Singular Value Decomposition Text Normalization Feature Extraction Keyphrase Extraction Collocations Weighted Tag–Based Phrase Extraction Topic Modeling Latent Semantic Indexing Latent Dirichlet Allocation Non-negative Matrix Factorization Extracting Topics from Product Reviews Automated Document Summarization Latent Semantic Analysis TextRank Summarizing a Product Description Summary Chapter 6:​ Text Similarity and Clustering Important Concepts Information Retrieval (IR) Feature Engineering Similarity Measures Unsupervised Machine Learning Algorithms Text Normalization Feature Extraction Text Similarity Analyzing Term Similarity Hamming Distance Manhattan Distance Euclidean Distance Levenshtein Edit Distance Cosine Distance and Similarity Analyzing Document Similarity Cosine Similarity Hellinger-Bhattacharya Distance Okapi BM25 Ranking Document Clustering Clustering Greatest Movies of All Time K-means Clustering Affinity Propagation Ward’s Agglomerative Hierarchical Clustering Summary Chapter 7:​ Semantic and Sentiment Analysis Semantic Analysis Exploring WordNet Understanding Synsets Analyzing Lexical Semantic Relations Word Sense Disambiguation Named Entity Recognition Analyzing Semantic Representations Propositional Logic First Order Logic Sentiment Analysis Sentiment Analysis of IMDb Movie Reviews Setting Up Dependencies Preparing Datasets Supervised Machine Learning Technique Unsupervised Lexicon-based Techniques Comparing Model Performances Summary Index Contents at a Glance About the Author About the Technical Reviewer Acknowledgments Introduction Chapter 1:​ Natural Language Basics Chapter 2:​ Python Refresher Chapter 3:​ Processing and Understanding Text Chapter 4:​ Text Classification Chapter 5:​ Text Summarization Chapter 6:​ Text Similarity and Clustering Chapter 7:​ Semantic and Sentiment Analysis Index About the Author and About the Technical Reviewer About the Author Dipanjan Sarkar is a data scientist at Intel, the world’s largest silicon company, which is on a mission to make the world more connected and productive. He primarily works on analytics, business intelligence, application development, and building large-scale intelligent systems. He received his master’s degree in information technology from the International Institute of Information Technology, Bangalore, with a focus on data science and software engineering. He is also an avid supporter of self-learning, especially through massive open online courses, and holds a data science specialization from Johns Hopkins University on Coursera.

SSBM Finance Inc is a Delaware corporation. This book is dedicated to my parents, partner, well-wishers, and especially to all the developers, practitioners, and organizations who have created a wonderful and thriving ecosystem around analytics and data science. Introduction I have been into mathematics and statistics since high school, when numbers began to really interest me. Analytics, data science, and more recently text analytics came much later, perhaps around four or five years ago when the hype about Big Data and Analytics was getting bigger and crazier. Personally I think a lot of it is over-hyped, but a lot of it is also exciting and presents huge possibilities with regard to new jobs, new discoveries, and solving problems that were previously deemed impossible to solve.

Sarkar has been an analytics practitioner for over four years, specializing in statistical, predictive, and text analytics. He has also authored a couple of books on R and machine learning, reviews technical books, and acts as a course beta tester for Coursera. Dipanjan’s interests include learning about new technology, financial markets, disruptive startups, data science, and more recently, artificial intelligence and deep learning. In his spare time he loves reading, gaming, and watching popular sitcoms and football. About the Technical Reviewer Shanky Sharma Currently leading the AI team at Nextremer India, Shanky Sharma’s work entails implementing various AI and machine learning–related projects and working on deep learning for speech recognition in Indic languages.


pages: 567 words: 122,311

Lean Analytics: Use Data to Build a Better Startup Faster by Alistair Croll, Benjamin Yoskovitz

Airbnb, Amazon Mechanical Turk, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, barriers to entry, Bay Area Rapid Transit, Ben Horowitz, bounce rate, business intelligence, call centre, cloud computing, cognitive bias, commoditize, constrained optimization, data science, digital rights, en.wikipedia.org, Firefox, Frederick Winslow Taylor, frictionless, frictionless market, game design, gamification, Google X / Alphabet X, growth hacking, hockey-stick growth, Infrastructure as a Service, Internet of things, inventory management, Kickstarter, lateral thinking, Lean Startup, lifelogging, longitudinal study, Marshall McLuhan, minimum viable product, Network effects, PalmPilot, pattern recognition, Paul Graham, performance metric, place-making, platform as a service, power law, price elasticity of demand, reality distortion field, recommendation engine, ride hailing / ride sharing, rolodex, Salesforce, sentiment analysis, skunkworks, Skype, social graph, social software, software as a service, Steve Jobs, subscription business, telemarketer, the long tail, transaction costs, two-sided market, Uber for X, web application, Y Combinator

They’re uneasy with their companies being optimized without a soul, and see the need to look at the bigger picture of the market, the problem they’re solving, and their fundamental business models. Ultimately, quantitative data is great for testing hypotheses, but it’s lousy for generating new ones unless combined with human introspection. How to Think Like a Data Scientist Monica Rogati, a data scientist at LinkedIn, gave us the following 10 common pitfalls that entrepreneurs should avoid as they dig into the data their startups capture. Assuming the data is clean. Cleaning the data you capture is often most of the work, and the simple act of cleaning it up can often reveal important patterns.

Who This Book Is For This book is for the entrepreneur trying to build something innovative. We’ll walk you through the analytical process, from idea generation to achieving product/market fit and beyond, so this book both is for those starting their entrepreneurial journey as well as those in the middle of it. Web analysts and data scientists may also find this book useful, because it shows how to move beyond traditional “funnel visualizations” and connect their work to more meaningful business discussions. Similarly, business professionals involved in product development, product management, marketing, public relations, and investing will find much of the content relevant, as it will help them understand and assess startups.

Remember, however, that you may still be able to invite them back to the service later if you have significant feature upgrades—as Path did when it redesigned its application—or if you’ve found a way to reach them with daily content, as Memolane did when it sent users memories from past years. As Shopify data scientist Steven H. Noble[32] explains in a detailed blog post,[33] the simple formula for churn is: Table 9-1 shows a simple example of a freemium SaaS company’s churn calculations. Table 9-1. Example of churn calculations Jan Feb Mar Apr May Jun Users Starting with 50,000 53,000 56,300 59,930 63,923 68,315 Newly acquired 3,000 3,600 4,320 5,184 6,221 7,465 Total 53,000 56,600 60,920 66,104 72,325 79,790 Active users Starting with 14,151 15,000 15,900 16,980 18,276 19,831 Newly active 849 900 1080 1,296 1,555 1,866 Total 15,000 15,900 16,980 18,276 19,831 21,697 Paying users Starting with 1,000 1,035 1,035 1049 1,079 1,128 Newly acquired 60 72 86 104 124 149 Lost (25) (26) (27) (29) (30) (33) Total 1,035 1,081 1,140 1,216 1,310 1,426 Table 9-1 shows users, active users, and paying users.


Artificial Whiteness by Yarden Katz

affirmative action, AI winter, algorithmic bias, AlphaGo, Amazon Mechanical Turk, autonomous vehicles, benefit corporation, Black Lives Matter, blue-collar work, Californian Ideology, Cambridge Analytica, cellular automata, Charles Babbage, cloud computing, colonial rule, computer vision, conceptual framework, Danny Hillis, data science, David Graeber, deep learning, DeepMind, desegregation, Donald Trump, Dr. Strangelove, driverless car, Edward Snowden, Elon Musk, Erik Brynjolfsson, European colonialism, fake news, Ferguson, Missouri, general purpose technology, gentrification, Hans Moravec, housing crisis, income inequality, information retrieval, invisible hand, Jeff Bezos, Kevin Kelly, knowledge worker, machine readable, Mark Zuckerberg, mass incarceration, Menlo Park, military-industrial complex, Nate Silver, natural language processing, Nick Bostrom, Norbert Wiener, pattern recognition, phenotype, Philip Mirowski, RAND corporation, recommendation engine, rent control, Rodney Brooks, Ronald Reagan, Salesforce, Seymour Hersh, Shoshana Zuboff, Silicon Valley, Silicon Valley billionaire, Silicon Valley ideology, Skype, speech recognition, statistical model, Stephen Hawking, Stewart Brand, Strategic Defense Initiative, surveillance capitalism, talking drums, telemarketer, The Signal and the Noise by Nate Silver, W. E. B. Du Bois, Whole Earth Catalog, WikiLeaks

This position is echoed by other not-for-profits in the sphere of critical AI commentary, such as the American Civil Liberties Union (ACLU). At an AI policy conference hosted at MIT, which included participants from the American intelligence community and the White House, the ACLU’s executive director stated that “AI has tremendous promise, but it really depends if the data scientists and law enforcement work together.”24 These recommendations highlight the expansionist dimension of carceral-positive logic. Reforms such as data “curation,” “maintenance,” or “auditing” all entail allocating more resources for computing systems used in surveillance and policing and the infrastructure around them, and hence to those profiting from the prison-industrial complex.

AI’s rebranding was an opportunity for companies to privatize more scientific research. Google and Microsoft have used the frenzy around AI to file patents on commonly used algorithmic techniques.23 The label’s nebulosity makes for broad patents: a MasterCard–owned company filed for a patent on a “Method for Providing Data Science, Artificial Intelligence and Machine Learning As-A-Service.” AI made for hazy patents in the past, too: a patent in 1985, simply titled “Artificial Intelligence System,” claimed “an artificial intelligence system for accepting a statement, understanding the statement and making a response to the statement based upon at least a partial understanding of the statement.”

The ideological nucleus of AI discourse stays intact.   15.   As computer scientist and statistician Michael I. Jordan has observed, “Such [re]labeling may come as a surprise to optimization or statistics researchers, who find themselves suddenly called AI researchers.” Jordan, “Artificial Intelligence—The Revolution Hasn’t Happened Yet,” Harvard Data Science Review, June 23, 2019. But while Jordan recognizes a rebranding of sorts, he does not question the possibility and coherence of AI nor the political forces capitalizing on its rise. Rather, he argues that the real workhorses behind the recent advances were his own fields of statistics and optimization, which produced “ideas hidden behind the scenes” that “have powered companies such as Google, Netflix, Facebook, and Amazon.”


pages: 291 words: 90,771

Upscale: What It Takes to Scale a Startup. By the People Who've Done It. by James Silver

Airbnb, augmented reality, Ben Horowitz, Big Tech, blockchain, business process, call centre, credit crunch, crowdsourcing, data science, DeepMind, DevOps, family office, flag carrier, fulfillment center, future of work, Google Hangouts, growth hacking, high net worth, hiring and firing, imposter syndrome, Jeff Bezos, Kickstarter, Lean Startup, Lyft, Mark Zuckerberg, minimum viable product, Network effects, pattern recognition, reality distortion field, ride hailing / ride sharing, Salesforce, Silicon Valley, Skype, Snapchat, software as a service, Uber and Lyft, uber lyft, WeWork, women in the workforce, Y Combinator

‘If you don’t have good people taking control of those areas, who are able to step up and be accountable for key issues like hiring, product and growth, then you won’t really know where to turn as a founder.’ ‘Communication and ongoing dialogue are critical.’ As a founder, you need to make sure that your teams - right down to specialists, if you’re a software company, such as your back-end and front-end developers, data scientists and DevOps person - are talking to one another, and that people are pointed in the same direction. Obviously the more people you have in the organisation, the more challenging that becomes to manage and orchestrate. ‘That’s where things like strategy, culture, goals and objectives become really important because the larger, the more complex the organisation becomes, the more you need things that really bind people together.’

It even got to the point where we were considering putting trackers in the boxes, but it was a time of heightened security concerns and we thought radio transmitters were not a good idea.’ He smiles. ‘Then someone asked: “Can’t you just pay to have these things scanned?” It turned out you could, at seven different places. You just chunked up the money and did it once and did a data science piece [based on] posting them on different days, different strategies, and posting them in different places.’ As a UK-headquartered startup on a limited budget exploring logistics across a vast and complex country, the team ultimately took an imaginative, entrepreneurial approach: namely, one that involved cardboard rabbits.

She explains: ‘When you’re using machine learning, every time you get an answer right, that’s good. However, every time you get an answer wrong, it’s called a “false positive”. There were very few people in the world at that stage that we met that even understood the terminology. So finding the guys that were fighting the fraud battles initially was brilliant, because we spoke the same data-science language, that was the first thing. ‘Then they were able to tell us about the problems they were having with their existing suppliers. We replied: “If we were able to build you something that actually worked better, would that be of interest to you?” And the answer to that was: “Yes, please.”’ King and her team were thus in the enviable position of designing and building a product with continuous input from the customer.


pages: 237 words: 74,109

Uncanny Valley: A Memoir by Anna Wiener

autonomous vehicles, back-to-the-land, basic income, behavioural economics, Blitzscaling, blockchain, blood diamond, Burning Man, call centre, charter city, cloud computing, cognitive bias, cognitive dissonance, commoditize, crowdsourcing, cryptocurrency, dark triade / dark tetrad, data science, digital divide, digital nomad, digital rights, end-to-end encryption, Extropian, functional programming, future of work, gentrification, Golden Gate Park, growth hacking, guns versus butter model, housing crisis, Jane Jacobs, job automation, knowledge worker, Lean Startup, means of production, medical residency, microaggression, microapartment, microdosing, new economy, New Urbanism, Overton Window, passive income, Plato's cave, pull request, rent control, ride hailing / ride sharing, San Francisco homelessness, Sand Hill Road, self-driving car, sharing economy, Shenzhen special economic zone , side project, Silicon Valley, Silicon Valley startup, Social Justice Warrior, social web, South of Market, San Francisco, special economic zone, subprime mortgage crisis, systems thinking, tech bro, tech worker, technoutopianism, telepresence, telepresence robot, union organizing, universal basic income, unpaid internship, urban planning, urban renewal, warehouse robotics, women in the workforce, work culture , Y2K, young professional

* * * I moved into an apartment in the Castro, joining a man and a woman in their late twenties, roommates who had wiggled their way onto a hand-me-down lease. They were tech workers, too. The woman worked as a midlevel product manager at the social network everyone hated; the man as a data scientist at a struggling solar-energy startup. They were both endurance runners; the data scientist kept a road bike in his bedroom. They had no body fat. They had no art in the apartment, either. On the refrigerator was an impressive collection of novelty magnets arranged in a perfect grid. The apartment was gigantic, a duplex with two living rooms and a view of the bay.

It could be integrated into online boutiques, digital megamalls, banks, social networks, streaming and gaming websites. It gathered data for platforms that enabled people to book flights or hotels or restaurant reservations or wedding venues; platforms for buying a house or finding a house cleaner, ordering takeout or arranging a date. Engineers and data scientists and product managers would inject snippets of our code into their own codebases, specify which behaviors they wanted to track, and begin collecting data immediately. Anything an app or website’s users did—tap a button, take a photograph, send a payment, swipe right, enter text—could be recorded in real time, stored, aggregated, and analyzed in those beautiful dashboards.

A pay-as-you-wish yoga studio shared a creaky walk-up with the headquarters of an encrypted-communications platform. A bodega selling loosies sat below an anarchistic hacker space. The older office buildings, regal and unkempt with marble floors and peeling paint, housed orthodontists and rare-book dealers alongside four-person companies trying to gamify human resources or commoditize meditation. Data scientists smoked weed in Dolores Park with Hula-Hoopers and blissed-out suburban teenagers. The independent movie theaters played ads for networked appliances and B2B software before projecting seventies cult classics. Even racks at the dry cleaner suggested a city in transition: starched police uniforms and synthetic neon furs, sheathed in plastic, hung beside custom-made suits and machine-washable pullovers.


pages: 157 words: 53,125

The Fifth Risk by Michael Lewis

Albert Einstein, behavioural economics, Bernie Sanders, Biosphere 2, chief data officer, cloud computing, data science, Donald Trump, fake news, Ferguson, Missouri, low interest rates, machine readable, opioid epidemic / opioid crisis, Silicon Valley, Solyndra, Steve Bannon, tail risk, the new new thing, uranium enrichment

The company was about to go public, and they wanted to clean up the organization chart. To that end DJ sat down with his counterpart at Facebook, who was dealing with the same problem. What could they call all these data people? “Data scientist,” his Facebook friend suggested. “We weren’t trying to create a new field or anything, just trying to get HR off our backs,” said DJ. He replaced the job titles for some openings with “data scientist.” To his surprise, the number of applicants for the jobs skyrocketed. “Data scientists” were what people wanted to be. In the fall of 2014 someone from the White House called him. Obama was coming to San Francisco and wanted to meet with him.

“How do we know if any of this will be of any use?” she asked. “If your husband is as good as everyone says he is, he’ll figure it out,” said Obama. Which of course made it even harder for DJ to refuse. DJ went to Washington. His assignment was to figure out how to make better use of the data created by the U.S. government. His title: Chief Data Scientist of the United States. He’d be the first person to hold the job. He made his first call at the Department of Commerce, to meet with Penny Pritzker, the commerce secretary, and Kathy Sullivan, the head of the National Oceanic and Atmospheric Administration. They were pleased to see him but also a bit taken aback that he had come.

“We’re going to open all the data and go to every economics department and say,‘Hey, you want a PhD?’ In every agency there were questions to be answered. Most of the answers we have gotten have not come from government. They’ve come from the broad American public who has access to the data.” The opioid crisis was a case in point. The data scientists in the Department of Health and Human Services had opened up the Medicaid and Medicare data, which held information about prescription drugs. Journalists at ProPublica had combed through it and discovered odd concentrations of opioid prescriptions. “We would never have figured out that there was an opioid crisis without the data,” said DJ.


pages: 541 words: 109,698

Mining the Social Web: Finding Needles in the Social Haystack by Matthew A. Russell

Andy Rubin, business logic, Climategate, cloud computing, crowdsourcing, data science, en.wikipedia.org, fault tolerance, Firefox, folksonomy, full text search, Georg Cantor, Google Earth, information retrieval, machine readable, Mark Zuckerberg, natural language processing, NP-complete, power law, Saturday Night Live, semantic web, Silicon Valley, slashdot, social graph, social web, sparse data, statistical model, Steve Jobs, supply-chain management, text mining, traveling salesman, Turing test, web application

Further analysis of the graph is left as a voluntary exercise for the reader, as the primary objective of this chapter was to get your development environment squared away and whet your appetite for more interesting topics. Graphviz appears elsewhere in this book, and if you consider yourself to be a data scientist (or are aspiring to be one), it is a tool that you’ll want to master. That said, we’ll also look at many other useful approaches to visualizing graphs. In the chapters to come, we’ll cover additional outlets of social web data and techniques for analysis. Synthesis: Visualizing Retweets with Protovis A turn-key example script that synthesizes much of the content from this chapter and adds a visualization is how we’ll wrap up this chapter.

The approach introduced in this section is to use graph-like structures, where a link between documents encodes a measure of the similarity between them. This situation presents an excellent opportunity to introduce more visualizations from Protovis, a cutting-edge HTML5-based visualization toolkit under development by the Stanford Visualization Group. Protovis is specifically designed with the interests of data scientists in mind, offers a familiar declarative syntax, and achieves a nice middle ground between high-level and low-level interfaces. A minimal (uninteresting) adaptation to Example 7-7 is all that’s needed to emit a collection of nodes and edges that can be used to produce visualizations of our data examples gallery.

Levitt is the co-author of Freakonomics: A Rogue Economist Explores the Hidden Side of Everything (Harper), a book that systematically uses data to answer seemingly radical questions such as, “What do school teachers and sumo wrestlers have in common?” [34] This question was partly inspired by the interesting Radar post, “Data science democratized”, which mentions a presentation that investigated the same question. [35] A “long tail” or “heavy tail” refers to a feature of statistical distributions in which a significant portion (usually 50 percent or more) of the area under the curve exists within its tail. This concept is revisited as part of a brief overview of Zipf’s law in Data Hacking with NLTK.


pages: 100 words: 15,500

Getting Started with D3 by Mike Dewar

data science, Firefox, Google Chrome, linked data

The data for this book has been gathered and made publicly available by the New York Metropolitan Transit Authority (MTA) and details various aspects of New York’s transit system, comprising of historical tables, live data streams, and geographical information. By the end of the book, we will have visited some of the core aspects of D3, and will be properly equipped to build basic, interactive data visualizations on the Web. Who This Book Is For This is a little book aimed at the data scientist: someone who has data to visualize and who wants to use the power of the modern web browser to give his visualizations additional impact. This might be an academic who wants to escape the confines of the printed article, a statistician who needs to share their impressive results with the rest of her company, or the designer who wants to get his info-viz out far and wide on the Internet.

Note Time spent forming clean, well-structured JSON can save you a lot of heartache down the road. Make sure any JSON you use satisfies http://jsonlint.com at the very least. Performing cleaning or data analysis in the browser is not only a frustrating programming task, but can also make your visualization less responsive. Micha’s Golden Rule Micha Gorelick, a data scientist in NYC, coined the following rule: Do not store data in the keys of a JSON blob. This is Micha’s Golden Rule; it should always be followed when forming JSON for use in D3, and will save you many confusing hours. This means that one should never form JSON like the following: { "bob": 20, "alice": 23, "drew": 30 } Here we are storing data in both the key (name) and the value (age).

This is an essential resource, both for reference and inspiration. Finally, the community around D3 is very active and friendly, and growing fast. The d3-js user group is a great resource for conversation and the d3.js tag on Stack Overflow should be used for specific questions. About the Author Mike Dewar is a data-scientist at Bitly, a New York tech company that makes long URLs shorter. He has a PhD in modelling dynamic systems from data from the University of Sheffield in the UK, and has worked as a Machine Learning post-doc in The University of Edinburgh and Columbia University. He has been drawing graphs regularly since he was in High School, and is starting to get the hang of it.


pages: 350 words: 90,898

A World Without Email: Reimagining Work in an Age of Communication Overload by Cal Newport

Cal Newport, call centre, Claude Shannon: information theory, cognitive dissonance, collaborative editing, Compatible Time-Sharing System, computer age, COVID-19, creative destruction, data science, David Heinemeier Hansson, fault tolerance, Ford Model T, Frederick Winslow Taylor, future of work, Garrett Hardin, hive mind, Inbox Zero, interchangeable parts, it's over 9,000, James Watt: steam engine, Jaron Lanier, John Markoff, John Nash: game theory, Joseph Schumpeter, Kanban, Kickstarter, knowledge worker, Marshall McLuhan, Nash equilibrium, passive income, Paul Graham, place-making, pneumatic tube, remote work: asynchronous communication, remote working, Richard Feynman, rolodex, Salesforce, Saturday Night Live, scientific management, Silicon Valley, Silicon Valley startup, Skype, social graph, stealth mode startup, Steve Jobs, supply-chain management, technological determinism, the medium is the message, the scientific method, Tragedy of the Commons, web application, work culture , Y Combinator

To help understand the true scarcity of uninterrupted time, the RescueTime data scientists also calculated the longest interval that each user worked with no inbox checks or instant messaging. For half the users studied, this longest uninterrupted interval was no more than forty minutes, with the most common length clocking in at a meager twenty minutes. More than two thirds of the users never experienced an hour or more of uninterrupted time during the period studied. To make these observations more concrete, Madison Lukaczyk, one of the data scientists involved in this report, published a chart capturing one full week of her own communication tool usage data.

But the studies cited provide only small snapshots of our current predicament, with the typical experiment observing at most a couple dozen employees for just a handful of days. For a more comprehensive picture of what’s going on in the standard networked office, we’ll turn to a small productivity software firm called RescueTime, which in recent years, with the help of a pair of dedicated data scientists, has been quietly producing a remarkable data set that allows an unprecedented look into the details of the communication habits of contemporary knowledge workers. * * * — The core product of RescueTime is its eponymous time-tracking tool, which runs in the background on your devices and records how much time you spend using various applications and websites.

Because the tool is a web application, however, all this data is stored in central servers, which makes it possible to aggregate and analyze the time use habits of tens of thousands of users. After a few false starts, RescueTime got serious about getting these analyses right. In 2016 they hired a pair of full-time data scientists, who transformed the data into the right format to study trends and properly protect privacy, and then got to work trying to understand how these modern, productivity-minded knowledge workers were actually spending their time. The results were staggering. A report from the summer of 2018 analyzed anonymized behavior data from over fifty thousand active users of the tracking software.9 It reveals that half these users were checking communication applications like email and Slack every six minutes or less.


pages: 252 words: 71,176

Strength in Numbers: How Polls Work and Why We Need Them by G. Elliott Morris

affirmative action, call centre, Cambridge Analytica, commoditize, coronavirus, COVID-19, critical race theory, data science, Donald Trump, Francisco Pizarro, green new deal, lockdown, Moneyball by Michael Lewis explains big data, Nate Silver, random walk, Ronald Reagan, selection bias, Silicon Valley, Socratic dialogue, statistical model, Works Progress Administration

Major media outlets, such as the Washington Post, now regularly work with online pollsters, such as Ipsos. Public opinion polling, at long last, has entered the twenty-first century. STIRRING THE POT In the Obama campaign’s data cave on Election Day 2012, things were not looking so good. “It was 10:30am . . . and my numbers were telling me that President Obama might lose Ohio,” Yair Ghitza, a data scientist for a voter-file vendor called Catalist, recounts in his PhD dissertation. Data on turnout had come in, and his modeling showed that young people and minorities were turning out at lower rates than they had expected: I remember one senior analyst’s ominous interpretation: “it looked like this could be real.”

According to Nolan McCaskill, a reporter for Politico, even Trump did not envision a victory.3 TWO DAYS AFTER THE 2016 ELECTION, the New York Times published a story titled “How Data Failed Us in Calling an Election.” In it, technology journalists Steve Lohr and Natasha Singer chastised election forecasters for getting the contest “wrong.” They decried the media’s growing reliance on data to handicap the horse race. “Data science is a technology advance with trade-offs,” they wrote. “It can see things as never before, but also can be a blunt instrument, missing context and nuance.” Lohr and Singer charged election forecasters and handicappers with taking their eyes off the ball, focusing on the data without considering other sources for prognostication.4 Election forecasters, in turn, blamed the pollsters.


pages: 294 words: 77,356

Automating Inequality by Virginia Eubanks

autonomous vehicles, basic income, Black Lives Matter, business process, call centre, cognitive dissonance, collective bargaining, correlation does not imply causation, data science, deindustrialization, digital divide, disruptive innovation, Donald Trump, driverless car, Elon Musk, ending welfare as we know it, experimental subject, fake news, gentrification, housing crisis, Housing First, IBM and the Holocaust, income inequality, job automation, mandatory minimum, Mark Zuckerberg, mass incarceration, minimum wage unemployment, mortgage tax deduction, new economy, New Urbanism, payday loans, performance metric, Ronald Reagan, San Francisco homelessness, self-driving car, sparse data, statistical model, strikebreaker, underbanked, universal basic income, urban renewal, W. E. B. Du Bois, War on Poverty, warehouse automation, working poor, Works Progress Administration, young professional, zero-sum game

The movement manufactures and circulates misleading stories about the poor: that they are an undeserving, fraudulent, dependent, and immoral minority. Conservative critics of the welfare state continue to run a very effective propaganda campaign to convince Americans that the working class and the poor must battle each other in a zero-sum game over limited resources. More quietly, program administrators and data scientists push high-tech tools that promise to help more people, more humanely, while promoting efficiency, identifying fraud, and containing costs. The digital poorhouse is framed as a way to rationalize and streamline benefits, but the real goal is what it has always been: to profile, police, and punish the poor. 2 AUTOMATING ELIGIBILITY IN THE HEARTLAND A little white donkey is chewing on a fencepost where we turn toward the Stipes house on a narrow utility road paralleling the train tracks in Tipton, Indiana.

A 14-year-old living in a cold and dirty house gets a risk score almost three times as high as a 6-year-old whose mother suspects he may have been abused and who may now be homeless. In these cases, the model does not seem to meet a commonsense standard for providing information useful enough to guide call screeners’ decision-making. Why might that be? Data scientist Cathy O’Neil has written that “models are opinions embedded in mathematics.”8 Models are useful because they let us strip out extraneous information and focus only on what is most critical to the outcomes we are trying to predict. But they are also abstractions. Choices about what goes into them reflect the priorities and preoccupations of their creators.

But its outwardly neutral classifications mask discriminatory outcomes that rob whole communities of wealth, compounding cumulative disadvantage. The digital poorhouse replaces the sometimes-biased decision-making of frontline social workers with the rational discrimination of high-tech tools. Administrators and data scientists focus public attention on the bias that enters decision-making systems through caseworkers, property managers, service providers, and intake center workers. They obliquely accuse their subordinates, often working-class people, of being the primary source of racist and classist outcomes in their organizations.


Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Geron

AlphaGo, Amazon Mechanical Turk, Bayesian statistics, centre right, combinatorial explosion, constrained optimization, correlation coefficient, crowdsourcing, data science, deep learning, DeepMind, duck typing, en.wikipedia.org, Geoffrey Hinton, iterative process, Netflix Prize, NP-complete, optical character recognition, P = NP, p-value, pattern recognition, performance metric, recommendation engine, self-driving car, SpamAssassin, speech recognition, statistical model

Then, before we set out to explore the Machine Learning continent, we will take a look at the map and learn about the main regions and the most notable landmarks: supervised versus unsupervised learning, online versus batch learning, instance-based versus model-based learning. Then we will look at the workflow of a typical ML project, discuss the main challenges you may face, and cover how to evaluate and fine-tune a Machine Learning system. This chapter introduces a lot of fundamental concepts (and jargon) that every data scientist should know by heart. It will be a high-level overview (the only chapter without much code), all rather simple, but you should make sure everything is crystal-clear to you before continuing to the rest of the book. So grab a coffee and let’s get started! Tip If you already know all the Machine Learning basics, you may want to skip directly to Chapter 2.

Poor-Quality Data Obviously, if your training data is full of errors, outliers, and noise (e.g., due to poor-quality measurements), it will make it harder for the system to detect the underlying patterns, so your system is less likely to perform well. It is often well worth the effort to spend time cleaning up your training data. The truth is, most data scientists spend a significant part of their time doing just that. For example: If some instances are clearly outliers, it may help to simply discard them or try to fix the errors manually. If some instances are missing a few features (e.g., 5% of your customers did not specify their age), you must decide whether you want to ignore this attribute altogether, ignore these instances, fill in the missing values (e.g., with the median age), or train one model with the feature and one model without it, and so on.

It’s just boring Pandas code that joins the life satisfaction data from the OECD with the GDP per capita data from the IMF. 7 It’s okay if you don’t understand all the code yet; we will present Scikit-Learn in the following chapters. 8 For example, knowing whether to write “to,” “two,” or “too” depending on the context. 9 Figure reproduced with permission from Banko and Brill (2001), “Learning Curves for Confusion Set Disambiguation.” 10 “The Unreasonable Effectiveness of Data,” Peter Norvig et al. (2009). 11 “The Lack of A Priori Distinctions Between Learning Algorithms,” D. Wolpert (1996). Chapter 2. End-to-End Machine Learning Project In this chapter, you will go through an example project end to end, pretending to be a recently hired data scientist in a real estate company.1 Here are the main steps you will go through: Look at the big picture. Get the data. Discover and visualize the data to gain insights. Prepare the data for Machine Learning algorithms. Select a model and train it. Fine-tune your model. Present your solution.


pages: 267 words: 72,552

Reinventing Capitalism in the Age of Big Data by Viktor Mayer-Schönberger, Thomas Ramge

accounting loophole / creative accounting, Air France Flight 447, Airbnb, Alvin Roth, Apollo 11, Atul Gawande, augmented reality, banking crisis, basic income, Bayesian statistics, Bear Stearns, behavioural economics, bitcoin, blockchain, book value, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, Cass Sunstein, centralized clearinghouse, Checklist Manifesto, cloud computing, cognitive bias, cognitive load, conceptual framework, creative destruction, Daniel Kahneman / Amos Tversky, data science, Didi Chuxing, disruptive innovation, Donald Trump, double entry bookkeeping, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, flying shuttle, Ford Model T, Ford paid five dollars a day, Frederick Winslow Taylor, fundamental attribution error, George Akerlof, gig economy, Google Glasses, Higgs boson, information asymmetry, interchangeable parts, invention of the telegraph, inventory management, invisible hand, James Watt: steam engine, Jeff Bezos, job automation, job satisfaction, joint-stock company, Joseph Schumpeter, Kickstarter, knowledge worker, labor-force participation, land reform, Large Hadron Collider, lone genius, low cost airline, low interest rates, Marc Andreessen, market bubble, market design, market fundamentalism, means of production, meta-analysis, Moneyball by Michael Lewis explains big data, multi-sided market, natural language processing, Neil Armstrong, Network effects, Nick Bostrom, Norbert Wiener, offshore financial centre, Parag Khanna, payday loans, peer-to-peer lending, Peter Thiel, Ponzi scheme, prediction markets, price anchoring, price mechanism, purchasing power parity, radical decentralization, random walk, recommendation engine, Richard Thaler, ride hailing / ride sharing, Robinhood: mobile stock trading app, Sam Altman, scientific management, Second Machine Age, self-driving car, Silicon Valley, Silicon Valley startup, six sigma, smart grid, smart meter, Snapchat, statistical model, Steve Jobs, subprime mortgage crisis, Suez canal 1869, tacit knowledge, technoutopianism, The Future of Employment, The Market for Lemons, The Nature of the Firm, transaction costs, universal basic income, vertical integration, William Langewiesche, Y Combinator

This wouldn’t work in every market—getting a good deal on a car can save a person thousands of dollars and may well be worth the effort. But for clothes, as Stitch Fix shows, it’s a model that can be successful. To achieve service at scale, Stitch Fix analyzes rich and comprehensive data streams. By 2016, the company employed more than seventy data scientists on a team headed by chief algorithm officer Eric Colson. Colson ran data science at Netflix, one of the rich-data pioneers. As it turns out, though, picking the right clothes is much harder than suggesting films to watch. Stitch Fix employs vastly more sophisticated data analytics than the standard social-filtering recommendation engines (“people who liked this movie also enjoyed this one”).


pages: 255 words: 78,207

Web Scraping With Python: Collecting Data From the Modern Web by Ryan Mitchell

AltaVista, Amazon Web Services, Apollo 13, cloud computing, Computing Machinery and Intelligence, data science, en.wikipedia.org, Firefox, Guido van Rossum, information security, machine readable, meta-analysis, natural language processing, optical character recognition, random walk, self-driving car, Turing test, web application

All tables in MySQL must have at least one primary key (the key column that MySQL sorts on), so that MySQL knows how to order it, and it can often be difficult to choose these keys intelligently. The debate over whether to use an artificially created id column for this key or some unique attribute such as username has raged among data scientists and software engineers for years, although I tend to lean on the side of creating id col‐ umns. The reasons for doing it one way or the other are complicated but for nonen‐ terprise systems, you should always be using an id column as an autoincremented primary key. Second, use intelligent indexing.

Appendix C includes case studies, as well as a breakdown of key issues that might affect how you can legally run scrapers in the United States and use the data that they produce. Technical books are often able to focus on a single language or technology, but web scraping is a relatively disparate subject, with practices that require the use of databa‐ ses, web servers, HTTP, HTML, Internet security, image processing, data science, and other tools. This book attempts to cover all of these to an extent for the purpose of gathering data from remote sources across the Internet. Part I covers the subject of web scraping and web crawling in depth, with a strong focus on a small handful of libraries used throughout the book. Part I can easily be used as a comprehensive reference for these libraries and techniques (with certain exceptions, where additional references will be provided).


pages: 318 words: 73,713

The Shame Machine: Who Profits in the New Age of Humiliation by Cathy O'Neil

2021 United States Capitol attack, Affordable Care Act / Obamacare, basic income, big-box store, Black Lives Matter, British Empire, call centre, cognitive dissonance, colonial rule, coronavirus, COVID-19, crack epidemic, crowdsourcing, data science, delayed gratification, desegregation, don't be evil, Edward Jenner, fake news, George Floyd, Greta Thunberg, Jon Ronson, Kickstarter, linked data, Mahatma Gandhi, mass incarceration, microbiome, microdosing, Nelson Mandela, opioid epidemic / opioid crisis, pre–internet, profit motive, QAnon, Ronald Reagan, selection bias, Silicon Valley, social distancing, Stanford marshmallow experiment, Streisand effect, TikTok, Walter Mischel, War on Poverty, working poor

In their massive research labs, mathematicians work closely with psychologists and anthropologists, using our behavioral data to train their machines. Their objective is to spur customer participation and to mine advertising gold. When it comes to this type of intense engagement, shame is one of the most potent motivators. It’s right up there with sex. So even if the data scientists and their bosses in the executive suites might not map out a strategy based on shaming, their automatic algorithms zero in on it. It spurs traffic and boosts revenue. You could argue that the people mocking Joanna McCabe didn’t intend to hurt her. They were just having a laugh. The photo of her tumble at Walmart provided an opportunity to preen on social media and to drive up reputations, gaining likes and followers.

See also labor and employment work requirements for government benefits, 60–61, 66–67, 76 work therapy rehab programs, 48, 54–56 Z Zaki, Jamil, 213 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z BY CATHY O’NEIL Doing Data Science (with Rachel Schutt) On Being a Data Skeptic Weapons of Math Destruction The Shame Machine About the Author CATHY O’NEIL is the author of the bestselling Weapons of Math Destruction, which won the Euler Book Prize and was longlisted for the National Book Award. She received her PhD in mathematics from Harvard and has worked in finance, tech, and academia.


pages: 250 words: 79,360

Escape From Model Land: How Mathematical Models Can Lead Us Astray and What We Can Do About It by Erica Thompson

Alan Greenspan, Bayesian statistics, behavioural economics, Big Tech, Black Swan, butterfly effect, carbon tax, coronavirus, correlation does not imply causation, COVID-19, data is the new oil, data science, decarbonisation, DeepMind, Donald Trump, Drosophila, Emanuel Derman, Financial Modelers Manifesto, fudge factor, germ theory of disease, global pandemic, hindcast, I will remember that I didn’t make the world, and it doesn’t satisfy my equations, implied volatility, Intergovernmental Panel on Climate Change (IPCC), John von Neumann, junk bonds, Kim Stanley Robinson, lockdown, Long Term Capital Management, moral hazard, mouse model, Myron Scholes, Nate Silver, Neal Stephenson, negative emissions, paperclip maximiser, precautionary principle, RAND corporation, random walk, risk tolerance, selection bias, self-driving car, social distancing, Stanford marshmallow experiment, statistical model, systematic bias, tacit knowledge, tail risk, TED Talk, The Great Moderation, The Great Resignation, the scientific method, too big to fail, trolley problem, value at risk, volatility smile, Y2K

If modellers, because of who they are, are able to work from home on a laptop while other people maintain the supply chains that bring them their lunchtime sandwich, they simply may not have access to all of the possible harms of the kinds of actions that are being proposed. This is not a conspiracy theory. Instead, this is what data scientists Catherine D’Ignazio and Lauren Klein term ‘privilege hazard’. It reflects no malice aforethought on the part of the modellers, who I am sure were doing their level best at a time of national crisis. Nor does it necessarily mean that the wrong information was given: after all, the modellers provided what the politicians said they wanted, and in a democracy it is the job of the political representative (not the scientist) to combine the available information with the values and interests of citizens to come to a decision.

The Dissemination of Reliable Knowledge, Cambridge University Press, 2010 Mansnerus, Erika, Modelling in Public Health Research: How Mathematical Techniques Keep us Healthy, Springer, 2014 Neustadt, Richard, and Harvey Fineberg, The Swine Flu Affair: Decision-making on a Slippery Disease, US Dept of Health, Education and Welfare, 1978 Rhodes, Tim, Kari Lancaster and Marsha Rosengarten, ‘A Model Society: Maths, Models and Expertise in Viral Outbreaks’, Critical Public Health, 30(3), 2020 ——, and Kari Lancaster, ‘Mathematical Models as Public Troubles in COVID-19 Infection Control: Following the Numbers’, Health Sociology Review, 29(2), 2020 Richardson, Eugene, ‘Pandemicity, COVID-19 and the Limits of Public Health Science’, BMJ Global Health, 5(4), 2020 Spinney, Laura, Pale Rider: The Spanish Flu of 1918 and How it Changed the World, Vintage, 2018 Chapter 10: Escaping from Model Land Dhami, Mandeep, ‘Towards an Evidence-Based Approach to Communicating Uncertainty in Intelligence Analysis’, Intelligence and National Security, 33(2), 2018 Harding, Sandra, Objectivity and Diversity: Another Logic of Scientific Research, University of Chicago Press, 2015 Marchau, Vincent, Warren Walker, Pieter Bloemen and Steven Popper (eds), Decision Making under Deep Uncertainty, Springer, 2019 Scoones, Ian, and Andy Stirling, The Politics of Uncertainty: Challenges of Transformation, Taylor & Francis, 2020 Chris Vernon Erica Thompson is a senior policy fellow at the London School of Economics’ Data Science Institute and a fellow of the London Mathematical Laboratory. With a PhD from Imperial College, she has recently worked on the limitations of models of COVID-19 spread, humanitarian crises, and climate change. She lives in West Wales.


Likewar: The Weaponization of Social Media by Peter Warren Singer, Emerson T. Brooking

4chan, active measures, Airbnb, augmented reality, barriers to entry, battle of ideas, Bellingcat, Bernie Sanders, Black Lives Matter, British Empire, Cambridge Analytica, Cass Sunstein, citizen journalism, Citizen Lab, Comet Ping Pong, content marketing, crony capitalism, crowdsourcing, data science, deep learning, digital rights, disinformation, disintermediation, Donald Trump, drone strike, Edward Snowden, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, fake news, false flag, Filter Bubble, global reserve currency, Google Glasses, Hacker Conference 1984, Hacker News, illegal immigration, information security, Internet Archive, Internet of things, invention of movable type, it is difficult to get a man to understand something, when his salary depends on his not understanding it, Jacob Silverman, John Gilmore, John Markoff, Kevin Roose, Kickstarter, lateral thinking, lolcat, Mark Zuckerberg, megacity, Menlo Park, meta-analysis, MITM: man-in-the-middle, Mohammed Bouazizi, Moneyball by Michael Lewis explains big data, moral panic, new economy, offshore financial centre, packet switching, Panopticon Jeremy Bentham, Parag Khanna, pattern recognition, Plato's cave, post-materialism, Potemkin village, power law, pre–internet, profit motive, RAND corporation, reserve currency, sentiment analysis, side project, Silicon Valley, Silicon Valley startup, Snapchat, social web, South China Sea, Steve Bannon, Steve Jobs, Steven Levy, Stewart Brand, systems thinking, too big to fail, trade route, Twitter Arab Spring, UNCLOS, UNCLOS, Upton Sinclair, Valery Gerasimov, We are Anonymous. We are Legion, We are as Gods, Whole Earth Catalog, WikiLeaks, Y Combinator, yellow journalism, Yochai Benkler

Russian sockpuppets ran rampant on services like Instagram, an image-sharing platform with over 800 million users (larger than Twitter and Snapchat combined) and more popular among youth than its Facebook corporate parent. Here, the pictorial nature of Instagram made the disinformation even more readily shareable and reproducible. In 2017, data scientist Jonathan Albright conducted a study of just twenty-eight accounts identified as having been operated by the Russian government. He found that this handful of accounts had drawn an astounding 145 million “likes,” comments, and plays of their embedded videos. They’d also provided the visual ammunition subsequently used by other trolls who stalked Facebook and Twitter.

One was labeled a “pornographer,” and another was accused of harassment. Such attacks can be doubly effective, not only silencing the direct targets but also discouraging others from doing the sort of work that earned such abuse. While the sockpuppets were extremely active in the 2016 election, it was far from their only campaign. In 2017, data scientists searched for patterns in accounts that were pushing the theme of #UniteTheRight, the far-right protests that culminated in the killing of a young woman in Charlottesville, Virginia, by a neo-Nazi. The researchers discovered that one key account in spreading the messages of hate came to life each day at 8:00 A.M.

As psychologist Sander van der Linden has written, belief in online conspiracy theories makes one more supportive of “extremism, racist attitudes against minority groups (e.g., anti-Semitism) and even political violence.” Modest lies and grand conspiracy theories have been weapons in the political arsenal for millennia. But social media has made them more powerful and more pervasive than ever before. In the most comprehensive study of its kind, MIT data scientists charted the life cycles of 126,000 Twitter “rumor cascades”—the first hints of stories before they could be verified as true or false. The researchers found that the fake stories spread about six times faster than the real ones. “Falsehood diffused significantly farther, faster, deeper, and more broadly than the truth in all categories of information,” they wrote.


pages: 416 words: 112,268

Human Compatible: Artificial Intelligence and the Problem of Control by Stuart Russell

3D printing, Ada Lovelace, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Alfred Russel Wallace, algorithmic bias, AlphaGo, Andrew Wiles, artificial general intelligence, Asilomar, Asilomar Conference on Recombinant DNA, augmented reality, autonomous vehicles, basic income, behavioural economics, Bletchley Park, blockchain, Boston Dynamics, brain emulation, Cass Sunstein, Charles Babbage, Claude Shannon: information theory, complexity theory, computer vision, Computing Machinery and Intelligence, connected car, CRISPR, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, deep learning, deepfake, DeepMind, delayed gratification, Demis Hassabis, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Ernest Rutherford, fake news, Flash crash, full employment, future of work, Garrett Hardin, Geoffrey Hinton, Gerolamo Cardano, Goodhart's law, Hans Moravec, ImageNet competition, Intergovernmental Panel on Climate Change (IPCC), Internet of things, invention of the wheel, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John Nash: game theory, John von Neumann, Kenneth Arrow, Kevin Kelly, Law of Accelerating Returns, luminiferous ether, machine readable, machine translation, Mark Zuckerberg, multi-armed bandit, Nash equilibrium, Nick Bostrom, Norbert Wiener, NP-complete, OpenAI, openstreetmap, P = NP, paperclip maximiser, Pareto efficiency, Paul Samuelson, Pierre-Simon Laplace, positional goods, probability theory / Blaise Pascal / Pierre de Fermat, profit maximization, RAND corporation, random walk, Ray Kurzweil, Recombinant DNA, recommendation engine, RFID, Richard Thaler, ride hailing / ride sharing, Robert Shiller, robotic process automation, Rodney Brooks, Second Machine Age, self-driving car, Shoshana Zuboff, Silicon Valley, smart cities, smart contracts, social intelligence, speech recognition, Stephen Hawking, Steven Pinker, superintelligent machines, surveillance capitalism, Thales of Miletus, The Future of Employment, The Theory of the Leisure Class by Thorstein Veblen, Thomas Bayes, Thorstein Veblen, Tragedy of the Commons, transport as a service, trolley problem, Turing machine, Turing test, universal basic income, uranium enrichment, vertical integration, Von Neumann architecture, Wall-E, warehouse robotics, Watson beat the top human players on Jeopardy!, web application, zero-sum game

Faced with the socioeconomic equivalent of becoming pet food, humans will be rather unhappy with their governments. Faced with potentially unhappy humans, governments around the world are beginning to devote some attention to the issue. Most have already discovered that the idea of retraining everyone as a data scientist or robot engineer is a nonstarter—the world might need five or ten million of these, but nowhere close to the billion or so jobs that are at risk. Data science is a very tiny lifeboat for a giant cruise ship.27 Some are working on “transition plans”—but transition to what? We need a plausible destination in order to plan a transition—that is, we need a plausible picture of a desirable future economy where most of what we currently call work is done by machines.

The progress of automation in legal analytics, describing the results of a contest: Jason Tashea, “AI software is more accurate, faster than attorneys when assessing NDAs,” ABA Journal, February 26, 2018. 26. A commentary by a distinguished economist, with a title explicitly evoking Keynes’s 1930 article: Lawrence Summers, “Economic possibilities for our children,” NBER Reporter (2013). 27. The analogy between data science employment and a small lifeboat for a giant cruise ship comes from a discussion with Yong Ying-I, head of Singapore’s Public Service Division. She conceded that it was correct on the global scale, but noted that “Singapore is small enough to fit in the lifeboat.” 28. Support for UBI from a conservative viewpoint: Sam Bowman, “The ideal welfare system is a basic income,” Adam Smith Institute, November 25, 2013. 29.


pages: 245 words: 71,886

Spike: The Virus vs The People - The Inside Story by Jeremy Farrar, Anjana Ahuja

"World Economic Forum" Davos, bioinformatics, Black Monday: stock market crash in 1987, Boris Johnson, Brexit referendum, contact tracing, coronavirus, COVID-19, crowdsourcing, dark matter, data science, DeepMind, Demis Hassabis, disinformation, Dominic Cummings, Donald Trump, double helix, dual-use technology, Future Shock, game design, global pandemic, Kickstarter, lab leak, lockdown, machine translation, nudge unit, open economy, pattern recognition, precautionary principle, side project, social distancing, the scientific method, Tim Cook: Apple, zoonotic diseases

In meetings at Number 10 and elsewhere, he was always curious, asked the right probing questions of the science and had the capacity to spot the difference between good-quality evidence and bogus science. He listened intently and seemed to be one of the few people who could make things happen at speed across government. Ben Warner is a data scientist who trained at University College London and worked with Cummings at Vote Leave. During February and March 2020, Warner frequently attended SAGE meetings. Cummings also called upon the expertise of Marc Warner, Ben’s brother and also a data scientist, who founded a company called Faculty. Marc had also been involved with Vote Leave and had recently been drafted in to work with NHS Digital. Cummings offers his own account of what happened in this important period, which echoes but also adds to the seven hours of evidence he gave to UK MPs on 26 May 2021.

By that time, Cummings said, there was unanimity among Patrick, Chris, Ben Warner and John Edmunds that intervention was needed. The Prime Minister did not want to act. Cummings claims that, in order to try to change Johnson’s mind, he organised a meeting on Tuesday 22 September 2020, at which Johnson was presented with the case numbers and infections rates by Catherine Cutts, a data scientist newly recruited to Number 10. She showed Johnson the current data and then fast-forwarded a month, to role-play the scenario of infections and deaths projected for October. Cummings says: ‘We presented it all as if we were about six weeks in the future. This was my best attempt to get people to actually see sense and realise that it would be better for the economy as well as for health to get on top of it fast.

He has been praised for his loyal support for other SAGE members and enthusiasm for initiatives like COG-UK. Jonathan Van-Tam An expert on respiratory viruses, Van-Tam is (with Jenny Harries) deputy chief medical officer for England. He commented on Cummings’s Durham excursion that the rules applied to everyone: public adherence might depend on this principle. Ben Warner A data scientist with a doctorate from University College London, Warner worked with Cummings at Vote Leave. He attended SAGE meetings as a Number 10 observer and raised early concerns about the UK’s coronavirus response. Chris Whitty The UK government’s chief medical adviser, Whitty trained in infectious diseases and did a period of study in Vietnam.


pages: 372 words: 100,947

An Ugly Truth: Inside Facebook's Battle for Domination by Sheera Frenkel, Cecilia Kang

"World Economic Forum" Davos, 2021 United States Capitol attack, affirmative action, augmented reality, autonomous vehicles, Ben Horowitz, Bernie Sanders, Big Tech, Black Lives Matter, blockchain, Cambridge Analytica, clean water, coronavirus, COVID-19, data science, disinformation, don't be evil, Donald Trump, Edward Snowden, end-to-end encryption, fake news, George Floyd, global pandemic, green new deal, hockey-stick growth, Ian Bogost, illegal immigration, immigration reform, independent contractor, information security, Jeff Bezos, Kevin Roose, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Menlo Park, natural language processing, offshore financial centre, Parler "social media", Peter Thiel, QAnon, RAND corporation, ride hailing / ride sharing, Robert Mercer, Russian election interference, Salesforce, Sam Altman, Saturday Night Live, Sheryl Sandberg, Shoshana Zuboff, Silicon Valley, Snapchat, social web, Steve Bannon, Steve Jobs, Steven Levy, subscription business, surveillance capitalism, TechCrunch disrupt, TikTok, Travis Kalanick, WikiLeaks

For the past year, the company’s data scientists had been quietly running experiments that tested how Facebook users responded when shown content that fell into one of two categories: good for the world or bad for the world. The experiments, which were posted on Facebook under the subject line “P (Bad for the world),” had reduced the visibility of posts that people considered “bad for the world.” But while they had successfully demoted them in the News Feed, therefore prompting users to see more posts that were “good for the world” when they logged into Facebook, the data scientists found that users opened Facebook far less after the changes were made.

The experiment laid bare both Facebook’s power to reach deep into the psyche of its users and its willingness to test the boundaries of that power without users’ knowledge. “Emotional states can be transferred to others via emotional contagion, leading people to experience the same emotions without their awareness,” Facebook data scientists wrote in a research paper published in the Proceedings of the National Academy of Sciences. They described how, over the course of one week in 2012, they had tampered with what almost 700,000 Facebook users saw when they logged on to the platform. In the experiment, some Facebook users were shown content that was overwhelmingly “happy,” while others were shown content that was overwhelming “sad.”

“We were announcing all these changes to long-standing policy but treating them as ad hoc, isolated decisions.” Other data were equally alarming. Internal reports also showed a steady rise in extremist groups and conspiracy movements. Facebook’s security team reported incidents of real-world violence, as well as frightening comments made in private groups. Facebook’s data scientists and security officials noted a 300 percent increase, from June through August 2020, in content related to the conspiracy theory QAnon. QAnon believers perpetuated a false theory that liberal elites and celebrities like Bill Gates, Tom Hanks, and George Soros ran a global child trafficking ring.


pages: 575 words: 140,384

It's Not TV: The Spectacular Rise, Revolution, and Future of HBO by Felix Gillette, John Koblin

activist fund / activist shareholder / activist investor, Airbnb, Amazon Web Services, AOL-Time Warner, Apollo 13, Big Tech, bike sharing, Black Lives Matter, Burning Man, business cycle, call centre, cloud computing, coronavirus, corporate governance, COVID-19, data science, disruptive innovation, Dissolution of the Soviet Union, Donald Trump, Elon Musk, Erlich Bachman, Exxon Valdez, fake news, George Floyd, Jeff Bezos, Keith Raniere, lockdown, Menlo Park, multilevel marketing, Nelson Mandela, Netflix Prize, out of africa, payday loans, peak TV, period drama, recommendation engine, Richard Hendricks, ride hailing / ride sharing, risk tolerance, Robert Durst, Ronald Reagan, Saturday Night Live, self-driving car, shareholder value, Sheryl Sandberg, side hustle, Silicon Valley, Silicon Valley startup, Stephen Hawking, Steve Jobs, subscription business, tech billionaire, TechCrunch disrupt, TikTok, Tim Cook: Apple, traveling salesman, unpaid internship, upwardly mobile, urban decay, WeWork

If HBO failed to keep pace with streaming technology it would soon be rendered obsolete. While Otto Berkes lobbied internally for more resources, his team opened up a new HBO office in Seattle, not far from his old stomping grounds at Microsoft. There, he began assembling a new team of engineers, product designers, and data scientists. Everybody who came on board knew the ultimate mission. Beat Netflix. “At the end of the day, I think my story, and the team that I built there, is really about modern tech meets old media,” Berkes says. “We think differently. We’re data driven. It’s not about relationships—couldn’t give a shit.

“We weren’t trying to copy HBO,” says Jonathan Friedland, the former Netflix communications executive. “We were trying to do it better than them. We were trying to do it with a different approach, a data-driven approach, as opposed to pure touch. That’s the fundamental thing to understand about Netflix: It’s a data science and technology company. Every step we took was guided by data. We aspired to the quality level of execution but based on a totally different set of tools.” The data analysis would prove to be spot on. House of Cards would go on to air for six seasons, becoming a defining series for the streaming service, proof that Netflix could master the same “It’s Not TV” game played by HBO.

Now when she talks to screenwriters trying to figure out the ideal home for their projects, the answer, almost surprisingly, remains the same. “HBO still is the one they want to go to,” Strauss says. “There is still a way that they look at things and a process that they go through, which is a cut above.” As the cable era drew to a close, Bloys recognized that the streamniks were right about one thing, data science was immensely useful for certain tasks like marketing, customer retention, and optimization of budgets on a broad scale. But great television, he also knew, would never come from listening to the customers or mining their preferences on the internet. “People don’t know they need the Roys until they meet the Roys,” he says of Succession.


The Smartphone Society by Nicole Aschoff

"Susan Fowler" uber, 4chan, A Declaration of the Independence of Cyberspace, Airbnb, algorithmic bias, algorithmic management, Amazon Web Services, artificial general intelligence, autonomous vehicles, barriers to entry, Bay Area Rapid Transit, Bernie Sanders, Big Tech, Black Lives Matter, blockchain, carbon footprint, Carl Icahn, Cass Sunstein, citizen journalism, cloud computing, correlation does not imply causation, crony capitalism, crowdsourcing, cryptocurrency, data science, deep learning, DeepMind, degrowth, Demis Hassabis, deplatforming, deskilling, digital capitalism, digital divide, do what you love, don't be evil, Donald Trump, Downton Abbey, Edward Snowden, Elon Musk, Evgeny Morozov, fake news, feminist movement, Ferguson, Missouri, Filter Bubble, financial independence, future of work, gamification, gig economy, global value chain, Google Chrome, Google Earth, Googley, green new deal, housing crisis, income inequality, independent contractor, Jaron Lanier, Jeff Bezos, Jessica Bruder, job automation, John Perry Barlow, knowledge economy, late capitalism, low interest rates, Lyft, M-Pesa, Mark Zuckerberg, minimum wage unemployment, mobile money, moral panic, move fast and break things, Naomi Klein, Network effects, new economy, Nicholas Carr, Nomadland, occupational segregation, Occupy movement, off-the-grid, offshore financial centre, opioid epidemic / opioid crisis, PageRank, Patri Friedman, peer-to-peer, Peter Thiel, pets.com, planned obsolescence, quantitative easing, Ralph Waldo Emerson, RAND corporation, Ray Kurzweil, RFID, Richard Stallman, ride hailing / ride sharing, Rodney Brooks, Ronald Reagan, Salesforce, Second Machine Age, self-driving car, shareholder value, sharing economy, Sheryl Sandberg, Shoshana Zuboff, Sidewalk Labs, Silicon Valley, single-payer health, Skype, Snapchat, SoftBank, statistical model, Steve Bannon, Steve Jobs, surveillance capitalism, TaskRabbit, tech worker, technological determinism, TED Talk, the scientific method, The Structural Transformation of the Public Sphere, TikTok, transcontinental railway, transportation-network company, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, upwardly mobile, Vision Fund, W. E. B. Du Bois, wages for housework, warehouse robotics, WikiLeaks, women in the workforce, yottabyte

Stories of self-teaching algorithms, autonomous cars, expert reports predicting the disappearance of at least half the world’s jobs in the next couple of decades due to automation and robots abound. It’s a wonder we don’t stay in bed, scrolling through our feeds, awaiting the Singularity in dignified repose. Indeed, a chorus of warnings from tech naysayers and handwringers suggests we should be terrified of what a Silicon Valley future holds. Data scientists paint a picture of a future in which algorithms determine everything, and in some accounts this future is now. Companies and state and federal agencies have used algorithms to determine whether someone will get parole, get a job, get a loan, get a raise, get public assistance, be accepted to a school, get fired, or get a promotion.

Atkinson, president of the Information Technology and Innovation Foundation, says, “No matter how many times a purported expert claims we are facing an epochal technology revolution that will destroy tens of millions of jobs and leave large swathes of human workers permanently unemployed, it still isn’t true.”39 Data scientists are also speaking up to assert that algorithms don’t take people out of the equation and they aren’t unbiased or neutral. Algorithms are designed by people (who have their own biases) and are trained on datasets that themselves often reflect bias and discrimination in their collection and design.

The GDPR’s creation originated in complaints by a former law student, Max Schrems, about Silicon Valley firms’ violations of European privacy laws. Other advocates for digital justice point to how tech companies do more than just invade our privacy—they develop and deploy algorithms that can cause substantial harm to individuals and communities. A growing number of data scientists advocate shining a light into the “black-box algorithms” that are being rapidly integrated into decision-making processes in myriad spheres of life. They call for “algorithmic accountability,” which the nonprofit research institute Data & Society defines as “the assignment of responsibility for how an algorithm is created and its impact on society; if harm occurs, accountable systems include a mechanism for redress.”20 Nicholas Diakopoulos, director of the Computational Journalism Lab at Northwestern University, and Sorelle Friedler, a computer science professor at Haverford College, suggest five dimensions of algorithmic accountability: responsibility, explainability, accuracy, auditability, and fairness.


We Are the Nerds: The Birth and Tumultuous Life of Reddit, the Internet's Culture Laboratory by Christine Lagorio-Chafkin

"Friedman doctrine" OR "shareholder theory", 4chan, Aaron Swartz, Airbnb, Amazon Web Services, Bernie Sanders, big-box store, bitcoin, blockchain, Brewster Kahle, Burning Man, compensation consultant, crowdsourcing, cryptocurrency, data science, David Heinemeier Hansson, digital rights, disinformation, Donald Trump, East Village, eternal september, fake news, game design, Golden Gate Park, growth hacking, Hacker News, hiring and firing, independent contractor, Internet Archive, Jacob Appelbaum, Jeff Bezos, jimmy wales, Joi Ito, Justin.tv, Kickstarter, Large Hadron Collider, Lean Startup, lolcat, Lyft, Marc Andreessen, Mark Zuckerberg, medical residency, minimum viable product, natural language processing, Palm Treo, Paul Buchheit, Paul Graham, paypal mafia, Peter Thiel, plutocrats, QR code, r/findbostonbombers, recommendation engine, RFID, rolodex, Ruby on Rails, Sam Altman, Sand Hill Road, Saturday Night Live, self-driving car, semantic web, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, slashdot, Snapchat, Social Justice Warrior, social web, South of Market, San Francisco, Startup school, Stephen Hawking, Steve Bannon, Steve Jobs, Steve Wozniak, Streisand effect, technoutopianism, uber lyft, Wayback Machine, web application, WeWork, WikiLeaks, Y Combinator

Slowe was given a Reddit hoodie, black with subtle orange-red accents, and began working on managing the effort to update the site’s very infrastructure, update the homepage algorithm, modernize anticheating measures, and keep users’ data safe. It wasn’t long before a handful of other Hipmunk engineers also joined Reddit. Some rejoined, including early Reddit employees David King and Jason Harvey. Later, once Reddit started furiously hiring programmers, more than a half dozen former Hipmunk data scientists and engineers would join their ranks. (Hires of his trusted team weren’t limited to engineering; Huffman later hired Hipmunk’s marketing executive, Roxy Young, as VP of marketing.) Ricky Ramirez and Neil Williams had never left Reddit. Combining the forces from essentially all eras of Huffman’s career made it feel like he’d created a reunion show of star employees.

A Reddit representative later said no advertisements connected to the Russian Internet Research Agency were detected. In the midst of all this, Reddit’s legal team proactively reached out to Warner’s office to introduce themselves. As part of the internal investigation, Reddit dug into an isolated conspiracy theory, Pizzagate. Its data scientists found no evidence at the time of suspicious Russian domains orchestrating the dissemination or analysis of Pizzagate on the subreddit. That was all regular Redditors. The spread of political disinformation by regular users wasn’t what Congress was investigating—it was primarily looking into easier-to-track and -grasp advertising—but a representative from Warner’s office said at the time, “No one denies that Reddit has been a hub of anti-Semitic and white nationalist expression.

As new engineers were hired, more were handed over to Slowe to build robust antispam systems. As Slowe’s team grew, he proved an adept manager and was handed an even larger team—eighteen engineers—leading the group in charge of maintaining and developing the full Reddit site’s architecture. Before long, data science was spun out as its own team and also placed under Slowe’s purview. By September 2016, he had four teams of engineers reporting to him. They’d put in motion a longer-term customization of Reddit’s homepage, busted spam by 90 percent, and hired like mad—giving Reddit the future capability not just to maintain the status quo, but rather to build well-functioning systems on top of the site’s old code, rendering the last version Slowe had touched so many years earlier, at long last, mostly useless.


System Error by Rob Reich

"Friedman doctrine" OR "shareholder theory", "World Economic Forum" Davos, 2021 United States Capitol attack, A Declaration of the Independence of Cyberspace, Aaron Swartz, AI winter, Airbnb, airport security, Alan Greenspan, Albert Einstein, algorithmic bias, AlphaGo, AltaVista, artificial general intelligence, Automated Insights, autonomous vehicles, basic income, Ben Horowitz, Berlin Wall, Bernie Madoff, Big Tech, bitcoin, Blitzscaling, Cambridge Analytica, Cass Sunstein, clean water, cloud computing, computer vision, contact tracing, contact tracing app, coronavirus, corporate governance, COVID-19, creative destruction, CRISPR, crowdsourcing, data is the new oil, data science, decentralized internet, deep learning, deepfake, DeepMind, deplatforming, digital rights, disinformation, disruptive innovation, Donald Knuth, Donald Trump, driverless car, dual-use technology, Edward Snowden, Elon Musk, en.wikipedia.org, end-to-end encryption, Fairchild Semiconductor, fake news, Fall of the Berlin Wall, Filter Bubble, financial engineering, financial innovation, fulfillment center, future of work, gentrification, Geoffrey Hinton, George Floyd, gig economy, Goodhart's law, GPT-3, Hacker News, hockey-stick growth, income inequality, independent contractor, informal economy, information security, Jaron Lanier, Jeff Bezos, Jim Simons, jimmy wales, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John Perry Barlow, Lean Startup, linear programming, Lyft, Marc Andreessen, Mark Zuckerberg, meta-analysis, minimum wage unemployment, Monkeys Reject Unequal Pay, move fast and break things, Myron Scholes, Network effects, Nick Bostrom, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, NP-complete, Oculus Rift, OpenAI, Panopticon Jeremy Bentham, Parler "social media", pattern recognition, personalized medicine, Peter Thiel, Philippa Foot, premature optimization, profit motive, quantitative hedge fund, race to the bottom, randomized controlled trial, recommendation engine, Renaissance Technologies, Richard Thaler, ride hailing / ride sharing, Ronald Reagan, Sam Altman, Sand Hill Road, scientific management, self-driving car, shareholder value, Sheryl Sandberg, Shoshana Zuboff, side project, Silicon Valley, Snapchat, social distancing, Social Responsibility of Business Is to Increase Its Profits, software is eating the world, spectrum auction, speech recognition, stem cell, Steve Jobs, Steven Levy, strong AI, superintelligent machines, surveillance capitalism, Susan Wojcicki, tech billionaire, tech worker, techlash, technoutopianism, Telecommunications Act of 1996, telemarketer, The Future of Employment, TikTok, Tim Cook: Apple, traveling salesman, Triangle Shirtwaist Factory, trolley problem, Turing test, two-sided market, Uber and Lyft, uber lyft, ultimatum game, union organizing, universal basic income, washing machines reduced drudgery, Watson beat the top human players on Jeopardy!, When a measure becomes a target, winner-take-all economy, Y Combinator, you are the product

Indeed, most of us will not face algorithmic decision-making in the criminal justice system, though we will grapple with algorithms in many other aspects of our daily lives. But those who will face it are among society’s most vulnerable—and often the victims of historical injustice and systemic inequalities. Cathy O’Neil, a former mathematics professor turned data scientist, emphasizes this in her seminal book Weapons of Math Destruction, writing that algorithmic decision-making models “tended to punish the poor and the oppressed in our society, while making the rich richer.” She recounts this phenomenon in the criminal justice system as well as many other domains such as credit scoring, college admissions, and employment decisions.

There was even a time when computer science departments struggled to attract students. But over the past thirty years, the programmer Davids have defeated the industrial Goliaths and become the new masters of the universe. Enrollments in computer science classes are booming almost everywhere. The reason is obvious: programming and data science are hugely valuable, and students want a chance to contribute to the digital revolution that is profoundly reshaping our world, changing individual human experience, social connection, community, and politics at a national and global level. Of course, the salary premium and chance of amassing start-up riches don’t hurt, either.

Big respect for the team for their great work,” Twitter, February 15, 2019, https://twitter.com/yoavgo/status/1096471273050382337. “humans find GPT-2 outputs convincing”: Irene Solaiman, Jack Clark, and Miles Brundage, “GPT-2: 1.5B Release,” OpenAI, November 5, 2019, https://openai.com/blog/gpt-2-1-5b-release/. “extremist groups can use”: Ibid. the training data: Stephen Ornes, “Explainer: Understanding the Size of Data,” Science News for Students, December 13, 2013, https://www.sciencenewsforstudents.org/article/explainer-understanding-size-data. “Kanye West Exclusive”: Arram Sabeti, “GPT-3,” Arram Sabeti (blog), July 9, 2020, https://arr.am/2020/07/09/gpt-3-an-ai-thats-eerily-good-at-writing-almost-anything/. “Why deep learning will never”: Gwern Branwen, “GPT-3 Creative Fiction,” gwern.net, June 19, 2020, https://www.gwern.net/GPT-3#why-deep-learning-will-never-truly-x; Kelsey Piper, “GPT-3, Explained: This New Language AI Is Uncanny, Funny—and a Big Deal,” Vox, August 13, 2020, https://www.vox.com/future-perfect/21355768/gpt-3-ai-openai-turing-test-language.


pages: 222 words: 53,317

Overcomplicated: Technology at the Limits of Comprehension by Samuel Arbesman

algorithmic trading, Anthropocene, Anton Chekhov, Apple II, Benoit Mandelbrot, Boeing 747, Chekhov's gun, citation needed, combinatorial explosion, Computing Machinery and Intelligence, Danny Hillis, data science, David Brooks, digital map, discovery of the americas, driverless car, en.wikipedia.org, Erik Brynjolfsson, Flash crash, friendly AI, game design, Google X / Alphabet X, Googley, Hans Moravec, HyperCard, Ian Bogost, Inbox Zero, Isaac Newton, iterative process, Kevin Kelly, machine translation, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mandelbrot fractal, Minecraft, Neal Stephenson, Netflix Prize, Nicholas Carr, Nick Bostrom, Parkinson's law, power law, Ray Kurzweil, recommendation engine, Richard Feynman, Richard Feynman: Challenger O-ring, Second Machine Age, self-driving car, SimCity, software studies, statistical model, Steve Jobs, Steve Wozniak, Steven Pinker, Stewart Brand, superintelligent machines, synthetic biology, systems thinking, the long tail, Therac-25, Tyler Cowen, Tyler Cowen: Great Stagnation, urban planning, Watson beat the top human players on Jeopardy!, Whole Earth Catalog, Y2K

Creating generalists who are able to serve this function well in our society first involves the construction of what have become known as T-shaped individuals, a term that appears to have first originated in computing education. T-shaped individuals have deep expertise in one area—the stem of the T shape—but breadth of knowledge as well: the bar of the T. What do these types of people look like? One example is the data scientist, who uses the tools of computer science and statistics to find meaning in large datasets, no matter what the discipline. Data scientists have to know a lot about many different areas in order to do their job successfully. We see something similar in applied mathematicians, who use quantitative tools to cut across disciplines and find commonalities, acting as generalists.

., 33–34 construction, cost of, 48–50 Cope, David, 168–69, 229–30 corpus, in linguistics, 55–56 counting: cognitive limits on, 75 human vs. computer, 69–70, 97, 209 Cowen, Tyler, 84 Cryptonomicon (Stephenson), 128–29 “Crystalline Structure of Legal Thought, The” (Balkin), 60–61 Curiosity (Ball), 87–88 Dabbler badge, 144–45 dark code, 21–22 Darwin, Charles, 115, 221, 227 Daston, Lorraine, 140–41 data scientists, 143 datasets, massive, 81–82, 104–5, 143 debugging, 103–4 Deep Blue, 84 diffusion-limited aggregation (DLA), 134–35 digital mapping systems, 5, 49, 51 Dijkstra, Edsger, 3, 50–51, 155 “Divers Instances of Peculiarities of Nature, Both in Men and Brutes” (Fairfax), 111–12 diversity, 113–14, 115 see also complexity, complex systems DNA, see genomes Doyle, John, 222 Dreyfus, Hubert, 173 dwarfism, 120 Dyson, Freeman, on unity vs. diversity, 114 Dyson, George, 110 Economist, 41 edge cases, 53–62, 65, 116, 128, 141, 201, 205, 207 unexpected behavior and, 99–100 see also outliers Einstein, Albert, 114 Eisen, Michael, 61 email, evolution of, 32–33 emergence, in complex systems, 27 encryption software, bugs in, 97–98 Enlightenment, 23 Entanglement, Age of, 23–29, 71, 92, 96, 97, 165, 173, 175, 176 symptoms of, 100–102 Environmental Protection Agency, 41 evolution: aesthetics and, 119 of biological systems, 117–20, 122 of genomes, 118, 156 of technological complexity, 127, 137–38 evolutionary computation, 82–84, 213 exceptions, see edge cases; outliers Facebook, 98, 189 failure, cost of, 48–50 Fairfax, Nathanael, 111–12, 113, 140 fear, as response to technological complexity, 5, 7, 154–55, 156, 165 Federal Aviation Administration (FAA), Y2K bug and, 37 feedback, 14–15, 79, 135 Felsenstein, Lee, 21 Fermi, Enrico, 109 Feynman, Richard, 9, 11 field biologists, 122 for complex technologies, 123, 126, 127, 132 financial sector: interaction in, 126 interconnectivity of, 62, 64 see also stock market systems Firthian linguistics, 206 Flash Crash (2010), 25 Fleming, Alexander, 124 Flood, Mark, 61, 85 Foote, Brian, 201 Fortran, 39 fractals, 60, 61, 136 Frederick the Great, king of Prussia, 89 fruit flies, 109–10 “Funes the Memorious” (Borges), 76–77, 131 Galaga, bug in, 95–96, 97, 216–17 Gall, John, 157–58, 167, 227 game theory, 210 garden path sentences, 74–75 generalists, 93 combination of physics and biological thinking in, 142–43, 146 education of, 144, 145 explosion of knowledge and, 142–49 specialists and, 146 as T-shaped individuals, 143–44, 146 see also Renaissance man generalization, in biological thinking, 131–32 genomes, 109, 128 accretion in, 156 evolution of, 118, 156 legacy code (junk) in, 118, 119–20, 222 mutations in, 120 RNAi and, 123–24 Gibson, William, 176 Gingold, Chaim, 162–63 Girl Scouts, 144–45 glitches, see unexpected behavior Gmail, crash of, 103 Gödel, Kurt, 175 “good enough,” 27, 42, 118, 119 Goodenough, Oliver, 61, 85 Google, 32, 59, 98, 104–5 data centers of, 81–82, 103, 189 Google Docs, 32 Google Maps, 205 Google Translate, 57 GOTO command, 44–45, 81 grammar, 54, 57–58 gravitation, Newton’s law of, 113 greeblies, 130–31 Greek philosophy, 138–40, 151 Gresham College, 89 Guide of the Perplexed, The (Maimonides), 151 Haldane, J.


pages: 339 words: 88,732

The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson, Andrew McAfee

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, 3D printing, access to a mobile phone, additive manufacturing, Airbnb, Alan Greenspan, Albert Einstein, Amazon Mechanical Turk, Amazon Web Services, American Society of Civil Engineers: Report Card, Any sufficiently advanced technology is indistinguishable from magic, autonomous vehicles, barriers to entry, basic income, Baxter: Rethink Robotics, Boston Dynamics, British Empire, business cycle, business intelligence, business process, call centre, carbon tax, Charles Lindbergh, Chuck Templeton: OpenTable:, clean water, combinatorial explosion, computer age, computer vision, congestion charging, congestion pricing, corporate governance, cotton gin, creative destruction, crowdsourcing, data science, David Ricardo: comparative advantage, digital map, driverless car, employer provided health coverage, en.wikipedia.org, Erik Brynjolfsson, factory automation, Fairchild Semiconductor, falling living standards, Filter Bubble, first square of the chessboard / second half of the chessboard, Frank Levy and Richard Murnane: The New Division of Labor, Freestyle chess, full employment, G4S, game design, general purpose technology, global village, GPS: selective availability, Hans Moravec, happiness index / gross national happiness, illegal immigration, immigration reform, income inequality, income per capita, indoor plumbing, industrial robot, informal economy, intangible asset, inventory management, James Watt: steam engine, Jeff Bezos, Jevons paradox, jimmy wales, job automation, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Kevin Kelly, Khan Academy, Kiva Systems, knowledge worker, Kodak vs Instagram, law of one price, low skilled workers, Lyft, Mahatma Gandhi, manufacturing employment, Marc Andreessen, Mark Zuckerberg, Mars Rover, mass immigration, means of production, Narrative Science, Nate Silver, natural language processing, Network effects, new economy, New Urbanism, Nicholas Carr, Occupy movement, oil shale / tar sands, oil shock, One Laptop per Child (OLPC), pattern recognition, Paul Samuelson, payday loans, post-work, power law, price stability, Productivity paradox, profit maximization, Ralph Nader, Ray Kurzweil, recommendation engine, Report Card for America’s Infrastructure, Robert Gordon, Robert Solow, Rodney Brooks, Ronald Reagan, search costs, Second Machine Age, self-driving car, sharing economy, Silicon Valley, Simon Kuznets, six sigma, Skype, software patent, sovereign wealth fund, speech recognition, statistical model, Steve Jobs, Steven Pinker, Stuxnet, supply-chain management, TaskRabbit, technological singularity, telepresence, The Bell Curve by Richard Herrnstein and Charles Murray, the Cathedral and the Bazaar, the long tail, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, total factor productivity, transaction costs, Tyler Cowen, Tyler Cowen: Great Stagnation, Vernor Vinge, warehouse robotics, Watson beat the top human players on Jeopardy!, winner-take-all economy, Y2K

Another interesting fact is that the majority of Kaggle contests are won by people who are marginal to the domain of the challenge—who, for example, made the best prediction about hospital readmission rates despite having no experience in health care—and so would not have been consulted as part of any traditional search for solutions. In many cases, these demonstrably capable and successful data scientists acquired their expertise in new and decidedly digital ways. Between February and September of 2012 Kaggle hosted two competitions about computer grading of student essays, which were sponsored by the Hewlett Foundation.* Kaggle and Hewlett worked with multiple education experts to set up the competitions, and as they were preparing to launch many of these people were worried.

.* Kaggle and Hewlett worked with multiple education experts to set up the competitions, and as they were preparing to launch many of these people were worried. The first contest was to consist of two rounds. Eleven established educational testing companies would compete against one another in the first round, with members of Kaggle’s community of data scientists invited to join in, individually or in teams, in the second. The experts were worried that the Kaggle crowd would simply not be competitive in the second round. After all, each of the testing companies had been working on automatic grading for some time and had devoted substantial resources to the problem. Their hundreds of person-years of accumulated experience and expertise seemed like an insurmountable advantage over a bunch of novices.

Economics reporter Catherine Rampell points out that college graduates are the only group that has seen employment growth since the start of the Great Recession in 2007, and in October of 2011 the unemployment rate for bachelor’s degree holders, at 5.8 percent, was only about half that of those with associate’s degrees (10.6 percent) and a third that of those who stopped after high school (16.2 percent).18 The college premium exists in part because so many types of raw data are getting dramatically cheaper, and as data get cheaper, the bottleneck increasingly is the ability to interpret and use data. This reflects the career advice that Google chief economist Hal Varian frequently gives: seek to be an indispensable complement to something that’s getting cheap and plentiful. Examples include data scientists, writers of mobile phone apps, and genetic counselors, who have come into demand as more people have their genes sequenced. Bill Gates has said that he chose to go into software when he saw how cheap and ubiquitous computers, especially microcomputers, were becoming. Jeff Bezos systematically analyzed the bottlenecks and opportunities created by low-cost online commerce, particularly the ability to index large numbers of products, before he set up Amazon.


Demystifying Smart Cities by Anders Lisdorf

3D printing, artificial general intelligence, autonomous vehicles, backpropagation, behavioural economics, Big Tech, bike sharing, bitcoin, business intelligence, business logic, business process, chief data officer, circular economy, clean tech, clean water, cloud computing, computer vision, Computing Machinery and Intelligence, congestion pricing, continuous integration, crowdsourcing, data is the new oil, data science, deep learning, digital rights, digital twin, distributed ledger, don't be evil, Elon Musk, en.wikipedia.org, facts on the ground, Google Glasses, hydroponic farming, income inequality, information security, Infrastructure as a Service, Internet of things, Large Hadron Collider, Masdar, microservices, Minecraft, OSI model, platform as a service, pneumatic tube, ransomware, RFID, ride hailing / ride sharing, risk tolerance, Salesforce, self-driving car, smart cities, smart meter, software as a service, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, Stuxnet, Thomas Bayes, Turing test, urban sprawl, zero-sum game

Here a data asset should exist only in one version sanctioned by the data owner. The discovery zone is specific to a unit or organization, and users can bring in their own data or create their own versions of data sets. This is not meant for general consumption but is more like a sandbox for data scientists where they can prepare new data sets and make ad hoc experiments. It is the only zone the users have a possibility to create and upload data to. The operational zone is like a traditional operational data store and is in essence a read replica of an operational database. It is used in order not to unnecessarily affect an operational, transactional database with queries.

At a workshop hosted by 100 Resilient Cities and the city of New York, the topic was how we could use data to improve the resilience of our cities. The brainstorming session produced multiple suggestions, but the one with most votes was the data catalog. Making it possible to find the data you need is crucial for realizing the value of data. Promote data lineage transparency – Data scientists around the world spend an inordinate amount of time and worry about where their data comes from and try to track down all the steps it goes through before it ends up in their data source. This is with good reason since this is crucial to the quality and nature of the data. Promoting data lineage transparency can be solved with tools, but they typically cover only the particular vendor stack.

A good example of a famous scientist is Stephen Hawking, who investigated black holes that are of little direct practical relevance. The same can be said of Charles Darwin who also worked on problems that at the time had little practical relevance. This type is fairly rare but can be found in an R&D division of a product-oriented company. However, it has recently resurfaced in the form of the data scientist. While they often work on more applied science type of problems, they also occasionally supply real scientific results based on their own research agenda. This can be seen in larger tech companies. For cities it is rare to have scientists employed, but they can frequently be supplied by a close-by university system.


pages: 335 words: 97,468

Uncharted: How to Map the Future by Margaret Heffernan

"World Economic Forum" Davos, 23andMe, Affordable Care Act / Obamacare, Airbnb, Alan Greenspan, Anne Wojcicki, anti-communist, Atul Gawande, autonomous vehicles, banking crisis, Berlin Wall, Boris Johnson, Brexit referendum, chief data officer, Chris Urmson, clean water, complexity theory, conceptual framework, cosmic microwave background, creative destruction, CRISPR, crowdsourcing, data science, David Attenborough, discovery of penicillin, driverless car, epigenetics, Fall of the Berlin Wall, fear of failure, George Santayana, gig economy, Google Glasses, Greta Thunberg, Higgs boson, index card, Internet of things, Jaron Lanier, job automation, Kickstarter, Large Hadron Collider, late capitalism, lateral thinking, Law of Accelerating Returns, liberation theology, mass immigration, mass incarceration, megaproject, Murray Gell-Mann, Nate Silver, obamacare, oil shale / tar sands, passive investing, pattern recognition, Peter Thiel, prediction markets, RAND corporation, Ray Kurzweil, Rosa Parks, Sam Altman, scientific management, Shoshana Zuboff, Silicon Valley, smart meter, Stephen Hawking, Steve Ballmer, Steve Jobs, surveillance capitalism, TED Talk, The Signal and the Noise by Nate Silver, Tim Cook: Apple, twin studies, University of East Anglia

A simplistic commercial view of the future is being forced onto a world as though there are no alternative possibilities, when in fact there are many. The same sleight of hand is intrinsic to almost all discussions of artificial intelligence. Wild promises are made about the capacity of AI to predict disease, crime, recidivism, career trajectories, lifespan. Inevitablism discourages practical questions. Data scientists know that, with a large enough dataset, projecting trends with gross accuracy is easy, but it’s near impossible to reduce from that to pinpoint accuracy for an individual. Philosophical questions abound – and not only about who owns the data and for what purposes. The rhetoric flowing from Silicon Valley casually assumes that a person is simply an aggregation of data.

It is, as one commentator noticed, one more step towards cutting out human agency altogether.44 Pervasive monitoring devices – smartphones, wearables, voice-enabled speakers and smart meters – allow companies to track and manage consumer behaviour. The Harvard business scholar Shoshana Zuboff quotes an unnamed chief data scientist who explains: ‘The goal of everything we do is to change people’s actual behavior at scale . . . we can capture their behaviours and identify good and bad [ones]. Then we develop “treatments” or “data pellets” that select good behaviours.’45 MIT’s Alex Pentland seems more interested in enhancing machines than human understanding.

One scenario planner, Angela Wilkinson, compared computer models in scenario planning to ‘a heavy axe in the hand of a fireman – even if it hinders his escape from a fire, a fireman is reluctant to drop his axe. The axe is a help most of the time, but a dangerous burden in extreme events.’ If you entrust scenario planning to data scientists, there is a real risk that strategy and data bifurcate so severely that the plastic, creative exercise of integrating them fails. And while it is clear that scenario planning demands a diverse range of people who are open-minded with deep intellectual curiosity, expertise and a capacity to think freely, those people can be hard to find.


pages: 244 words: 66,977

Subscribed: Why the Subscription Model Will Be Your Company's Future - and What to Do About It by Tien Tzuo, Gabe Weisert

3D printing, Airbnb, airport security, Amazon Web Services, augmented reality, autonomous vehicles, Big Tech, bike sharing, blockchain, Brexit referendum, Build a better mousetrap, business cycle, business intelligence, business process, call centre, cloud computing, cognitive dissonance, connected car, data science, death of newspapers, digital nomad, digital rights, digital twin, double entry bookkeeping, Elon Musk, factory automation, fake news, fiat currency, Ford Model T, fulfillment center, growth hacking, hockey-stick growth, Internet of things, inventory management, iterative process, Jeff Bezos, John Zimmer (Lyft cofounder), Kevin Kelly, Lean Startup, Lyft, manufacturing employment, Marc Benioff, Mary Meeker, megaproject, minimum viable product, natural language processing, Network effects, Nicholas Carr, nuclear winter, pets.com, planned obsolescence, pneumatic tube, profit maximization, race to the bottom, ride hailing / ride sharing, Salesforce, Sand Hill Road, shareholder value, Silicon Valley, skunkworks, smart meter, social graph, software as a service, spice trade, Steve Ballmer, Steve Jobs, subscription business, systems thinking, tech worker, TED Talk, Tim Cook: Apple, transport as a service, Uber and Lyft, uber lyft, WeWork, Y2K, Zipcar

International Data Corporation (IDC) predicts that by 2020, 50 percent of the world’s largest enterprises will see the majority of their business depend on their ability to create digitally enhanced products, services, and experiences. Focusing on services over products is also a sound business strategy. Zuora’s Subscription Economy Index, which you’ll find at the end of this book, shows that subscription-based companies are growing eight times faster than the S&P 500 and five times faster than US retail sales. Our chief data scientist, Carl Gold, put this report together using anonymized, aggregated, system-generated activity on our platform. I urge you to read it—it’s a fascinating document, based on billions of dollars of revenue and millions of financial transactions, that has all sorts of industry benchmarks and insights.

At the same time, we’re also hearing a lot more about how all these companies have in-house teams of “growth hackers,” which on a surface level sounds a lot like, well, marketing. They’re trying to come up with smarter ways to drive sales. But these folks tend to reject that label. Stitch Fix has more than ninety data scientists on its payroll. These people aren’t thinking of snappier punch lines for billboards; they’re looking for ways to optimize growth within the service itself. It’s almost as if the engineers have taken over the marketing shop: building freemium models, creating upgrade incentives, offering in-app purchases.

Over a period of just under six years (January 1, 2012, to September 30, 2017), the SEI grew at an average annual rate of 17.6%. The S&P 500 Sales grows at an average annual rate of 2.2%, while US Retail Sales grew at an average annual rate of 3.6%. This study was conducted by Zuora chief data scientist Carl Gold. Subscription business sales have grown substantially faster than two key public benchmarks—S&P 500 sales and US retail sales. Overall, the SEI reveals that subscription businesses grew revenues about eight times faster than S&P 500 company revenues (17.6 percent versus 2.2 percent) and about five times faster than US retail sales (17.6 percent versus 3.6 percent) from January 1, 2012, to September 30, 2017.


Four Battlegrounds by Paul Scharre

2021 United States Capitol attack, 3D printing, active measures, activist lawyer, AI winter, AlphaGo, amateurs talk tactics, professionals talk logistics, artificial general intelligence, ASML, augmented reality, Automated Insights, autonomous vehicles, barriers to entry, Berlin Wall, Big Tech, bitcoin, Black Lives Matter, Boeing 737 MAX, Boris Johnson, Brexit referendum, business continuity plan, business process, carbon footprint, chief data officer, Citizen Lab, clean water, cloud computing, commoditize, computer vision, coronavirus, COVID-19, crisis actor, crowdsourcing, DALL-E, data is not the new oil, data is the new oil, data science, deep learning, deepfake, DeepMind, Demis Hassabis, Deng Xiaoping, digital map, digital rights, disinformation, Donald Trump, drone strike, dual-use technology, Elon Musk, en.wikipedia.org, endowment effect, fake news, Francis Fukuyama: the end of history, future of journalism, future of work, game design, general purpose technology, Geoffrey Hinton, geopolitical risk, George Floyd, global supply chain, GPT-3, Great Leap Forward, hive mind, hustle culture, ImageNet competition, immigration reform, income per capita, interchangeable parts, Internet Archive, Internet of things, iterative process, Jeff Bezos, job automation, Kevin Kelly, Kevin Roose, large language model, lockdown, Mark Zuckerberg, military-industrial complex, move fast and break things, Nate Silver, natural language processing, new economy, Nick Bostrom, one-China policy, Open Library, OpenAI, PalmPilot, Parler "social media", pattern recognition, phenotype, post-truth, purchasing power parity, QAnon, QR code, race to the bottom, RAND corporation, recommendation engine, reshoring, ride hailing / ride sharing, robotic process automation, Rodney Brooks, Rubik’s Cube, self-driving car, Shoshana Zuboff, side project, Silicon Valley, slashdot, smart cities, smart meter, Snapchat, social software, sorting algorithm, South China Sea, sparse data, speech recognition, Steve Bannon, Steven Levy, Stuxnet, supply-chain attack, surveillance capitalism, systems thinking, tech worker, techlash, telemarketer, The Brussels Effect, The Signal and the Noise by Nate Silver, TikTok, trade route, TSMC

., Recommendations for Leveraging Cloud Computing Resources for Federally Funded Artificial Intelligence Research and Development (Select Committee on Artificial Intelligence, National Science & Technology Council, November 17, 2020), https://www.nitrd.gov/pubs/Recommendations-Cloud-AI-RD-Nov2020.pdf; Hatef Monajemi et al., “Ambitious Data Science Can Be Painless,” Harvard Data Science Review, no. 1.1 (July 1, 2019), https://doi.org/10.1162/99608f92.02ffc552. 32National Artificial Intelligence Research Resource: Division E, “National Artificial Intelligence Research Resource,” in William M. (Mac) Thornberry National Defense Authorization Act for Fiscal Year 2021, H.R. 6395, 116th Cong. (2019), https://www.congress.gov/bill/116th-congress/house-bill/6395/text; The White House, “The Biden Administration Launches the National Artificial Intelligence Research Resource Task Force,” news release, June 10, 2021, https://www.whitehouse.gov/ostp/news-updates/2021/06/10/the-biden-administration-launches-the-national-artificial-intelligence-research-resource-task-force/; Interim NAIRR Task Force, Envisioning a National Artificial Intelligence Research Resource (NAIRR): Preliminary Findings and Recommendations, May 2022, https://www.ai.gov/wp-content/uploads/2022/05/NAIRR-TF-Interim-Report-2022.pdf; Lynne Parker, “Bridging the Resource Divide for Artificial Intelligence Research,” OSTP blog, May 22, 2022, https://www.whitehouse.gov/ostp/news-updates/2022/05/25/bridging-the-resource-divide-for-artificial-intelligence-research/. 32China’s Thousand Talents Plan: Threats to the U.S.

Under the old method, the top academically performing cadets got their first pick, and then it would go down the list of cadets, in academic rank, until each job position filled up. Easley explained that the Army has known for a while that it wasn’t optimally aligning people to jobs that might be the best fit for them. “They just didn’t have a better way of implementing it,” he said. AI is changing that. Lieutenant Colonel Isaac Faber, chief data scientist for the Army AI Task Force, outlined how they are in the process of building an AI model that uses five years’ worth of officer performance data—“tens of thousands of data points”—to predict how well West Point cadets are likely to do in a given career field. In 2020, “for the first time,” Faber said, “a machine learning algorithm will be part of the fabric that makes up the branching recommendations for cadets at West Point.”

., https://www.wired.com/insights/2014/07/data-new-oil-digital-economy/; Kiran Bhageshpur, “Data Is the New Oil—and That’s a Good Thing,” Forbes, November 15, 2019, https://www.forbes.com/sites/forbestechcouncil/2019/11/15/data-is-the-new-oil-and-thats-a-good-thing/?sh=10eefed73045; Adeola Adesina, “Data Is the New Oil,” Medium, November 13, 2018, https://medium.com/@adeolaadesina/data-is-the-new-oil-2947ed8804f6; Will Murphy, “Data Is the New Oil,” Towards Data Science, May 7, 2017, https://towardsdatascience.com/data-is-the-new-oil-f11440e80dd0; Giuliano Giacaglia, “Data Is the New Oil,” Hackernoon, February 9, 2019, https://hackernoon.com/data-is-the-new-oil-1227197762b2. 20data-is-not-the-new-oil articles: Antonio Garcia Martinez, “No, Data Is Not the New Oil,” Wired, February 26, 2019, https://www.wired.com/story/no-data-is-not-the-new-oil/; Bernard Marr, “Here’s Why Data Is Not the New Oil,” Forbes, March 5, 2018, https://www.forbes.com/sites/bernardmarr/2018/03/05/heres-why-data-is-not-the-new-oil/?


pages: 343 words: 91,080

Uberland: How Algorithms Are Rewriting the Rules of Work by Alex Rosenblat

"Susan Fowler" uber, Affordable Care Act / Obamacare, Airbnb, algorithmic management, Amazon Mechanical Turk, autonomous vehicles, barriers to entry, basic income, big-box store, bike sharing, Black Lives Matter, business logic, call centre, cashless society, Cass Sunstein, choice architecture, cognitive load, collaborative economy, collective bargaining, creative destruction, crowdsourcing, data science, death from overwork, digital divide, disinformation, disruptive innovation, don't be evil, Donald Trump, driverless car, emotional labour, en.wikipedia.org, fake news, future of work, gender pay gap, gig economy, Google Chrome, Greyball, income inequality, independent contractor, information asymmetry, information security, Jaron Lanier, Jessica Bruder, job automation, job satisfaction, Lyft, marginal employment, Mark Zuckerberg, move fast and break things, Network effects, new economy, obamacare, performance metric, Peter Thiel, price discrimination, proprietary trading, Ralph Waldo Emerson, regulatory arbitrage, ride hailing / ride sharing, Salesforce, self-driving car, sharing economy, side hustle, Silicon Valley, Silicon Valley ideology, Skype, social software, SoftBank, stealth mode startup, Steve Jobs, strikebreaker, TaskRabbit, technological determinism, Tim Cook: Apple, transportation-network company, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, union organizing, universal basic income, urban planning, Wolfgang Streeck, work culture , workplace surveillance , Yochai Benkler, Zipcar

Its results, published in a 2018 report, state that on the national level in the United States, 93 percent of its drivers drive fewer than twenty hours per week, and 93 percent are employed, seeking employment, full-time students, or retired.9 In February 2018, Uber published a blog post stating that “nearly 60% of U.S. drivers use Uber less than 10 hours a week.”10 Uber confirmed in an email to me that the latter statistic accounted for drivers who drove fewer than ten hours a week in a typical workweek over the previous three months, according to data scientists on Uber’s policy team.11 However, the Uber and Lyft reports on how much drivers work for either company tell only part of the story: a typical driver I met in New York City worked full time for multiple apps (often two to three), such as some combination of Uber, Lyft, Juno, Via, and Gett. Indeed, a 2016 report by the Office of the Mayor in New York City states that, of taxi and for-hire drivers (which includes ridehail drivers), “about three-quarters of all drivers say that driving a taxi or other for-hire vehicle is their full-time job.”12 Lyft’s 2018 report also offers a city-by-city breakdown of driver statistics, and it states that in New York City, 91 percent of drivers work fewer than twenty hours per week13—but that may simply reflect the fact that drivers who work full time are giving some of their hours to local competitors, like Uber, Juno, or Via.

The app is known to display information about local happenings like sporting events to drivers so they can anticipate demand. While nudges are not necessarily manipulative and do inherently provide nudge-recipients with a sense of choice or agency,50 they are nevertheless highly influential in setting expectations. The recommendations that individual drivers receive from Uber may be the product of mathematical data science, but it’s not clear that Uber is an honest broker of that data. Moreover, when drivers use that data to their own advantage, such as by rejecting a nonsurge dispatch when they are located in a surge-pricing zone, in order to wait for a surge-priced dispatch, they risk being fired. In other words, drivers are not merely consumers of free data-driven analysis, like users of GPS navigation services.


How to Work Without Losing Your Mind by Cate Sevilla

Big Tech, BIPOC, Black Lives Matter, coronavirus, COVID-19, data science, Desert Island Discs, Donald Trump, emotional labour, gender pay gap, Girl Boss, global pandemic, Google Hangouts, imposter syndrome, job satisfaction, lockdown, microaggression, period drama, Phoebe Waller-Bridge, remote working, Sheryl Sandberg, side project, Skype, tech bro, TED Talk, women in the workforce, work culture

She acknowledges that she was comfortable in her corporate job, and would never have considered a change of career otherwise. She says that she’s trying to see her redundancy as a chance to try something new, and as a tremendous opportunity: I’m thinking about training as a data scientist. I’ve done a course in my spare time to try to keep my skills going, so I might spend the money consolidating what I’ve taught myself and then try and change career into data science. It wasn’t even around as a job when I graduated. But because I’ve got this opportunity of being redundant, it’s given me a bit of a kick to something else. If you’re going to get a bit of money, why not do it now?


pages: 379 words: 109,223

Frenemies: The Epic Disruption of the Ad Business by Ken Auletta

"World Economic Forum" Davos, Airbnb, Alvin Toffler, AOL-Time Warner, barriers to entry, Bernie Sanders, bike sharing, Boris Johnson, Build a better mousetrap, Burning Man, call centre, Cambridge Analytica, capitalist realism, carbon footprint, cloud computing, commoditize, connected car, content marketing, corporate raider, crossover SUV, data science, digital rights, disintermediation, Donald Trump, driverless car, Elon Musk, fake news, financial engineering, forensic accounting, Future Shock, Google Glasses, Internet of things, Jeff Bezos, Kevin Roose, Khan Academy, Lyft, Mark Zuckerberg, market design, Mary Meeker, Max Levchin, Menlo Park, move fast and break things, Naomi Klein, NetJets, Network effects, pattern recognition, pets.com, race to the bottom, Richard Feynman, ride hailing / ride sharing, Salesforce, Saturday Night Live, self-driving car, sharing economy, Sheryl Sandberg, Shoshana Zuboff, Silicon Valley, Snapchat, Steve Ballmer, Steve Jobs, surveillance capitalism, Susan Wojcicki, The Theory of the Leisure Class by Thorstein Veblen, three-martini lunch, Tim Cook: Apple, transaction costs, Uber and Lyft, uber lyft, Upton Sinclair, éminence grise

“We really have to get into these walled gardens to really understand what people are doing and how they’re behaving,” he says. Mobile phones pose obstacles as well; marketing on mobile phones is complex. Salama cautions, “How do we test ads on mobile?” And if the mobile phone lacks a flash drive, they can’t show the ad. Talent may be big data’s most crucial impediment, Salama thinks. “Everyone wants to hire data scientists and engineers.” The supply is limited, the competition intense. * * * ■ ■ ■ In the data arms race, Irwin Gotlieb set out to build a state-of-the-art data weapon, known internally as the Secret Sauce project. Gotlieb was intent on building GroupM’s own proprietary data system because, as he would publicly complain in 2015, Facebook and Google had their own ad tech vehicles to target ads and were, in effect, muscling agencies and clients by warning: “If you want to buy ads on our properties, you have to use our ad tech tools.”

Third, I need to match that to an advertiser that’s trying to reach you. Fourth, I need to value [price] that. And fifth, I need the systems to be able to get you the right ad within two hundred milliseconds.” Agencies have to change, he says, and will have to recruit more “software developers and data scientists and analysts. . . . Tech people are going to take over.” Like colleagues at other agencies, Xaxis executives obsess about Facebook. On any given day, a Xaxis vice president confides, they see 130 billion Internet pages that carry an ad. Facebook and Google see trillions. With its walled garden, he says Facebook tries to “curtail people from understanding what’s happening in their environments.

The reason that progress in AI has seemed so pronounced in the past few years is that technological advances in all three areas have accelerated.”* The race to dominate AI reveals companies with deep pockets—Google, Facebook, Amazon, IBM, Oracle, Apple, Salesforce.com, among others—vying to hire engineers and data scientists. “The bulk of Fox advertising will be sold by machines,” predicts James Murdoch, who goes on to say this will threaten the existence of the advertising holding companies. “The bulk of their business, the buying of media and the analysis of how to generate reach at a low incremental cost, it’s hard to see what their role is twenty years from now.


pages: 428 words: 103,544

The Data Detective: Ten Easy Rules to Make Sense of Statistics by Tim Harford

Abraham Wald, access to a mobile phone, Ada Lovelace, affirmative action, algorithmic bias, Automated Insights, banking crisis, basic income, behavioural economics, Black Lives Matter, Black Swan, Bretton Woods, British Empire, business cycle, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, Charles Babbage, clean water, collapse of Lehman Brothers, contact tracing, coronavirus, correlation does not imply causation, COVID-19, cuban missile crisis, Daniel Kahneman / Amos Tversky, data science, David Attenborough, Diane Coyle, disinformation, Donald Trump, Estimating the Reproducibility of Psychological Science, experimental subject, fake news, financial innovation, Florence Nightingale: pie chart, Gini coefficient, Great Leap Forward, Hans Rosling, high-speed rail, income inequality, Isaac Newton, Jeremy Corbyn, job automation, Kickstarter, life extension, meta-analysis, microcredit, Milgram experiment, moral panic, Netflix Prize, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, opioid epidemic / opioid crisis, Paul Samuelson, Phillips curve, publication bias, publish or perish, random walk, randomized controlled trial, recommendation engine, replication crisis, Richard Feynman, Richard Thaler, rolodex, Ronald Reagan, selection bias, sentiment analysis, Silicon Valley, sorting algorithm, sparse data, statistical model, stem cell, Stephen Hawking, Steve Bannon, Steven Pinker, survivorship bias, systematic bias, TED Talk, universal basic income, W. E. B. Du Bois, When a measure becomes a target

Clearly this was a ponderous method, although it is unlikely that it introduced enough errors to explain the huge disparity between my experience and the official occupancy figures. In any case, in the age of contactless payments it’s much easier to estimate passenger numbers. The vast majority of bus journeys are made by people tapping an identifiable contactless chip on a bank card, a TfL Oyster card, or a smartphone. The data scientists at TfL can see where and when these devices are being used. They still have to make an educated guess as to when you get off the bus, but this is often possible—for example, they might see you make the return journey from the same area later. Or they might see that you had used your card on a connecting service: whenever I tapped into the tube network at Bethnal Green, one minute after the bus I’d been riding on arrived in the area, TfL could conclude with confidence that I’d been on the bus until the stop at Bethnal Green, but no farther.

I’ve discussed this story with many people and I’m struck by a disparity. Most people are wide-eyed with astonishment. But two of the groups I hang out with a lot take a rather different view. Journalists are often cynical; some suspect Duhigg of inventing, exaggerating, or passing on an urban myth. (I suspect them of professional jealousy.) Data scientists and statisticians, on the other hand, yawn. They regard the tale as both unsurprising and uninformative. And I think the statisticians have it right. First, let’s think for a moment about just how amazing it might be to predict that someone is pregnant based on their shopping habits: not very.

We’re awestruck by the algorithm in part because we don’t appreciate the mundanity of what’s happening underneath the magician’s silk handkerchief. And there’s another way in which Duhigg’s story of Target’s algorithm invites us to overestimate the capabilities of data-fueled computer analytics. “There’s a huge false positive issue,” says Kaiser Fung, a data scientist who has spent years developing similar approaches for retailers and advertisers. What Fung means is that we don’t get to hear stories about women who receive coupons for baby wear but who aren’t pregnant. Hearing the anecdote, it’s easy to assume that Target’s algorithms are infallible—that everybody receiving coupons for onesies and wet wipes is pregnant.


pages: 484 words: 114,613

No Filter: The Inside Story of Instagram by Sarah Frier

Airbnb, Amazon Web Services, Benchmark Capital, blockchain, Blue Bottle Coffee, Cambridge Analytica, Clayton Christensen, cloud computing, cryptocurrency, data science, disinformation, Donald Trump, Elon Musk, end-to-end encryption, fake news, Frank Gehry, growth hacking, Jeff Bezos, Marc Andreessen, Mark Zuckerberg, Menlo Park, Minecraft, move fast and break things, Network effects, new economy, Oculus Rift, Peter Thiel, ride hailing / ride sharing, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, Snapchat, Steve Jobs, TaskRabbit, TikTok, Tony Hsieh, Travis Kalanick, ubercab, Zipcar

When Instagram launched new features, she tried to make sure they demonstrated them with teen digital-first influencers. The data showed that these kinds of stars, who had become famous on Vine, YouTube, or Instagram, were much more popular than anyone in the office expected them to be. She made a list of 500 of them, then asked Facebook data scientists for help understanding their impact. They found that about a third of Instagram’s user base followed at least one of the people on her list. Perle, like Porch, thought that Instagram should have a role in creating future mainstream celebrities—and that it would be important to build relationships with the ones who hadn’t quite become stars yet but had high interest from their audiences.

In the case of Myspace, the disruptor was Facebook. Paranoia over obsolescence festered at Facebook’s very core, and was the reason they’d bought Instagram and attempted to buy Snapchat in the first place. The anecdotal evidence from Third Thursday Teens was backed up by the data. When Nayak first heard about finstas, she asked Instagram’s data scientists to look into how many people had multiple accounts. After weeks of pestering, she got the numbers back. Between 15 and 20 percent of users had multiple accounts, and among teens, that proportion was much higher. She wrote out a report to explain the phenomenon for the Instagram team, since she couldn’t find anything about it in a Google search.

But Zuckerberg gave the same dismissive opinion at a question-and-answer session with employees the next day. He also told his workers that there was another, more positive way of looking at it. If people were blaming Facebook for the election’s outcome, it showed how important the social network was to their everyday lives. Not long after Zuckerberg’s talk, a data scientist posted a study internally on the difference between Trump’s campaign and Clinton’s. That was when employees realized there was another, maybe even bigger way their company had helped ensure the election outcome. In their attempt to be impartial, Facebook had given much more advertising strategy help to Trump.


pages: 409 words: 112,055

The Fifth Domain: Defending Our Country, Our Companies, and Ourselves in the Age of Cyber Threats by Richard A. Clarke, Robert K. Knake

"World Economic Forum" Davos, A Declaration of the Independence of Cyberspace, Affordable Care Act / Obamacare, air gap, Airbnb, Albert Einstein, Amazon Web Services, autonomous vehicles, barriers to entry, bitcoin, Black Lives Matter, Black Swan, blockchain, Boeing 737 MAX, borderless world, Boston Dynamics, business cycle, business intelligence, call centre, Cass Sunstein, cloud computing, cognitive bias, commoditize, computer vision, corporate governance, cryptocurrency, data acquisition, data science, deep learning, DevOps, disinformation, don't be evil, Donald Trump, Dr. Strangelove, driverless car, Edward Snowden, Exxon Valdez, false flag, geopolitical risk, global village, immigration reform, information security, Infrastructure as a Service, Internet of things, Jeff Bezos, John Perry Barlow, Julian Assange, Kubernetes, machine readable, Marc Benioff, Mark Zuckerberg, Metcalfe’s law, MITM: man-in-the-middle, Morris worm, move fast and break things, Network effects, open borders, platform as a service, Ponzi scheme, quantum cryptography, ransomware, Richard Thaler, Salesforce, Sand Hill Road, Schrödinger's Cat, self-driving car, shareholder value, Silicon Valley, Silicon Valley startup, Skype, smart cities, Snapchat, software as a service, Steven Levy, Stuxnet, technoutopianism, The future is already here, Tim Cook: Apple, undersea cable, unit 8200, WikiLeaks, Y2K, zero day

By formally defining and verifying modular components of code, these pieces of code may be used as trusted building blocks, providing strong footing for less secure software running on top of them. Meanwhile, as we wait for our AI overlords to start writing better code, Zatko and his wife, the data scientist Sarah Zatko, have started rating today’s software for how well it is constructed. At the request of the White House in 2015, they set up the Cyber Independent Testing Lab to automate the process of rating software quality to, in his words, “quantify the resilience of software against future exploitation.”

Evan Wolff is one of the leading cybersecurity attorneys in Washington. What that means is that by day he helps his clients respond to cyber incidents, including directing investigations and advising on notifications under state, federal, and international requirements. By night, as a former MITRE data scientist and Global Fellow at the Wilson Center, he thinks and writes about how his clients can mitigate the threat of cyber incidents in the first place, including what can be done to build an effective collective defense. From experience, Wolff recognizes that only rarely would the teams behind security incidents stop at nothing to reach their targets.

For most AI/ML programs to work well, that data all needs to be swimming in the same place, in what big corporations call their data lake. It seldom is. It’s scattered. Or sometimes it is not even collected or stored, or not stored for very long, or not stored in the right format. Then it has to be converted into a usable format through what data scientists politely call “manicuring” (and behind the scenes call “data mangling”) for the AI/ML engine to perform its work. Capturing all the data and storing it for six weeks or more to catch the “low-and-slow” attacks (ones that take each step in the attack days or weeks apart so as not to be noticed) would be a very expensive proposition for any company.


pages: 1,034 words: 241,773

Enlightenment Now: The Case for Reason, Science, Humanism, and Progress by Steven Pinker

3D printing, Abraham Maslow, access to a mobile phone, affirmative action, Affordable Care Act / Obamacare, agricultural Revolution, Albert Einstein, Alfred Russel Wallace, Alignment Problem, An Inconvenient Truth, anti-communist, Anton Chekhov, Arthur Eddington, artificial general intelligence, availability heuristic, Ayatollah Khomeini, basic income, Berlin Wall, Bernie Sanders, biodiversity loss, Black Swan, Bonfire of the Vanities, Brexit referendum, business cycle, capital controls, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, carbon tax, Charlie Hebdo massacre, classic study, clean water, clockwork universe, cognitive bias, cognitive dissonance, Columbine, conceptual framework, confounding variable, correlation does not imply causation, creative destruction, CRISPR, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, dark matter, data science, decarbonisation, degrowth, deindustrialization, dematerialisation, demographic transition, Deng Xiaoping, distributed generation, diversified portfolio, Donald Trump, Doomsday Clock, double helix, Eddington experiment, Edward Jenner, effective altruism, Elon Musk, en.wikipedia.org, end world poverty, endogenous growth, energy transition, European colonialism, experimental subject, Exxon Valdez, facts on the ground, fake news, Fall of the Berlin Wall, first-past-the-post, Flynn Effect, food miles, Francis Fukuyama: the end of history, frictionless, frictionless market, Garrett Hardin, germ theory of disease, Gini coefficient, Great Leap Forward, Hacker Conference 1984, Hans Rosling, hedonic treadmill, helicopter parent, Herbert Marcuse, Herman Kahn, Hobbesian trap, humanitarian revolution, Ignaz Semmelweis: hand washing, income inequality, income per capita, Indoor air pollution, Intergovernmental Panel on Climate Change (IPCC), invention of writing, Jaron Lanier, Joan Didion, job automation, Johannes Kepler, John Snow's cholera map, Kevin Kelly, Khan Academy, knowledge economy, l'esprit de l'escalier, Laplace demon, launch on warning, life extension, long peace, longitudinal study, Louis Pasteur, Mahbub ul Haq, Martin Wolf, mass incarceration, meta-analysis, Michael Shellenberger, microaggression, Mikhail Gorbachev, minimum wage unemployment, moral hazard, mutually assured destruction, Naomi Klein, Nate Silver, Nathan Meyer Rothschild: antibiotics, negative emissions, Nelson Mandela, New Journalism, Norman Mailer, nuclear taboo, nuclear winter, obamacare, ocean acidification, Oklahoma City bombing, open economy, opioid epidemic / opioid crisis, paperclip maximiser, Paris climate accords, Paul Graham, peak oil, Peter Singer: altruism, Peter Thiel, post-truth, power law, precautionary principle, precision agriculture, prediction markets, public intellectual, purchasing power parity, radical life extension, Ralph Nader, randomized controlled trial, Ray Kurzweil, rent control, Republic of Letters, Richard Feynman, road to serfdom, Robert Gordon, Rodney Brooks, rolodex, Ronald Reagan, Rory Sutherland, Saturday Night Live, science of happiness, Scientific racism, Second Machine Age, secular stagnation, self-driving car, sharing economy, Silicon Valley, Silicon Valley ideology, Simon Kuznets, Skype, smart grid, Social Justice Warrior, sovereign wealth fund, sparse data, stem cell, Stephen Hawking, Steve Bannon, Steven Pinker, Stewart Brand, Stuxnet, supervolcano, synthetic biology, tech billionaire, technological determinism, technological singularity, Ted Kaczynski, Ted Nordhaus, TED Talk, The Rise and Fall of American Growth, the scientific method, The Signal and the Noise by Nate Silver, The Spirit Level, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, total factor productivity, Tragedy of the Commons, union organizing, universal basic income, University of East Anglia, Unsafe at Any Speed, Upton Sinclair, uranium enrichment, urban renewal, W. E. B. Du Bois, War on Poverty, We wanted flying cars, instead we got 140 characters, women in the workforce, working poor, World Values Survey, Y2K

I am grateful as well to Marian Tupy of HumanProgress and to Ola Rosling and Hans Rosling of Gapminder, two other invaluable resources for understanding the state of humanity. Hans was an inspiration, and his death in 2017 a tragedy for those who are committed to reason, science, humanism, and progress. My gratitude goes as well to the other data scientists I pestered and to the institutions that collect and maintain their data: Karlyn Bowman, Daniel Cox (PRRI), Tamar Epner (Social Progress Index), Christopher Fariss, Chelsea Follett (HumanProgress), Andrew Gelman, Yair Ghitza, April Ingram (Science Heroes), Jill Janocha (Bureau of Labor Statistics), Gayle Kelch (US Fire Administration/FEMA), Alaina Kolosh (National Safety Council), Kalev Leetaru (Global Database of Events, Language, and Tone), Monty Marshall (Polity Project), Bruce Meyer, Branko Milanović (World Bank), Robert Muggah (Homicide Monitor), Pippa Norris (World Values Survey), Thomas Olshanski (US Fire Administration/FEMA), Amy Pearce (Science Heroes), Mark Perry, Therese Pettersson (Uppsala Conflict Data Program), Leandro Prados de la Escosura, Stephen Radelet, Auke Rijpma (OECD Clio Infra), Hannah Ritchie (Our World in Data), Seth Stephens-Davidowitz (Google Trends), James X.

One consequence is that many Americans today have difficulty imagining, valuing or even believing in the promise of incremental system change, which leads to a greater appetite for revolutionary, smash-the-machine change.30 Bornstein and Rosenberg don’t blame the usual culprits (cable TV, social media, late-night comedians) but instead trace it to the shift during the Vietnam and Watergate eras from glorifying leaders to checking their power—with an overshoot toward indiscriminate cynicism, in which everything about America’s civic actors invites an aggressive takedown. If the roots of progressophobia lie in human nature, is my suggestion that it is on the rise itself an illusion of the Availability bias? Anticipating the methods I will use in the rest of the book, let’s look at an objective measure. The data scientist Kalev Leetaru applied a technique called sentiment mining to every article published in the New York Times between 1945 and 2005, and to an archive of translated articles and broadcasts from 130 countries between 1979 and 2010. Sentiment mining assesses the emotional tone of a text by tallying the number and contexts of words with positive and negative connotations, like good, nice, terrible, and horrific.

The technological advances that have propelled this progress should only gather speed. Stein’s Law continues to obey Davies’s Corollary (Things that can’t go on forever can go on much longer than you think), and genomics, synthetic biology, neuroscience, artificial intelligence, materials science, data science, and evidence-based policy analysis are flourishing. We know that infectious diseases can be extinguished, and many are slated for the past tense. Chronic and degenerative diseases are more recalcitrant, but incremental progress in many (such as cancer) has been accelerating, and breakthroughs in others (such as Alzheimer’s) are likely.


pages: 393 words: 91,257

The Coming of Neo-Feudalism: A Warning to the Global Middle Class by Joel Kotkin

"RICO laws" OR "Racketeer Influenced and Corrupt Organizations", "World Economic Forum" Davos, Admiral Zheng, Alvin Toffler, Andy Kessler, autonomous vehicles, basic income, Bernie Sanders, Big Tech, bread and circuses, Brexit referendum, call centre, Capital in the Twenty-First Century by Thomas Piketty, carbon credits, carbon footprint, Cass Sunstein, clean water, company town, content marketing, Cornelius Vanderbilt, creative destruction, data science, deindustrialization, demographic transition, deplatforming, don't be evil, Donald Trump, driverless car, edge city, Elon Musk, European colonialism, Evgeny Morozov, financial independence, Francis Fukuyama: the end of history, Future Shock, gentrification, gig economy, Gini coefficient, Google bus, Great Leap Forward, green new deal, guest worker program, Hans Rosling, Herbert Marcuse, housing crisis, income inequality, informal economy, Jane Jacobs, Jaron Lanier, Jeff Bezos, Jeremy Corbyn, job automation, job polarisation, job satisfaction, Joseph Schumpeter, land reform, liberal capitalism, life extension, low skilled workers, Lyft, Marc Benioff, Mark Zuckerberg, market fundamentalism, Martin Wolf, mass immigration, megacity, Michael Shellenberger, Nate Silver, new economy, New Urbanism, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, Occupy movement, Parag Khanna, Peter Thiel, plutocrats, post-industrial society, post-work, postindustrial economy, postnationalism / post nation state, precariat, profit motive, public intellectual, RAND corporation, Ray Kurzweil, rent control, Richard Florida, road to serfdom, Robert Gordon, Salesforce, Sam Altman, San Francisco homelessness, Satyajit Das, sharing economy, Sidewalk Labs, Silicon Valley, smart cities, Social Justice Warrior, Steve Jobs, Stewart Brand, superstar cities, technological determinism, Ted Nordhaus, The Death and Life of Great American Cities, The future is already here, The Future of Employment, The Rise and Fall of American Growth, Thomas L Friedman, too big to fail, trade route, Travis Kalanick, Uber and Lyft, uber lyft, universal basic income, unpaid internship, upwardly mobile, Virgin Galactic, We are the 99%, Wolfgang Streeck, women in the workforce, work culture , working-age population, Y Combinator

Stanford graduates had already founded Hewlett-Packard in 1939, and an engineering professor who became provost of the university, Frederick Terman, nurtured tech companies in the area.6 In the ensuing decades, the Bay Area, including San Francisco, became the world’s leading technology hub. This rapid technological growth resulted in a consolidation of wealth and power in a handful of companies. A relatively small cadre of engineers, data scientists, and marketers—a tiny sliver of humanity—began reshaping the world’s economy, and its culture as well.7 In the Middle Ages, the power of the nobility rested on the control of land and the right to bear arms; the power of today’s ascendant tech aristocracy comes mainly from exploiting “natural monopolies” in web-based business.

They will thereby create new godlings, who might be as diferent from us Sapiens as we are different from Homo erectus.38 Clearly the tech elites’ search for immortality does not address issues that affect those still living within nature’s limits. Someone needing assistance in a disaster is more likely to look toward a church member than a data scientist for help. Organized faiths at their best serve as powerful instruments of social improvement, with particular concern for the needy. The secular social justice warriors may be passionately committed to their causes, but often it is groups like the Baptists or the Church of Jesus Christ of Latter-Day Saints who come to the rescue faster and more effectively in a crisis.39 Religious institutions have long brought together people of disparate backgrounds and economic status, building social bonds between them and serving as unifying transmitters of tradition and cultural identity.

“By 2022, it’s possible that your personal device will know more about your emotional state than your own family,” said Annette Zimmermann, research vice president at the consulting company Gartner.13 This emotional reliance on technology provides more opportunity for the oligarchy and the clerisy to gain access to our inner feelings and profit from them.14 No matter how strongly a public relations staffer at Facebook or Google contends otherwise, the algorithms that govern social media are not neutral or objective, but reflect the assumptions of those who create the programs. “Algorithms are opinions embedded in code,” writes Cathy O’Neil, a data scientist.15 The most concerning effects of the new intrusive technology can be seen in younger people. Research published in 2017 by Jean Twenge, a psychologist at San Diego State University, indicates that more screen time and social media activity correlate with a higher rate of depression and elevated suicide risk among American adolescents.


pages: 661 words: 156,009

Your Computer Is on Fire by Thomas S. Mullaney, Benjamin Peters, Mar Hicks, Kavita Philip

"Susan Fowler" uber, 2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, A Declaration of the Independence of Cyberspace, affirmative action, Airbnb, algorithmic bias, AlphaGo, AltaVista, Amazon Mechanical Turk, Amazon Web Services, American Society of Civil Engineers: Report Card, An Inconvenient Truth, Asilomar, autonomous vehicles, Big Tech, bitcoin, Bletchley Park, blockchain, Boeing 737 MAX, book value, British Empire, business cycle, business process, Californian Ideology, call centre, Cambridge Analytica, carbon footprint, Charles Babbage, cloud computing, collective bargaining, computer age, computer vision, connected car, corporate governance, corporate social responsibility, COVID-19, creative destruction, cryptocurrency, dark matter, data science, Dennis Ritchie, deskilling, digital divide, digital map, don't be evil, Donald Davies, Donald Trump, Edward Snowden, en.wikipedia.org, European colonialism, fake news, financial innovation, Ford Model T, fulfillment center, game design, gentrification, George Floyd, glass ceiling, global pandemic, global supply chain, Grace Hopper, hiring and firing, IBM and the Holocaust, industrial robot, informal economy, Internet Archive, Internet of things, Jeff Bezos, job automation, John Perry Barlow, Julian Assange, Ken Thompson, Kevin Kelly, Kickstarter, knowledge economy, Landlord’s Game, Lewis Mumford, low-wage service sector, M-Pesa, Mark Zuckerberg, mass incarceration, Menlo Park, meta-analysis, mobile money, moral panic, move fast and break things, Multics, mutually assured destruction, natural language processing, Neal Stephenson, new economy, Norbert Wiener, off-the-grid, old-boy network, On the Economy of Machinery and Manufactures, One Laptop per Child (OLPC), packet switching, pattern recognition, Paul Graham, pink-collar, pneumatic tube, postindustrial economy, profit motive, public intellectual, QWERTY keyboard, Ray Kurzweil, Reflections on Trusting Trust, Report Card for America’s Infrastructure, Salesforce, sentiment analysis, Sheryl Sandberg, Silicon Valley, Silicon Valley ideology, smart cities, Snapchat, speech recognition, SQL injection, statistical model, Steve Jobs, Stewart Brand, tacit knowledge, tech worker, techlash, technoutopianism, telepresence, the built environment, the map is not the territory, Thomas L Friedman, TikTok, Triangle Shirtwaist Factory, undersea cable, union organizing, vertical integration, warehouse robotics, WikiLeaks, wikimedia commons, women in the workforce, Y2K

Technology analysts Luke Stark and Anna Hoffmann, in a mid-2019 opinion piece published while ethical debates raged in data science, suggested ways in which metaphors matter. Data-driven work is already complicit, they argue, “in perpetuating racist, sexist, and other oppressive harms.”4 Stark and Hoffmann argued, however, that a solution was hidden in the very articulation of this problem: “The language we use to describe data can also help us fix its problems.” They drew on a skill commonly believed appropriate only in literature departments: the analysis of metaphor, imagery, and other narrative tropes. This route to data ethics, they suggested, would help data scientists to be “explicit about the power dynamics and historical oppressions that shape our world.”

Often, the fact that data—which is the output on some level of human activity and thought—is not typically seen as a social construct by a whole host of data makers makes intervening upon the dirty or flawed data even more difficult.14 In fact, I would say this is a major point of contention when humanists and social scientists come together with colleagues from other domains to try to make sense of the output of these products and processes. Even social scientists currently engaging in data science projects, like anthropologists developing predictive policing technologies, are using logics and frameworks that are widely disputed as discriminatory.15 The concepts of the purity and neutrality of data are so deeply embedded in the training and discourses about what data is that there is great difficulty moving away from the reductionist argument that “math can’t discriminate because it’s math,” which patently avoids the issue of application of predictive mathematical modeling to the social dimensions of human experience.


pages: 484 words: 104,873

Rise of the Robots: Technology and the Threat of a Jobless Future by Martin Ford

3D printing, additive manufacturing, Affordable Care Act / Obamacare, AI winter, algorithmic management, algorithmic trading, Amazon Mechanical Turk, artificial general intelligence, assortative mating, autonomous vehicles, banking crisis, basic income, Baxter: Rethink Robotics, Bernie Madoff, Bill Joy: nanobots, bond market vigilante , business cycle, call centre, Capital in the Twenty-First Century by Thomas Piketty, carbon tax, Charles Babbage, Chris Urmson, Clayton Christensen, clean water, cloud computing, collateralized debt obligation, commoditize, computer age, creative destruction, data science, debt deflation, deep learning, deskilling, digital divide, disruptive innovation, diversified portfolio, driverless car, Erik Brynjolfsson, factory automation, financial innovation, Flash crash, Ford Model T, Fractional reserve banking, Freestyle chess, full employment, general purpose technology, Geoffrey Hinton, Goldman Sachs: Vampire Squid, Gunnar Myrdal, High speed trading, income inequality, indoor plumbing, industrial robot, informal economy, iterative process, Jaron Lanier, job automation, John Markoff, John Maynard Keynes: technological unemployment, John von Neumann, Kenneth Arrow, Khan Academy, Kiva Systems, knowledge worker, labor-force participation, large language model, liquidity trap, low interest rates, low skilled workers, low-wage service sector, Lyft, machine readable, machine translation, manufacturing employment, Marc Andreessen, McJob, moral hazard, Narrative Science, Network effects, new economy, Nicholas Carr, Norbert Wiener, obamacare, optical character recognition, passive income, Paul Samuelson, performance metric, Peter Thiel, plutocrats, post scarcity, precision agriculture, price mechanism, public intellectual, Ray Kurzweil, rent control, rent-seeking, reshoring, RFID, Richard Feynman, Robert Solow, Rodney Brooks, Salesforce, Sam Peltzman, secular stagnation, self-driving car, Silicon Valley, Silicon Valley billionaire, Silicon Valley startup, single-payer health, software is eating the world, sovereign wealth fund, speech recognition, Spread Networks laid a new fibre optics cable between New York and Chicago, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, Steven Pinker, strong AI, Stuxnet, technological singularity, telepresence, telepresence robot, The Bell Curve by Richard Herrnstein and Charles Murray, The Coming Technological Singularity, The Future of Employment, the long tail, Thomas L Friedman, too big to fail, Tragedy of the Commons, Tyler Cowen, Tyler Cowen: Great Stagnation, uber lyft, union organizing, Vernor Vinge, very high income, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, women in the workforce

Tools that provide new ways to visualize data collected from social media interactions as well as sensors built into doors, turnstiles, and escalators offer urban planners and city managers graphic representations of the way people move, work, and interact in urban environments, a development that may lead directly to more efficient and livable cities. There is a potential dark side, however. Target, Inc., provided a far more controversial example of the ways in which vast quantities of extraordinarily detailed customer data can be leveraged. A data scientist working for the company found a complex set of correlations involving the purchase of about twenty-five different health and cosmetic products that were a powerful early predictor of pregnancy. The company’s analysis could even estimate a woman’s due date with a high degree of accuracy. Target began bombarding women with offers for pregnancy-related products at such an early stage that, in some cases, the women had often not yet shared the news with their immediate families.

Quentin Hardy, “Active in Cloud, Amazon Reshapes Computing,” New York Times, August 27, 2012, http://www.nytimes.com/2012/08/28/technology/active-in-cloud-amazon-reshapes-computing.html. 31. Mark Stevenson, An Optimist’s Tour of the Future: One Curious Man Sets Out to Answer “What’s Next?” (New York: Penguin Group, 2011), p. 101. 32. Michael Schmidt and Hod Lipson, “Distilling Free-Form Natural Laws from Experimental Data,” Science 324 (April 3, 2009), http://creativemachines.cornell.edu/sites/default/files/Science09_Schmidt.pdf. 33. Stevenson, An Optimist’s Tour of the Future, p. 104. 34. National Science Foundation Press Release: “Maybe Robots Dream of Electric Sheep, But Can They Do Science?,” April 2, 2009, http://www.nsf.gov/mobile/news/news_summ.jsp?


pages: 364 words: 99,897

The Industries of the Future by Alec Ross

"World Economic Forum" Davos, 23andMe, 3D printing, Airbnb, Alan Greenspan, algorithmic bias, algorithmic trading, AltaVista, Anne Wojcicki, autonomous vehicles, banking crisis, barriers to entry, Bernie Madoff, bioinformatics, bitcoin, Black Lives Matter, blockchain, Boston Dynamics, Brian Krebs, British Empire, business intelligence, call centre, carbon footprint, clean tech, cloud computing, collaborative consumption, connected car, corporate governance, Credit Default Swap, cryptocurrency, data science, David Brooks, DeepMind, Demis Hassabis, disintermediation, Dissolution of the Soviet Union, distributed ledger, driverless car, Edward Glaeser, Edward Snowden, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, fiat currency, future of work, General Motors Futurama, global supply chain, Google X / Alphabet X, Gregor Mendel, industrial robot, information security, Internet of things, invention of the printing press, Jaron Lanier, Jeff Bezos, job automation, John Markoff, Joi Ito, Kevin Roose, Kickstarter, knowledge economy, knowledge worker, lifelogging, litecoin, low interest rates, M-Pesa, machine translation, Marc Andreessen, Mark Zuckerberg, Max Levchin, Mikhail Gorbachev, military-industrial complex, mobile money, money: store of value / unit of account / medium of exchange, Nelson Mandela, new economy, off-the-grid, offshore financial centre, open economy, Parag Khanna, paypal mafia, peer-to-peer, peer-to-peer lending, personalized medicine, Peter Thiel, precision agriculture, pre–internet, RAND corporation, Ray Kurzweil, recommendation engine, ride hailing / ride sharing, Rubik’s Cube, Satoshi Nakamoto, selective serotonin reuptake inhibitor (SSRI), self-driving car, sharing economy, Silicon Valley, Silicon Valley startup, Skype, smart cities, social graph, software as a service, special economic zone, supply-chain management, supply-chain management software, technoutopianism, TED Talk, The Future of Employment, Travis Kalanick, underbanked, unit 8200, Vernor Vinge, Watson beat the top human players on Jeopardy!, women in the workforce, work culture , Y Combinator, young professional

In very competitive elections, the Obama campaign used big data to gain insights into how to raise money, where to campaign, and how to advertise, which none of its opponents could rival. From fund-raising to field operations to the analytics in its polling operation, a group of several hundred digital operatives and data scientists crushed their Republican opponents. In 2012, the Obama campaign’s voter targeting and turnout programs performed brilliantly, while the Romney campaign’s crashed. Over the course of the 2012 campaign, Obama’s 18-person email team tested over 10,000 versions of email messages. In one instance, the campaign ran 18 variations of a single email, all with different subject lines, to determine which would be most effective.

Google chairman Eric Schmidt recruited an Israeli entrepreneur, Dror Berman, to move to Silicon Valley and head up Innovation Endeavors, a large venture firm that invests Schmidt’s money. Israel is home to many of the 20th century’s great innovations in farming. Berman brought the intellectual curiosity about agriculture with him to Silicon Valley and developed Farm2050, a partnership that aspires to combine data science and robotics to improve farming with a group of partners as diverse as Google, DuPont, and 3D Robotics. Dror recognized that Silicon Valley can be a little too navel-gazing, and told me that 90 percent of the region’s entrepreneurs focus on 10 percent of the world’s problems. With Farm2050, he is trying to bring Silicon Valley’s A game to agriculture.


pages: 688 words: 107,867

Python Data Analytics: With Pandas, NumPy, and Matplotlib by Fabio Nelli

Amazon Web Services, backpropagation, centre right, computer vision, data science, Debian, deep learning, DevOps, functional programming, Google Earth, Guido van Rossum, Internet of things, optical character recognition, pattern recognition, sentiment analysis, speech recognition, statistical model, web application

Population in 2014 Conclusions Chapter 12:​ Recognizing Handwritten Digits Handwriting Recognition Recognizing Handwritten Digits with scikit-learn The Digits Dataset Learning and Predicting Recognizing Handwritten Digits with TensorFlow Learning and Predicting Conclusions Chapter 13:​ Textual Data Analysis with NLTK Text Analysis Techniques The Natural Language Toolkit (NLTK) Import the NLTK Library and the NLTK Downloader Tool Search for a Word with NLTK Analyze the Frequency of Words Selection of Words from Text Bigrams and Collocations Use Text on the Network Extract the Text from the HTML Pages Sentimental Analysis Conclusions Chapter 14:​ Image Analysis and Computer Vision with OpenCV Image Analysis and Computer Vision OpenCV and Python OpenCV and Deep Learning Installing OpenCV First Approaches to Image Processing and Analysis Before Starting Load and Display an Image Working with Images Save the New Image Elementary Operations on Images Image Blending Image Analysis Edge Detection and Image Gradient Analysis Edge Detection The Image Gradient Theory A Practical Example of Edge Detection with the Image Gradient Analysis A Deep Learning Example:​ The Face Detection Conclusions Appendix A:​ Writing Mathematical Expressions with LaTeX With matplotlib With IPython Notebook in a Markdown Cell With IPython Notebook in a Python 2 Cell Subscripts and Superscripts Fractions, Binomials, and Stacked Numbers Radicals Fonts Accents Appendix B:​ Open Data Sources Political and Government Data Health Data Social Data Miscellaneous and Public Data Sets Financial Data Climatic Data Sports Data Publications, Newspapers, and Books Musical Data Index About the Author and About the Technical Reviewer About the Author Fabio Nelliis a data scientist and Python consultant, designing and developing Python applications for data analysis and visualization. He has experience with the scientific world, having performed various data analysis roles in pharmaceutical chemistry for private research companies and universities. He has been a computer consultant for many years at IBM, EDS, and Hewlett-Packard, along with several banks and insurance companies.

About the Technical Reviewer Raul Samayoa is a senior software developer and machine learning specialist with many years of experience in the financial industry. An MSc graduate from the Georgia Institute of Technology, he’s never met a neural network or dataset he did not like. He’s fond of evangelizing the use of DevOps tools for data science and software development. Raul enjoys the energy of his hometown of Toronto, Canada, where he runs marathons, volunteers as a technology instructor with the University of Toronto coders, and likes to work with data in Python and R. © Fabio Nelli 2018 Fabio NelliPython Data Analyticshttps://doi.org/10.1007/978-1-4842-3913-1_1 1.


The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling by Ralph Kimball, Margy Ross

active measures, Albert Einstein, book value, business intelligence, business process, call centre, cloud computing, data acquisition, data science, discrete time, false flag, inventory management, iterative process, job automation, knowledge worker, performance metric, platform as a service, side project, zero-sum game

Here are a couple of examples to illustrate this recommendation: The production environment for custom analytic programming might be MatLab within PostgreSQL or SAS within a Teradata RDBMS, but the data scientists might be building their proofs of concept in a wide variety of their own preferred languages and architectures. The key insight here: IT must be uncharacteristically tolerant of the range of technologies the data scientists use and be prepared in many cases to re-implement the data scientists' work in a standard set of technologies that can be supported over the long haul. The sandbox development environment might be custom R code directly accessing Hadoop, but controlled by a metadata-driven driven ETL tool. Then when the data scientist is ready to hand over the proof of concept, much of the logic could immediately be redeployed under the ETL tool to run in a grid computing environment that is scalable, highly available, and secure.

Build From Sandbox Results Consider embracing sandbox silos and building a practice of productionizing sandbox results. Allow data scientists to construct their data experiments and prototypes using their preferred languages and programming environments. Then, after proof of concept, systematically reprogram these implementations with an IT turnover team. Here are a couple of examples to illustrate this recommendation: The production environment for custom analytic programming might be MatLab within PostgreSQL or SAS within a Teradata RDBMS, but the data scientists might be building their proofs of concept in a wide variety of their own preferred languages and architectures.


pages: 223 words: 60,909

Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech by Sara Wachter-Boettcher

"Susan Fowler" uber, Abraham Maslow, Airbnb, airport security, algorithmic bias, AltaVista, big data - Walmart - Pop Tarts, Big Tech, Black Lives Matter, data science, deep learning, Donald Trump, fake news, false flag, Ferguson, Missouri, Firefox, Grace Hopper, Greyball, Hacker News, hockey-stick growth, independent contractor, job automation, Kickstarter, lifelogging, lolcat, Marc Benioff, Mark Zuckerberg, Max Levchin, Menlo Park, meritocracy, microaggression, move fast and break things, natural language processing, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, off-the-grid, pattern recognition, Peter Thiel, real-name policy, recommendation engine, ride hailing / ride sharing, Salesforce, self-driving car, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Snapchat, Steve Jobs, Tactical Technology Collective, TED Talk, Tim Cook: Apple, Travis Kalanick, upwardly mobile, Wayback Machine, women in the workforce, work culture , zero-sum game

After all, if women interested in technology don’t exist, how could employers hire them? This is theoretical, sure: I don’t know how often Google got gender wrong back then, and I don’t know how much that affected the way the tech industry continued to be perceived. But that’s the problem: neither does Google. Proxies are naturally inexact, writes data scientist Cathy O’Neil in Weapons of Math Destruction. Even worse, they’re self-perpetuating: they “define their own reality and use it to justify their results.” 12 Now, Google doesn’t think I’m a man anymore. Sometime in the last five years, it sorted that out (not surprising, since Google now knows a lot more about me, including how often I shop for dresses and search for haircut ideas).

Mostly, the algorithm is somewhere in the middle: it finds just what you want a lot of the time, but sends you somewhere mediocre some of the time too. Yelp is also able to tune its model and improve results over time, by looking at things like how often users search, don’t like the results, and then search again. The algorithm is the core of Yelp’s product—it’s what connects users to businesses—so you can bet that data scientists are tweaking and refining this model all the time. A product like COMPAS, the criminal recidivism software, doesn’t just affect whether you opt for tacos or try a new ramen place tonight, though. It affects people’s lives: whether they can get bail, how long they will spend in prison, whether they’ll be eligible for parole.


pages: 205 words: 61,903

Survival of the Richest: Escape Fantasies of the Tech Billionaires by Douglas Rushkoff

"World Economic Forum" Davos, 4chan, A Declaration of the Independence of Cyberspace, agricultural Revolution, Airbnb, Alan Greenspan, Amazon Mechanical Turk, Amazon Web Services, Andrew Keen, AOL-Time Warner, artificial general intelligence, augmented reality, autonomous vehicles, basic income, behavioural economics, Big Tech, biodiversity loss, Biosphere 2, bitcoin, blockchain, Boston Dynamics, Burning Man, buy low sell high, Californian Ideology, carbon credits, carbon footprint, circular economy, clean water, cognitive dissonance, Colonization of Mars, coronavirus, COVID-19, creative destruction, Credit Default Swap, CRISPR, data science, David Graeber, DeepMind, degrowth, Demis Hassabis, deplatforming, digital capitalism, digital map, disinformation, Donald Trump, Elon Musk, en.wikipedia.org, energy transition, Ethereum, ethereum blockchain, European colonialism, Evgeny Morozov, Extinction Rebellion, Fairphone, fake news, Filter Bubble, game design, gamification, gig economy, Gini coefficient, global pandemic, Google bus, green new deal, Greta Thunberg, Haight Ashbury, hockey-stick growth, Howard Rheingold, if you build it, they will come, impact investing, income inequality, independent contractor, Jane Jacobs, Jeff Bezos, Jeffrey Epstein, job automation, John Nash: game theory, John Perry Barlow, Joseph Schumpeter, Just-in-time delivery, liberal capitalism, Mark Zuckerberg, Marshall McLuhan, mass immigration, megaproject, meme stock, mental accounting, Michael Milken, microplastics / micro fibres, military-industrial complex, Minecraft, mirror neurons, move fast and break things, Naomi Klein, New Urbanism, Norbert Wiener, Oculus Rift, One Laptop per Child (OLPC), operational security, Patri Friedman, pattern recognition, Peter Thiel, planetary scale, Plato's cave, Ponzi scheme, profit motive, QAnon, RAND corporation, Ray Kurzweil, rent-seeking, Richard Thaler, ride hailing / ride sharing, Robinhood: mobile stock trading app, Sam Altman, Shoshana Zuboff, Silicon Valley, Silicon Valley billionaire, SimCity, Singularitarianism, Skinner box, Snapchat, sovereign wealth fund, Stephen Hawking, Steve Bannon, Steve Jobs, Steven Levy, Steven Pinker, Stewart Brand, surveillance capitalism, tech billionaire, tech bro, technological solutionism, technoutopianism, Ted Nelson, TED Talk, the medium is the message, theory of mind, TikTok, Torches of Freedom, Tragedy of the Commons, universal basic income, urban renewal, warehouse robotics, We are as Gods, WeWork, Whole Earth Catalog, work culture , working poor

We would liberate ourselves from obsolete notions of God, and all become part of the same “supreme cybernetic system” he called Mind. Throughout the 1950s and 1960s, government and corporate leaders alike hoped that computers would offer new ways of measuring public opinion and then developing appropriate “mass communications’’ strategies for controlling all these people. Data scientists at companies from RAND to Simulmatics sought and failed to predict and steer the behavior of consumers and voters. It wasn’t until the first intentionally “sticky” websites in the mid-nineties—websites designed to keep users from surfing away—that digital technology provided the sort of controlled environment and live feedback mechanisms required to do operant conditioning en masse.

Skinner, Science and Human Behavior (New York: Macmillan, 1953). 103   “a servosystem coupled” : Fred Turner, The Democratic Surround: Multimedia and American Liberalism from World War II to the Psychedelic Sixties (Chicago: University of Chicago Press, 2013), 123. 103   “How would we rig” : Gregory Bateson, quoted in Mark Stahlman, “The Inner Senses and Human Engineering,” Dianoetikon 1 (2020): 1–26. 104   didn’t come off as nefarious : For the rich history of Bateson and Mead’s efforts in this regard, see Fred Turner’s terrific The Democratic Surround . 104   Data scientists : Jill Lepore, If Then: How the Simulmatics Corporation Invented the Future (New York: Liveright, 2020). 106   “the purpose of Behavior Design” : Stanford University, “Welcome | Behavior Design Lab,” https:// captology .stanford .edu /, accessed June 18, 2018. 106   Chatbots engage : “Smartphone App to Change Your Personality,” Das Fachportal für Biotechnologie , Pharma und Life Sciences , February 15, 2021, https:// www .bionity .com /en /news /1169863 /smartphone -app -to -change -your -personality .html. 107   Amazon incentivizes productivity : Nick Statt, “Amazon Expands Gamification Program That Encourages Warehouse Employees to Work Harder,” Verge , March 15, 2021, https:// www .theverge .com /2021 /3 /15 /22331502 /amazon -warehouse -gamification -program -expand -fc -games. 107   promote environmentally friendly behavior : Markus Brauer and Benjamin D.


pages: 425 words: 112,220

The Messy Middle: Finding Your Way Through the Hardest and Most Crucial Part of Any Bold Venture by Scott Belsky

23andMe, 3D printing, Airbnb, Albert Einstein, Anne Wojcicki, augmented reality, autonomous vehicles, behavioural economics, Ben Horowitz, bitcoin, blockchain, Chuck Templeton: OpenTable:, commoditize, correlation does not imply causation, cryptocurrency, data science, delayed gratification, DevOps, Donald Trump, Elon Musk, endowment effect, fake it until you make it, hiring and firing, Inbox Zero, iterative process, Jeff Bezos, knowledge worker, Lean Startup, Lyft, Mark Zuckerberg, Marshall McLuhan, minimum viable product, move fast and break things, NetJets, Network effects, new economy, old-boy network, Paradox of Choice, pattern recognition, Paul Graham, private spaceflight, reality distortion field, ride hailing / ride sharing, Salesforce, Sheryl Sandberg, Silicon Valley, skeuomorphism, slashdot, Snapchat, Steve Jobs, subscription business, sugar pill, systems thinking, TaskRabbit, TED Talk, the medium is the message, Tony Fadell, Travis Kalanick, Uber for X, uber lyft, WeWork, Y Combinator, young professional

Of course, to build upon ideas, everyone must understand them. Seek people who make the impossible-to-understand more accessible. One of the greater challenges leaders face on the hiring front is evaluating people with a different technical expertise. For example, how do you evaluate the skills of a cryptocurrency expert or a data scientist if you have no expertise in either one? Sure, you can get third-party opinions from others in the industry, but sometimes recruitment is confidential and your candidates have jobs elsewhere, which restricts how many people you can involve in the recruitment process. But all skills, no matter how scientific, can be explained in layman’s terms—it’s just extremely hard to do it.

The word delegate suggests that a leader is single-handedly deciding who should do what, assigning tasks, and then holding everyone responsible as any prototypical boss would. But among high-performing teams, delegation is as much sought as it is received. In such teams, there is a genuine collective drive to free up those with the rarest or least scalable talents. For example, you want your data science experts or programmers to be analyzing data and programming—not expending their precious energy on administrative work. If everyone is aligned with the mission and the market forces and is determined to do whatever they can to make the greatest impact, then the pressure to delegate should come from below as well as above.


pages: 447 words: 111,991

Exponential: How Accelerating Technology Is Leaving Us Behind and What to Do About It by Azeem Azhar

"Friedman doctrine" OR "shareholder theory", "World Economic Forum" Davos, 23andMe, 3D printing, A Declaration of the Independence of Cyberspace, Ada Lovelace, additive manufacturing, air traffic controllers' union, Airbnb, algorithmic management, algorithmic trading, Amazon Mechanical Turk, autonomous vehicles, basic income, Berlin Wall, Bernie Sanders, Big Tech, Bletchley Park, Blitzscaling, Boeing 737 MAX, book value, Boris Johnson, Bretton Woods, carbon footprint, Chris Urmson, Citizen Lab, Clayton Christensen, cloud computing, collective bargaining, computer age, computer vision, contact tracing, contact tracing app, coronavirus, COVID-19, creative destruction, crowdsourcing, cryptocurrency, cuban missile crisis, Daniel Kahneman / Amos Tversky, data science, David Graeber, David Ricardo: comparative advantage, decarbonisation, deep learning, deglobalization, deindustrialization, dematerialisation, Demis Hassabis, Diane Coyle, digital map, digital rights, disinformation, Dissolution of the Soviet Union, Donald Trump, Double Irish / Dutch Sandwich, drone strike, Elon Musk, emotional labour, energy security, Fairchild Semiconductor, fake news, Fall of the Berlin Wall, Firefox, Frederick Winslow Taylor, fulfillment center, future of work, Garrett Hardin, gender pay gap, general purpose technology, Geoffrey Hinton, gig economy, global macro, global pandemic, global supply chain, global value chain, global village, GPT-3, Hans Moravec, happiness index / gross national happiness, hiring and firing, hockey-stick growth, ImageNet competition, income inequality, independent contractor, industrial robot, intangible asset, Jane Jacobs, Jeff Bezos, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John Perry Barlow, Just-in-time delivery, Kickstarter, Kiva Systems, knowledge worker, Kodak vs Instagram, Law of Accelerating Returns, lockdown, low skilled workers, lump of labour, Lyft, manufacturing employment, Marc Benioff, Mark Zuckerberg, megacity, Mitch Kapor, Mustafa Suleyman, Network effects, new economy, NSO Group, Ocado, offshore financial centre, OpenAI, PalmPilot, Panopticon Jeremy Bentham, Peter Thiel, Planet Labs, price anchoring, RAND corporation, ransomware, Ray Kurzweil, remote working, RFC: Request For Comment, Richard Florida, ride hailing / ride sharing, Robert Bork, Ronald Coase, Ronald Reagan, Salesforce, Sam Altman, scientific management, Second Machine Age, self-driving car, Shoshana Zuboff, Silicon Valley, Social Responsibility of Business Is to Increase Its Profits, software as a service, Steve Ballmer, Steve Jobs, Stuxnet, subscription business, synthetic biology, tacit knowledge, TaskRabbit, tech worker, The Death and Life of Great American Cities, The Future of Employment, The Nature of the Firm, Thomas Malthus, TikTok, Tragedy of the Commons, Turing machine, Uber and Lyft, Uber for X, uber lyft, universal basic income, uranium enrichment, vertical integration, warehouse automation, winner-take-all economy, workplace surveillance , Yom Kippur War

She started her project over the summer holidays, learning how to build and fine-tune convolutional neural networks and how to find and clean the data. Fortunately, Herlev Hospital in Denmark had an open-source data set of cervical smears she could use. It wasn’t straightforward. The data set was, in the jargon of a data scientist, unbalanced. It contained too many supposedly abnormal, potentially cancerous screens, and not enough healthy ones. Real-world data would be the reverse: most women have healthy smears, and a tiny number have problematic ones. This kind of inconsistency, or lack of balance, in the data Laura was using could cause problems for her system.

By 2013, Alipay had become the world’s largest mobile payment service, and its parent company spun the business out as Ant Financial. This was the first step in horizontal expansion: giving this new finance company its own autonomy. Ant Financial drew upon the huge amounts of transaction data Alibaba had been collecting, channelling the power of data science – through what it described in hyperbolic marketing speak as the ‘Ant Brain’ – to become an ever more dominant force. It was a classic example of the power of network effects: the more data Alibaba had, the more powerful and effective it became; the more powerful and effective it became, the more customers joined up – and the more data it had.


Exploring Everyday Things with R and Ruby by Sau Sheong Chang

Alfred Russel Wallace, bioinformatics, business process, butterfly effect, cloud computing, Craig Reynolds: boids flock, data science, Debian, duck typing, Edward Lorenz: Chaos theory, Gini coefficient, income inequality, invisible hand, p-value, price stability, Ruby on Rails, Skype, statistical model, stem cell, Stephen Hawking, text mining, The Wealth of Nations by Adam Smith, We are the 99%, web application, wikimedia commons

Introducing R Programmers are trained in logic, and our daily work mostly involves controlling and moving bits and bytes around. So when we’re faced with a chunk of data and asked to do something with it, our reactions usually involve either bolting for the nearest exit or stuffing the data into a relational database and running SQL SELECT statements on it. I’m exaggerating, of course. Most, if not all, data scientists are also programmers, and you can hardly get away with data analysis without doing some programming work. However, not all programming platforms and languages are suitable for data analysis and manipulation. There are a number of languages built for this rather specialized purpose, including MATLAB and S, as well as packages like SAS and SPSS.

The other reason why R is getting increasingly popular is that it is free. The existing batch of tools for data analysis—S, MATLAB, SPSS, and SAS—can be quite expensive, and R is a cost-effective way to achieve the same goals. Also, R has a very vibrant and active community of domain experts and developers, including statisticians and data scientists who contribute many very useful packages that enhance its overall capabilities. R is available in most major platforms, and installing it is quite straightforward. Just visit the R website (http://www.r-project.org/), download the necessary binaries or installer for your platform, and then install it accordingly.


pages: 52 words: 14,333

Growth Hacker Marketing: A Primer on the Future of PR, Marketing, and Advertising by Ryan Holiday

Aaron Swartz, Airbnb, data science, growth hacking, Hacker News, iterative process, Kickstarter, Lean Startup, Marc Andreessen, market design, minimum viable product, Multics, Paul Graham, pets.com, post-work, Silicon Valley, slashdot, Steve Wozniak, Travis Kalanick

Whereas marketing was once brand based, with growth hacking it becomes metric and ROI driven. Suddenly, finding customers and getting attention for your product become no longer a guessing game. But this is more than just marketing with better metrics. Growth hackers trace their roots back to programmers—and that’s how they see themselves. They are data scientists meets design fiends meets marketers. They welcome this information, process it and utilize it differently, and see it as desperately needed clarity in a world that has been dominated by gut instincts and artistic preference for too long. Ultimately that’s why this new approach is better suited to the future.


pages: 533 words: 125,495

Rationality: What It Is, Why It Seems Scarce, Why It Matters by Steven Pinker

affirmative action, Albert Einstein, autonomous vehicles, availability heuristic, Ayatollah Khomeini, backpropagation, basic income, behavioural economics, belling the cat, Black Lives Matter, butterfly effect, carbon tax, Cass Sunstein, choice architecture, classic study, clean water, Comet Ping Pong, coronavirus, correlation coefficient, correlation does not imply causation, COVID-19, critical race theory, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, data science, David Attenborough, deep learning, defund the police, delayed gratification, disinformation, Donald Trump, Dr. Strangelove, Easter island, effective altruism, en.wikipedia.org, Erdős number, Estimating the Reproducibility of Psychological Science, fake news, feminist movement, framing effect, George Akerlof, George Floyd, germ theory of disease, high batting average, if you see hoof prints, think horses—not zebras, index card, Jeff Bezos, job automation, John Nash: game theory, John von Neumann, libertarian paternalism, Linda problem, longitudinal study, loss aversion, Mahatma Gandhi, meta-analysis, microaggression, Monty Hall problem, Nash equilibrium, New Journalism, Paul Erdős, Paul Samuelson, Peter Singer: altruism, Pierre-Simon Laplace, placebo effect, post-truth, power law, QAnon, QWERTY keyboard, Ralph Waldo Emerson, randomized controlled trial, replication crisis, Richard Thaler, scientific worldview, selection bias, social discount rate, social distancing, Social Justice Warrior, Stanford marshmallow experiment, Steve Bannon, Steven Pinker, sunk-cost fallacy, TED Talk, the scientific method, Thomas Bayes, Tragedy of the Commons, trolley problem, twin studies, universal basic income, Upton Sinclair, urban planning, Walter Mischel, yellow journalism, zero-sum game

Paul Slovic, a collaborator of Tversky and Kahneman, showed that people also overestimate the danger from threats that are novel (the devil they don’t know instead of the devil they do), out of their control (as if they can drive more safely than a pilot can fly), human-made (so they avoid genetically modified foods but swallow the many toxins that evolved naturally in plants), and inequitable (when they feel they assume a risk for another’s gain).23 When these bugbears combine with the prospect of a disaster that kills many people at once, the sum of all fears becomes a dread risk. Plane crashes, nuclear meltdowns, and terrorist attacks are prime examples. * * * • • • Terrorism, like other losses of life with malice aforethought, brews up a different chemistry of fear. Body-counting data scientists are often perplexed at the way that highly publicized but low-casualty killings can lead to epochal societal reactions. The worst terrorist attack in history by far was 9/11, and it claimed 3,000 lives; in most bad years, the United States suffers a few dozen terrorist deaths, a rounding error in the tally of homicides and accidents.

Increasingly, “randomistas” are urging policymakers to test their nostrums in one set of randomly selected villages, classes, or neighborhoods, and compare the results against a control group which is put on a waitlist or given some meaningless make-work program.26 The knowledge gained is likely to outperform traditional ways of evaluating policies, like dogma, folklore, charisma, conventional wisdom, and HiPPO (highest-paid person’s opinion). Randomized experiments are no panacea (since nothing is a panacea, which is a good reason to retire that cliché). Laboratory scientists snipe at each other as much as correlational data scientists, because even in an experiment you can’t do just one thing. Experimenters may think that they have administered a treatment and only that treatment to the experimental group, but other variables may be confounded with it, a problem called excludability. According to a joke, a sexually unfulfilled couple consults a rabbi with their problem, since it is written in the Talmud that a husband is responsible for his wife’s sexual pleasure.

While a low channel number (I) can cause people to watch Fox News (A), and watching Fox News may or may not cause them to vote Republican (B), neither having conservative views (C) nor voting Republican can cause someone’s favorite television station to skitter down the cable dial. Sure enough, in a comparison across cable markets, the lower the channel number of Fox News relative to other news networks, the larger the Republican vote.29 From Correlation to Causation without Experimentation When a data scientist finds a regression discontinuity or an instrumental variable, it’s a really good day. But more often they have to squeeze what causation they can out of the usual correlational tangle. All is not lost, though, because there are palliatives for each of the ailments that enfeeble causal inference.


pages: 240 words: 74,182

This Is Not Propaganda: Adventures in the War Against Reality by Peter Pomerantsev

4chan, active measures, anti-communist, Bellingcat, Berlin Wall, Black Lives Matter, call centre, Cambridge Analytica, citizen journalism, data science, Day of the Dead, desegregation, disinformation, Donald Trump, Etonian, European colonialism, fake news, Fall of the Berlin Wall, feminist movement, illegal immigration, mass immigration, mega-rich, megacity, Mikhail Gorbachev, post-truth, side hustle, Skype, South China Sea

Throughout the book I will travel, some of the time through space, but not always. The physical and political maps delineating continents, countries and oceans, the maps I grew up with, can be less important than the new maps of information flows. These ‘network maps’ are generated by data scientists. They call the process ‘surfacing’. One takes a keyword, a message, a narrative and casts it into the ever-expanding pool of the world’s data. The data scientist then ‘surfaces’ the people, media outlets, social media accounts, bots, trolls and cyborgs pushing or interacting with those keywords, narratives and messages. These network maps, which look like fields of pin mould or photographs of distant galaxies, show how outdated our geographic definitions are, revealing unexpected constellations where anyone from anywhere can influence everyone everywhere.


pages: 472 words: 117,093

Machine, Platform, Crowd: Harnessing Our Digital Future by Andrew McAfee, Erik Brynjolfsson

"World Economic Forum" Davos, 3D printing, additive manufacturing, AI winter, Airbnb, airline deregulation, airport security, Albert Einstein, algorithmic bias, AlphaGo, Amazon Mechanical Turk, Amazon Web Services, Andy Rubin, AOL-Time Warner, artificial general intelligence, asset light, augmented reality, autism spectrum disorder, autonomous vehicles, backpropagation, backtesting, barriers to entry, behavioural economics, bitcoin, blockchain, blood diamond, British Empire, business cycle, business process, carbon footprint, Cass Sunstein, centralized clearinghouse, Chris Urmson, cloud computing, cognitive bias, commoditize, complexity theory, computer age, creative destruction, CRISPR, crony capitalism, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, Dean Kamen, deep learning, DeepMind, Demis Hassabis, discovery of DNA, disintermediation, disruptive innovation, distributed ledger, double helix, driverless car, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Ethereum, ethereum blockchain, everywhere but in the productivity statistics, Evgeny Morozov, fake news, family office, fiat currency, financial innovation, general purpose technology, Geoffrey Hinton, George Akerlof, global supply chain, Great Leap Forward, Gregor Mendel, Hernando de Soto, hive mind, independent contractor, information asymmetry, Internet of things, inventory management, iterative process, Jean Tirole, Jeff Bezos, Jim Simons, jimmy wales, John Markoff, joint-stock company, Joseph Schumpeter, Kickstarter, Kiva Systems, law of one price, longitudinal study, low interest rates, Lyft, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, Marc Andreessen, Marc Benioff, Mark Zuckerberg, meta-analysis, Mitch Kapor, moral hazard, multi-sided market, Mustafa Suleyman, Myron Scholes, natural language processing, Network effects, new economy, Norbert Wiener, Oculus Rift, PageRank, pattern recognition, peer-to-peer lending, performance metric, plutocrats, precision agriculture, prediction markets, pre–internet, price stability, principal–agent problem, Project Xanadu, radical decentralization, Ray Kurzweil, Renaissance Technologies, Richard Stallman, ride hailing / ride sharing, risk tolerance, Robert Solow, Ronald Coase, Salesforce, Satoshi Nakamoto, Second Machine Age, self-driving car, sharing economy, Silicon Valley, Skype, slashdot, smart contracts, Snapchat, speech recognition, statistical model, Steve Ballmer, Steve Jobs, Steven Pinker, supply-chain management, synthetic biology, tacit knowledge, TaskRabbit, Ted Nelson, TED Talk, the Cathedral and the Bazaar, The Market for Lemons, The Nature of the Firm, the strength of weak ties, Thomas Davenport, Thomas L Friedman, too big to fail, transaction costs, transportation-network company, traveling salesman, Travis Kalanick, Two Sigma, two-sided market, Tyler Cowen, Uber and Lyft, Uber for X, uber lyft, ubercab, Vitalik Buterin, warehouse robotics, Watson beat the top human players on Jeopardy!, winner-take-all economy, yield management, zero day

The members of Topcoder’s global community include not only programmers, but also people who identify as designers, students, data scientists, and physicists. Topcoder offers this crowd a series of corporate projects, lets them self-select into teams and into roles, stitches all their work together, and monitors quality. It uses monetary and competitive rewards, along with a bit of oversight, to create Linux-like efforts for its clients. Kaggle does the same thing for data science competitions. Finding the right resource. Sometimes you don’t want to bring together an entire crowd; you simply want to find, as quickly and efficiently as possible, the right person or team to help with something.


pages: 424 words: 123,180

Democracy's Data: The Hidden Stories in the U.S. Census and How to Read Them by Dan Bouk

Black Lives Matter, card file, COVID-19, dark matter, data science, desegregation, digital map, Donald Trump, George Floyd, germ theory of disease, government statistician, hiring and firing, illegal immigration, index card, invisible hand, Jeff Bezos, linked data, Mahatma Gandhi, mass incarceration, public intellectual, pull request, Ralph Waldo Emerson, Scientific racism, Shoshana Zuboff, Silicon Valley, social distancing, surveillance capitalism, transcontinental railway, union organizing, W. E. B. Du Bois, Works Progress Administration, zero-sum game

Reading the data deeply now, we can see her and acknowledge her, and her choice to live outside the bounds of a patriarchal household, and the unseen negotiations that made her, maybe only briefly, a “partner.” MAPS REVEAL MARGINS, WHERE PARTNERS CLUSTER This map, prepared for me by the data scientist Stephanie Jordan, nicely illustrates the occurrence of “partner” labels across the nation (but excluding territories, like Hawaii): they come in clusters. I went looking for “partners” in marginal (edgy) neighborhoods, which could be seen on maps created by risk evaluators for the Home Owners’ Loan Corporation (HOLC) to support federal mortgage lending.

At Johns Hopkins University, a team of digital humanists (including Kim Gallon, Jeremy Greene, Jessica Marie Johnson, and Alexandré White) in the Black Beyond Data project work to uncover the racist histories of data sets while also imagining new and better ways to make meaningful data about Black lives. 11.  A useful text for those who want a more thorough introduction to doing data (science) differently is D’Ignazio and Klein, Data Feminism. 12.  Jesse Jones to Franklin D. Roosevelt, November 29, 1940, in Folder “Speeches, Articles, and Papers,” Box 4, Entry 229, RG 29, NARA, D.C. 13.  For the history of apportionment and the methods used, see Michel L. Balinski and H. Peyton Young, Fair Representation: Meeting the Ideal of One Man, One Vote (Washington, D.C.: Brookings Institution Press, 2001). 14.  


pages: 301 words: 89,076

The Globotics Upheaval: Globalisation, Robotics and the Future of Work by Richard Baldwin

agricultural Revolution, Airbnb, AlphaGo, AltaVista, Amazon Web Services, Apollo 11, augmented reality, autonomous vehicles, basic income, Big Tech, bread and circuses, business process, business process outsourcing, call centre, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, commoditize, computer vision, Corn Laws, correlation does not imply causation, Credit Default Swap, data science, David Ricardo: comparative advantage, declining real wages, deep learning, DeepMind, deindustrialization, deskilling, Donald Trump, Douglas Hofstadter, Downton Abbey, Elon Musk, Erik Brynjolfsson, facts on the ground, Fairchild Semiconductor, future of journalism, future of work, George Gilder, Google Glasses, Google Hangouts, Hans Moravec, hiring and firing, hype cycle, impulse control, income inequality, industrial robot, intangible asset, Internet of things, invisible hand, James Watt: steam engine, Jeff Bezos, job automation, Kevin Roose, knowledge worker, laissez-faire capitalism, Les Trente Glorieuses, low skilled workers, machine translation, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, manufacturing employment, Mark Zuckerberg, mass immigration, mass incarceration, Metcalfe’s law, mirror neurons, new economy, optical character recognition, pattern recognition, Ponzi scheme, post-industrial society, post-work, profit motive, remote working, reshoring, ride hailing / ride sharing, Robert Gordon, Robert Metcalfe, robotic process automation, Ronald Reagan, Salesforce, San Francisco homelessness, Second Machine Age, self-driving car, side project, Silicon Valley, Skype, Snapchat, social intelligence, sovereign wealth fund, standardized shipping container, statistical model, Stephen Hawking, Steve Jobs, supply-chain management, systems thinking, TaskRabbit, telepresence, telepresence robot, telerobotics, Thomas Malthus, trade liberalization, universal basic income, warehouse automation

A recent study by the consulting firm, Forrester, suggest that 16 percent of all US jobs will be displaced by automation in the next ten years.9 That is one out of every six jobs. The professions hardest hit are forecast to be those that employ office workers. Forrester, however, notes that about half of the job destruction will be matched by job creation equal to 9 percent of today’s jobs. The study points to “robot monitoring professionals,” data scientists, automation specialists, and content curators as the biggest sources of new tech-related jobs. On net, Forrester forecasts that the impact will be a loss of 7 percent of jobs. That is still one out of every fourteen jobs. A recent World Economic Forum study, which is based on a survey of high-level corporate human resource types, put the number much lower.

Another aspect of RPA may dial-up the outrage factor even more. The workers being replaced will be training their robot replacements. Here is how one RPA software company explains it. “WorkFusion automates the time-consuming process of training and selecting machine learning algorithms . . . WorkFusion’s Virtual Data Scientist uses historical data and real-time human actions to train models to automate judgment work in a business process, like categorizing and extracting unstructured information.” This thing, in other words, is a white-collar robot that figures out what parts of the job can be done by a white-collar robot.


AI 2041 by Kai-Fu Lee, Chen Qiufan

3D printing, Abraham Maslow, active measures, airport security, Albert Einstein, AlphaGo, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, augmented reality, autonomous vehicles, basic income, bitcoin, blockchain, blue-collar work, Cambridge Analytica, carbon footprint, Charles Babbage, computer vision, contact tracing, coronavirus, corporate governance, corporate social responsibility, COVID-19, CRISPR, cryptocurrency, DALL-E, data science, deep learning, deepfake, DeepMind, delayed gratification, dematerialisation, digital map, digital rights, digital twin, Elon Musk, fake news, fault tolerance, future of work, Future Shock, game design, general purpose technology, global pandemic, Google Glasses, Google X / Alphabet X, GPT-3, happiness index / gross national happiness, hedonic treadmill, hiring and firing, Hyperloop, information security, Internet of things, iterative process, job automation, language acquisition, low earth orbit, Lyft, Maslow's hierarchy, mass immigration, mirror neurons, money: store of value / unit of account / medium of exchange, mutually assured destruction, natural language processing, Neil Armstrong, Nelson Mandela, OpenAI, optical character recognition, pattern recognition, plutocrats, post scarcity, profit motive, QR code, quantitative easing, Richard Feynman, ride hailing / ride sharing, robotic process automation, Satoshi Nakamoto, self-driving car, seminal paper, Silicon Valley, smart cities, smart contracts, smart transportation, Snapchat, social distancing, speech recognition, Stephen Hawking, synthetic biology, telemarketer, Tesla Model S, The future is already here, trolley problem, Turing test, uber lyft, universal basic income, warehouse automation, warehouse robotics, zero-sum game

For example, the future doctor will still be the primary point of contact trusted by the patient but will rely on AI diagnostic tools to determine the best treatment. This will redirect the doctor’s role to that of a compassionate caregiver, giving them more time with their patients. Just as the mobile Internet led to roles like the Uber driver, the coming of AI will create jobs we cannot even conceive of yet. Examples today include AI engineers, data scientists, data-labelers, and robot mechanics. But we don’t yet know and cannot predict many of these new professions, just as in 2001 we couldn’t have known about Uber drivers. We should watch for the emergence of these roles, make people aware of them, and provide training for them. RENAISSANCE Finally, with the right training and the right tools, we can expect an AI-led renaissance that will enable and celebrate creativity, compassion, and humanity.

As a result, human life expectancy increased from thirty-one years in 1900 to seventy-two years in 2017. Today, I believe we are at the cusp of another revolution for healthcare, in which digitization will enable the application of all data technologies from computing, communications, mobile, robotics, data science, and, most important, AI. First, existing healthcare databases and processes will be digitized, including patient records, drug efficacy, medical instruments, wearable devices, clinical trials, quality-of-care surveillance, infectious-disease-spread data, as well as supplies of drugs and vaccines.


pages: 330 words: 91,805

Peers Inc: How People and Platforms Are Inventing the Collaborative Economy and Reinventing Capitalism by Robin Chase

Airbnb, Amazon Web Services, Andy Kessler, Anthropocene, Apollo 13, banking crisis, barriers to entry, basic income, Benevolent Dictator For Life (BDFL), bike sharing, bitcoin, blockchain, Burning Man, business climate, call centre, car-free, carbon tax, circular economy, cloud computing, collaborative consumption, collaborative economy, collective bargaining, commoditize, congestion charging, creative destruction, crowdsourcing, cryptocurrency, data science, deal flow, decarbonisation, different worldview, do-ocracy, don't be evil, Donald Shoup, Elon Musk, en.wikipedia.org, Ethereum, ethereum blockchain, Eyjafjallajökull, Ferguson, Missouri, Firefox, Free Software Foundation, frictionless, Gini coefficient, GPS: selective availability, high-speed rail, hive mind, income inequality, independent contractor, index fund, informal economy, Intergovernmental Panel on Climate Change (IPCC), Internet of things, Jane Jacobs, Jeff Bezos, jimmy wales, job satisfaction, Kickstarter, Kinder Surprise, language acquisition, Larry Ellison, Lean Startup, low interest rates, Lyft, machine readable, means of production, megacity, Minecraft, minimum viable product, Network effects, new economy, Oculus Rift, off-the-grid, openstreetmap, optical character recognition, pattern recognition, peer-to-peer, peer-to-peer lending, peer-to-peer model, Post-Keynesian economics, Richard Stallman, ride hailing / ride sharing, Ronald Coase, Ronald Reagan, Salesforce, Satoshi Nakamoto, Search for Extraterrestrial Intelligence, self-driving car, shareholder value, sharing economy, Silicon Valley, six sigma, Skype, smart cities, smart grid, Snapchat, sovereign wealth fund, Steve Crocker, Steve Jobs, Steven Levy, TaskRabbit, The Death and Life of Great American Cities, The Future of Employment, the long tail, The Nature of the Firm, Tragedy of the Commons, transaction costs, Turing test, turn-by-turn navigation, Uber and Lyft, uber lyft, vertical integration, Zipcar

The challenges included tracking asteroids given a specific set of NASA data; delivering email and calendar updates between Earth and the International Space Station (44,000 miles distant) reliably, safely, and securely; tracking food intake for space travelers; and using algorithms to crunch seventeen years’ worth of data from the Saturn-orbiting Cassini rocket and uncover interesting patterns in ring phenomena and structure, or detect new moons. The TopCoder community, consisting of 630,000 data scientists, developers, and designers were offered up these and other challenges in 2013 and 2014.30 So far, NASA has received thousands of different submissions from more than twenty countries. By mid-2014, the contest winners had taken home over $1.5 million. Jason Crusan, director of the Advanced Exploration Systems Division, said that “tapping into a diverse pool of the world’s top technical talent has not only resulted in new and innovative ways to advance technologies to further space exploration, but has also led to a whole new way of thinking for NASA, and other government agencies, providing us with an additional set of on-demand tools to tackle complex projects.”31 Karim Lakhani, who has extensively investigated the way communities and contests can be used for innovation and has run large-scale experiments for both Harvard Medical School and NASA, told me that his analysis of more than 150 scientific contests revealed that “the best solutions came from solvers who had expertise that was quite far from the problem domain.

I’ll talk more in Chapter 7 about how and when government might play a role in protecting the rights of peers. Last year Facebook experienced a public relations fiasco: the aftermath from the results of an experiment it had let researchers conduct inside the Facebook community. For one week in January 2012, data scientists skewed the news feed of almost 700,000 Facebook users so that they saw either happier or sadder news. At the end of the week, those who had seen happier news feeds themselves posted more upbeat status updates; those who had seen the more pessimistic news feeds posted more negative updates. The report, “Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks,” was published in the June 2014 issue of the Proceedings of the National Academy of Sciences of the United States of America.32 Danah Boyd, a principal researcher at Microsoft in the area of social media, commented on the fallout that had filled Twitter, Facebook, blogs, and the mainstream media for days: “What’s at stake is the underlying dynamic of how Facebook runs its business, operates its system, and makes decisions that have nothing to do with how its users want Facebook to operate.


pages: 358 words: 93,969

Climate Change by Joseph Romm

biodiversity loss, carbon footprint, carbon tax, clean tech, Climatic Research Unit, data science, decarbonisation, demand response, disinformation, Douglas Hofstadter, electricity market, Elon Musk, energy security, energy transition, failed state, gigafactory, hydraulic fracturing, hydrogen economy, Intergovernmental Panel on Climate Change (IPCC), knowledge worker, mass immigration, ocean acidification, performance metric, renewable energy transition, ride hailing / ride sharing, Ronald Reagan, Silicon Valley, Silicon Valley startup, the scientific method

If you fill these data gaps using satellite measurements, the warming trend is more than doubled in the widely used HadCRUT4 data, and the much-discussed “warming pause” has virtually disappeared. Figure 1.3 The corrected data (bold lines) are shown compared to the uncorrected ones (thin lines). Source: Kevin Cowtan and Robert Way When you include all of the data scientists have (through 2012), surface air temperatures have continued to rise globally in the last decade (see Figure 1.3), but at what appears to be a slightly slower rate than in previous decades. Why is that? A 2011 study removed the “noise” of natural climate variability from the temperature record to reveal the true global warming signal.11 That noise is “the estimated impact of known factors on short-term temperature variations (El Niño/southern oscillation, volcanic aerosols and solar variability).”

However, in fact, the coming multidecadal megadroughts will be much worse than the Dust Bowl of the 1930—“worse than anything seen during the last 2000 years,” as a major 2014 Cornell-led study put it. They will be the kind of megadroughts that in the past destroyed entire civilizations.30 In that 2014 Journal of Climate study, “Assessing the Risk of Persistent Drought Using Climate Model Simulations and Paleoclimate Data,” scientists quantified the risk of devastating, prolonged drought in the southwestern U.S. and the world due to global warming. Researchers from Cornell, University of Arizona, and the U.S. Geological Survey concluded “the risk of a decade-scale megadrought in the coming century [in the Southwest] is at least 80%, and may be higher than 90% in certain areas.”


pages: 98 words: 25,753

Ethics of Big Data: Balancing Risk and Innovation by Kord Davis, Doug Patterson

4chan, business process, corporate social responsibility, crowdsourcing, data science, en.wikipedia.org, longitudinal study, Mahatma Gandhi, Mark Zuckerberg, Netflix Prize, Occupy movement, off-the-grid, performance metric, Robert Bork, side project, smart grid, urban planning

They run a Hadoop cluster of nearly 100 machines, process near real-time analytics reporting with Pentaho, and are experimenting with ways to enhance their customers’ ability to analyze their own datasets using R for statistical analysis and graphics. Their combined customer data sets exceed 100 terabytes and are growing daily. Further, they are especially excited about a powerful new opportunity their data scientists have uncovered that would integrate some of the data in their customers’ databases with other customer data to enhance and expand the value of the services they offer for everyone. They are aware, however, that performing such correlations must be done in a highly secure environment, and a rigorous test plan is designed and implemented.


pages: 317 words: 100,414

Superforecasting: The Art and Science of Prediction by Philip Tetlock, Dan Gardner

Affordable Care Act / Obamacare, Any sufficiently advanced technology is indistinguishable from magic, availability heuristic, behavioural economics, Black Swan, butterfly effect, buy and hold, cloud computing, cognitive load, cuban missile crisis, Daniel Kahneman / Amos Tversky, data science, desegregation, drone strike, Edward Lorenz: Chaos theory, forward guidance, Freestyle chess, fundamental attribution error, germ theory of disease, hindsight bias, How many piano tuners are there in Chicago?, index fund, Jane Jacobs, Jeff Bezos, Kenneth Arrow, Laplace demon, longitudinal study, Mikhail Gorbachev, Mohammed Bouazizi, Nash equilibrium, Nate Silver, Nelson Mandela, obamacare, operational security, pattern recognition, performance metric, Pierre-Simon Laplace, place-making, placebo effect, precautionary principle, prediction markets, quantitative easing, random walk, randomized controlled trial, Richard Feynman, Richard Thaler, Robert Shiller, Ronald Reagan, Saturday Night Live, scientific worldview, Silicon Valley, Skype, statistical model, stem cell, Steve Ballmer, Steve Jobs, Steven Pinker, tacit knowledge, tail risk, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Watson beat the top human players on Jeopardy!

It would be facile to reduce superforecasting to a bumper-sticker slogan, but if I had to, that would be it. 6 Superquants? We live in the era of Big Data. Vast, proliferating information-technology networks churn out staggering quantities of information that can be analyzed by data scientists armed with powerful computers and arcane math. Order and meaning are extracted. Reality is seen and foreseen like never before. And most of us—let’s be honest with ourselves—don’t have the dimmest idea how data scientists do what they do. We find it a little intimidating, if not dazzling. As the scientist and science fiction writer Arthur C. Clarke famously observed, “Any sufficiently advanced technology is indistinguishable from magic.”


pages: 368 words: 96,825

Bold: How to Go Big, Create Wealth and Impact the World by Peter H. Diamandis, Steven Kotler

3D printing, additive manufacturing, adjacent possible, Airbnb, Amazon Mechanical Turk, Amazon Web Services, Apollo 11, augmented reality, autonomous vehicles, Boston Dynamics, Charles Lindbergh, cloud computing, company town, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, deal flow, deep learning, dematerialisation, deskilling, disruptive innovation, driverless car, Elon Musk, en.wikipedia.org, Exxon Valdez, fail fast, Fairchild Semiconductor, fear of failure, Firefox, Galaxy Zoo, Geoffrey Hinton, Google Glasses, Google Hangouts, gravity well, hype cycle, ImageNet competition, industrial robot, information security, Internet of things, Jeff Bezos, John Harrison: Longitude, John Markoff, Jono Bacon, Just-in-time delivery, Kickstarter, Kodak vs Instagram, Law of Accelerating Returns, Lean Startup, life extension, loss aversion, Louis Pasteur, low earth orbit, Mahatma Gandhi, Marc Andreessen, Mark Zuckerberg, Mars Rover, meta-analysis, microbiome, minimum viable product, move fast and break things, Narrative Science, Netflix Prize, Network effects, Oculus Rift, OpenAI, optical character recognition, packet switching, PageRank, pattern recognition, performance metric, Peter H. Diamandis: Planetary Resources, Peter Thiel, pre–internet, Ray Kurzweil, recommendation engine, Richard Feynman, ride hailing / ride sharing, risk tolerance, rolodex, Scaled Composites, self-driving car, sentiment analysis, shareholder value, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, skunkworks, Skype, smart grid, SpaceShipOne, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, Stewart Brand, Stuart Kauffman, superconnector, Susan Wojcicki, synthetic biology, technoutopianism, TED Talk, telepresence, telepresence robot, Turing test, urban renewal, Virgin Galactic, Wayback Machine, web application, X Prize, Y Combinator, zero-sum game

A great example of this is TopCoder (www.topcoder.com). You’ve probably heard about hackathons—those mysterious tournaments where coders compete to see who can hack together the best piece of software in a weekend. Well, with TopCoder, now you can have over 600,000 developers, designers, and data scientists hacking away to create solutions just for you. In fields like software and algorithm development, where there are many ways to solve a problem, having multiple submissions lets you compare performance metrics and choose the best one. Or take Gigwalk, a crowdsourced information-gathering platform that pays a small denomination to incentivize the crowd (i.e., anyone who has the Gigwalk app) to perform a simple task at a particular place and time.

Unfortunately, not everyone knows how to tease out valuable insights from this deluge. Enter companies like Kaggle (www.kaggle.com) and TopCoder (www.topcoder.com), both of which are crowdsourcing, data-mining competition platforms that allow you to define your goal/desired insight, set a monetary prize, upload your data, and watch as hordes of data scientists (tens of thousands, to be exact) figure out the best way to sort through it. The best algorithm wins. The reward levels vary from kudos or zero dollars to hundreds of thousands of dollars from bigger companies. And for exponential entrepreneurs, not relying on the advantages of data is no longer an option.


pages: 391 words: 99,963

The Weather of the Future by Heidi Cullen

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, air freight, American Society of Civil Engineers: Report Card, availability heuristic, back-to-the-land, bank run, California gold rush, carbon footprint, clean water, colonial rule, data science, Easter island, energy security, hindcast, illegal immigration, Intergovernmental Panel on Climate Change (IPCC), Isaac Newton, Kickstarter, mass immigration, Medieval Warm Period, megacity, millennium bug, ocean acidification, out of africa, Silicon Valley, smart cities, trade route, urban planning, Y2K

Given the 1.3°F of warming we’ve already put into the system, it’s a temperature range that could easily be in the cards during this century. A fundamental problem is that existing models of ice sheets are unable to explain the speed of the recent changes in the GIS that GRACE and IceSat are observing. In other words, the models cannot reproduce the data. Scientists such as Scott Luthcke are seeing things happen in Greenland right now that, technically, the models don’t show as happening for another thirty years.22 Even if some temperature threshold is passed, the IPCC gives a 1,000-year timescale for a total collapse of the GIS. But, given the inability of current models to simulate the rapid disappearance of continental ice right now, let alone at the end of the last ice age, a lower limit of 300 years is conceivable.23 I met up with Steffensen, Severinghaus, and other scientists from the NEEM project in Kangerlussuaq, a former Cold War outpost for the U.S.

Climate Central used a common technique to translate large-scale climate information from the computer models to provide useful information about local and regional conditions. This method involves calculating differences between time series data from current and future global climate model simulations and then adding these changes to time series of observed climate data. Scientists at Climate Central first identified weather observation stations closest to each city, as well as the closest point in the output of computer models, which is known as a grid point. For the station data, Climate Central examined temperature information for the summer months (June, July, and August) during two twenty-year periods to determine how extreme heat events have evolved during the twentieth century.


The Knowledge Machine: How Irrationality Created Modern Science by Michael Strevens

Albert Einstein, Albert Michelson, anthropic principle, Arthur Eddington, Atul Gawande, coronavirus, COVID-19, dark matter, data science, Eddington experiment, Edmond Halley, Fellow of the Royal Society, fudge factor, germ theory of disease, Great Leap Forward, Gregor Mendel, heat death of the universe, Higgs boson, Intergovernmental Panel on Climate Change (IPCC), invention of movable type, invention of the telescope, Isaac Newton, Islamic Golden Age, Johannes Kepler, Large Hadron Collider, longitudinal study, Louis Pasteur, military-industrial complex, Murray Gell-Mann, Peace of Westphalia, Richard Feynman, Stephen Hawking, Steven Pinker, systematic bias, Thales of Miletus, the scientific method, Thomas Bayes, William of Occam

Over the past few decades, the answers have come in. They are almost entirely negative. There is little evidence, as you will see, for a dispassionate Popperian critical spirit, but also little evidence for universal subservience to a paradigm. Indeed, in their thinking about the connection between theory and data, scientists seem scarcely to follow any rules at all. CHAPTER 2 Human Frailty Scientists are too contentious and too morally and intellectually fragile to follow any method consistently. AS THE MOON’S DISK CREPT across the face of the sun on May 29, 1919, a new science of gravity hung in the balance.

Eddington had to make a choice. Discount the astrographic data? Overlook the 4-inch discrepancy? Declare the experiment to be inconclusive? He did not have enough information to single out an obviously correct answer. So he followed his instincts. Eddington’s situation was not at all unusual. In the interpretation of data, scientists often have great room for maneuver and all too seldom have unambiguous guidance as to which maneuvers are objectively right and wrong. The room for maneuver exists because, as the eclipse experiment shows, theories in themselves do not make predictions about what will be observed. To say anything at all about the experimental outcome—about, say, the position of spots on a photographic plate—theories must be supported and helped along by other posits, other presumptions about the proper functioning of the experimental apparatus, the suitability of the background conditions, and more.


pages: 285 words: 98,832

The Premonition: A Pandemic Story by Michael Lewis

"World Economic Forum" Davos, Airbnb, contact tracing, coronavirus, COVID-19, dark matter, data science, deep learning, Donald Davies, Donald Trump, double helix, energy security, facts on the ground, failed state, gentleman farmer, global supply chain, illegal immigration, Marc Benioff, Mark Zuckerberg, out of africa, precautionary principle, QAnon, rolling blackouts, Ronald Reagan, Salesforce, Silicon Valley, social distancing, Social Justice Warrior, stem cell, tech bro, telemarketer, the new new thing, working poor, young professional

Newsom’s economic adviser called Park immediately and asked him if he could help the state figure out what to do about the coronavirus. Park recruited a pair of former Obama administration officials: Bob Kocher, a doctor turned venture capitalist who had advised Obama about health care, and DJ Patil,* who had served as the country’s first chief data scientist. Patil pulled together a team of some of the best programmers in Silicon Valley, and the team instantly began to collect data that would help them to project and predict. In a couple of days, they had everything from the number of beds in intensive care units to data from toll booths and cell phone companies that gave them a feel for how people moved around inside the state.

And as she waited, her governor, in whom she still had faith, did the sort of thing that might have given her even more of it. He called the Red Phone. * I wrote about DJ Patil in The Fifth Risk. Working with a friend at LinkedIn, and needing a description for a new kind of job in the economy, DJ had coined the phrase “data scientist.” † And not just in California. The two other states that moved most quickly to shut down, Ohio and Maryland, had also paid close attention to Carter’s analysis. ‡ Slavitt renamed the plan “Victory over COVID-19” and presented it to Kushner as his own. PART III TEN The Bug in the System The Red Phone had always been a less than perfectly efficient tool for saving lives.


Forward: Notes on the Future of Our Democracy by Andrew Yang

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, Affordable Care Act / Obamacare, Amazon Web Services, American Society of Civil Engineers: Report Card, basic income, benefit corporation, Bernie Sanders, blockchain, blue-collar work, call centre, centre right, clean water, contact tracing, coronavirus, correlation does not imply causation, COVID-19, data is the new oil, data science, deepfake, disinformation, Donald Trump, facts on the ground, fake news, forensic accounting, future of work, George Floyd, gig economy, global pandemic, income inequality, independent contractor, Jaron Lanier, Jeff Bezos, job automation, Kevin Roose, labor-force participation, Marc Benioff, Mark Zuckerberg, medical bankruptcy, new economy, obamacare, opioid epidemic / opioid crisis, pez dispenser, QAnon, recommendation engine, risk tolerance, rolodex, Ronald Reagan, Rutger Bregman, Sam Altman, Saturday Night Live, shareholder value, Shoshana Zuboff, Silicon Valley, Simon Kuznets, single-payer health, Snapchat, social distancing, SoftBank, surveillance capitalism, systematic bias, tech billionaire, TED Talk, The Day the Music Died, the long tail, TikTok, universal basic income, winner-take-all economy, working poor

They found that a false story was much more likely to go viral; fake news was six times faster to reach fifteen hundred people than something accurate. This was the case in every subject area—business, foreign affairs, science, and technology. “It seems to be pretty clear that false information outperforms true information,” Soroush Vosoughi, an MIT data scientist who led the study, told a reporter for The Atlantic. The tendency seemed most acute in one subject category: political content. “The key takeaway,” Rebekah Tromble, a professor of political science, told The Atlantic, “is really that content that arouses strong emotions spreads further, faster, more deeply, and more broadly.”

Against this backdrop and facing such a high set of standards, the fact that citizens have won more than $1 billion in civil judgments against police departments across the country per year in recent years is staggering, and evidence that the true scope of police damages against citizens is some multiple billions of dollars per year. In 2018 there were 686,665 police officers in eighteen thousand local departments across the country, from the tiniest police department in rural America to the NYPD. How can one meaningfully reform behaviors nationwide? Samuel Sinyangwe, co-founder of Campaign Zero, is a data scientist who has been researching police violence data and different policy responses for years. He has identified a number of changes that correspond to lower loss of life in encounters with police. The first is direct and obvious: more restrictive rules and laws governing use of force. Police departments have rules and guidelines as to what techniques they can use in different situations.


pages: 116 words: 31,356

Platform Capitalism by Nick Srnicek

"World Economic Forum" Davos, 3D printing, additive manufacturing, Airbnb, Amazon Mechanical Turk, Amazon Web Services, Big Tech, Californian Ideology, Capital in the Twenty-First Century by Thomas Piketty, cloud computing, collaborative economy, collective bargaining, data science, deindustrialization, deskilling, Didi Chuxing, digital capitalism, digital divide, disintermediation, driverless car, Ford Model T, future of work, gig economy, independent contractor, Infrastructure as a Service, Internet of things, Jean Tirole, Jeff Bezos, knowledge economy, knowledge worker, liquidity trap, low interest rates, low skilled workers, Lyft, Mark Zuckerberg, means of production, mittelstand, multi-sided market, natural language processing, Network effects, new economy, Oculus Rift, offshore financial centre, pattern recognition, platform as a service, quantitative easing, RFID, ride hailing / ride sharing, Robert Gordon, Salesforce, self-driving car, sharing economy, Shoshana Zuboff, Silicon Valley, Silicon Valley startup, software as a service, surveillance capitalism, TaskRabbit, the built environment, total factor productivity, two-sided market, Uber and Lyft, Uber for X, uber lyft, unconventional monetary instruments, unorthodox policies, vertical integration, warehouse robotics, Zipcar

There is a convergence of surveillance and profit making in the digital economy, which leads some to speak of ‘surveillance capitalism’.27 Key to revenues, however, is not just the collection of data, but also the analysis of data. Advertisers are interested less in unorganised data and more in data that give them insights or match them to likely consumers. These are data that have been worked on.28 They have had some process applied to them, whether through the skilled labour of a data scientist or the automated labour of a machine-learning algorithm. What is sold to advertisers is therefore not the data themselves (advertisers do not receive personalised data), but rather the promise that Google’s software will adeptly match an advertiser with the correct users when needed. While the data extraction model has been prominent in the online world, it has also migrated into the offline world.


pages: 416 words: 108,370

Hit Makers: The Science of Popularity in an Age of Distraction by Derek Thompson

Airbnb, Albert Einstein, Alexey Pajitnov wrote Tetris, always be closing, augmented reality, Clayton Christensen, data science, Donald Trump, Downton Abbey, Ford Model T, full employment, game design, Golden age of television, Gordon Gekko, hindsight bias, hype cycle, indoor plumbing, industrial cluster, information trail, invention of the printing press, invention of the telegraph, Jeff Bezos, John Snow's cholera map, Kevin Roose, Kodak vs Instagram, linear programming, lock screen, Lyft, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Mary Meeker, Menlo Park, Metcalfe’s law, Minecraft, Nate Silver, Network effects, Nicholas Carr, out of africa, planned obsolescence, power law, prosperity theology / prosperity gospel / gospel of success, randomized controlled trial, recommendation engine, Robert Gordon, Ronald Reagan, Savings and loan crisis, Silicon Valley, Skype, Snapchat, social contagion, statistical model, Steve Ballmer, Steve Jobs, Steven Levy, Steven Pinker, subscription business, TED Talk, telemarketer, the medium is the message, The Rise and Fall of American Growth, Tyler Cowen, Uber and Lyft, Uber for X, uber lyft, Vilfredo Pareto, Vincenzo Peruggia: Mona Lisa, women in the workforce

It’s more like an orchestra of dozens and dozens of formulas that are conducted by a metaformula. One of the most important instruments in this algorithmic symphony is familiarity. “The most common complaint about Pandora is that there is too much repetition of bands and songs,” said Eric Bieschke, the first chief data scientist at Pandora. “Preferences for familiarity are much more individual than I would have thought. You can play the exact same songs to two people with the same tastes in music. One will consider the station perfectly familiar, and the other will consider it horribly repetitive.” There are two fascinating implications here.

Facebook has an advantage over the Iowa Method and basically every other company in the world when it comes to understanding people. In psychological studies, “reactivity” is the notion that when people are aware that they’re being watched, they change their behavior. On Facebook, however, it’s unlikely that most people are in a constant state of nervous self-monitoring, lest Facebook’s data scientists know that they like videos of red pandas. Facebook can watch readers without readers’ explicit awareness that they are under surveillance. This ought to afford a fairly accurate understanding of what people really want to read. The most obvious thing that Facebook can tell is that reader preferences are a mosaic within countries and around the world.


pages: 382 words: 105,819

Zucked: Waking Up to the Facebook Catastrophe by Roger McNamee

"Susan Fowler" uber, "World Economic Forum" Davos, 4chan, Albert Einstein, algorithmic trading, AltaVista, Amazon Web Services, Andy Rubin, barriers to entry, Bernie Sanders, Big Tech, Bill Atkinson, Black Lives Matter, Boycotts of Israel, Brexit referendum, Cambridge Analytica, carbon credits, Cass Sunstein, cloud computing, computer age, cross-subsidies, dark pattern, data is the new oil, data science, disinformation, Donald Trump, Douglas Engelbart, Douglas Engelbart, driverless car, Electric Kool-Aid Acid Test, Elon Musk, fake news, false flag, Filter Bubble, game design, growth hacking, Ian Bogost, income inequality, information security, Internet of things, It's morning again in America, Jaron Lanier, Jeff Bezos, John Markoff, laissez-faire capitalism, Lean Startup, light touch regulation, Lyft, machine readable, Marc Andreessen, Marc Benioff, Mark Zuckerberg, market bubble, Max Levchin, Menlo Park, messenger bag, Metcalfe’s law, minimum viable product, Mother of all demos, move fast and break things, Network effects, One Laptop per Child (OLPC), PalmPilot, paypal mafia, Peter Thiel, pets.com, post-work, profit maximization, profit motive, race to the bottom, recommendation engine, Robert Mercer, Ronald Reagan, Russian election interference, Sand Hill Road, self-driving car, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Skype, Snapchat, social graph, software is eating the world, Stephen Hawking, Steve Bannon, Steve Jobs, Steven Levy, Stewart Brand, subscription business, TED Talk, The Chicago School, The future is already here, Tim Cook: Apple, two-sided market, Uber and Lyft, Uber for X, uber lyft, Upton Sinclair, vertical integration, WikiLeaks, Yom Kippur War

When I met Tim a few days later, he helped me understand the role of state attorneys general in the legal system and the kind of evidence that would be necessary to make a case. Over the ensuing six months, he organized a series of meetings with staff in Schneiderman’s office, a truly impressive group of people. We did not have to explain internet platforms to them. Not only did the New York AG’s office understand the internet, they had data scientists who could perform forensics. The AG’s office had the skills and experience to handle the most complex cases. In time, we would furnish them with whistle-blowers, as well as insights. By April 2018, thirty-seven state attorneys general had begun investigations of Facebook. 6 Congress Gets Serious Technological progress has merely provided us with more efficient means for going backwards.

New Knowledge also created Hamilton 68, the public dashboard that tracks Russian disinformation on Twitter. Sponsored by the German Marshall Fund and introduced on August 2, 2017, Hamilton 68 enables anyone to track what pro-Kremlin Twitter accounts are discussing and promoting. Renée is also director of policy of Data for Democracy, whose mission is “to be an inclusive community of data scientists and technologists to volunteer and collaborate on projects that make a positive impact on society.” Renée’s own focus is on analysis of efforts by bad actors to subvert democracy around the world. Unlike us, Renée was a pro in the world of election security. She and her colleagues had heard whispers of Russian interference efforts in 2015 but had struggled to get the authorities to take action.


pages: 380 words: 109,724

Don't Be Evil: How Big Tech Betrayed Its Founding Principles--And All of US by Rana Foroohar

"Susan Fowler" uber, "World Economic Forum" Davos, accounting loophole / creative accounting, Airbnb, Alan Greenspan, algorithmic bias, algorithmic management, AltaVista, Andy Rubin, autonomous vehicles, banking crisis, barriers to entry, behavioural economics, Bernie Madoff, Bernie Sanders, Big Tech, bitcoin, Black Lives Matter, book scanning, Brewster Kahle, Burning Man, call centre, Cambridge Analytica, cashless society, clean tech, cloud computing, cognitive dissonance, Colonization of Mars, computer age, corporate governance, creative destruction, Credit Default Swap, cryptocurrency, data is the new oil, data science, deal flow, death of newspapers, decentralized internet, Deng Xiaoping, digital divide, digital rights, disinformation, disintermediation, don't be evil, Donald Trump, drone strike, Edward Snowden, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Etonian, Evgeny Morozov, fake news, Filter Bubble, financial engineering, future of work, Future Shock, game design, gig economy, global supply chain, Gordon Gekko, Great Leap Forward, greed is good, income inequality, independent contractor, informal economy, information asymmetry, intangible asset, Internet Archive, Internet of things, invisible hand, Jaron Lanier, Jeff Bezos, job automation, job satisfaction, junk bonds, Kenneth Rogoff, life extension, light touch regulation, low interest rates, Lyft, Mark Zuckerberg, Marshall McLuhan, Martin Wolf, Menlo Park, military-industrial complex, move fast and break things, Network effects, new economy, offshore financial centre, PageRank, patent troll, Paul Volcker talking about ATMs, paypal mafia, Peter Thiel, pets.com, price discrimination, profit maximization, race to the bottom, recommendation engine, ride hailing / ride sharing, Robert Bork, Sand Hill Road, search engine result page, self-driving car, shareholder value, sharing economy, Sheryl Sandberg, Shoshana Zuboff, side hustle, Sidewalk Labs, Silicon Valley, Silicon Valley startup, smart cities, Snapchat, SoftBank, South China Sea, sovereign wealth fund, Steve Bannon, Steve Jobs, Steven Levy, stock buybacks, subscription business, supply-chain management, surveillance capitalism, TaskRabbit, tech billionaire, tech worker, TED Talk, Telecommunications Act of 1996, The Chicago School, the long tail, the new new thing, Tim Cook: Apple, too big to fail, Travis Kalanick, trickle-down economics, Uber and Lyft, Uber for X, uber lyft, Upton Sinclair, warehouse robotics, WeWork, WikiLeaks, zero-sum game

They adhered strictly to the maxim that says it’s better to ask for forgiveness than to beg for permission—though in truth they weren’t really doing either. It’s an attitude of entitlement that still exists today, even after all the events of the past few years. In 2018, while attending a major economics conference, I was stuck in a cab with a Google data scientist, who expressed envy at the amount of surveillance that Chinese companies are allowed to conduct on citizens, and the vast amount of data it produces. She seemed genuinely outraged about the fact that the university where she was conducting AI research had apparently allowed her to put just a handful of data-recording sensors around campus to collect information that could then be used in her research.

As Shoshana Zuboff has written, in the sort of surveillance capitalism practiced by Google and other Big Tech firms, “Contract and the rule of law are supplanted by the rewards and punishments of a new kind of invisible hand,”42 the algorithmic hand of Silicon Valley. Varian and his team were unique, and foreshadowed an era in which most big companies would hire data scientists and data economists in great numbers. The existing laws that governed commerce were, like most laws in the view of Big Tech, made to be broken. Stewards of Trust? To be fair, pioneers like Varian have acknowledged a number of downsides of this new networked business model being pursued by Google and numerous other Silicon Valley giants, even the big one: privacy.


pages: 386 words: 112,064

Rich White Men: What It Takes to Uproot the Old Boys' Club and Transform America by Garrett Neiman

"World Economic Forum" Davos, Affordable Care Act / Obamacare, Albert Einstein, basic income, Bernie Sanders, BIPOC, Black Lives Matter, Branko Milanovic, British Empire, Capital in the Twenty-First Century by Thomas Piketty, carried interest, clean water, confounding variable, coronavirus, COVID-19, critical race theory, dark triade / dark tetrad, data science, Donald Trump, drone strike, effective altruism, Elon Musk, gender pay gap, George Floyd, glass ceiling, green new deal, high net worth, Home mortgage interest deduction, Howard Zinn, impact investing, imposter syndrome, impulse control, income inequality, Jeff Bezos, Jeffrey Epstein, John Maynard Keynes: Economic Possibilities for our Grandchildren, knowledge worker, Larry Ellison, liberal capitalism, Lyft, Mahatma Gandhi, mandatory minimum, Mark Zuckerberg, mass incarceration, means of production, meritocracy, meta-analysis, Michael Milken, microaggression, mortgage tax deduction, move fast and break things, Nelson Mandela, new economy, obamacare, occupational segregation, offshore financial centre, Paul Buchheit, Peter Thiel, plutocrats, Ralph Waldo Emerson, randomized controlled trial, rent-seeking, Ronald Reagan, Rutger Bregman, Sheryl Sandberg, Silicon Valley, Snapchat, sovereign wealth fund, Steve Jobs, subprime mortgage crisis, TED Talk, The Bell Curve by Richard Herrnstein and Charles Murray, Travis Kalanick, trickle-down economics, uber lyft, universal basic income, Upton Sinclair, War on Poverty, white flight, William MacAskill, winner-take-all economy, women in the workforce, work culture , working poor

How was I different from an equally intelligent student in a high-poverty neighborhood whose teachers may have been too overwhelmed to offer such advocacy? How could I account for the fact that—as Johns Hopkins researchers found—white teachers like mine believe white students have more potential?6 Or the analysis of former Google data scientist Seth Stephens-Davidowitz, who found that in their Google searches parents are two and a half times more likely to ask “Is my son gifted?” than “Is my daughter gifted?” which suggests that many parents see their sons as more intelligent or—at the very least—are more invested in their sons’ being intelligent because intelligence typically offers more status and financial rewards for men than it does for women.7 It is true that if the GATE test had included just the math and verbal sections—without the spatial component—I would have passed with flying colors.

While the typical student speaks in every other class session, Paul averaged five comments every class. That was ten times as often as the typical student, and thirty times as often as less vocal students. The desire to hear from everyone is about more than an appetite for novelty. In his 2014 book Social Physics, MIT data scientist Alex Pentland studies teams and groups. In his research, Pentland found that groups where a few people dominate the conversation are less collectively intelligent than groups where more people contribute. “The largest factor in predicting group intelligence,” Pentland writes, “was the equality of conversational turn taking.”2 Why?


pages: 151 words: 39,757

Ten Arguments for Deleting Your Social Media Accounts Right Now by Jaron Lanier

4chan, Abraham Maslow, basic income, Big Tech, Black Lives Matter, Cambridge Analytica, cloud computing, context collapse, corporate governance, data science, disinformation, Donald Trump, en.wikipedia.org, fake news, Filter Bubble, gig economy, Internet of things, Jaron Lanier, life extension, Mark Zuckerberg, market bubble, Milgram experiment, move fast and break things, Network effects, peak TV, ransomware, Ray Kurzweil, recommendation engine, Silicon Valley, Skinner box, Snapchat, Stanford prison experiment, stem cell, Steve Jobs, Ted Nelson, theory of mind, WikiLeaks, you are the product, zero-sum game

Companies like Facebook, Google, and Twitter are finally trying to fix some of the massive problems they created, albeit in a piecemeal way. Is it because they are being pressured or because they feel that it’s the right thing to do? Probably a little of both. The companies are changing policies, hiring humans to monitor what’s going on, and hiring data scientists to come up with algorithms to avoid the worst failings. Facebook’s old mantra was “Move fast and break things,”3 and now they’re coming up with better mantras and picking up a few pieces from a shattered world and gluing them together. This book will argue that the companies on their own can’t do enough to glue the world back together.


pages: 561 words: 120,899

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant From Two Centuries of Controversy by Sharon Bertsch McGrayne

Abraham Wald, Alan Greenspan, Bayesian statistics, bioinformatics, Bletchley Park, British Empire, classic study, Claude Shannon: information theory, Daniel Kahneman / Amos Tversky, data science, double helix, Dr. Strangelove, driverless car, Edmond Halley, Fellow of the Royal Society, full text search, government statistician, Henri Poincaré, Higgs boson, industrial research laboratory, Isaac Newton, Johannes Kepler, John Markoff, John Nash: game theory, John von Neumann, linear programming, longitudinal study, machine readable, machine translation, meta-analysis, Nate Silver, p-value, Pierre-Simon Laplace, placebo effect, prediction markets, RAND corporation, recommendation engine, Renaissance Technologies, Richard Feynman, Richard Feynman: Challenger O-ring, Robert Mercer, Ronald Reagan, seminal paper, speech recognition, statistical model, stochastic process, Suez canal 1869, Teledyne, the long tail, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, traveling salesman, Turing machine, Turing test, uranium enrichment, We are all Keynesians now, Yom Kippur War

17 Elaborating on Jeffreys, Savage answered as follows: as the amount of data increases, subjectivists move into agreement, the way scientists come to a consensus as evidence accumulates about, say, the greenhouse effect or about cigarettes being the leading cause of lung cancer. When they have little data, scientists disagree and are subjectivists; when they have piles of data, they agree and become objectivists. Lindley agreed: “That’s the way science is done.”18 But when Savage trumpeted the mathematical treatment of personal opinion, no one—not even he and Lindley—realized yet that he had written the Bayesian Bible.

[on a] clear way to treat uncertainty. . . . In certain circumstances, a population might go extinct before a significant decline could be detected.”22 During the administration of Bill Clinton, the Wildlife Protection Act was amended to accept Bayesian analyses alerting conservationists early to the need for more data. Scientists advising the International Whaling Commission were particularly worried about the uncertainty of their measurements. Each year the commission establishes the number of endangered bowhead whales Eskimos can hunt in Arctic seas. To ensure the long-term survival of the bowheads, scientists compute 2 numbers each year: the number of bowheads and their rate of increase.


pages: 521 words: 118,183

The Wires of War: Technology and the Global Struggle for Power by Jacob Helberg

"World Economic Forum" Davos, 2021 United States Capitol attack, A Declaration of the Independence of Cyberspace, active measures, Affordable Care Act / Obamacare, air gap, Airbnb, algorithmic management, augmented reality, autonomous vehicles, Berlin Wall, Bernie Sanders, Big Tech, bike sharing, Black Lives Matter, blockchain, Boris Johnson, Brexit referendum, cable laying ship, call centre, Cambridge Analytica, Cass Sunstein, cloud computing, coronavirus, COVID-19, creative destruction, crisis actor, data is the new oil, data science, decentralized internet, deep learning, deepfake, deglobalization, deindustrialization, Deng Xiaoping, deplatforming, digital nomad, disinformation, don't be evil, Donald Trump, dual-use technology, Edward Snowden, Elon Musk, en.wikipedia.org, end-to-end encryption, fail fast, fake news, Filter Bubble, Francis Fukuyama: the end of history, geopolitical risk, glass ceiling, global pandemic, global supply chain, Google bus, Google Chrome, GPT-3, green new deal, information security, Internet of things, Jeff Bezos, Jeffrey Epstein, John Markoff, John Perry Barlow, knowledge economy, Larry Ellison, lockdown, Loma Prieta earthquake, low earth orbit, low skilled workers, Lyft, manufacturing employment, Marc Andreessen, Mark Zuckerberg, Mary Meeker, Mikhail Gorbachev, military-industrial complex, Mohammed Bouazizi, move fast and break things, Nate Silver, natural language processing, Network effects, new economy, one-China policy, open economy, OpenAI, Parler "social media", Peter Thiel, QAnon, QR code, race to the bottom, Ralph Nader, RAND corporation, reshoring, ride hailing / ride sharing, Ronald Reagan, Russian election interference, Salesforce, Sam Altman, satellite internet, self-driving car, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Skype, smart grid, SoftBank, Solyndra, South China Sea, SpaceX Starlink, Steve Jobs, Steven Levy, Stuxnet, supply-chain attack, Susan Wojcicki, tech worker, techlash, technoutopianism, TikTok, Tim Cook: Apple, trade route, TSMC, Twitter Arab Spring, uber lyft, undersea cable, Unsafe at Any Speed, Valery Gerasimov, vertical integration, Wargames Reagan, Westphalian system, white picket fence, WikiLeaks, Y Combinator, zero-sum game

Google has a number of discrete news products—from the Google News tab on your web browser, to the news feed you see when you swipe down on your Android phone, to the audio news you hear if you ask the Google Assistant, “Okay, Google, read me the news.” To most consumers, these products probably seem like different features. Yet each one is made possible by an unseen team of designers, engineers, data scientists, and marketing professionals. Unlike the organic ten blue links, which are meant to be a reflection of the web, news features are more tightly curated and subject to stricter policies for content that Google labels or designates as “news.” And each of these products, potentially, was a fissure into which Moscow might have inserted its perverse propaganda.

Then, a month later, new revelations turned the tech world upside down. On March 17, the Guardian and the New York Times simultaneously reported an explosive series of stories, based on whistleblower accounts, about a company called Cambridge Analytica and its work with the Trump campaign.68 According to Brittany Kaiser, one of the whistleblowers, the data scientists at Cambridge Analytica had taken advantage of lax privacy laws and Facebook loopholes to “scrape” up to 5,000 data points on every American older than eighteen—approximately 240 million people.69 This included data from public posts and ostensibly private direct messages. Most egregiously, perhaps, users who agreed to the terms and services of a third-party app (like Candy Crush) consented to provide not only their own data but their friends’.


Uncontrolled Spread by Scott Gottlieb

"World Economic Forum" Davos, additive manufacturing, Atul Gawande, Bernie Sanders, Citizen Lab, contact tracing, coronavirus, COVID-19, data science, disinformation, Donald Trump, double helix, fear of failure, global pandemic, global supply chain, Kevin Roose, lab leak, Larry Ellison, lockdown, medical residency, Nate Silver, randomized controlled trial, social distancing, stem cell, sugar pill, synthetic biology, uranium enrichment, zoonotic diseases

The CDC and HHS had drafted a plan to bring their systems into compliance with the legislation, but “the actions identified in the implementation plan did not address all of the requirements defined by the law,” the GAO concluded, and “as of May 2017, HHS had made limited progress toward establishing the required electronic public health situational awareness network capabilities.”31 During COVID, it became evident that the CDC didn’t have the basic tools of data management for public health decision making. They didn’t have the advanced data analytics needed to do the evaluations that were required, and they didn’t have the right integration for electronic capture of information from health records. They didn’t have a large group of data scientists and modelers. They outsourced most of these analytical tasks to academic partners. And the problems weren’t just on the back end in how the CDC captured and analyzed data, but also on the front end, in how they developed the raw information, particularly in cases where the agency had the primary responsibility for generating a body of evidence.

The intelligence community, by contrast, has a prospective mind-set; it’s always scanning the horizon. Intelligence agencies have to make a call on future threats. They’re willing to be wrong, but they’re compelled by the nature of their work to make predictions. Even if the CDC were in the business of making assessments on future threats, it largely lacks the data science capabilities and the advanced analytics required to do this sort of forecasting of risk. America’s dismal experience with COVID leaves us little choice but to expand the tools we use to inform us of new risks. In bolstering our pandemic preparedness, our purpose shouldn’t be merely to blunt the impact of the next pathogen that emerges, but to make sure that a calamity on the scale of COVID can never happen again, and the US can never be threatened in this way again.


pages: 184 words: 46,395

The Choice Factory: 25 Behavioural Biases That Influence What We Buy by Richard Shotton

active measures, behavioural economics, call centre, cashless society, cognitive dissonance, Daniel Kahneman / Amos Tversky, data science, David Brooks, Estimating the Reproducibility of Psychological Science, Firefox, framing effect, fundamental attribution error, Goodhart's law, Google Chrome, Kickstarter, loss aversion, nudge unit, Ocado, placebo effect, price anchoring, principal–agent problem, Ralph Waldo Emerson, replication crisis, Richard Feynman, Richard Thaler, Robert Shiller, Rory Sutherland, TED Talk, Veblen good, When a measure becomes a target, World Values Survey

Search is the most accessible found data source. Analysing search data provides insights that consumers might be loath to admit in a survey. Consider sexism. Most people would claim that they’re equally interested in their children’s intelligence, regardless of gender. However, Seth Stephens-Davidowitz, the New York Times journalist and data scientist, has analysed US search data and found that parents are two and a half times more likely to Google “is my son gifted?” than “is my daughter gifted?”. Google acts as a modern confessional in which all our darkest thoughts are captured. However, this rich seam of data is too rarely mined by advertisers.


pages: 160 words: 45,516

Tomorrow's Lawyers: An Introduction to Your Future by Richard Susskind

business intelligence, business process, business process outsourcing, call centre, Clayton Christensen, cloud computing, commoditize, crowdsourcing, data science, disruptive innovation, global supply chain, information retrieval, invention of the wheel, power law, pre–internet, Ray Kurzweil, Silicon Valley, Skype, speech recognition, supply-chain management, telepresence, Watson beat the top human players on Jeopardy!

For example, by aggregating search data, we might be able to find out what legal issues and concerns are troubling particular communities; by analysing databases of decisions by judges and regulators, we may be able to predict outcomes in entirely novel ways; and by collecting huge bodies of commercial contracts and exchanges of emails, we might gain insight into the greatest legal risks that specific sectors face. The disruption here is that crucial legal insights, correlations, and even algorithms might come to play a central role in legal practice and legal risk management and yet they will not be generated through the work of mainstream lawyers (unless they choose to collaborate with big data scientists). AI-Based Problem-Solving If IBM’s Watson (an artificially intelligent computer system designed to compete on the US TV quiz show Jeopardy!) is able publicly to beat the two finest human competitors, then the days of online problem-solving by computer are not very far away. And when we enter that era, and we apply the same techniques and technologies in law, then we will have AI-based legal problem-solving.


pages: 444 words: 127,259

Super Pumped: The Battle for Uber by Mike Isaac

"Susan Fowler" uber, "World Economic Forum" Davos, activist fund / activist shareholder / activist investor, Airbnb, Albert Einstein, always be closing, Amazon Web Services, Andy Kessler, autonomous vehicles, Ayatollah Khomeini, barriers to entry, Bay Area Rapid Transit, Benchmark Capital, Big Tech, Burning Man, call centre, Cambridge Analytica, Chris Urmson, Chuck Templeton: OpenTable:, citizen journalism, Clayton Christensen, cloud computing, corporate governance, creative destruction, data science, Didi Chuxing, don't be evil, Donald Trump, driverless car, Elon Musk, end-to-end encryption, fake news, family office, gig economy, Google Glasses, Google X / Alphabet X, Greyball, Hacker News, high net worth, hockey-stick growth, hustle culture, impact investing, information security, Jeff Bezos, John Markoff, John Zimmer (Lyft cofounder), Kevin Roose, Kickstarter, Larry Ellison, lolcat, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Masayoshi Son, mass immigration, Menlo Park, Mitch Kapor, money market fund, moral hazard, move fast and break things, Network effects, new economy, off grid, peer-to-peer, pets.com, Richard Florida, ride hailing / ride sharing, Salesforce, Sand Hill Road, self-driving car, selling pickaxes during a gold rush, shareholder value, Shenzhen special economic zone , Sheryl Sandberg, side hustle, side project, Silicon Valley, Silicon Valley startup, skunkworks, Snapchat, SoftBank, software as a service, software is eating the world, South China Sea, South of Market, San Francisco, sovereign wealth fund, special economic zone, Steve Bannon, Steve Jobs, stock buybacks, super pumped, TaskRabbit, tech bro, tech worker, the payments system, Tim Cook: Apple, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, ubercab, union organizing, upwardly mobile, Vision Fund, WeWork, Y Combinator

One employee coined the term “rides of glory” to describe the Uber trip a customer takes home the morning after a one-night-stand. “In times of yore, you would have woken up in a panic, scrambling in the dark trying to find your fur coat or velvet smoking jacket or whatever it is you cool kids wear,” the post said, authored by Bradley Voytek, one of Uber’s data scientists. “Then that long walk home in the pre-morning dawn.” Voytek, a cognitive neuroscientist by trade, joined Uber because he loved the insight that such an enormous data set gave him into human behavior. Watching trips across cities being carried out in real time was like having his own personal human ant farm.

Fraudsters simply entered fake names and emails. Then they used apps like “Burner” or “TextNow” to create thousands of fake telephone numbers to be matched with stolen credit card numbers. But requiring Chinese users to add other, more precise, forms of identification would add more friction to the process. And, as Kalanick’s data scientists found in their research, adding friction slowed growth. For Kalanick, putting a dent in growth was not an option. Kalanick’s solution was to grow and rely upon the anti-fraud team. But scammers grew more shrewd over time. Eventually, hustlers found that searching forums for riders was inefficient and time-consuming, so they ended up creating “riders” themselves.


pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian, Tom Griffiths

4chan, Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, algorithmic bias, algorithmic trading, anthropic principle, asset allocation, autonomous vehicles, Bayesian statistics, behavioural economics, Berlin Wall, Big Tech, Bill Duvall, bitcoin, Boeing 747, Charles Babbage, cognitive load, Community Supported Agriculture, complexity theory, constrained optimization, cosmological principle, cryptocurrency, Danny Hillis, data science, David Heinemeier Hansson, David Sedaris, delayed gratification, dematerialisation, diversification, Donald Knuth, Donald Shoup, double helix, Dutch auction, Elon Musk, exponential backoff, fault tolerance, Fellow of the Royal Society, Firefox, first-price auction, Flash crash, Frederick Winslow Taylor, fulfillment center, Garrett Hardin, Geoffrey Hinton, George Akerlof, global supply chain, Google Chrome, heat death of the universe, Henri Poincaré, information retrieval, Internet Archive, Jeff Bezos, Johannes Kepler, John Nash: game theory, John von Neumann, Kickstarter, knapsack problem, Lao Tzu, Leonard Kleinrock, level 1 cache, linear programming, martingale, multi-armed bandit, Nash equilibrium, natural language processing, NP-complete, P = NP, packet switching, Pierre-Simon Laplace, power law, prediction markets, race to the bottom, RAND corporation, RFC: Request For Comment, Robert X Cringely, Sam Altman, scientific management, sealed-bid auction, second-price auction, self-driving car, Silicon Valley, Skype, sorting algorithm, spectrum auction, Stanford marshmallow experiment, Steve Jobs, stochastic process, Thomas Bayes, Thomas Malthus, Tragedy of the Commons, traveling salesman, Turing machine, urban planning, Vickrey auction, Vilfredo Pareto, Walter Mischel, Y Combinator, zero-sum game

We have the expression “Eat, drink, and be merry, for tomorrow we die,” but perhaps we should also have its inverse: “Start learning a new language or an instrument, and make small talk with a stranger, because life is long, and who knows what joy could blossom over many years’ time.” When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them. “I’m more likely to try a new restaurant when I move to a city than when I’m leaving it,” explains data scientist and blogger Chris Stucchio, a veteran of grappling with the explore/exploit tradeoff in both his work and his life. “I mostly go to restaurants I know and love now, because I know I’m going to be leaving New York fairly soon. Whereas a couple years ago I moved to Pune, India, and I just would eat friggin’ everywhere that didn’t look like it was gonna kill me.

Instead of “the” Google search algorithm and “the” Amazon checkout flow, there are now untold and unfathomably subtle permutations. (Google infamously tested forty-one shades of blue for one of its toolbars in 2009.) In some cases, it’s unlikely that any pair of users will have the exact same experience. Data scientist Jeff Hammerbacher, former manager of the Data group at Facebook, once told Bloomberg Businessweek that “the best minds of my generation are thinking about how to make people click ads.” Consider it the millennials’ Howl—what Allen Ginsberg’s immortal “I saw the best minds of my generation destroyed by madness” was to the Beat Generation.


pages: 525 words: 147,008

SuperBetter by Jane McGonigal

autism spectrum disorder, data science, full employment, game design, job satisfaction, Kickstarter, longitudinal study, meta-analysis, Minecraft, mirror neurons, randomized controlled trial, risk tolerance, social intelligence, space junk, stem cell, Stephen Hawking, TED Talk, theory of mind, traumatic brain injury, ultimatum game, Walter Mischel

If I sound quite confident that you can transform your life for the better with a gameful mindset and the SuperBetter method, it’s because I am. Since I invented SuperBetter, more than 400,000 people have played an online version of the game. We’ve recorded every power-up they’ve activated, every bad guy they’ve battled, and every quest they’ve completed—so we know what works and what doesn’t. I’ve joined forces with data scientists to analyze all the information we’ve collected from these 400,000 players over the past two years. I wanted answers to some of the same questions you might have: Who can the SuperBetter method work for? (Virtually anyone—young or old, male or female, avid game player or someone who has never played a video game in their life.)

Eventually, the perfect opportunity arose: Roepke found two colleagues at Penn who were interested in helping her conduct a formal study of SuperBetter’s effectiveness for treating depression. To help Penn prepare for this study, I teamed up with two collaborators at SuperBetter Labs, science writer Bez Maxwell and data scientist Rose Broome, to create a special set of depression-related power-ups, bad guys, and quests. The three of us also helped the Penn researchers design the study—how long it would last, how often we would encourage participants to play, and what questions we would ask. But the actual trial, including all recruitment, data collection, and data analysis, was conducted independently by the research team at the University of Pennsylvania.


Agile Project Management with Kanban (Developer Best Practices) by Eric Brechner

Amazon Web Services, cloud computing, continuous integration, crowdsourcing, data science, DevOps, don't repeat yourself, en.wikipedia.org, index card, Kaizen: continuous improvement, Kanban, loose coupling, minimum viable product, pull request, software as a service

A feature team is a group of individuals, often from multiple disciplines, who work on the same set of product features together. A typical feature team might have 1–3 analysts, 1–6 developers, and 1–6 testers (a total of 3–15 people), but some can be larger. Feature teams may also have marketers, product planners, designers, user researchers, architects, technical researchers, data scientists, quality assurance personnel, service engineers, service operations staff, and project managers. Often, feature team members are part of multiple feature teams, although developers and testers tend to be dedicated to a single team. Many people who use traditional Waterfall work on feature teams, all for the same manager or as a virtual team.


pages: 230 words: 61,702

The Internet of Us: Knowing More and Understanding Less in the Age of Big Data by Michael P. Lynch

Affordable Care Act / Obamacare, Amazon Mechanical Turk, big data - Walmart - Pop Tarts, bitcoin, Cass Sunstein, Claude Shannon: information theory, cognitive load, crowdsourcing, data science, Edward Snowden, Firefox, Google Glasses, hive mind, income inequality, Internet of things, John von Neumann, meta-analysis, Nate Silver, new economy, Nick Bostrom, Panopticon Jeremy Bentham, patient HM, prediction markets, RFID, sharing economy, Steve Jobs, Steven Levy, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Twitter Arab Spring, WikiLeaks

That is, instead of asking people survey questions or contriving small-scale experiments, which was how social science was often done in the past, I could go and look at what actually happens, when, say, 100,000 white men and 100,000 black women interact in private.2 Anderson and Rudder’s comments are not isolated; they bring to the surface sentiments that have been echoed across discussions of analytics over the last few years. While Rudder has been particularly adept at showing how huge data gathered by social sites can provide eye-opening correlations, and data scientists and companies the world over have been harvesting a wealth of surprising information using analytics, Google remains the most visible leader in this field. The most frequently cited, and still one of the most interesting, examples is Google Flu Trends. In a now-famous journal article in Nature, Google scientists compared the 50 million most common search terms used in America with the CDC’s data about the spread of seasonal flu between 2003 and 2008.3 What they learned was that forty-five search terms could be used to predict where the flu was spreading—and do so in real time, as they did with some accuracy in 2009 during the H1N1 outbreak.


pages: 202 words: 62,901

The People's Republic of Walmart: How the World's Biggest Corporations Are Laying the Foundation for Socialism by Leigh Phillips, Michal Rozworski

Alan Greenspan, Anthropocene, Berlin Wall, Bernie Sanders, biodiversity loss, call centre, capitalist realism, carbon footprint, carbon tax, central bank independence, Colonization of Mars, combinatorial explosion, company town, complexity theory, computer age, corporate raider, crewed spaceflight, data science, decarbonisation, digital rights, discovery of penicillin, Elon Musk, financial engineering, fulfillment center, G4S, Garrett Hardin, Georg Cantor, germ theory of disease, Gordon Gekko, Great Leap Forward, greed is good, hiring and firing, independent contractor, index fund, Intergovernmental Panel on Climate Change (IPCC), Internet of things, inventory management, invisible hand, Jeff Bezos, Jeremy Corbyn, Joseph Schumpeter, Kanban, Kiva Systems, linear programming, liquidity trap, mass immigration, Mont Pelerin Society, Neal Stephenson, new economy, Norbert Wiener, oil shock, passive investing, Paul Samuelson, post scarcity, profit maximization, profit motive, purchasing power parity, recommendation engine, Ronald Coase, Ronald Reagan, sharing economy, Silicon Valley, Skype, sovereign wealth fund, strikebreaker, supply-chain management, surveillance capitalism, technoutopianism, TED Talk, The Nature of the Firm, The Wealth of Nations by Adam Smith, theory of mind, Tragedy of the Commons, transaction costs, Turing machine, union organizing, warehouse automation, warehouse robotics, We are all Keynesians now

We get a few super-yachts instead of superabundant housing for all; and we might well say the same when it comes to which consumer items we prioritize for production and distribution. In our irrational system, the ultimate purpose of product recommendations is to drive sales and profits for Amazon. Data scientists have found that rather than high numbers of customer-submitted reviews, which have little impact, it is recommendations that boost Amazon’s sales. Recommendations help sell not only less popular niche items—when it’s hard to dig up information, even just a recommendation can be enough to sway us—and bestsellers that constantly pop up when we’re browsing.


pages: 186 words: 50,651

Interactive Data Visualization for the Web by Scott Murray

barriers to entry, data science, Firefox, intentional community, iterative process, TED Talk, the long tail, web application, your tax dollars at work

Information Dashboard Design: The Effective Visual Communication of Data by Stephen Few. O’Reilly Media, 2006. On the practicalities of working with data: Bad Data Handbook: Mapping the World of Data Problems by Q. Ethan McCallum. O’Reilly Media, 2012. Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists by Philipp K. Janert. O’Reilly Media, 2010. Python for Data Analysis: Agile Tools for Real World Data by Wes McKinney. O’Reilly Media, 2012. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions.


Natural Language Processing with Python and spaCy by Yuli Vasiliev

Bayesian statistics, computer vision, data science, database schema, Easter island, en.wikipedia.org, loose coupling, natural language processing, Skype, statistical model

Natural language processing (NLP) is a subfield of artificial intelligence that tries to process and analyze natural language data. It includes teaching machines to interact with humans in a natural language (a language that developed naturally through use). By creating machine learning algorithms designed to work with unknown datasets much larger than those two dozen tablets found on Rapa Nui, data scientists can learn how we use language. They can also do more than simply decipher ancient inscriptions. Today, you can use algorithms to observe languages whose semantics and grammar rules are well known (unlike the rongorongo inscriptions), and then build applications that can programmatically “understand” utterances in that language.


pages: 287 words: 62,824

Just Keep Buying: Proven Ways to Save Money and Build Your Wealth by Nick Maggiulli

Airbnb, asset allocation, Big Tech, bitcoin, buy and hold, COVID-19, crowdsourcing, cryptocurrency, data science, diversification, diversified portfolio, financial independence, Hans Rosling, index fund, it's over 9,000, Jeff Bezos, Jeff Seder, lifestyle creep, mass affluent, mortgage debt, oil shock, payday loans, phenotype, price anchoring, risk-adjusted returns, Robert Shiller, Sam Altman, side hustle, side project, stocks for the long run, The 4% rule, time value of money, transaction costs, very high income, William Bengen, yield curve

Maggiulli not only uses evidence to guide his suggestions, but he is also among the best at boiling everything down into ideas that are easy to understand and apply.” —James Clear, #1 New York Times bestselling author, Atomic Habits “The first time I read Nick Maggiulli's writing I knew he had a special talent. There are lots of good data scientists, and lots of good storytellers. But few understand the data and can tell a compelling story about it like Nick. This is a must-read.” —Morgan Housel, bestselling author, The Psychology of Money “Nick Maggiulli clearly delights in flouting the received wisdom about how people should manage their money.


pages: 184 words: 60,229

Re-Educated: Why It’s Never Too Late to Change Your Life by Lucy Kellaway

"World Economic Forum" Davos, Berlin Wall, Boris Johnson, Broken windows theory, cognitive load, coronavirus, COVID-19, data science, Donald Trump, fake news, George Floyd, Greta Thunberg, imposter syndrome, lockdown, Martin Wolf, stakhanovite, wage slave

The moral of his story is similar to mine. He got away with it, partly because he had middle-class parents who found a way. He could afford to fail, because he had a safety net. I have just shown him what I’ve written. He doesn’t mind being portrayed as a wastrel, because he is one no longer. He has a job as a data scientist in a start-up and is motivated and doing well. But what about all the shouting and nagging? Did it make a difference? Art now says it was counterproductive – as well as being unpleasant – anything resembling coercion automatically makes him inclined to do the reverse. Has he forgiven me for it?


pages: 626 words: 167,836

The Technology Trap: Capital, Labor, and Power in the Age of Automation by Carl Benedikt Frey

3D printing, AlphaGo, Alvin Toffler, autonomous vehicles, basic income, Bernie Sanders, Branko Milanovic, British Empire, business cycle, business process, call centre, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, Charles Babbage, Clayton Christensen, collective bargaining, computer age, computer vision, Corn Laws, Cornelius Vanderbilt, creative destruction, data science, David Graeber, David Ricardo: comparative advantage, deep learning, DeepMind, deindustrialization, demographic transition, desegregation, deskilling, Donald Trump, driverless car, easy for humans, difficult for computers, Edward Glaeser, Elon Musk, Erik Brynjolfsson, everywhere but in the productivity statistics, factory automation, Fairchild Semiconductor, falling living standards, first square of the chessboard / second half of the chessboard, Ford Model T, Ford paid five dollars a day, Frank Levy and Richard Murnane: The New Division of Labor, full employment, future of work, game design, general purpose technology, Gini coefficient, Great Leap Forward, Hans Moravec, high-speed rail, Hyperloop, income inequality, income per capita, independent contractor, industrial cluster, industrial robot, intangible asset, interchangeable parts, Internet of things, invention of agriculture, invention of movable type, invention of the steam engine, invention of the wheel, Isaac Newton, James Hargreaves, James Watt: steam engine, Jeremy Corbyn, job automation, job satisfaction, job-hopping, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Kickstarter, Kiva Systems, knowledge economy, knowledge worker, labor-force participation, labour mobility, Lewis Mumford, Loebner Prize, low skilled workers, machine translation, Malcom McLean invented shipping containers, manufacturing employment, mass immigration, means of production, Menlo Park, minimum wage unemployment, natural language processing, new economy, New Urbanism, Nick Bostrom, Norbert Wiener, nowcasting, oil shock, On the Economy of Machinery and Manufactures, OpenAI, opioid epidemic / opioid crisis, Pareto efficiency, pattern recognition, pink-collar, Productivity paradox, profit maximization, Renaissance Technologies, rent-seeking, rising living standards, Robert Gordon, Robert Solow, robot derives from the Czech word robota Czech, meaning slave, safety bicycle, Second Machine Age, secular stagnation, self-driving car, seminal paper, Silicon Valley, Simon Kuznets, social intelligence, sparse data, speech recognition, spinning jenny, Stephen Hawking, tacit knowledge, The Future of Employment, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, Thomas Malthus, total factor productivity, trade route, Triangle Shirtwaist Factory, Turing test, union organizing, universal basic income, warehouse automation, washing machines reduced drudgery, wealth creators, women in the workforce, working poor, zero-sum game

Hiring good people has always been a critical issue for competitive advantage. But since the widespread availability of data is comparatively recent, this problem is particularly acute. Automobile companies can hire people who know how to build automobiles since that is part of their core competency. They may or may not have sufficient internal expertise to hire good data scientists, which is why we can expect to see heterogeneity in productivity as this new skill percolates through the labor markets.84 For these reasons, Amara’s Law will likely to apply to AI, too. Myriad necessary ancillary inventions and adjustment are required for automation to happen. Erik Brynjolfsson, who was among those investigating the role of computer technologies in the productivity boom of the late 1990s, thinks that the trajectory of AI adoption is likely to mirror the past in this regard.

Official employment statistics are always behind the curve when it comes to capturing new occupations, which are not included in the data until they have reached a critical mass in terms of the number of people in them. But other sources, like LinkedIn data, allow us at least to nowcast some emerging jobs. Among them are the jobs of machine learning engineers, big data architects, data scientists, digital marketing specialists, and Android developers.14 But we also find jobs like Zumba instructors and Beachbody coaches.15 In a world that is becoming increasingly technologically sophisticated, rising returns on skills are unlikely to disappear and likely to intensify. Like computers, AI seems set to spawn more skilled jobs for labor, in the process creating more demand for in-person service jobs that remain hard to automate.


pages: 222 words: 70,132

Move Fast and Break Things: How Facebook, Google, and Amazon Cornered Culture and Undermined Democracy by Jonathan Taplin

"Friedman doctrine" OR "shareholder theory", "there is no alternative" (TINA), 1960s counterculture, affirmative action, Affordable Care Act / Obamacare, Airbnb, AlphaGo, Amazon Mechanical Turk, American Legislative Exchange Council, AOL-Time Warner, Apple's 1984 Super Bowl advert, back-to-the-land, barriers to entry, basic income, battle of ideas, big data - Walmart - Pop Tarts, Big Tech, bitcoin, Brewster Kahle, Buckminster Fuller, Burning Man, Clayton Christensen, Cody Wilson, commoditize, content marketing, creative destruction, crony capitalism, crowdsourcing, data is the new oil, data science, David Brooks, David Graeber, decentralized internet, don't be evil, Donald Trump, Douglas Engelbart, Douglas Engelbart, Dynabook, Edward Snowden, Elon Musk, equal pay for equal work, Erik Brynjolfsson, Fairchild Semiconductor, fake news, future of journalism, future of work, George Akerlof, George Gilder, Golden age of television, Google bus, Hacker Ethic, Herbert Marcuse, Howard Rheingold, income inequality, informal economy, information asymmetry, information retrieval, Internet Archive, Internet of things, invisible hand, Jacob Silverman, Jaron Lanier, Jeff Bezos, job automation, John Markoff, John Maynard Keynes: technological unemployment, John Perry Barlow, John von Neumann, Joseph Schumpeter, Kevin Kelly, Kickstarter, labor-force participation, Larry Ellison, life extension, Marc Andreessen, Mark Zuckerberg, Max Levchin, Menlo Park, Metcalfe’s law, military-industrial complex, Mother of all demos, move fast and break things, natural language processing, Network effects, new economy, Norbert Wiener, offshore financial centre, packet switching, PalmPilot, Paul Graham, paypal mafia, Peter Thiel, plutocrats, pre–internet, Ray Kurzweil, reality distortion field, recommendation engine, rent-seeking, revision control, Robert Bork, Robert Gordon, Robert Metcalfe, Ronald Reagan, Ross Ulbricht, Sam Altman, Sand Hill Road, secular stagnation, self-driving car, sharing economy, Silicon Valley, Silicon Valley ideology, Skinner box, smart grid, Snapchat, Social Justice Warrior, software is eating the world, Steve Bannon, Steve Jobs, Stewart Brand, tech billionaire, techno-determinism, technoutopianism, TED Talk, The Chicago School, the long tail, The Market for Lemons, The Rise and Fall of American Growth, Tim Cook: Apple, trade route, Tragedy of the Commons, transfer pricing, Travis Kalanick, trickle-down economics, Tyler Cowen, Tyler Cowen: Great Stagnation, universal basic income, unpaid internship, vertical integration, We are as Gods, We wanted flying cars, instead we got 140 characters, web application, Whole Earth Catalog, winner-take-all economy, women in the workforce, Y Combinator, you are the product

The privacy issue was reignited in early 2014, when the Wall Street Journal reported that Facebook had conducted a massive social-science experiment on nearly seven hundred thousand of its users. To determine whether it could alter the emotional state of its users and prompt them to post either more positive or negative content, the site’s data scientists enabled an algorithm, for one week, to automatically omit content that contained words associated with either positive or negative emotions from the central news feeds of 689,003 users. As it turned out, the experiment was very “successful” in that it was relatively easy to manipulate users’ emotions, but the backlash from the blogosphere was horrendous.


pages: 279 words: 71,542

Digital Minimalism: Choosing a Focused Life in a Noisy World by Cal Newport

Black Lives Matter, Burning Man, Cal Newport, data science, Donald Trump, Dunbar number, financial independence, game design, Hacker News, index fund, Jaron Lanier, Kevin Kelly, Kickstarter, lifelogging, longitudinal study, Mark Zuckerberg, Mr. Money Mustache, Pepto Bismol, pre–internet, price discrimination, race to the bottom, ride hailing / ride sharing, Silicon Valley, Skype, Snapchat, Steve Jobs, TED Talk

In other words, depending on whom you ask, social media is either making us lonely or bringing us joy. To better understand this general phenomenon of contrasting conclusions, let’s look closer at the specific studies summarized above. One of the main positive articles cited by the Facebook blog post was authored by Moira Burke, a data scientist at the company (who also coauthored the blog post), and Robert Kraut, a human computer interaction specialist at Carnegie Mellon University. It was published in the Journal of Computer-Mediated Communication in July 2016. In this study, Burke and Kraut recruited a group of around 1,900 Facebook users who agreed to quantify their current level of happiness when prompted.


pages: 246 words: 68,392

Gigged: The End of the Job and the Future of Work by Sarah Kessler

"Susan Fowler" uber, Affordable Care Act / Obamacare, Airbnb, Amazon Mechanical Turk, basic income, bitcoin, blockchain, business cycle, call centre, cognitive dissonance, collective bargaining, crowdsourcing, data science, David Attenborough, do what you love, Donald Trump, East Village, Elon Musk, financial independence, future of work, game design, gig economy, Hacker News, income inequality, independent contractor, information asymmetry, Jeff Bezos, job automation, law of one price, Lyft, Mark Zuckerberg, market clearing, minimum wage unemployment, new economy, opioid epidemic / opioid crisis, payday loans, post-work, profit maximization, QR code, race to the bottom, ride hailing / ride sharing, Salesforce, Second Machine Age, self-driving car, shareholder value, sharing economy, Silicon Valley, Snapchat, TaskRabbit, TechCrunch disrupt, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, union organizing, universal basic income, working-age population, Works Progress Administration, Y Combinator

There, we would find answers to common complaints, such as “I feel uncomfortable or unsafe” (remove yourself from the situation and call the police) and instructions for what to do if a customer locked them out of the home by accident, or if they wanted to dispute a fee the company had taken from their pay for missing a scheduled cleaning or other violations to the terms of service. “We’ll call you!” shouted a woman wearing a skirt in the front row. “Don’t call us,” Carol automatically corrected. * * * Uber was extremely shrewd at finding new ways to manage independent contractors through its app. “Employing hundreds of social scientists and data scientists,” wrote the New York Times in 2017, “Uber has experimented with video game techniques, graphics and noncash rewards of little value that can prod drivers into working longer and harder—and sometimes at hours and locations that are less lucrative for them.”6 One strategy was its surge pricing model, which made rates higher during busy times in order to encourage more drivers to work during those periods.


pages: 250 words: 64,011

Everydata: The Misinformation Hidden in the Little Data You Consume Every Day by John H. Johnson

Affordable Care Act / Obamacare, autism spectrum disorder, Black Swan, business intelligence, Carmen Reinhart, cognitive bias, correlation does not imply causation, Daniel Kahneman / Amos Tversky, data science, Donald Trump, en.wikipedia.org, Kenneth Rogoff, labor-force participation, lake wobegon effect, Long Term Capital Management, Mercator projection, Mercator projection distort size, especially Greenland and Africa, meta-analysis, Nate Silver, obamacare, p-value, PageRank, pattern recognition, publication bias, QR code, randomized controlled trial, risk-adjusted returns, Ronald Reagan, selection bias, statistical model, The Signal and the Noise by Nate Silver, Thomas Bayes, Tim Cook: Apple, wikimedia commons, Yogi Berra

SIGNIFICANT OTHERS In the movie Thank You for Smoking, Aaron Eckhart’s character (a spokesperson for the tobacco industry) tells his son, “When you argue correctly, you’re never wrong.”16 It’s a line from a lobbyist in a Hollywood satire—but it’s an interesting quote to keep in mind as we talk about statistical significance, given that many people feel it’s the “correct” way to talk about data. Statistical significance is a concept used by scientists and researchers to set an objective standard that can be used to determine whether or not a particular relationship “statistically” exists in the data. Scientists test for statistical significance to distinguish between whether an observed effect is present in the data (given a high degree of probability), or just due to chance. It is important to note that finding a statistically significant relationship tells us nothing about whether a relationship is a simple correlation or a causal one, and it also can’t tell us anything about whether some omitted factor is driving the result.


pages: 265 words: 69,310

What's Yours Is Mine: Against the Sharing Economy by Tom Slee

4chan, Airbnb, Amazon Mechanical Turk, asset-backed security, barriers to entry, Benchmark Capital, benefit corporation, Berlin Wall, big-box store, bike sharing, bitcoin, blockchain, Californian Ideology, citizen journalism, collaborative consumption, commons-based peer production, congestion charging, Credit Default Swap, crowdsourcing, data acquisition, data science, David Brooks, democratizing finance, do well by doing good, don't be evil, Dr. Strangelove, emotional labour, Evgeny Morozov, gentrification, gig economy, Hacker Ethic, impact investing, income inequality, independent contractor, informal economy, invisible hand, Jacob Appelbaum, Jane Jacobs, Jeff Bezos, John Zimmer (Lyft cofounder), Kevin Roose, Khan Academy, Kibera, Kickstarter, license plate recognition, Lyft, machine readable, Marc Andreessen, Mark Zuckerberg, Max Levchin, move fast and break things, natural language processing, Netflix Prize, Network effects, new economy, Occupy movement, openstreetmap, Paul Graham, peer-to-peer, peer-to-peer lending, Peter Thiel, pre–internet, principal–agent problem, profit motive, race to the bottom, Ray Kurzweil, recommendation engine, rent control, ride hailing / ride sharing, sharing economy, Silicon Valley, Snapchat, software is eating the world, South of Market, San Francisco, TaskRabbit, TED Talk, the Cathedral and the Bazaar, the long tail, The Nature of the Firm, Thomas L Friedman, transportation-network company, Travis Kalanick, Tyler Cowen, Uber and Lyft, Uber for X, uber lyft, ultimatum game, urban planning, WeWork, WikiLeaks, winner-take-all economy, Y Combinator, Yochai Benkler, Zipcar

The effectiveness of reputation systems and algorithmic ratings systems in providing a solid basis for trust is exaggerated in the Sharing Economy world. Sites that rely on algorithmic ratings have run into problems of fairness and of proper process, for example marketplace lending company Lending Club. By becoming a new way to qualify potential borrowers, the marketplace lending companies are entering into the area of credit scoring. Data scientist Cathy O’Neil argues that one reason Lending Club and others can bring such value to big financial institutions is that they provide a way to bypass credit scoring regulations, such as the Federal Trade Commission’s Equal Credit Opportunities Act (ECOA) that prohibits credit discrimination on the basis of race, color, religion and other factors and the Fair Credit Reporting Act (FCRA).


pages: 238 words: 68,914

Where Does It Hurt?: An Entrepreneur's Guide to Fixing Health Care by Jonathan Bush, Stephen Baker

Affordable Care Act / Obamacare, Alan Greenspan, Atul Gawande, barriers to entry, Clayton Christensen, commoditize, data science, informal economy, inventory management, job automation, knowledge economy, lifelogging, obamacare, personalized medicine, ride hailing / ride sharing, Ronald Reagan, Salesforce, Silicon Valley, Steve Jobs, web application, women in the workforce, working poor

It’s not just a matter of killing time in uncomfortable chairs with soap operas blaring. People often have to take time off from work, or hire a babysitter. Some drive through heavy traffic, in both directions. Their time is money, and it represents an uncounted health care expense—on top of the trillions that we get billed for. Our data scientists can capture the elapsed minutes between the moment a patient signs in and the consultation begins. That’s the waiting time. Then, conceivably, we’ll be able to look for correlations between waiting time and other behaviors. Do offices where patients wait for more than thirty minutes suffer from higher customer churn?


pages: 212 words: 69,846

The Nation City: Why Mayors Are Now Running the World by Rahm Emanuel

Affordable Care Act / Obamacare, Airbnb, Big Tech, bike sharing, blockchain, carbon footprint, clean water, data science, deindustrialization, disinformation, Donald Trump, Edward Glaeser, Enrique Peñalosa, Filter Bubble, food desert, gentrification, high-speed rail, income inequality, informal economy, Jane Jacobs, Kickstarter, Lyft, megacity, military-industrial complex, new economy, New Urbanism, offshore financial centre, opioid epidemic / opioid crisis, payday loans, ride hailing / ride sharing, Ronald Reagan, Salesforce, San Francisco homelessness, Silicon Valley, The Death and Life of Great American Cities, the High Line, transcontinental railway, Uber and Lyft, uber lyft, urban planning, War on Poverty, white flight, working poor

When a certain threshold is met in a neighborhood, we flood the area with police. We borrowed this concept from Los Angeles, then refined it to fit Chicago. We put strategic support centers in twelve of twenty-two districts, which run the data every eight hours to stay on top of crime trends. These centers are staffed around the clock by two cops and two data scientists from the University of Chicago. (If you are ever interested in witnessing some great cultural exchanges, sit in for a bit with two Chicago cops and two data nerds from the University of Chicago stuck in a room for eight hours at a time. Talk about cultural diversity.) Ken Griffin, the founder and chief executive of the investment firm Citadel, pitched in $10 million in funding for our new crime measures.


pages: 215 words: 69,370

Still Broke: Walmart's Remarkable Transformation and the Limits of Socially Conscious Capitalism by Rick Wartzman

"Friedman doctrine" OR "shareholder theory", activist fund / activist shareholder / activist investor, An Inconvenient Truth, basic income, Bernie Sanders, call centre, collective bargaining, coronavirus, COVID-19, cryptocurrency, data science, Donald Trump, employer provided health coverage, fulfillment center, full employment, future of work, George Floyd, illegal immigration, immigration reform, income inequality, Jeff Bezos, job automation, Kickstarter, labor-force participation, low skilled workers, Marc Benioff, old-boy network, race to the bottom, RAND corporation, rolodex, Ronald Reagan, Salesforce, shareholder value, supply-chain management, TikTok, Triangle Shirtwaist Factory, union organizing, universal basic income, War on Poverty, warehouse robotics, We are the 99%, women in the workforce, working poor

“Everywhere I traveled—even for vacation—I would talk to the cashiers and the people in the stores, and I would ask questions,” she said. “I would do that for our stores, and then I would do that for the competition.” After enough of these outings, she began to pick up on something. “If you’d go into a Starbucks,” said Ormanidou, who in 2016 would leave Walmart to become a data scientist at the coffee chain, “for the most part, people would say, ‘Oh, I love it here. They accept me. I’m happy. I love my job.’ But you would go to Walmart and everybody would complain. It’s the same candidates, the same people who apply to Target and Walmart and McDonald’s and Starbucks. How do we end up with the people who are not happy?


pages: 278 words: 74,880

A World of Three Zeros: The New Economics of Zero Poverty, Zero Unemployment, and Zero Carbon Emissions by Muhammad Yunus

"Friedman doctrine" OR "shareholder theory", active measures, Bernie Sanders, biodiversity loss, Capital in the Twenty-First Century by Thomas Piketty, clean water, conceptual framework, crony capitalism, data science, distributed generation, Donald Trump, financial engineering, financial independence, fixed income, full employment, high net worth, income inequality, Indoor air pollution, Internet of things, invisible hand, Jeff Bezos, job automation, Lean Startup, Marc Benioff, Mark Zuckerberg, megacity, microcredit, new economy, Occupy movement, profit maximization, Silicon Valley, the market place, The Wealth of Nations by Adam Smith, too big to fail, Tragedy of the Commons, unbanked and underbanked, underbanked, urban sprawl, young professional

Thanks in large part to the attractiveness of this online platform, in less than three years, the Food Assembly has spread to more than seven hundred locations in France, Belgium, the United Kingdom, Spain, Germany, and Italy—a vivid illustration of what I mean by the multiplying power of digital ICT! MakeSense is continuing to develop and refine its use of technological tools to enhance and spread social business. Beginning in 2016, a data scientist with expertise in developing and applying advanced analytic tools came to work at Make-Sense thanks to a grant from his main employer, the media company Bloomberg L.P. The scientist is working on a system to track and measure the performance of social business projects. The goal is to develop new, more accurate ways of determining which methodologies and practices produce the best results for the people whom the social business is designed to benefit.


pages: 300 words: 76,638

The War on Normal People: The Truth About America's Disappearing Jobs and Why Universal Basic Income Is Our Future by Andrew Yang

3D printing, Airbnb, assortative mating, augmented reality, autonomous vehicles, basic income, Bear Stearns, behavioural economics, Ben Horowitz, Bernie Sanders, call centre, corporate governance, cryptocurrency, data science, David Brooks, DeepMind, Donald Trump, Elon Musk, falling living standards, financial deregulation, financial engineering, full employment, future of work, global reserve currency, income inequality, Internet of things, invisible hand, Jeff Bezos, job automation, John Maynard Keynes: technological unemployment, Khan Academy, labor-force participation, longitudinal study, low skilled workers, Lyft, manufacturing employment, Mark Zuckerberg, megacity, meritocracy, Narrative Science, new economy, passive income, performance metric, post-work, quantitative easing, reserve currency, Richard Florida, ride hailing / ride sharing, risk tolerance, robo advisor, Ronald Reagan, Rutger Bregman, Sam Altman, San Francisco homelessness, self-driving car, shareholder value, Silicon Valley, Simon Kuznets, single-payer health, Stephen Hawking, Steve Ballmer, supercomputer in your pocket, tech worker, technoutopianism, telemarketer, The future is already here, The Wealth of Nations by Adam Smith, traumatic brain injury, Tyler Cowen, Tyler Cowen: Great Stagnation, Uber and Lyft, uber lyft, unemployed young men, universal basic income, urban renewal, warehouse robotics, white flight, winner-take-all economy, Y Combinator

Every innovation will bring with it new opportunities, and some will be difficult to predict. Self-driving cars and trucks will bring with them a need for improved infrastructure and thus perhaps some construction jobs. The demise of retail could make drone pilots more of a need over time. The proliferation of data is already making data scientists a hot new job category. The problem is that the new jobs are almost certain to be in different places than existing ones and will be less numerous than the ones that disappear. They will generally require higher levels of education than the displaced workers have. And it will be very unlikely for a displaced worker to move, identify the need, gain skills, and fill the new role.


pages: 252 words: 73,131

The Inner Lives of Markets: How People Shape Them—And They Shape Us by Tim Sullivan

Abraham Wald, Airbnb, airport security, Al Roth, Alvin Roth, Andrei Shleifer, attribution theory, autonomous vehicles, barriers to entry, behavioural economics, Brownian motion, business cycle, buy and hold, centralized clearinghouse, Chuck Templeton: OpenTable:, classic study, clean water, conceptual framework, congestion pricing, constrained optimization, continuous double auction, creative destruction, data science, deferred acceptance, Donald Trump, Dutch auction, Edward Glaeser, experimental subject, first-price auction, framing effect, frictionless, fundamental attribution error, George Akerlof, Goldman Sachs: Vampire Squid, Gunnar Myrdal, helicopter parent, information asymmetry, Internet of things, invisible hand, Isaac Newton, iterative process, Jean Tirole, Jeff Bezos, Johann Wolfgang von Goethe, John Nash: game theory, John von Neumann, Joseph Schumpeter, Kenneth Arrow, late fees, linear programming, Lyft, market clearing, market design, market friction, medical residency, multi-sided market, mutually assured destruction, Nash equilibrium, Occupy movement, opioid epidemic / opioid crisis, Pareto efficiency, Paul Samuelson, Peter Thiel, pets.com, pez dispenser, power law, pre–internet, price mechanism, price stability, prisoner's dilemma, profit motive, proxy bid, RAND corporation, ride hailing / ride sharing, Robert Shiller, Robert Solow, Ronald Coase, school choice, school vouchers, scientific management, sealed-bid auction, second-price auction, second-price sealed-bid, sharing economy, Silicon Valley, spectrum auction, Steve Jobs, Tacoma Narrows Bridge, techno-determinism, technoutopianism, telemarketer, The Market for Lemons, The Wisdom of Crowds, Thomas Malthus, Thorstein Veblen, trade route, transaction costs, two-sided market, uber lyft, uranium enrichment, Vickrey auction, Vilfredo Pareto, WarGames: Global Thermonuclear War, winner-take-all economy

The rest are plain-vanilla fixed-price sales, just as one would see listed from a third-party seller on Amazon, which makes buying things on eBay fundamentally not that different from the way people have done their shopping for the past century or so. Given the critical importance of this shift in online commerce for eBay’s bottom line, it’s no surprise that data scientists within the company’s research group have thoroughly studied the change. In a collaboration with Stanford economists, eBay researchers have dug into the reasons behind the decline of the company’s auction business. Their findings matter to eBay executives mapping out business’s future, and also for those of us who are simply trying to make sense of how the internet has changed the nature of markets and, just as important, the ways in which it hasn’t.


pages: 281 words: 78,317

But What if We're Wrong? Thinking About the Present as if It Were the Past by Chuck Klosterman

a long time ago in a galaxy far, far away, Affordable Care Act / Obamacare, British Empire, citizen journalism, cosmological constant, dark matter, data science, Easter island, Edward Snowden, Elon Musk, Francis Fukuyama: the end of history, Frank Gehry, George Santayana, Gerolamo Cardano, ghettoisation, Golden age of television, Hans Moravec, Higgs boson, Howard Zinn, Isaac Newton, Joan Didion, Large Hadron Collider, Nick Bostrom, non-fiction novel, obamacare, pre–internet, public intellectual, Ralph Nader, Ray Kurzweil, Ronald Reagan, Seymour Hersh, Silicon Valley, Stephen Hawking, TED Talk, the medium is the message, the scientific method, Thomas Kuhn: the structure of scientific revolutions, too big to fail, Y2K

It’s easy to discover a new planet and then work up the math proving that it’s there; it’s quite another to mathematically insist a massive undiscovered planet should be precisely where it ends up being. This is a different level of correctness. It’s not interpretative, because numbers have no agenda, no sense of history, and no sense of humor. The Pythagorean theorem doesn’t need the existence of Mr. Pythagoras in order to work exactly as it does. I have a friend who’s a data scientist, currently working on the economics of mobile gaming environments. He knows a great deal about probability theory,35 so I asked him if our contemporary understanding of probability is still evolving and if the way people understood probability three hundred years ago has any relationship to how we will gauge probability three hundred years from today.


pages: 280 words: 71,268

Measure What Matters: How Google, Bono, and the Gates Foundation Rock the World With OKRs by John Doerr

Abraham Maslow, Albert Einstein, Big Tech, Bob Noyce, cloud computing, collaborative editing, commoditize, crowdsourcing, data science, fail fast, Fairchild Semiconductor, Firefox, Frederick Winslow Taylor, Google Chrome, Google Earth, Google X / Alphabet X, Haight Ashbury, hockey-stick growth, intentional community, Jeff Bezos, job satisfaction, Khan Academy, knowledge worker, Mary Meeker, Menlo Park, meta-analysis, PageRank, Paul Buchheit, Ray Kurzweil, risk tolerance, Salesforce, scientific management, self-driving car, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, Skype, Steve Jobs, Steven Levy, subscription business, Susan Wojcicki, web application, Yogi Berra, éminence grise

Remind is on its way to solving that problem—by focusing on what matters. 6 Commit: The Nuna Story Jini Kim Cofounder and CEO Nuna is the story of the passionate Jini Kim, propelled by family tragedy to deliver better health care to huge numbers of Americans. Of how she bootstrapped Nuna through years of rejection. And of how she recruited engineers and data scientists to commit to a wildly audacious goal: building a new Medicaid data platform, from scratch. Alongside focus, commitment is a core element of our first superpower. In implementing OKRs, leaders must publicly commit to their objectives and stay steadfast. At Nuna, a health care data platform and analytics company, the cofounders overcame a false start with OKRs.


pages: 290 words: 73,000

Algorithms of Oppression: How Search Engines Reinforce Racism by Safiya Umoja Noble

A Declaration of the Independence of Cyberspace, affirmative action, Airbnb, algorithmic bias, Alvin Toffler, Black Lives Matter, borderless world, cloud computing, conceptual framework, critical race theory, crowdsourcing, data science, desegregation, digital divide, disinformation, Donald Trump, Edward Snowden, fake news, Filter Bubble, Firefox, Future Shock, Gabriella Coleman, gamification, Google Earth, Google Glasses, housing crisis, illegal immigration, immigration reform, information retrieval, information security, Internet Archive, Jaron Lanier, John Perry Barlow, military-industrial complex, Mitch Kapor, Naomi Klein, new economy, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, PageRank, performance metric, phenotype, profit motive, Silicon Valley, Silicon Valley ideology, Snapchat, the long tail, Tim Cook: Apple, union organizing, women in the workforce, work culture , yellow journalism

In attendance was the journalist Julia Angwin, one of the investigators of the breaking story about courtroom sentencing software Northpointe, used for risk assessment by judges to determine the alleged future criminality of defendants.6 She and her colleagues determined that this type of artificial intelligence miserably mispredicted future criminal activity and led to the overincarceration of Black defendants. Conversely, the reporters found it was much more likely to predict that White criminals would not offend again, despite the data showing that this was not at all accurate. Sitting next to me was Cathy O’Neil, a data scientist and the author of the book Weapons of Math Destruction, who has an insider’s view of the way that math and big data are directly implicated in the financial and housing crisis of 2008 (which, incidentally, destroyed more African American wealth than any other event in the United States, save for not compensating African Americans for three hundred years of forced enslavement).


pages: 333 words: 76,990

The Long Good Buy: Analysing Cycles in Markets by Peter Oppenheimer

Alan Greenspan, asset allocation, banking crisis, banks create money, barriers to entry, behavioural economics, benefit corporation, Berlin Wall, Big bang: deregulation of the City of London, Black Monday: stock market crash in 1987, book value, Bretton Woods, business cycle, buy and hold, Cass Sunstein, central bank independence, collective bargaining, computer age, credit crunch, data science, debt deflation, decarbonisation, diversification, dividend-yielding stocks, equity premium, equity risk premium, Fall of the Berlin Wall, financial engineering, financial innovation, fixed income, Flash crash, foreign exchange controls, forward guidance, Francis Fukuyama: the end of history, general purpose technology, gentrification, geopolitical risk, George Akerlof, Glass-Steagall Act, household responsibility system, housing crisis, index fund, invention of the printing press, inverted yield curve, Isaac Newton, James Watt: steam engine, Japanese asset price bubble, joint-stock company, Joseph Schumpeter, Kickstarter, Kondratiev cycle, liberal capitalism, light touch regulation, liquidity trap, Live Aid, low interest rates, market bubble, Mikhail Gorbachev, mortgage debt, negative equity, Network effects, new economy, Nikolai Kondratiev, Nixon shock, Nixon triggered the end of the Bretton Woods system, oil shock, open economy, Phillips curve, price stability, private sector deleveraging, Productivity paradox, quantitative easing, railway mania, random walk, Richard Thaler, risk free rate, risk tolerance, risk-adjusted returns, Robert Shiller, Robert Solow, Ronald Reagan, Savings and loan crisis, savings glut, secular stagnation, Shenzhen special economic zone , Simon Kuznets, South Sea Bubble, special economic zone, stocks for the long run, tail risk, Tax Reform Act of 1986, technology bubble, The Great Moderation, too big to fail, total factor productivity, trade route, tulip mania, yield curve

This component of a strong narrative that drives the interest in investment was observed by renowned Austrian economist Joseph Schumpeter, who argued that speculation often occurs at the start of a new industry. More recently, in a testimony before the US Congress on 26 February 1997, then-chairman of the Federal Reserve Alan Greenspan noted that ‘regrettably, history is strewn with visions of such “new eras” that, in the end, have proven to be a mirage’. A recent study by data scientists found that, in a sample of 51 major innovations introduced between 1825 and 2000, bubbles in equity prices were evident in 73% of the cases. They also found that the magnitude of these bubbles increases with the radicalness of innovations, with their potential to generate indirect network effects and with their public visibility at the time of commercialisation.12 Although it is not obvious that innovation was a trigger in the case of the tulip mania, it could be argued that it was important in the financial bubbles of the South Sea Company in Great Britain and the Mississippi Company in France in 1720.


pages: 269 words: 70,543

Tech Titans of China: How China's Tech Sector Is Challenging the World by Innovating Faster, Working Harder, and Going Global by Rebecca Fannin

"World Economic Forum" Davos, Adam Neumann (WeWork), Airbnb, augmented reality, autonomous vehicles, Benchmark Capital, Big Tech, bike sharing, blockchain, call centre, cashless society, Chuck Templeton: OpenTable:, clean tech, cloud computing, computer vision, connected car, corporate governance, cryptocurrency, data is the new oil, data science, deep learning, Deng Xiaoping, Didi Chuxing, digital map, disruptive innovation, Donald Trump, El Camino Real, electricity market, Elon Musk, fake news, family office, fear of failure, fulfillment center, glass ceiling, global supply chain, Great Leap Forward, income inequality, industrial robot, information security, Internet of things, invention of movable type, Jeff Bezos, Kickstarter, knowledge worker, Lyft, Mark Zuckerberg, Mary Meeker, megacity, Menlo Park, money market fund, Network effects, new economy, peer-to-peer lending, personalized medicine, Peter Thiel, QR code, RFID, ride hailing / ride sharing, Sand Hill Road, self-driving car, sharing economy, Shenzhen was a fishing village, Silicon Valley, Silicon Valley startup, Skype, smart cities, smart transportation, Snapchat, social graph, SoftBank, software as a service, South China Sea, sovereign wealth fund, speech recognition, stealth mode startup, Steve Jobs, stock buybacks, supply-chain management, tech billionaire, TechCrunch disrupt, TikTok, Tim Cook: Apple, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, urban planning, Vision Fund, warehouse automation, WeWork, winner-take-all economy, Y Combinator, young professional

Consumers complete the entire lending process over their smartphone and don’t need an established credit history—an issue among young people starting in their careers. Loan decisions for individual borrowers are made online within seconds. One hint: don’t fill out the online form in all capital letters. WeLab has found applicants who write in upper case are not good credit risks. A technology team of more than 210 engineers and data scientists have guided WeLab in reinventing traditional lending and assessing credit risks by three proprietary AI systems: WeDefend detects fraud and suspicious behavior by analyzing more than 2,500 user data points in under one second. WeReach peeks into consumers’ influence and interactions with social connections.


pages: 240 words: 78,436

Open for Business Harnessing the Power of Platform Ecosystems by Lauren Turner Claire, Laure Claire Reillier, Benoit Reillier

Airbnb, Amazon Mechanical Turk, Amazon Web Services, augmented reality, autonomous vehicles, barriers to entry, basic income, benefit corporation, Blitzscaling, blockchain, carbon footprint, Chuck Templeton: OpenTable:, cloud computing, collaborative consumption, commoditize, crowdsourcing, data science, deep learning, Diane Coyle, Didi Chuxing, disintermediation, distributed ledger, driverless car, fake news, fulfillment center, future of work, George Akerlof, independent contractor, intangible asset, Internet of things, Jean Tirole, Jeff Bezos, Kickstarter, knowledge worker, Lean Startup, Lyft, Mark Zuckerberg, market design, Metcalfe’s law, minimum viable product, multi-sided market, Network effects, Paradox of Choice, Paul Graham, peer-to-peer lending, performance metric, Peter Thiel, platform as a service, price discrimination, price elasticity of demand, profit motive, ride hailing / ride sharing, Sam Altman, search costs, self-driving car, seminal paper, shareholder value, sharing economy, Silicon Valley, Skype, smart contracts, Snapchat, software as a service, Steve Jobs, Steve Wozniak, TaskRabbit, the long tail, The Market for Lemons, Tim Cook: Apple, transaction costs, two-sided market, Uber and Lyft, uber lyft, universal basic income, Y Combinator

This process of collecting human observations about the relevance of search results is also used by Facebook to improve what people should see first in their newsfeed.7 It started as a feed quality panel experiment in 2014, asking Platform maturity: profitable growth 125 people to provide detailed feedback on what they saw in their newsfeeds, which posts they liked and why, and what posts they would have liked to see instead. This qualitative human feedback exposed blind spots that data scientists, who work on improving the newsfeed algorithm, could not identify with machine learning. Facebook now runs feed quality panels across the US and international markets, and combines both quantitative and qualitative approaches to optimize its newsfeed algorithm. Search innovation Search is an area that is constantly evolving.


pages: 244 words: 78,238

Cabin Fever: The Harrowing Journey of a Cruise Ship at the Dawn of a Pandemic by Michael Smith, Jonathan Franklin

airport security, Boeing 747, call centre, coronavirus, COVID-19, data science, Donald Trump, global pandemic, lockdown, offshore financial centre, Panamax, Port of Oakland, Snapchat, social distancing, Suez canal 1869

Aboard the Zaandam, information about typical COVID-19 symptoms took time to get around and then spread from cabin to cabin, one person to the next. Passengers were even less informed than people onshore. Anne hadn’t heard the news that losing one’s sense of taste or smell was such a clear sign of COVID-19 that data scientists were now tracking Google queries: “Why can’t I taste my food?” and “Why can’t I smell anything?” These searches were now considered frontline markers of the pandemic’s international expansion. But Arthur, on dry land and far from the Zaandam, was informed. “I’d just heard an epidemiologist describe lack of taste as one of the symptoms of COVID,” he said.


pages: 232 words: 72,483

Immortality, Inc. by Chip Walter

23andMe, Airbnb, Albert Einstein, Arthur D. Levinson, bioinformatics, Buckminster Fuller, cloud computing, CRISPR, data science, disintermediation, double helix, Elon Musk, Isaac Newton, Jeff Bezos, Larry Ellison, Law of Accelerating Returns, life extension, Menlo Park, microbiome, mouse model, pattern recognition, Peter Thiel, phenotype, radical life extension, Ray Kurzweil, Recombinant DNA, Rodney Brooks, self-driving car, Silicon Valley, Silicon Valley startup, Snapchat, South China Sea, SpaceShipOne, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, TED Talk, Thomas Bayes, zero day

Bloom had other examples of how insights into the human genome could kill you or save your life, depending. Take the infamous H1N1 flu epidemic of 2009. H1N1 killed 203,000 people worldwide—one of the worst pandemics in recent history. The first known outbreak was in Veracruz, Mexico. In analyzing the epidemic’s genomic data, HLI’s chief data scientist Amalio Telenti found that for every 40,000 children, one died of the disease. That wasn’t a lot (unless you were the child who died), but the statistics had a “genomic” ring, as if something in the genes made some children more susceptible to the virus than others. When Telenti looked at the records of the children that died, he noticed about 60 percent had preexisting lung illnesses like asthma or cystic fibrosis.


pages: 305 words: 75,697

Cogs and Monsters: What Economics Is, and What It Should Be by Diane Coyle

3D printing, additive manufacturing, Airbnb, Al Roth, Alan Greenspan, algorithmic management, Amazon Web Services, autonomous vehicles, banking crisis, barriers to entry, behavioural economics, Big bang: deregulation of the City of London, biodiversity loss, bitcoin, Black Lives Matter, Boston Dynamics, Bretton Woods, Brexit referendum, business cycle, call centre, Carmen Reinhart, central bank independence, choice architecture, Chuck Templeton: OpenTable:, cloud computing, complexity theory, computer age, conceptual framework, congestion charging, constrained optimization, coronavirus, COVID-19, creative destruction, credit crunch, data science, DeepMind, deglobalization, deindustrialization, Diane Coyle, discounted cash flows, disintermediation, Donald Trump, Edward Glaeser, en.wikipedia.org, endogenous growth, endowment effect, Erik Brynjolfsson, eurozone crisis, everywhere but in the productivity statistics, Evgeny Morozov, experimental subject, financial deregulation, financial innovation, financial intermediation, Flash crash, framing effect, general purpose technology, George Akerlof, global supply chain, Goodhart's law, Google bus, haute cuisine, High speed trading, hockey-stick growth, Ida Tarbell, information asymmetry, intangible asset, Internet of things, invisible hand, Jaron Lanier, Jean Tirole, job automation, Joseph Schumpeter, Kenneth Arrow, Kenneth Rogoff, knowledge economy, knowledge worker, Les Trente Glorieuses, libertarian paternalism, linear programming, lockdown, Long Term Capital Management, loss aversion, low earth orbit, lump of labour, machine readable, market bubble, market design, Menlo Park, millennium bug, Modern Monetary Theory, Mont Pelerin Society, multi-sided market, Myron Scholes, Nash equilibrium, Nate Silver, Network effects, Occupy movement, Pareto efficiency, payday loans, payment for order flow, Phillips curve, post-industrial society, price mechanism, Productivity paradox, quantitative easing, randomized controlled trial, rent control, rent-seeking, ride hailing / ride sharing, road to serfdom, Robert Gordon, Robert Shiller, Robert Solow, Robinhood: mobile stock trading app, Ronald Coase, Ronald Reagan, San Francisco homelessness, savings glut, school vouchers, sharing economy, Silicon Valley, software is eating the world, spectrum auction, statistical model, Steven Pinker, tacit knowledge, The Chicago School, The Future of Employment, The Great Moderation, the map is not the territory, The Rise and Fall of American Growth, the scientific method, The Signal and the Noise by Nate Silver, the strength of weak ties, The Wealth of Nations by Adam Smith, total factor productivity, transaction costs, Uber for X, urban planning, winner-take-all economy, Winter of Discontent, women in the workforce, Y2K

This means that companies that used to invest in servers and other equipment, and hire people to staff large IT departments, no longer need to do so. More and more companies, and pretty much all start-ups, do not make these investments at all now but instead use cloud services such as Amazon Web Services, or Microsoft’s Azure. Executives I have interviewed told me they used to have IT departments with skilled data scientists costing many tens or hundreds of thousands of pounds a year, but now for a few pounds on the company credit card they can simply use services provided by cloud platforms, with the latest software and cutting edge AI. Big firms and government departments and agencies have switched to cloud computing, and new firms start with it.


pages: 280 words: 82,355

Extreme Teams: Why Pixar, Netflix, AirBnB, and Other Cutting-Edge Companies Succeed Where Most Fail by Robert Bruce Shaw, James Foster, Brilliance Audio

Airbnb, augmented reality, benefit corporation, Blitzscaling, call centre, cloud computing, data science, deliberate practice, Elon Musk, emotional labour, financial engineering, future of work, holacracy, inventory management, Jeff Bezos, job satisfaction, Jony Ive, karōshi / gwarosa / guolaosi, loose coupling, meta-analysis, nuclear winter, Paul Graham, peer-to-peer, peer-to-peer model, performance metric, Peter Thiel, sharing economy, Sheryl Sandberg, Silicon Valley, social intelligence, SoftBank, Steve Jobs, TED Talk, Tony Fadell, Tony Hsieh, work culture

They also have the flexibility to balance long and short term work, creating business impact while managing technical debt. Does this mean engineers just do whatever they want? No. They work to define and prioritize impactful work with the rest of their team including product managers, designers, data scientists and others.” nerds.airbnb.com/engineering-culture-airbnb/. 10Own Thomas. “How Airbnb Manages Not to Manage Engineers.” 11The importance of experience in Airbnb is suggested when realizing that the head of what most firms call human resources is called the head of employee experience at Airbnb.


pages: 252 words: 78,780

Lab Rats: How Silicon Valley Made Work Miserable for the Rest of Us by Dan Lyons

"Friedman doctrine" OR "shareholder theory", "Susan Fowler" uber, "World Economic Forum" Davos, Airbnb, Amazon Robotics, Amazon Web Services, antiwork, Apple II, augmented reality, autonomous vehicles, basic income, Big Tech, bitcoin, blockchain, Blue Ocean Strategy, business process, call centre, Cambridge Analytica, Clayton Christensen, clean water, collective bargaining, corporate governance, corporate social responsibility, creative destruction, cryptocurrency, data science, David Heinemeier Hansson, digital rights, Donald Trump, Elon Musk, Ethereum, ethereum blockchain, fake news, full employment, future of work, gig economy, Gordon Gekko, greed is good, Hacker News, hiring and firing, holacracy, housing crisis, impact investing, income inequality, informal economy, initial coin offering, Jeff Bezos, job automation, job satisfaction, job-hopping, John Gruber, John Perry Barlow, Joseph Schumpeter, junk bonds, Kanban, Kevin Kelly, knowledge worker, Larry Ellison, Lean Startup, loose coupling, Lyft, Marc Andreessen, Mark Zuckerberg, McMansion, Menlo Park, Milgram experiment, minimum viable product, Mitch Kapor, move fast and break things, new economy, Panopticon Jeremy Bentham, Parker Conrad, Paul Graham, paypal mafia, Peter Thiel, plutocrats, precariat, prosperity theology / prosperity gospel / gospel of success, public intellectual, RAND corporation, remote working, RFID, ride hailing / ride sharing, Ronald Reagan, Rubik’s Cube, Ruby on Rails, Sam Altman, San Francisco homelessness, Sand Hill Road, scientific management, self-driving car, shareholder value, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, six sigma, Skinner box, Skype, Social Responsibility of Business Is to Increase Its Profits, SoftBank, software is eating the world, Stanford prison experiment, stem cell, Steve Jobs, Steve Wozniak, Stewart Brand, stock buybacks, super pumped, TaskRabbit, tech bro, tech worker, TechCrunch disrupt, TED Talk, telemarketer, Tesla Model S, Thomas Davenport, Tony Hsieh, Toyota Production System, traveling salesman, Travis Kalanick, tulip mania, Uber and Lyft, Uber for X, uber lyft, universal basic income, web application, WeWork, Whole Earth Catalog, work culture , workplace surveillance , Y Combinator, young professional, Zenefits

Human recruiters still had to look at all the videos, and there is only so much they can do. Sure, they could fast-forward through videos and make decisions. But to scale up even more, “We started asking, how can we use technology to take the place of what humans are doing?” Parker says. HireVue assembled a team of data scientists and industrial and organizational psychologists, who took existing science on things like “facial action units” and encoded it into software. Two years ago HireVue began offering this service to its customers. HireVue has more than seven hundred clients, including Nike, Intel, Honeywell, and Delta Airlines.


pages: 444 words: 84,486

Radicalized by Cory Doctorow

activist fund / activist shareholder / activist investor, Affordable Care Act / Obamacare, air gap, Bernie Sanders, Black Lives Matter, call centre, crisis actor, crowdsourcing, cryptocurrency, data science, Edward Snowden, Flash crash, G4S, high net worth, information asymmetry, Kim Stanley Robinson, license plate recognition, Neal Stephenson, obamacare, old-boy network, public intellectual, satellite internet, six sigma, Social Justice Warrior, stock buybacks, TaskRabbit

They’ve been watching the darknet boards, they know that everyone’s been figuring out how to jailbreak their shit while we’ve been getting restarted, and they figure all those people could be customers, but instead of paying for food we sell them, they’d pay us to use food someone else sold them.” Salima almost laughed. It was a crime if she did it, a product if they sold it to her. Everything could be a product. “It’s weird, I know. But here’s where you come in. They’ve got this research unit, anthropologists and data scientists and marketers, and they want to talk to people like you, find out what you’d pay for different kinds of products. They want to see if you’d sell the package to your neighbors, if you could get a cut of the money from them, like a commission? They’ve got one plan, you could teach those kids you were working with to sell paid unlocking to the people in your building, and they’d get a commission and you’d get a commission because you recruited them.”


pages: 266 words: 86,324

The Drunkard's Walk: How Randomness Rules Our Lives by Leonard Mlodinow

Albert Einstein, Alfred Russel Wallace, Antoine Gombaud: Chevalier de Méré, Atul Gawande, behavioural economics, Brownian motion, butterfly effect, correlation coefficient, Daniel Kahneman / Amos Tversky, data science, Donald Trump, feminist movement, forensic accounting, Gary Kildall, Gerolamo Cardano, Henri Poincaré, index fund, Isaac Newton, law of one price, Monty Hall problem, pattern recognition, Paul Erdős, Pepto Bismol, probability theory / Blaise Pascal / Pierre de Fermat, RAND corporation, random walk, Richard Feynman, Ronald Reagan, Stephen Hawking, Steve Jobs, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas Bayes, V2 rocket, Watson beat the top human players on Jeopardy!

But what was truly striking was that when I made a bar graph showing how the number of buyers diminished as the buyers’ age strayed from the mean of seven, I found that the graph took a very familiar shape—that of the error law. It is one thing to suspect that archers and astronomers, chemists and marketers, encounter the same error law; it is another to discover the specific form of that law. Driven by the need to analyze astronomical data, scientists like Daniel Bernoulli and Laplace postulated a series of flawed candidates in the late eighteenth century. As it turned out, the correct mathematical function describing the error law—the bell curve—had been under their noses the whole time. It had been discovered in London in a different context many decades earlier.


pages: 361 words: 81,068

The Internet Is Not the Answer by Andrew Keen

"World Economic Forum" Davos, 3D printing, A Declaration of the Independence of Cyberspace, Airbnb, AltaVista, Andrew Keen, AOL-Time Warner, augmented reality, Bay Area Rapid Transit, Berlin Wall, Big Tech, bitcoin, Black Swan, Bob Geldof, Boston Dynamics, Burning Man, Cass Sunstein, Charles Babbage, citizen journalism, Clayton Christensen, clean water, cloud computing, collective bargaining, Colonization of Mars, computer age, connected car, creative destruction, cuban missile crisis, data science, David Brooks, decentralized internet, DeepMind, digital capitalism, disintermediation, disruptive innovation, Donald Davies, Downton Abbey, Dr. Strangelove, driverless car, Edward Snowden, Elon Musk, Erik Brynjolfsson, fail fast, Fall of the Berlin Wall, Filter Bubble, Francis Fukuyama: the end of history, Frank Gehry, Frederick Winslow Taylor, frictionless, fulfillment center, full employment, future of work, gentrification, gig economy, global village, Google bus, Google Glasses, Hacker Ethic, happiness index / gross national happiness, holacracy, income inequality, index card, informal economy, information trail, Innovator's Dilemma, Internet of things, Isaac Newton, Jaron Lanier, Jeff Bezos, job automation, John Perry Barlow, Joi Ito, Joseph Schumpeter, Julian Assange, Kevin Kelly, Kevin Roose, Kickstarter, Kiva Systems, Kodak vs Instagram, Lean Startup, libertarian paternalism, lifelogging, Lyft, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Martin Wolf, Mary Meeker, Metcalfe’s law, military-industrial complex, move fast and break things, Nate Silver, Neil Armstrong, Nelson Mandela, Network effects, new economy, Nicholas Carr, nonsequential writing, Norbert Wiener, Norman Mailer, Occupy movement, packet switching, PageRank, Panopticon Jeremy Bentham, Patri Friedman, Paul Graham, peer-to-peer, peer-to-peer rental, Peter Thiel, plutocrats, Potemkin village, power law, precariat, pre–internet, printed gun, Project Xanadu, RAND corporation, Ray Kurzweil, reality distortion field, ride hailing / ride sharing, Robert Metcalfe, Robert Solow, San Francisco homelessness, scientific management, Second Machine Age, self-driving car, sharing economy, Sheryl Sandberg, Silicon Valley, Silicon Valley billionaire, Silicon Valley ideology, Skype, smart cities, Snapchat, social web, South of Market, San Francisco, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, subscription business, TaskRabbit, tech bro, tech worker, TechCrunch disrupt, Ted Nelson, telemarketer, The future is already here, The Future of Employment, the long tail, the medium is the message, the new new thing, Thomas L Friedman, Travis Kalanick, Twitter Arab Spring, Tyler Cowen, Tyler Cowen: Great Stagnation, Uber for X, uber lyft, urban planning, Vannevar Bush, warehouse robotics, Whole Earth Catalog, WikiLeaks, winner-take-all economy, work culture , working poor, Y Combinator

The drop-off that occurred during the last few years coincided with increased awareness of and sensitivity to worrisome behavior in chat rooms.”39 This “pervasive misogyny” has led some former Internet evangelists, such as the British author Charles Leadbeater, to believe that the Internet is failing to realize its potential.40 “It’s outrageous we’ve got an Internet where women are regularly abused simply for appearing on television or appearing on Twitter,” Leadbeater said. “If that were to happen in a public space, it would cause outrage.”41 Hatred is ubiquitous on the Internet. “Big hatred meets big data,” writes the Google data scientist Seth Stephens-Davidowitz about the growth of online Nazi and racist forums that attract up to four hundred thousand Americans per month.42 Then there are the haters of the haters—the digital vigilantes, such as the group OpAntiBully, who track down Internet bullies and bully them.43 Worst of all are the anonymous online bullies themselves.


pages: 247 words: 81,135

The Great Fragmentation: And Why the Future of All Business Is Small by Steve Sammartino

3D printing, additive manufacturing, Airbnb, augmented reality, barriers to entry, behavioural economics, Bill Gates: Altair 8800, bitcoin, BRICs, Buckminster Fuller, citizen journalism, collaborative consumption, cryptocurrency, data science, David Heinemeier Hansson, deep learning, disruptive innovation, driverless car, Dunbar number, Elon Musk, fiat currency, Frederick Winslow Taylor, game design, gamification, Google X / Alphabet X, haute couture, helicopter parent, hype cycle, illegal immigration, index fund, Jeff Bezos, jimmy wales, Kickstarter, knowledge economy, Law of Accelerating Returns, lifelogging, market design, Mary Meeker, Metcalfe's law, Minecraft, minimum viable product, Network effects, new economy, peer-to-peer, planned obsolescence, post scarcity, prediction markets, pre–internet, profit motive, race to the bottom, random walk, Ray Kurzweil, recommendation engine, remote working, RFID, Rubik’s Cube, scientific management, self-driving car, sharing economy, side project, Silicon Valley, Silicon Valley startup, skunkworks, Skype, social graph, social web, software is eating the world, Steve Jobs, subscription business, survivorship bias, The Home Computer Revolution, the long tail, too big to fail, US Airways Flight 1549, vertical integration, web application, zero-sum game

The efficiency low-friction labour markets create opens up an opportunity for further independence of worlds, which is more profitable for those who need things done and those doing it. The type of work to evolve will be that of projecteers. These people — who aren’t really staff members, and don’t really run companies either — are digitally facilitated freelancers with skills that are in demand from the new economic landscape, such as UX Consulting, app developers, big data scientists, community managers, cloud services specialists, online course teachers and 3D printing designers, as well as jobs that don’t exist yet. They’re niche roles for an increasingly fragmented world. The greatest fallacy in modern politics is the idea of saving jobs. There aren’t many people who hunt bison for a living in this day and age and saving jobs is a simple misallocation of taxpayers’ dollars.


pages: 321

Finding Alphas: A Quantitative Approach to Building Trading Strategies by Igor Tulchinsky

algorithmic trading, asset allocation, automated trading system, backpropagation, backtesting, barriers to entry, behavioural economics, book value, business cycle, buy and hold, capital asset pricing model, constrained optimization, corporate governance, correlation coefficient, credit crunch, Credit Default Swap, currency risk, data science, deep learning, discounted cash flows, discrete time, diversification, diversified portfolio, Eugene Fama: efficient market hypothesis, financial engineering, financial intermediation, Flash crash, Geoffrey Hinton, implied volatility, index arbitrage, index fund, intangible asset, iterative process, Long Term Capital Management, loss aversion, low interest rates, machine readable, market design, market microstructure, merger arbitrage, natural language processing, passive investing, pattern recognition, performance metric, Performance of Mutual Funds in the Period, popular capitalism, prediction markets, price discovery process, profit motive, proprietary trading, quantitative trading / quantitative finance, random walk, Reminiscences of a Stock Operator, Renaissance Technologies, risk free rate, risk tolerance, risk-adjusted returns, risk/return, selection bias, sentiment analysis, shareholder value, Sharpe ratio, short selling, Silicon Valley, speech recognition, statistical arbitrage, statistical model, stochastic process, survivorship bias, systematic bias, systematic trading, text mining, transaction costs, Vanguard fund, yield curve

Another alternative is FloatBoost, which incorporates the backtracking mechanism of floating search and repeatedly performs a backtracking to remove unfavorable weak classifiers after a new weak classifier is added by AdaBoost; this ensures a lower error rate and reduced feature set at the cost of about five times longer training time. Deep Learning Deep learning (DL) is a popular topic today – and a term that is used to discuss a number of rather distinct things. Some data scientists think DL is just a buzz word or a rebranding of neural networks. The name comes from Canadian scientist Geoffrey Hinton, who created an unsupervised method known as the restricted Boltzmann machine (RBM) for pretraining NNs with a large number of neuron layers. That was meant to improve on the backpropagation training method, but there is no strong evidence that it really was an improvement.


pages: 280 words: 82,393

Conflicted: How Productive Disagreements Lead to Better Outcomes by Ian Leslie

Atul Gawande, Ben Horowitz, Berlin Wall, Black Lives Matter, call centre, data science, different worldview, double helix, Fall of the Berlin Wall, Isaac Newton, longitudinal study, low cost airline, Mark Zuckerberg, medical malpractice, meta-analysis, Nelson Mandela, Paul Graham, Silicon Valley, Socratic dialogue, the scientific method, The Wisdom of Crowds, work culture , zero-sum game

As Graham puts it, ‘Agreeing tends to motivate people less than disagreeing.’ Readers are more likely to comment on an article or post when they disagree with it, and in disagreement they have more to say (there are only so many ways you can say, ‘I agree’). They also tend to get more animated when they disagree, which usually means getting angry. A team of data scientists in 2010 studied user activity on BBC discussion forums, measuring the emotional sentiment of nearly 2.5 million posts from 18,000 users. They found that longer discussion threads were sustained by negative comments, and that the most active users overall were more likely to express negative emotions.


pages: 291 words: 85,822

The Truth About Lies: The Illusion of Honesty and the Evolution of Deceit by Aja Raden

air gap, Ayatollah Khomeini, bank run, banking crisis, Bernie Madoff, bitcoin, blockchain, California gold rush, carbon footprint, carbon-based life, cognitive bias, cognitive dissonance, collateralized debt obligation, Credit Default Swap, credit default swaps / collateralized debt obligations, cryptocurrency, data science, disinformation, Donald Trump, fake news, intentional community, iterative process, low interest rates, Milgram experiment, mirror neurons, multilevel marketing, offshore financial centre, opioid epidemic / opioid crisis, placebo effect, Ponzi scheme, prosperity theology / prosperity gospel / gospel of success, Ronald Reagan, Ronald Reagan: Tear down this wall, selective serotonin reuptake inhibitor (SSRI), Silicon Valley, Steve Bannon, sugar pill, survivorship bias, theory of mind, too big to fail, transcontinental railway, Vincenzo Peruggia: Mona Lisa

Well, some guys at MIT proved it.35 According to a first-of-its-kind study of Twitter, fake news stories (rumors, hoaxes, propaganda) spread six times faster and significantly farther than truthful ones.36 And before you scream BOTS!, these results accounted and corrected for their impact. This was all us. Humans love to read inflammatory lies and pass them along—even when they know the stories are untrue. The study was the brainchild of its lead author, data scientist Soroush Vosoughi. Dr. Vosoughi was a Ph.D. student at the time of the 2013 Boston Marathon bombing. He was deeply disturbed by the volume and fervor of conspiracy theories emerging in the days following the bombings—most directed at a missing Brown student, whose tragic disappearance was ultimately totally unrelated.


pages: 422 words: 86,414

Hands-On RESTful API Design Patterns and Best Practices by Harihara Subramanian

blockchain, business logic, business process, cloud computing, continuous integration, create, read, update, delete, cyber-physical system, data science, database schema, DevOps, disruptive innovation, domain-specific language, fault tolerance, information security, Infrastructure as a Service, Internet of things, inventory management, job automation, Kickstarter, knowledge worker, Kubernetes, loose coupling, Lyft, machine readable, microservices, MITM: man-in-the-middle, MVC pattern, Salesforce, self-driving car, semantic web, single page application, smart cities, smart contracts, software as a service, SQL injection, supply-chain management, web application, WebSocket

ELK, which is an open source software, fulfills these differing requirements in a tightly-integrated manner. E stands for Elasticsearch, L for Logstash, and K for Kibana. Elasticsearch just dumps the logs and provides a fuzzy search capability, Logstash is used to collect logs from different sources and transform them, and Kibana is a graphical user interface (GUI) that helps data scientists, testers, developers, and even businesspeople to insightfully search the logs as per their evolving requirements. Considering the significance of log analytics, there are open source as well as commercial-grade solutions to extract log, operational, performance, scalability, and security insights from microservice interaction log data.


pages: 292 words: 92,588

The Water Will Come: Rising Seas, Sinking Cities, and the Remaking of the Civilized World by Jeff Goodell

"World Economic Forum" Davos, Airbnb, Anthropocene, carbon footprint, centre right, clean water, climate change refugee, creative destruction, data science, desegregation, Donald Trump, Dr. Strangelove, Elon Musk, failed state, fixed income, Frank Gehry, global pandemic, Google Earth, Higgs boson, illegal immigration, Intergovernmental Panel on Climate Change (IPCC), Large Hadron Collider, megacity, Murano, Venice glass, negative emissions, New Urbanism, ocean acidification, Paris climate accords, Pearl River Delta, Peter Thiel, planetary scale, Ray Kurzweil, Richard Florida, risk tolerance, Ronald Reagan, Silicon Valley, smart cities, South China Sea, space junk, urban planning, urban renewal, wikimedia commons

These measurements, from which the influence of the tides and the waves is removed, are free from distortion by rising or sinking land. When this data is combined with tide gauge averages, as well as measurements from ocean floats that record changes in the heat content of the ocean, it gives scientists a very good picture of how much the sea level is rising and what the causes are. With better data, scientists are now able to more clearly understand other factors beyond land movement that lead to variations in the rate of sea-level rise. One is the gravitational fingerprinting I mentioned earlier, which pushes water into the Southern Hemisphere from melting ice sheets in Greenland and into the Northern Hemisphere from Antarctica.


pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists by Gary Marcus, Jeremy Freeman

23andMe, Albert Einstein, backpropagation, bioinformatics, bitcoin, brain emulation, cloud computing, complexity theory, computer age, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data acquisition, data science, deep learning, Drosophila, epigenetics, Geoffrey Hinton, global pandemic, Google Glasses, ITER tokamak, iterative process, language acquisition, linked data, mouse model, optical character recognition, pattern recognition, personalized medicine, phenotype, race to the bottom, Richard Feynman, Ronald Reagan, semantic web, speech recognition, stem cell, Steven Pinker, supply-chain management, synthetic biology, tacit knowledge, traumatic brain injury, Turing machine, twin studies, web application

The Human Brain Project aims to bridge the two. Where’s the Data? A prerequisite to creating a whole brain model is the emerging new discipline called neuroinformatics—the endeavor to apply computing technology to help solve the challenges neuroscientists face in organizing, sharing, and gaining insight from their data. Scientists have produced millions of papers and petabytes of data about the brain describing these many levels of detail—and the pace is growing even faster. Since 1990, the number of publications alone has grown from around 30,000 to nearly 100,000 per year in 2013. The number and size of large-scale datasets are also rapidly increasing—a recently produced single human brain scan consumes 1 terabyte (a thousand gigabytes) of storage—enough to fill the storage on a single laptop.


pages: 408 words: 85,118

Python for Finance by Yuxing Yan

asset-backed security, book value, business cycle, business intelligence, capital asset pricing model, constrained optimization, correlation coefficient, data science, distributed generation, diversified portfolio, financial engineering, functional programming, implied volatility, market microstructure, P = NP, p-value, quantitative trading / quantitative finance, risk free rate, Sharpe ratio, tail risk, time value of money, value at risk, volatility smile, zero-sum game

His work is focused on developing an FX Options application, and he mainly works with C++. However, he has worked with a variety of languages and technologies through the years. He is a Linux and Python enthusiast and spends his free time experimenting and developing applications with them. Mourad MOURAFIQ is a software engineer and data scientist. After successfully completing his studies in Applied Mathematics, he worked at an investment bank as a quantitative modeler in the structured products market, specializing in ABS, CDO, and CDS. Then, he worked as a quantitative analyst for the largest French bank. After a couple of years in the financial world, he discovered a passion for machine learning and computational mathematics and decided to join a start-up that specializes in software mining and artificial intelligence.


pages: 307 words: 88,180

AI Superpowers: China, Silicon Valley, and the New World Order by Kai-Fu Lee

"World Economic Forum" Davos, AI winter, Airbnb, Albert Einstein, algorithmic bias, algorithmic trading, Alignment Problem, AlphaGo, artificial general intelligence, autonomous vehicles, barriers to entry, basic income, bike sharing, business cycle, Cambridge Analytica, cloud computing, commoditize, computer vision, corporate social responsibility, cotton gin, creative destruction, crony capitalism, data science, deep learning, DeepMind, Demis Hassabis, Deng Xiaoping, deskilling, Didi Chuxing, Donald Trump, driverless car, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, fake news, full employment, future of work, general purpose technology, Geoffrey Hinton, gig economy, Google Chrome, Hans Moravec, happiness index / gross national happiness, high-speed rail, if you build it, they will come, ImageNet competition, impact investing, income inequality, informal economy, Internet of things, invention of the telegraph, Jeff Bezos, job automation, John Markoff, Kickstarter, knowledge worker, Lean Startup, low skilled workers, Lyft, machine translation, mandatory minimum, Mark Zuckerberg, Menlo Park, minimum viable product, natural language processing, Neil Armstrong, new economy, Nick Bostrom, OpenAI, pattern recognition, pirate software, profit maximization, QR code, Ray Kurzweil, recommendation engine, ride hailing / ride sharing, risk tolerance, Robert Mercer, Rodney Brooks, Rubik’s Cube, Sam Altman, Second Machine Age, self-driving car, sentiment analysis, sharing economy, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Skype, SoftBank, Solyndra, special economic zone, speech recognition, Stephen Hawking, Steve Jobs, strong AI, TED Talk, The Future of Employment, Travis Kalanick, Uber and Lyft, uber lyft, universal basic income, urban planning, vertical integration, Vision Fund, warehouse robotics, Y Combinator

Instead, it will simply take over the execution of tasks that meet two criteria: they can be optimized using data, and they do not require social interaction. (I will be going into greater detail about exactly which jobs AI can and cannot replace.) Yes, there will be some new jobs created along the way—robot repairing and AI data scientists, for example. But the main thrust of AI’s employment impact is not one of job creation through deskilling but of job replacement through increasingly intelligent machines. Displaced workers can theoretically transition into other industries that are more difficult to automate, but this is itself a highly disruptive process that will take a long time.


pages: 372 words: 94,153

More From Less: The Surprising Story of How We Learned to Prosper Using Fewer Resources – and What Happens Next by Andrew McAfee

back-to-the-land, Bartolomé de las Casas, Berlin Wall, bitcoin, Blitzscaling, Branko Milanovic, British Empire, Buckminster Fuller, call centre, carbon credits, carbon footprint, carbon tax, Charles Babbage, clean tech, clean water, cloud computing, congestion pricing, Corn Laws, creative destruction, crony capitalism, data science, David Ricardo: comparative advantage, decarbonisation, DeepMind, degrowth, dematerialisation, Demis Hassabis, Deng Xiaoping, do well by doing good, Donald Trump, Edward Glaeser, en.wikipedia.org, energy transition, Erik Brynjolfsson, failed state, fake news, Fall of the Berlin Wall, Garrett Hardin, Great Leap Forward, Haber-Bosch Process, Hans Rosling, humanitarian revolution, hydraulic fracturing, income inequality, indoor plumbing, intangible asset, James Watt: steam engine, Jeff Bezos, job automation, John Snow's cholera map, joint-stock company, Joseph Schumpeter, Khan Academy, Landlord’s Game, Louis Pasteur, Lyft, Marc Andreessen, Marc Benioff, market fundamentalism, means of production, Michael Shellenberger, Mikhail Gorbachev, ocean acidification, oil shale / tar sands, opioid epidemic / opioid crisis, Paul Samuelson, peak oil, precision agriculture, price elasticity of demand, profit maximization, profit motive, risk tolerance, road to serfdom, Ronald Coase, Ronald Reagan, Salesforce, Scramble for Africa, Second Machine Age, Silicon Valley, Steve Jobs, Steven Pinker, Stewart Brand, Ted Nordhaus, TED Talk, telepresence, The Wealth of Nations by Adam Smith, Thomas Davenport, Thomas Malthus, Thorstein Veblen, total factor productivity, Tragedy of the Commons, Uber and Lyft, uber lyft, Veblen good, War on Poverty, We are as Gods, Whole Earth Catalog, World Values Survey

Samasource, founded by Janah in 2008, trains people to do entry-level technology work (such as data entry and image labeling) and connects them with employers. Online education companies such as Udacity, Coursera, and Lambda aim to provide higher-level online training. I like these efforts because they often train people for jobs that can be done wherever there’s an Internet connection. Not every coder or data scientist wants to live in a big city, or to have to move to one to acquire new skills. I’m encouraged to see that promising alternatives are now appearing. I’m also encouraged to see that business leaders are taking seriously the issue of disconnection and working on efforts to bring back economic opportunity to communities in danger of being left behind as globalization and tech progress race ahead.


pages: 297 words: 93,882

Winning Now, Winning Later by David M. Cote

activist fund / activist shareholder / activist investor, Asian financial crisis, business cycle, business logic, business process, compensation consultant, data science, hiring and firing, Internet of things, Parkinson's law, Paul Samuelson, Silicon Valley, six sigma, Steve Jobs, stock buybacks, Toyota Production System, trickle-down economics, warehouse automation

To attract these premier programmers, or “multipliers” as we called them, we began evaluating potential hires on specific skills related to programming, collaboration, and teamwork, observing their actual behavior rather than just relying on their academic record. We took a similar approach to hiring data scientists as well. Our efforts in this area helped us significantly up our game as we developed software as a business and incorporated it into more of our existing products. To expand our capability and improve recruiting, we also brought a number of multifunctional teams into a new software center we had built in Atlanta, Georgia.


pages: 420 words: 94,064

The Revolution That Wasn't: GameStop, Reddit, and the Fleecing of Small Investors by Spencer Jakab

4chan, activist fund / activist shareholder / activist investor, barriers to entry, behavioural economics, Bernie Madoff, Bernie Sanders, Big Tech, bitcoin, Black Swan, book value, buy and hold, classic study, cloud computing, coronavirus, COVID-19, crowdsourcing, cryptocurrency, data science, deal flow, democratizing finance, diversified portfolio, Dogecoin, Donald Trump, Elon Musk, Everybody Ought to Be Rich, fake news, family office, financial innovation, gamification, global macro, global pandemic, Google Glasses, Google Hangouts, Gordon Gekko, Hacker News, income inequality, index fund, invisible hand, Jeff Bezos, Jim Simons, John Bogle, lockdown, Long Term Capital Management, loss aversion, Marc Andreessen, margin call, Mark Zuckerberg, market bubble, Masayoshi Son, meme stock, Menlo Park, move fast and break things, Myron Scholes, PalmPilot, passive investing, payment for order flow, Pershing Square Capital Management, pets.com, plutocrats, profit maximization, profit motive, race to the bottom, random walk, Reminiscences of a Stock Operator, Renaissance Technologies, Richard Thaler, ride hailing / ride sharing, risk tolerance, road to serfdom, Robinhood: mobile stock trading app, Saturday Night Live, short selling, short squeeze, Silicon Valley, Silicon Valley billionaire, SoftBank, Steve Jobs, TikTok, Tony Hsieh, trickle-down economics, Vanguard fund, Vision Fund, WeWork, zero-sum game

The usual prescription is more and better “investor education,” though there is little evidence that it is effective. The meme-stock squeeze delivered a wake-up call to everyone. Even as it was still going on, furious politicians called hearings, movie studios raced to ink deals for GameStop movies, and hedge funds scrambled to hire data scientists to scour social media so they could get in early on the next mania, or at least stay out of harm’s way. There may be nothing new under the sun when it comes to investing or stock tips, but the hyperconnected, algorithmically enhanced version is an order of magnitude more potent. It was strong enough to enrage Washington, inspire Hollywood, and rattle Wall Street.


pages: 406 words: 88,977

How to Prevent the Next Pandemic by Bill Gates

augmented reality, call centre, computer vision, contact tracing, coronavirus, COVID-19, data science, demographic dividend, digital divide, digital map, disinformation, Edward Jenner, global pandemic, global supply chain, Hans Rosling, lockdown, Neal Stephenson, Picturephone, profit motive, QR code, remote working, social distancing, statistical model, TED Talk, women in the workforce, zero-sum game

Most of the team would be based at individual countries’ national public health institutes, though some would sit in the WHO’s regional offices and at its headquarters in Geneva. When there’s a potential pandemic looming, the world needs expert analysis of early data points that can confirm the threat. GERM’s data scientists would build a system for monitoring reports of clusters of suspicious cases. Its epidemiologists would monitor reports from national governments and work with WHO colleagues to identify anything that looks like an outbreak. Its product-development experts would advise governments and companies on the highest-priority drugs and vaccines.


pages: 285 words: 91,144

App Kid: How a Child of Immigrants Grabbed a Piece of the American Dream by Michael Sayman

airport security, augmented reality, Bernie Sanders, Big Tech, Cambridge Analytica, data science, Day of the Dead, fake news, Frank Gehry, Google bus, Google Chrome, Google Hangouts, Googley, hacker house, imposter syndrome, Khan Academy, Marc Benioff, Mark Zuckerberg, Menlo Park, microaggression, move fast and break things, Salesforce, San Francisco homelessness, self-driving car, Sheryl Sandberg, Silicon Valley, skeuomorphism, Snapchat, Steve Jobs, tech worker, the High Line, TikTok, Tim Cook: Apple

At my parents’ place, probably—sitting at the kitchen counter, sweating over the next app that would make or break my world. I never wanted to feel that fragile and precarious again. I made a promise to myself to devote the rest of my internship to networking like a boss. I began by reaching out to market researchers, data scientists, and other people at that level, telling them I’d like to learn about their experiences. Rarely did someone decline to meet with me. Most people, I’ve found, will embrace the opportunity to be a teacher. The secret, I learned, was not to talk about myself, beyond saying enough to reassure people that I was competent and worthy of their time.


pages: 282 words: 93,783

The Future Is Analog: How to Create a More Human World by David Sax

Alvin Toffler, augmented reality, autonomous vehicles, Bernie Sanders, big-box store, bike sharing, Black Lives Matter, blockchain, bread and circuses, Buckminster Fuller, Cal Newport, call centre, clean water, cognitive load, commoditize, contact tracing, contact tracing app, COVID-19, crowdsourcing, cryptocurrency, data science, David Brooks, deep learning, digital capitalism, Donald Trump, driverless car, Elon Musk, fiat currency, Francis Fukuyama: the end of history, future of work, gentrification, George Floyd, indoor plumbing, informal economy, Jane Jacobs, Jaron Lanier, Jeff Bezos, Kickstarter, knowledge worker, lockdown, Lyft, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Minecraft, New Urbanism, nuclear winter, opioid epidemic / opioid crisis, Peter Thiel, RAND corporation, Ray Kurzweil, remote working, retail therapy, RFID, Richard Florida, ride hailing / ride sharing, Saturday Night Live, Shoshana Zuboff, side hustle, Sidewalk Labs, Silicon Valley, Silicon Valley startup, Skype, smart cities, social distancing, sovereign wealth fund, Steve Jobs, Superbowl ad, supply-chain management, surveillance capitalism, tech worker, technological singularity, technoutopianism, TED Talk, The Death and Life of Great American Cities, TikTok, Uber and Lyft, uber lyft, unemployed young men, urban planning, walkable city, Y2K, zero-sum game

The plan was to attempt a test bed downtown, using sensors, advanced cameras, public Wi-Fi networks, and digital kiosks to connect all sorts of city services and improve them for the mostly poorer Black and Latino residents of the area. The data would reveal gaps in parking, transportation, and policing, which would lead to quicker and better solutions by city staff. Embedding herself in the project over three years, doing everything from visiting the huge control rooms run by data scientists and statisticians to riding in the backs of police cruisers to waiting at cold bus stops, Baykurt got a front-row seat to what a smart city actually looks like when implemented on the ground. “To be honest, it doesn’t change much,” Baykurt concluded. “The hype mobilizes a lot of people. There seems to be change going on.”


pages: 1,829 words: 135,521

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython by Wes McKinney

Bear Stearns, business process, data science, Debian, duck typing, Firefox, general-purpose programming language, Google Chrome, Guido van Rossum, index card, p-value, quantitative trading / quantitative finance, random walk, recommendation engine, sentiment analysis, side project, sorting algorithm, statistical model, Two Sigma, type inference

Cross-validated models take longer to train, but can often yield better model performance. 13.5 Continuing Your Education While I have only skimmed the surface of some Python modeling libraries, there are more and more frameworks for various kinds of statistics and machine learning either implemented in Python or with a Python user interface. This book is focused especially on data wrangling, but there are many others dedicated to modeling and data science tools. Some excellent ones are: Introduction to Machine Learning with Python by Andreas Mueller and Sarah Guido (O’Reilly) Python Data Science Handbook by Jake VanderPlas (O’Reilly) Data Science from Scratch: First Principles with Python by Joel Grus (O’Reilly) Python Machine Learning by Sebastian Raschka (Packt Publishing) Hands-On Machine Learning with Scikit-Learn and TensorFlow by Aurélien Géron (O’Reilly) While books can be valuable resources for learning, they can sometimes grow out of date when the underlying open source software changes.

The Python community has grown immensely, and the ecosystem of open source software around it has flourished. This new edition of the book would not exist if not for the tireless efforts of the pandas core developers, who have grown the project and its user community into one of the cornerstones of the Python data science ecosystem. These include, but are not limited to, Tom Augspurger, Joris van den Bossche, Chris Bartak, Phillip Cloud, gfyoung, Andy Hayden, Masaaki Horikoshi, Stephan Hoyer, Adam Klein, Wouter Overmeire, Jeff Reback, Chang She, Skipper Seabold, Jeff Tratner, and y-p. On the actual writing of this second edition, I would like to thank the O’Reilly staff who helped me patiently with the writing process.

Among interpreted languages, for various historical and cultural reasons, Python has developed a large and active scientific computing and data analysis community. In the last 10 years, Python has gone from a bleeding-edge or “at your own risk” scientific computing language to one of the most important languages for data science, machine learning, and general software development in academia and industry. For data analysis and interactive computing and data visualization, Python will inevitably draw comparisons with other open source and commercial programming languages and tools in wide use, such as R, MATLAB, SAS, Stata, and others.


The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do by Erik J. Larson

AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, Alignment Problem, AlphaGo, Amazon Mechanical Turk, artificial general intelligence, autonomous vehicles, Big Tech, Black Swan, Bletchley Park, Boeing 737 MAX, business intelligence, Charles Babbage, Claude Shannon: information theory, Computing Machinery and Intelligence, conceptual framework, correlation does not imply causation, data science, deep learning, DeepMind, driverless car, Elon Musk, Ernest Rutherford, Filter Bubble, Geoffrey Hinton, Georg Cantor, Higgs boson, hive mind, ImageNet competition, information retrieval, invention of the printing press, invention of the wheel, Isaac Newton, Jaron Lanier, Jeff Hawkins, John von Neumann, Kevin Kelly, Large Hadron Collider, Law of Accelerating Returns, Lewis Mumford, Loebner Prize, machine readable, machine translation, Nate Silver, natural language processing, Nick Bostrom, Norbert Wiener, PageRank, PalmPilot, paperclip maximiser, pattern recognition, Peter Thiel, public intellectual, Ray Kurzweil, retrograde motion, self-driving car, semantic web, Silicon Valley, social intelligence, speech recognition, statistical model, Stephen Hawking, superintelligent machines, tacit knowledge, technological singularity, TED Talk, The Coming Technological Singularity, the long tail, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, theory of mind, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, Yochai Benkler

In Part Three, The F ­ uture of the Myth, I argue that the myth has very bad consequences if taken seriously, ­because it subverts science. In par­tic­u ­lar, it erodes a culture of human intelligence and invention, which is necessary for the very breakthroughs we ­w ill need to understand our own f­ uture. Data science (the application of AI to “big data”) is at best a prosthetic for ­human ingenuity, which if used correctly can help us deal with our modern “data deluge.” If used as a replacement for individual intelligence, it tends to chew up invest- INTRODUCTION 5 ment without delivering results. I explain, in par­tic­u ­lar, how the myth has negatively affected research in neuroscience, among other recent scientific pursuits.

Programs like DENDRAL, which analyzed the structure of chemicals, and MYCIN, which provided sometimes quite good 58 T he S implified W orld medical diagnoses, made clear that AI methods ­were relevant to a variety of prob­lems normally requiring high h­ uman intelligence. Machine translation, as w ­ e’ve seen, was an initial failure, but yielded to dif­fer­ent approaches made pos­si­ble by the availability of large datasets (a precursor to many Big Data and data science successes in the 2000s). All sorts of natu­ral language pro­cessing tasks, like generating parses of natu­ral language sentences, and tagging parts of speech or entities (persons, organ­izations, places, and the like), were chipped away at by AI systems with increasing power and sophistication.8 Yet Turing’s original goal for AI, passing the Turing test, remained elusive.

Amazon was using big data before it was a buzzword, tracking and cata­loging online purchases, which now are used as data to feed machine learning algorithms offering product recommendations, enhanced search, and other customer features. Big data is an inevitable consequence of Moore’s law: as computers become more power­f ul, statistical techniques like machine learning become better, and new business models emerge—­a ll from data and its analy­sis. What we now refer to as data science (or, increasingly, AI) is r­ eally an old field, given new wings by Moore’s law and massive volumes of data, mostly made available by the growth of the web. Governments and nonprofit organ­izations quickly joined in, using big data to predict every­thing from traffic flows to recidivism among parole-­eligible prisoners.


pages: 337 words: 103,273

The Great Disruption: Why the Climate Crisis Will Bring on the End of Shopping and the Birth of a New World by Paul Gilding

"World Economic Forum" Davos, airport security, Alan Greenspan, Albert Einstein, biodiversity loss, Bob Geldof, BRICs, carbon credits, carbon footprint, carbon tax, clean tech, clean water, Climategate, commoditize, corporate social responsibility, creative destruction, data science, decarbonisation, energy security, Exxon Valdez, failed state, fear of failure, geopolitical risk, income inequality, Intergovernmental Panel on Climate Change (IPCC), John Elkington, Joseph Schumpeter, market fundamentalism, mass immigration, Medieval Warm Period, Naomi Klein, negative emissions, Nelson Mandela, new economy, nuclear winter, Ocado, ocean acidification, oil shock, peak oil, Ponzi scheme, precautionary principle, purchasing power parity, retail therapy, Ronald Reagan, shareholder value, systems thinking, The Spirit Level, The Wealth of Nations by Adam Smith, union organizing, University of East Anglia, warehouse automation

These gatherings have become key milestones measuring society’s progress on sustainability, or the lack of it, with a recent example being the Climate Conference in Copenhagen. The 1972 Stockholm Conference also established various global and regional scientific monitoring processes that helped provide the data scientists now use to measure the changing state of the global ecosystem. And in case you thought climate change was a recent issue, it was addressed at this meeting nearly forty years ago! The second key event of 1972 was the publication of The Limits to Growth. While commissioned by the Club of Rome, an international group of intellectuals and industrialists, the report was produced by MIT experts who were focused on system dynamics—taking the behavior of systems, rather than environmental issues, as their starting point.


pages: 463 words: 105,197

Radical Markets: Uprooting Capitalism and Democracy for a Just Society by Eric Posner, E. Weyl

3D printing, activist fund / activist shareholder / activist investor, Affordable Care Act / Obamacare, Airbnb, Amazon Mechanical Turk, anti-communist, augmented reality, basic income, Berlin Wall, Bernie Sanders, Big Tech, Branko Milanovic, business process, buy and hold, carbon footprint, Cass Sunstein, Clayton Christensen, cloud computing, collective bargaining, commoditize, congestion pricing, Corn Laws, corporate governance, crowdsourcing, cryptocurrency, data science, deep learning, DeepMind, Donald Trump, Elon Musk, endowment effect, Erik Brynjolfsson, Ethereum, feminist movement, financial deregulation, Francis Fukuyama: the end of history, full employment, gamification, Garrett Hardin, George Akerlof, global macro, global supply chain, guest worker program, hydraulic fracturing, Hyperloop, illegal immigration, immigration reform, income inequality, income per capita, index fund, informal economy, information asymmetry, invisible hand, Jane Jacobs, Jaron Lanier, Jean Tirole, Jeremy Corbyn, Joseph Schumpeter, Kenneth Arrow, labor-force participation, laissez-faire capitalism, Landlord’s Game, liberal capitalism, low skilled workers, Lyft, market bubble, market design, market friction, market fundamentalism, mass immigration, negative equity, Network effects, obamacare, offshore financial centre, open borders, Pareto efficiency, passive investing, patent troll, Paul Samuelson, performance metric, plutocrats, pre–internet, radical decentralization, random walk, randomized controlled trial, Ray Kurzweil, recommendation engine, rent-seeking, Richard Thaler, ride hailing / ride sharing, risk tolerance, road to serfdom, Robert Shiller, Ronald Coase, Rory Sutherland, search costs, Second Machine Age, second-price auction, self-driving car, shareholder value, sharing economy, Silicon Valley, Skype, special economic zone, spectrum auction, speech recognition, statistical model, stem cell, telepresence, Thales and the olive presses, Thales of Miletus, The Death and Life of Great American Cities, The Future of Employment, The Market for Lemons, The Nature of the Firm, The Rise and Fall of American Growth, The Theory of the Leisure Class by Thorstein Veblen, The Wealth of Nations by Adam Smith, Thorstein Veblen, trade route, Tragedy of the Commons, transaction costs, trickle-down economics, Tyler Cowen, Uber and Lyft, uber lyft, universal basic income, urban planning, Vanguard fund, vertical integration, women in the workforce, Zipcar

Respondents in QV surveys also participate more actively, revising their answers to reflect their preferences much more frequently and often providing feedback that taking the QV survey had helped them learn their own preferences more accurately by forcing them to make difficult, even frustrating tradeoffs. To test whether QV manages to solve the problems with Likert, in 2016 Decide’s chief data scientist and now professor of mathematics education David Quarfoot, along with several co-authors, ran a nationally representative survey with thousands of participants that took versions of the same poll using Likert, QV, or both depending on which group they were assigned to.43 Figure 2.4 pictures a representative set of responses, on the question of repealing Obamacare, with the Likert survey on the left (with its signature W-shape) and the results from QV on the right.


pages: 359 words: 96,019

How to Turn Down a Billion Dollars: The Snapchat Story by Billy Gallagher

Airbnb, Albert Einstein, Amazon Web Services, AOL-Time Warner, Apple's 1984 Super Bowl advert, augmented reality, Bernie Sanders, Big Tech, Black Swan, citizen journalism, Clayton Christensen, computer vision, data science, disruptive innovation, Donald Trump, El Camino Real, Elon Musk, fail fast, Fairchild Semiconductor, Frank Gehry, gamification, gentrification, Google Glasses, Hyperloop, information asymmetry, Jeff Bezos, Justin.tv, Kevin Roose, Lean Startup, Long Term Capital Management, Mark Zuckerberg, Menlo Park, minimum viable product, Nelson Mandela, Oculus Rift, paypal mafia, Peter Thiel, power law, QR code, Robinhood: mobile stock trading app, Salesforce, Sand Hill Road, Saturday Night Live, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, skeuomorphism, Snapchat, social graph, SoftBank, sorting algorithm, speech recognition, stealth mode startup, Steve Jobs, TechCrunch disrupt, too big to fail, value engineering, Y Combinator, young professional

Snapchat exec Sriram Krishnan later wrote about this Snapchat core belief after leaving the company, explaining how most companies measure a proxy metric for actual human behavior, since the latter is nearly impossible to measure perfectly. You convert a nebulous human emotion/behavior to a quantifiable metric you can align execution on and stick on a graph and measure teams on. Engineers and data scientists can’t do anything with “this makes people feel warm and fuzzy.” They can do a lot with “this feature improves metric X by 5% week-over-week.” Figuring out the connection between the two is often the art and science of product management. Krishnan explained how these metrics often have unforeseen side effects, as people focus on simply making the metric increase, but not in the way the original system designers intended: For example, in terms of what designers wanted, what they built/measured and what they unintentionally caused: Quality journalism → Measure Clicks → Creation of click-bait content Marissa Mayer once tested forty-one different shades of blue on Google users to see which one would be most effective.


pages: 332 words: 100,601

Rebooting India: Realizing a Billion Aspirations by Nandan Nilekani

Airbnb, Atul Gawande, autonomous vehicles, barriers to entry, bitcoin, call centre, carbon credits, cashless society, clean water, cloud computing, collaborative consumption, congestion charging, DARPA: Urban Challenge, data science, dematerialisation, demographic dividend, digital rights, driverless car, Edward Snowden, en.wikipedia.org, energy security, fail fast, financial exclusion, gamification, Google Hangouts, illegal immigration, informal economy, information security, Khan Academy, Kickstarter, knowledge economy, land reform, law of one price, M-Pesa, machine readable, Mahatma Gandhi, Marc Andreessen, Mark Zuckerberg, mobile money, Mohammed Bouazizi, more computing power than Apollo, Negawatt, Network effects, new economy, off-the-grid, offshore financial centre, price mechanism, price stability, rent-seeking, RFID, Ronald Coase, school choice, school vouchers, self-driving car, sharing economy, Silicon Valley, single source of truth, Skype, smart grid, smart meter, software is eating the world, source of truth, Steve Jobs, systems thinking, The future is already here, The Nature of the Firm, transaction costs, vertical integration, WikiLeaks, work culture

Along with his colleagues Jeff Bezanson, Alan Edelman and Stefan Karpinski, Viral is a co-inventor of the Julia programming language. Julia is an open-source, high-performance programming language under development since 2009. It is commonly used by scientists in the physical and social sciences, engineers and data scientists for diverse purposes ranging from exploring the secrets of the universe to teasing out new insights from big data. While Julia is itself an open-source project that received contributions from scores of programmers worldwide, it has benefited from government research funding at the Massachusetts Institute of Technology, from US government agencies such as the Defense Advanced Research Projects Agency, the National Science Foundation and the Department of Energy.


Beautiful Visualization by Julie Steele

barriers to entry, correlation does not imply causation, data acquisition, data science, database schema, Drosophila, en.wikipedia.org, epigenetics, global pandemic, Hans Rosling, index card, information retrieval, iterative process, linked data, Mercator projection, meta-analysis, natural language processing, Netflix Prize, no-fly zone, pattern recognition, peer-to-peer, performance metric, power law, QR code, recommendation engine, semantic web, social bookmarking, social distancing, social graph, sorting algorithm, Steve Jobs, the long tail, web application, wikimedia commons, Yochai Benkler

In addition to working at the Times, Nick helped co-found NYCResistor, a hardware hacker space in Brooklyn, New York. He is also an adjunct professor at NYU in the Interactive Telecommunications program. Michael Driscoll fell in love with with data visualization over a decade ago as a software engineer for the Human Genome Project. He is the founder and principal data scientist at Dataspora, an analytics consultancy in San Francisco. Jonathan Feinberg is a computer programmer who lives in Medford, Massachusetts, with his wife and two sons. Please write to him at jdf@pobox.com, especially if you know of any Boston-area Pad Thai that can go up against the Thai Café in Greenpoint, Brooklyn.


pages: 372 words: 101,678

Lessons from the Titans: What Companies in the New Economy Can Learn from the Great Industrial Giants to Drive Sustainable Success by Scott Davis, Carter Copeland, Rob Wertheimer

3D printing, activist fund / activist shareholder / activist investor, additive manufacturing, Airbnb, airport security, asset light, barriers to entry, Big Tech, Boeing 747, business cycle, business process, clean water, commoditize, coronavirus, corporate governance, COVID-19, data science, disruptive innovation, Elisha Otis, Elon Musk, factory automation, fail fast, financial engineering, Ford Model T, global pandemic, hydraulic fracturing, Internet of things, iterative process, junk bonds, Kaizen: continuous improvement, Kanban, low cost airline, Marc Andreessen, Mary Meeker, megacity, Michael Milken, Network effects, new economy, Ponzi scheme, profit maximization, random walk, RFID, ride hailing / ride sharing, risk tolerance, Salesforce, shareholder value, Silicon Valley, six sigma, skunkworks, software is eating the world, strikebreaker, tech billionaire, TED Talk, Toyota Production System, Uber for X, value engineering, warehouse automation, WeWork, winner-take-all economy

We’ve heard dozens of companies adapt the lingo of continuous improvement and even some of the techniques. Most fail to make durable progress. Making an actual system work takes years of disciplined implementation. Software as a sector hasn’t come up with differentiated and systematic workflows. For companies like Uber, that is an expensive problem: 3,000 software engineers and data scientists in a company expected to lose $3 billion in operating profits on less than $20 billion in revenue. This failure to build in rigorous continuous improvement is unsurprising and certainly not unique to software or technology. As competition grows, winners will need a bigger moat. When Uber was founded in 2008, there were only a small handful of companies pursuing asset sharing.


pages: 346 words: 97,330

Ghost Work: How to Stop Silicon Valley From Building a New Global Underclass by Mary L. Gray, Siddharth Suri

"World Economic Forum" Davos, Affordable Care Act / Obamacare, AlphaGo, Amazon Mechanical Turk, Apollo 13, augmented reality, autonomous vehicles, barriers to entry, basic income, benefit corporation, Big Tech, big-box store, bitcoin, blue-collar work, business process, business process outsourcing, call centre, Capital in the Twenty-First Century by Thomas Piketty, cloud computing, cognitive load, collaborative consumption, collective bargaining, computer vision, corporate social responsibility, cotton gin, crowdsourcing, data is the new oil, data science, deep learning, DeepMind, deindustrialization, deskilling, digital divide, do well by doing good, do what you love, don't be evil, Donald Trump, Elon Musk, employer provided health coverage, en.wikipedia.org, equal pay for equal work, Erik Brynjolfsson, fake news, financial independence, Frank Levy and Richard Murnane: The New Division of Labor, fulfillment center, future of work, gig economy, glass ceiling, global supply chain, hiring and firing, ImageNet competition, independent contractor, industrial robot, informal economy, information asymmetry, Jeff Bezos, job automation, knowledge economy, low skilled workers, low-wage service sector, machine translation, market friction, Mars Rover, natural language processing, new economy, operational security, passive income, pattern recognition, post-materialism, post-work, power law, race to the bottom, Rana Plaza, recommendation engine, ride hailing / ride sharing, Ronald Coase, scientific management, search costs, Second Machine Age, sentiment analysis, sharing economy, Shoshana Zuboff, side project, Silicon Valley, Silicon Valley startup, Skype, software as a service, speech recognition, spinning jenny, Stephen Hawking, TED Talk, The Future of Employment, The Nature of the Firm, Tragedy of the Commons, transaction costs, two-sided market, union organizing, universal basic income, Vilfredo Pareto, Wayback Machine, women in the workforce, work culture , Works Progress Administration, Y Combinator, Yochai Benkler

Right now, most ghost work platforms, particularly for micro-tasks working with AI training data, default to assuming that people are individual agents standing by and ready to jump into a task, with the latest software and a stable internet connection. For example, a manager might give a team of data scientists, hired through an online labor market, access to files and data, effectively bringing them into the enterprise for a short period of time. Then, once the project is completed, their access to the files and data would be revoked, returning them to a position outside of the enterprise. Managing the workflows—worker output and interactions with data—presents new challenges for porous enterprises mixing full-time employees and on-demand workers.


pages: 302 words: 100,493

Working Backwards: Insights, Stories, and Secrets From Inside Amazon by Colin Bryar, Bill Carr

Amazon Web Services, barriers to entry, Big Tech, Black Lives Matter, business logic, business process, cloud computing, coronavirus, COVID-19, data science, delayed gratification, en.wikipedia.org, fulfillment center, iterative process, Jeff Bezos, late fees, loose coupling, microservices, Minecraft, performance metric, search inside the book, shareholder value, Silicon Valley, six sigma, Steve Jobs, subscription business, Toyota Production System, two-pizza team, web application, why are manhole covers round?

“How can we make a 44-inch TV with an HD display that can retail for $1,999 at a 25 percent gross margin?” or “How will we make a Kindle reader that connects to carrier networks to download books without customers having to sign a contract with a carrier?” or “How many new software engineers and data scientists do we need to hire for this new initiative?” In other words, the FAQ section is where the writer shares the details of the plan from a consumer point of view and addresses the various risks and challenges from internal operations, technical, product, marketing, legal, business development, and financial points of view.


pages: 309 words: 96,168

Masters of Scale: Surprising Truths From the World's Most Successful Entrepreneurs by Reid Hoffman, June Cohen, Deron Triff

"Susan Fowler" uber, 23andMe, 3D printing, Airbnb, Anne Wojcicki, Ben Horowitz, bitcoin, Blitzscaling, Broken windows theory, Burning Man, call centre, chief data officer, clean water, collaborative consumption, COVID-19, crowdsourcing, data science, desegregation, do well by doing good, Elon Musk, financial independence, fulfillment center, gender pay gap, global macro, growth hacking, hockey-stick growth, Internet of things, knowledge economy, late fees, Lean Startup, lone genius, Marc Benioff, Mark Zuckerberg, minimum viable product, move fast and break things, Network effects, Paul Graham, Peter Thiel, polynesian navigation, race to the bottom, remote working, RFID, Ronald Reagan, Rubik’s Cube, Ruby on Rails, Salesforce, Sam Altman, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, social distancing, Steve Jobs, Susan Wojcicki, TaskRabbit, TechCrunch disrupt, TED Talk, the long tail, the scientific method, Tim Cook: Apple, Travis Kalanick, two and twenty, work culture , Y Combinator, zero day, Zipcar

* * * — It’s no surprise that a digital platform like Eventbrite relies heavily on data—as well as emotional connection—to inform the fuller picture of their customer. But you might be surprised to learn that Jenn Hyman’s clothing rental biz, Rent the Runway, is also deep into data, and always has been. “Actually, 80 percent of our corporate employees are engineers, data scientists, and product managers,” says Jenn. “We have very few people in merchandising and marketing. The first C-level hire that I made was a chief data officer, and he was in my first ten employees. From the very beginning of the company, we were thinking about data. “We are getting data from our customer over a hundred times a year,” says Jenn.


pages: 329 words: 99,504

Easy Money: Cryptocurrency, Casino Capitalism, and the Golden Age of Fraud by Ben McKenzie, Jacob Silverman

algorithmic trading, asset allocation, bank run, barriers to entry, Ben McKenzie, Bernie Madoff, Big Tech, bitcoin, Bitcoin "FTX", blockchain, capital controls, citizen journalism, cognitive dissonance, collateralized debt obligation, COVID-19, Credit Default Swap, credit default swaps / collateralized debt obligations, cross-border payments, cryptocurrency, data science, distributed ledger, Dogecoin, Donald Trump, effective altruism, Elon Musk, en.wikipedia.org, Ethereum, ethereum blockchain, experimental economics, financial deregulation, financial engineering, financial innovation, Flash crash, Glass-Steagall Act, high net worth, housing crisis, information asymmetry, initial coin offering, Jacob Silverman, Jane Street, low interest rates, Lyft, margin call, meme stock, money market fund, money: store of value / unit of account / medium of exchange, Network effects, offshore financial centre, operational security, payday loans, Peter Thiel, Ponzi scheme, Potemkin village, prediction markets, proprietary trading, pushing on a string, QR code, quantitative easing, race to the bottom, ransomware, regulatory arbitrage, reserve currency, risk tolerance, Robert Shiller, Robinhood: mobile stock trading app, Ross Ulbricht, Sam Bankman-Fried, Satoshi Nakamoto, Saturday Night Live, short selling, short squeeze, Silicon Valley, Skype, smart contracts, Steve Bannon, systems thinking, TikTok, too big to fail, transaction costs, tulip mania, uber lyft, underbanked, vertical integration, zero-sum game

We are committed to being fully licensed and regulated around the world, and we were recently awarded virtual assets service provider licenses in Bahrain and Dubai.” Binance’s best defense may be to claim basic technical incompetence—perhaps network congestion really did lead to malfunctions in the company’s app. What actually happened on May 19 remains a mystery. But people like Carol Alexander and Matt Ranger, a data scientist and former professional poker player, propose that the platform’s problems may go beyond simple technical outages. In blog posts, academic papers, and conversations with journalists, they have argued that Binance has been outplayed in its own casino. According to their analysis, Binance has become the perfect playground for professional trading firms to clean up against unsophisticated retail traders.


pages: 130 words: 43,665

Powerful: Teams, Leaders and the Culture of Freedom and Responsibility by Patty McCord

call centre, data science, future of work, job satisfaction, late fees, Silicon Valley, Skype, subscription business, the scientific method, women in the workforce

People Learn to Welcome Criticism Openly sharing criticism was one of the hardest parts of the Netflix culture for new employees to get used to, but most quickly came to appreciate how valuable the openness was. When I talked about this with one of our great team leaders, Eric Colson, he told me the giving and taking of honest feedback was central to how well his teams worked, and his teams worked beautifully. That’s why Eric rose to the position of VP of data science and engineering in less than three years at the company, having begun as an individual contributor. He’d been managing a small data analytics team at Yahoo! before coming to Netflix, and he recalled that the culture there was to be super supportive of people and not to criticize them. He told me that when he started getting critical feedback from colleagues at Netflix, “It hurt.

There’s a dangerous fallacy that data constitutes the facts you need to know to run your business. Hard data is absolutely vital, of course, but you also need qualitative insight and well-formulated opinions, and you need your team to debate those insights and opinions openly and with gusto. Data Doesn’t Have an Opinion I loved when we hired somebody new in data science, especially in the early days. We all had our own beliefs about customer behavior that they’d bust. In the beginning we opined about how the customer behaved based on ourselves as customers. We would argue back and forth, saying, “That’s not the way they watch; no, no, I don’t watch that way.” Then with the transition to streaming, we started to get actual viewing data.

The best employees are always looking for challenging new opportunities, and though they are usually intensely loyal, many of them will eventually seek those opportunities elsewhere. You can never know when they might decide to make a move, and often there is nothing you’ll be able to do to stop them. Earlier I mentioned Eric Colson. In less than three years, he rose from a data analyst position to the role of VP of data science and engineering, reporting directly to Reed and managing four big and very important teams. He had never expected to be given so much responsibility, certainly not so fast. He told me recently that he was, and still is, hugely grateful for the opportunities offered to him. He also loved the work he was doing at Netflix.


pages: 281 words: 71,242

World Without Mind: The Existential Threat of Big Tech by Franklin Foer

artificial general intelligence, back-to-the-land, Berlin Wall, big data - Walmart - Pop Tarts, Big Tech, big-box store, Buckminster Fuller, citizen journalism, Colonization of Mars, computer age, creative destruction, crowdsourcing, data is the new oil, data science, deep learning, DeepMind, don't be evil, Donald Trump, Double Irish / Dutch Sandwich, Douglas Engelbart, driverless car, Edward Snowden, Electric Kool-Aid Acid Test, Elon Musk, Evgeny Morozov, Fall of the Berlin Wall, Filter Bubble, Geoffrey Hinton, global village, Google Glasses, Haight Ashbury, hive mind, income inequality, intangible asset, Jeff Bezos, job automation, John Markoff, Kevin Kelly, knowledge economy, Law of Accelerating Returns, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, means of production, move fast and break things, new economy, New Journalism, Norbert Wiener, off-the-grid, offshore financial centre, PageRank, Peace of Westphalia, Peter Thiel, planetary scale, Ray Kurzweil, scientific management, self-driving car, Silicon Valley, Singularitarianism, software is eating the world, Steve Jobs, Steven Levy, Stewart Brand, strong AI, supply-chain management, TED Talk, the medium is the message, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas L Friedman, Thorstein Veblen, Upton Sinclair, Vernor Vinge, vertical integration, We are as Gods, Whole Earth Catalog, yellow journalism

Or if we want to be melodramatic about it, we could say Facebook is constantly tinkering with how its users view the world—always tinkering with the quality of news and opinion that it allows to break through the din, adjusting the quality of political and cultural discourse in order to hold the attention of users for a few more beats. But how do the engineers know which dial to twist and how hard? There’s a whole discipline, data science, to guide the writing and revision of algorithms. Facebook has a team, poached from academia, to conduct experiments on users. It’s a statistician’s sexiest dream—some of the largest data sets in human history, the ability to run trials on mathematically meaningful cohorts. When Cameron Marlow, the former head of Facebook’s data science team, described the opportunity, he began twitching with ecstatic joy. “For the first time,” Marlow said, “we have a microscope that not only lets us examine social behavior at a very fine level that we’ve never been able to see before but allows us to run experiments that millions of users are exposed to.”

For one group, Facebook excised the positive words from the posts in the News Feed; for another group, it removed the negative words. Each group, it concluded, wrote posts that echoed the mood of the posts it had reworded. This study was roundly condemned as invasive, but it is not so unusual. As one member of Facebook’s data science team confessed: “Anyone on that team could run a test. They’re always trying to alter people’s behavior.” There’s no doubting the emotional and psychological power possessed by Facebook—at least Facebook doesn’t doubt it. It has bragged about how it increased voter turnout (and organ donation) by subtly amping up the social pressures that compel virtuous behavior.

.* The formula supposedly illustrates how a piece of editorial content could go viral—how it could travel through the social networks to quickly reach a massive audience, as rapidly as smallpox ripped its way across North America. Peretti’s formula, in fact, came from epidemiology. The nod to science was intentional. With experimentation and careful reading of data, science could suggest which pieces had the best shot at achieving virality—and if not virality, then at least a robust audience. The emerging science of traffic was really a branch of behavioral psychology—people clicked so quickly, they didn’t always fully understand why they gravitated to one piece over another.


pages: 375 words: 88,306

The Sharing Economy: The End of Employment and the Rise of Crowd-Based Capitalism by Arun Sundararajan

"World Economic Forum" Davos, additive manufacturing, Airbnb, AltaVista, Amazon Mechanical Turk, asset light, autonomous vehicles, barriers to entry, basic income, benefit corporation, bike sharing, bitcoin, blockchain, book value, Burning Man, call centre, Carl Icahn, collaborative consumption, collaborative economy, collective bargaining, commoditize, commons-based peer production, corporate social responsibility, cryptocurrency, data science, David Graeber, distributed ledger, driverless car, Eben Moglen, employer provided health coverage, Erik Brynjolfsson, Ethereum, ethereum blockchain, Frank Levy and Richard Murnane: The New Division of Labor, future of work, general purpose technology, George Akerlof, gig economy, housing crisis, Howard Rheingold, independent contractor, information asymmetry, Internet of things, inventory management, invisible hand, job automation, job-hopping, John Zimmer (Lyft cofounder), Kickstarter, knowledge worker, Kula ring, Lyft, Marc Andreessen, Mary Meeker, megacity, minimum wage unemployment, moral hazard, moral panic, Network effects, new economy, Oculus Rift, off-the-grid, pattern recognition, peer-to-peer, peer-to-peer lending, peer-to-peer model, peer-to-peer rental, profit motive, public intellectual, purchasing power parity, race to the bottom, recommendation engine, regulatory arbitrage, rent control, Richard Florida, ride hailing / ride sharing, Robert Gordon, Ronald Coase, Ross Ulbricht, Second Machine Age, self-driving car, sharing economy, Silicon Valley, smart contracts, Snapchat, social software, supply-chain management, TaskRabbit, TED Talk, the long tail, The Nature of the Firm, total factor productivity, transaction costs, transportation-network company, two-sided market, Uber and Lyft, Uber for X, uber lyft, universal basic income, Vitalik Buterin, WeWork, Yochai Benkler, Zipcar

In a September 2014 panel discussion I participated in at the Techonomy Detroit conference, the moderator, Jennifer Bradley of the Aspen Institute, asked TaskRabbit’s president Stacy Brown-Philpot whether the platform had “flags or protections or things that could alert you to discrimination in the system or bad actors.” “We do. We have a data science team that we run [to] constantly to make sure we’re flagging and alerting human beings to actually go through and look at it,” Brown-Philpot replied, “and we actually track data on what drives somebody to select a tasker, and you can see all their pictures so you know what they look like, and the most important thing is a smile. That’s it.”30 Data science holds tremendous promise as a way to detect systemic forms of discrimination, often difficult to identify on a case-by-case basis during face-to-face interaction, but which may be brought to light and addressed with data analytics.

These include conversations with: Neha Gondal about the sociology of the sharing economy; Ravi Bapna, Verena Butt d’Espous, Juan Cartagena, Chris Dellarocas, Alok Gupta, and Sarah Rice about trust; Paul Daugherty, Peter Evans, Geoffrey Parker, Anand Shah, Marshall Van Alstyne, and Bruce Weinelt about platforms; Brad Burnham, Kanyi Maqubela, Simon Rothman, Craig Shapiro, and Albert Wenger about venture capital; Janelle Orsi, Nathan Schreiber, and Trebor Scholz about cooperatives; Umang Dua, Oisin Hanrahan, Micah Kaufmann, and Juho Makkonen about marketplace models; Gene Homicki about alternative rental models; Primavera De Filipi and Matan Field about the blockchain and decentralized peer-to-peer technologies; Ashwini Chhabra, Molly Cohen, Althea Erickson, David Estrada, Nick Grossman, David Hantman, Alex Howard, Meera Joshi, Veronica Juarez, Chris Lehane, Mike Masserman, Padden Murphy, Joseph Okpaku, Brooks Rainwater, April Rinne, Sofia Ranchordas, Michael Simas, Jessica Singleton, Adam Thierer, and Bradley Tusk about regulation; Elena Grewal, Kevin Novak, and Chris Pouliot about the use of data science in the sharing economy; Nellie Abernathy, Cynthia Estlund, Steve King, Wilma Liebman, Marysol McGee, Brian Miller, Michelle Miller, Caitlin Pearce, Libby Reder, Julie Samuels, Kristin Sharp, Dan Teran, Felicia Wong, and Marco Zappacosta about the future of work. I am also thankful to Congressman Darrell Issa, Congressman Eric Swalwell, and Senator Mark Warner for their leadership and for many conversations about critical sharing economy policy issues.

., 159, 163–164 Blockchain, 18–19, 59–60, 86, 88 blockchain economies, 58–60, 87–95, 100–102 Blockchain: Blueprint for a New Economy (Swan), 93 Blurring of boundaries/lines, 8, 27, 46, 76, 141–142, 148, 171 Botsman, Rachel, 27–30, 35, 70, 81, 82 Bradley, Jennifer, 157 Brand-based trust, 144–146 Brastaviceanu, Tiberius, 199 Bresnahan, Timothy, 75 Brookings Institute, 179, 184 Brown-Philpot, Stacy, 157 Brynjolfsson, Erik, 75, 112, 165–166 Budelis, Katrina, 14 Burnham, Brad, 85–86, 187 Business of Sharing, The (Stephany), 28 Buterin, Vitalik, 95, 101–102 Button, 201 Byers, John W., 121 California Fruit Exchange, 197 California Labor Commissioner, 160 California Public Utilities Commission (CPUC), 153–154 Capalino, James, 136 Capital in the 21st Century (Piketty), 123 Card, David, 166 Car sharing, 1, 3. See also BlaBlaCar; Getaround; Lyft; Turo; Uber data science and, 157 La’Zooz, 94–95 local network effects, 119–120 regulatory challenges, 135 trust and, 98 Cartagena, Juan, 65 Castor, Emily, 9 Center for Global Enterprise, 119 Centers for Disease Control Chain, 86–87, 91 Chandler, Alfred, 4, 69, 71–72 Chase, Robin, 27, 66 Chen, Edward M., 160 Cheng, Denise, 184 Chéron, Guilhem, 16 Chesborough, Henry, 75–76 Chesky, Brian, 1, 7–9, 113, 122, 125, 131 Chhabra, Ashwini, 136 Chhabria, Vince, 160, 177 Choukrun, Marc-David, 16, 17 Clark, Shelby, 190 Clinton, Hillary, 161 Clothing and accessories rentals, 15–16 “Coase’s Penguin, or, Linux and the Nature of the Firm” (Benkler), 210n19 Cohen, Molly, 139, 153 Coleman, James, 60 Collaborative consumption, 28, 82 Collaborative economy, 25–28 Collaborative Economy Honeycomb, 82–84 Collaborative-Peer-Sharing Economy Summit, 105, 114–115 Commercial exchange, history of peer-to-peer, 4–5 Commons-based peer production, 30–32, 210n19 Community accommodation platforms and, 38–43 creation of, 36 funding, 41–43 human connectedness and, 44–46 service platforms and, 43–44 Congressional Sharing Economy Caucus, 137, 182 Consumer Electronic Show, 100 Consumerization of the digital, 54–55 Contractor dependent contractor, 183–184 versus employee, 159–160, 174–175, 178–182 Cooperatives; platform cooperativism, 19, 106, 178, 196–198 Couchsurfing, 38–40, 45, 121.


pages: 385 words: 111,113

Augmented: Life in the Smart Lane by Brett King

23andMe, 3D printing, additive manufacturing, Affordable Care Act / Obamacare, agricultural Revolution, Airbnb, Albert Einstein, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, Apollo 11, Apollo Guidance Computer, Apple II, artificial general intelligence, asset allocation, augmented reality, autonomous vehicles, barriers to entry, bitcoin, Bletchley Park, blockchain, Boston Dynamics, business intelligence, business process, call centre, chief data officer, Chris Urmson, Clayton Christensen, clean water, Computing Machinery and Intelligence, congestion charging, CRISPR, crowdsourcing, cryptocurrency, data science, deep learning, DeepMind, deskilling, different worldview, disruptive innovation, distributed generation, distributed ledger, double helix, drone strike, electricity market, Elon Musk, Erik Brynjolfsson, Fellow of the Royal Society, fiat currency, financial exclusion, Flash crash, Flynn Effect, Ford Model T, future of work, gamification, Geoffrey Hinton, gig economy, gigafactory, Google Glasses, Google X / Alphabet X, Hans Lippershey, high-speed rail, Hyperloop, income inequality, industrial robot, information asymmetry, Internet of things, invention of movable type, invention of the printing press, invention of the telephone, invention of the wheel, James Dyson, Jeff Bezos, job automation, job-hopping, John Markoff, John von Neumann, Kevin Kelly, Kickstarter, Kim Stanley Robinson, Kiva Systems, Kodak vs Instagram, Leonard Kleinrock, lifelogging, low earth orbit, low skilled workers, Lyft, M-Pesa, Mark Zuckerberg, Marshall McLuhan, megacity, Metcalfe’s law, Minecraft, mobile money, money market fund, more computing power than Apollo, Neal Stephenson, Neil Armstrong, Network effects, new economy, Nick Bostrom, obamacare, Occupy movement, Oculus Rift, off grid, off-the-grid, packet switching, pattern recognition, peer-to-peer, Ray Kurzweil, retail therapy, RFID, ride hailing / ride sharing, Robert Metcalfe, Salesforce, Satoshi Nakamoto, Second Machine Age, selective serotonin reuptake inhibitor (SSRI), self-driving car, sharing economy, Shoshana Zuboff, Silicon Valley, Silicon Valley startup, Skype, smart cities, smart grid, smart transportation, Snapchat, Snow Crash, social graph, software as a service, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, strong AI, synthetic biology, systems thinking, TaskRabbit, technological singularity, TED Talk, telemarketer, telepresence, telepresence robot, Tesla Model S, The future is already here, The Future of Employment, Tim Cook: Apple, trade route, Travis Kalanick, TSMC, Turing complete, Turing test, Twitter Arab Spring, uber lyft, undersea cable, urban sprawl, V2 rocket, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, white picket fence, WikiLeaks, yottabyte

In the peer-reviewed paper released by Baylor College of Medicine and IBM, at the conclusion of the study, scientists were able to demonstrate a possible new path for generating scientific questions that may be helpful in the long-term development of new, effective treatments for disease. In a matter of weeks, biologists and data scientists, using Watson technology, accurately identified proteins that modify the p53 protein structure16. The study noted that this feat would have taken researchers years to accomplish without Watson’s cognitive capabilities. Watson analysed 70,000 scientific articles on p53 to predict proteins that turn on or off p53’s activity.


pages: 343 words: 102,846

Trees on Mars: Our Obsession With the Future by Hal Niedzviecki

"World Economic Forum" Davos, Ada Lovelace, agricultural Revolution, Airbnb, Albert Einstein, Alvin Toffler, Amazon Robotics, anti-communist, big data - Walmart - Pop Tarts, big-box store, business intelligence, Charles Babbage, Colonization of Mars, computer age, crowdsourcing, data science, David Brooks, driverless car, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, Flynn Effect, Ford Model T, Future Shock, Google Glasses, hive mind, Howard Zinn, if you build it, they will come, income inequality, independent contractor, Internet of things, invention of movable type, Jaron Lanier, Jeff Bezos, job automation, John von Neumann, knowledge economy, Kodak vs Instagram, life extension, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Marshall McLuhan, Neil Armstrong, One Laptop per Child (OLPC), Peter H. Diamandis: Planetary Resources, Peter Thiel, Pierre-Simon Laplace, Ponzi scheme, precariat, prediction markets, Ralph Nader, randomized controlled trial, Ray Kurzweil, ride hailing / ride sharing, rising living standards, Robert Solow, Ronald Reagan, Salesforce, self-driving car, shareholder value, sharing economy, Silicon Valley, Silicon Valley startup, Skype, Steve Jobs, TaskRabbit, tech worker, technological singularity, technological solutionism, technoutopianism, Ted Kaczynski, TED Talk, Thomas L Friedman, Tyler Cowen, Uber and Lyft, uber lyft, Virgin Galactic, warehouse robotics, working poor

The games include Dungeon Scrawl, in which players have to move through quest-themed mazes and puzzles, and Wasabi Waiter, which, as you might imagine, is a game in which players have to match up sushi to an ever-growing number of customers. But here’s the rub. The games are carefully “designed by a team of neuroscientists, psychologists, and data scientists to suss out human potential.”35 According to Guy Halfteck, Knack’s founder, when you play any of the games, you generate massive amounts of data about how quickly you are able to solve problems and make the right decisions while multitasking and learning on the go. “The end result,” Halfteck says, “is a high-resolution portrait of your psyche and intellect, and an assessment of your potential as a leader or an innovator.”36 Hans Haringa, leader of petroleum giant Royal Dutch Shell’s GameChanger unit, asked about 1,400 people who had contributed ideas and proposals to the division to play Dungeon Scrawl and Wasabi Waiter.


pages: 385 words: 103,561

Pinpoint: How GPS Is Changing Our World by Greg Milner

Apollo 11, Ayatollah Khomeini, Boeing 747, British Empire, creative destruction, data acquisition, data science, Dava Sobel, different worldview, digital map, Easter island, Edmond Halley, Eratosthenes, experimental subject, Eyjafjallajökull, Flash crash, friendly fire, GPS: selective availability, Hedy Lamarr / George Antheil, Ian Bogost, Internet of things, Isaac Newton, John Harrison: Longitude, Kevin Kelly, Kwajalein Atoll, land tenure, lone genius, low earth orbit, Mars Rover, Mercator projection, place-making, polynesian navigation, precision agriculture, race to the bottom, Silicon Valley, Silicon Valley startup, Skinner box, skunkworks, smart grid, systems thinking, the map is not the territory, vertical integration

Whenever the satellite passed through the sky over a station, a sprawling network of ground antennas would measure the signal’s angle. That information would be sent back to Washington, converted into punch cards, and programmed into the computer. Between seven to nine hours after liftoff, the computer would have enough data to compute the satellite’s exact orbit and velocity. Minitrack had another component. Based on the data, scientists at the Smithsonian Astrophysical Observatory in Cambridge, Massachusetts, would calculate where the satellite would be most visible, and when. Scattered around the globe, in Florida, Mexico, Iran, Japan, and eight other locations, observation stations were established, each equipped with a camera that could locate an object up to 500 miles away, linked to a clock accurate to within a millisecond.


pages: 391 words: 105,382

Utopia Is Creepy: And Other Provocations by Nicholas Carr

Abraham Maslow, Air France Flight 447, Airbnb, Airbus A320, AltaVista, Amazon Mechanical Turk, augmented reality, autonomous vehicles, Bernie Sanders, book scanning, Brewster Kahle, Buckminster Fuller, Burning Man, Captain Sullenberger Hudson, centralized clearinghouse, Charles Lindbergh, cloud computing, cognitive bias, collaborative consumption, computer age, corporate governance, CRISPR, crowdsourcing, Danny Hillis, data science, deskilling, digital capitalism, digital map, disruptive innovation, Donald Trump, driverless car, Electric Kool-Aid Acid Test, Elon Musk, Evgeny Morozov, factory automation, failed state, feminist movement, Frederick Winslow Taylor, friendly fire, game design, global village, Google bus, Google Glasses, Google X / Alphabet X, Googley, hive mind, impulse control, indoor plumbing, interchangeable parts, Internet Archive, invention of movable type, invention of the steam engine, invisible hand, Isaac Newton, Jeff Bezos, jimmy wales, Joan Didion, job automation, John Perry Barlow, Kevin Kelly, Larry Ellison, Lewis Mumford, lifelogging, lolcat, low skilled workers, machine readable, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Max Levchin, means of production, Menlo Park, mental accounting, natural language processing, Neal Stephenson, Network effects, new economy, Nicholas Carr, Nick Bostrom, Norman Mailer, off grid, oil shale / tar sands, Peter Thiel, plutocrats, profit motive, Ralph Waldo Emerson, Ray Kurzweil, recommendation engine, Republic of Letters, robot derives from the Czech word robota Czech, meaning slave, Ronald Reagan, scientific management, self-driving car, SETI@home, side project, Silicon Valley, Silicon Valley ideology, Singularitarianism, Snapchat, social graph, social web, speech recognition, Startup school, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, technoutopianism, TED Talk, the long tail, the medium is the message, theory of mind, Turing test, Tyler Cowen, Whole Earth Catalog, Y Combinator, Yochai Benkler

Technology would replace ideology. Today’s neobehavioralism has also been inspired by advances in computer technology, particularly the establishment of vast data banks of information on people’s behavior and the development of automated statistical techniques to parse the information. The MIT data scientist Alex Pentland, in his revealingly titled 2014 book Social Physics, offered something of a manifesto for the new behavioralism, using terms that, consciously or not, echoed what was heard in the early sixties: We need to move beyond merely describing social structure to building a causal theory of social structure.


pages: 370 words: 107,983

Rage Inside the Machine: The Prejudice of Algorithms, and How to Stop the Internet Making Bigots of Us All by Robert Elliott Smith

"World Economic Forum" Davos, Ada Lovelace, adjacent possible, affirmative action, AI winter, Alfred Russel Wallace, algorithmic bias, algorithmic management, AlphaGo, Amazon Mechanical Turk, animal electricity, autonomous vehicles, behavioural economics, Black Swan, Brexit referendum, British Empire, Cambridge Analytica, cellular automata, Charles Babbage, citizen journalism, Claude Shannon: information theory, combinatorial explosion, Computing Machinery and Intelligence, corporate personhood, correlation coefficient, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, deep learning, DeepMind, desegregation, discovery of DNA, disinformation, Douglas Hofstadter, Elon Musk, fake news, Fellow of the Royal Society, feminist movement, Filter Bubble, Flash crash, Geoffrey Hinton, Gerolamo Cardano, gig economy, Gödel, Escher, Bach, invention of the wheel, invisible hand, Jacquard loom, Jacques de Vaucanson, John Harrison: Longitude, John von Neumann, Kenneth Arrow, Linda problem, low skilled workers, Mark Zuckerberg, mass immigration, meta-analysis, mutually assured destruction, natural language processing, new economy, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, On the Economy of Machinery and Manufactures, p-value, pattern recognition, Paul Samuelson, performance metric, Pierre-Simon Laplace, post-truth, precariat, profit maximization, profit motive, Silicon Valley, social intelligence, statistical model, Stephen Hawking, stochastic process, Stuart Kauffman, telemarketer, The Bell Curve by Richard Herrnstein and Charles Murray, The Future of Employment, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, Thomas Bayes, Thomas Malthus, traveling salesman, Turing machine, Turing test, twin studies, Vilfredo Pareto, Von Neumann architecture, warehouse robotics, women in the workforce, Yochai Benkler

In the US, a program called Correctional Offender Management Profiling for Alternative Sanctions (Compas) has been used to inform the decisions of judges when assessing the likelihood of defendants reoffending, by comparing the big data of many past defendants to features of the individual facing time behind bars.2 The Los Angeles Police Department worked with data scientists at UCLA and Santa Clara University to develop PredPol, a predictive policing program that maps out ‘crime hotspots’ where police should concentrate their presence, patrols, intelligence-gathering exercises and other efforts to prevent crime from happening, because according to Modesto Police Chief Galen Carroll, ‘burglars and thieves work in a mathematical way, whether they know it or not’.3 In each case, it’s assumed that the evaluation of data about what some people have done in the past can predict the propensities of what other people will do in the future.


pages: 445 words: 105,255

Radical Abundance: How a Revolution in Nanotechnology Will Change Civilization by K. Eric Drexler

3D printing, additive manufacturing, agricultural Revolution, Bill Joy: nanobots, Brownian motion, carbon footprint, Cass Sunstein, conceptual framework, continuation of politics by other means, crowdsourcing, dark matter, data science, double helix, failed state, Ford Model T, general purpose technology, global supply chain, Higgs boson, industrial robot, iterative process, Large Hadron Collider, Mars Rover, means of production, Menlo Park, mutually assured destruction, Neil Armstrong, New Journalism, Nick Bostrom, performance metric, radical decentralization, reversible computing, Richard Feynman, Silicon Valley, South China Sea, Thomas Malthus, V2 rocket, Vannevar Bush, Vision Fund, zero-sum game

From ancient photons captured by telescopes, astronomers infer the composition and motion of galaxies; by probing materials with electronic instruments, physicists infer the dynamics of systems of coupled electrons; by using their eyes and telemetry signals, ornithologists study how bar-tailed godwits cross from Alaska to New Zealand, spanning the Pacific in a single non-stop flight. Through data, scientists describe what they seek to explain. On the bridge to the top level of this sketch of science, concrete descriptions drive the evolution of theories, first by suggesting ideas about how the world works, and then by enabling tests of those ideas through an intellectual form of natural selection.


pages: 371 words: 108,317

The Inevitable: Understanding the 12 Technological Forces That Will Shape Our Future by Kevin Kelly

A Declaration of the Independence of Cyberspace, Aaron Swartz, AI winter, Airbnb, Albert Einstein, Alvin Toffler, Amazon Web Services, augmented reality, bank run, barriers to entry, Baxter: Rethink Robotics, bitcoin, blockchain, book scanning, Brewster Kahle, Burning Man, cloud computing, commoditize, computer age, Computer Lib, connected car, crowdsourcing, dark matter, data science, deep learning, DeepMind, dematerialisation, Downton Abbey, driverless car, Edward Snowden, Elon Musk, Filter Bubble, Freestyle chess, Gabriella Coleman, game design, Geoffrey Hinton, Google Glasses, hive mind, Howard Rheingold, index card, indoor plumbing, industrial robot, Internet Archive, Internet of things, invention of movable type, invisible hand, Jaron Lanier, Jeff Bezos, job automation, John Markoff, John Perry Barlow, Kevin Kelly, Kickstarter, lifelogging, linked data, Lyft, M-Pesa, machine readable, machine translation, Marc Andreessen, Marshall McLuhan, Mary Meeker, means of production, megacity, Minecraft, Mitch Kapor, multi-sided market, natural language processing, Netflix Prize, Network effects, new economy, Nicholas Carr, off-the-grid, old-boy network, peer-to-peer, peer-to-peer lending, personalized medicine, placebo effect, planetary scale, postindustrial economy, Project Xanadu, recommendation engine, RFID, ride hailing / ride sharing, robo advisor, Rodney Brooks, self-driving car, sharing economy, Silicon Valley, slashdot, Snapchat, social graph, social web, software is eating the world, speech recognition, Stephen Hawking, Steven Levy, Ted Nelson, TED Talk, The future is already here, the long tail, the scientific method, transport as a service, two-sided market, Uber for X, uber lyft, value engineering, Watson beat the top human players on Jeopardy!, WeWork, Whole Earth Review, Yochai Benkler, yottabyte, zero-sum game

Over the next 30 years, the great work will be parsing all the information we track and create—all the information of business, education, entertainment, science, sport, and social relations—into their most primeval elements. The scale of this undertaking requires massive cycles of cognition. Data scientists call this stage “machine readable” information, because it is AIs and not humans who will do this work in the zillions. When you hear a term like “big data,” this is what it is about. Out of this new chemistry of information will arise thousands of new compounds and informational building materials.


pages: 361 words: 107,461

How I Built This: The Unexpected Paths to Success From the World's Most Inspiring Entrepreneurs by Guy Raz

Airbnb, AOL-Time Warner, Apple II, barriers to entry, Bear Stearns, Ben Horowitz, Big Tech, big-box store, Black Monday: stock market crash in 1987, Blitzscaling, business logic, call centre, Clayton Christensen, commoditize, Cornelius Vanderbilt, Credit Default Swap, crowdsourcing, data science, East Village, El Camino Real, Elon Musk, fear of failure, glass ceiling, growth hacking, housing crisis, imposter syndrome, inventory management, It's morning again in America, iterative process, James Dyson, Jeff Bezos, Justin.tv, Kickstarter, low cost airline, Lyft, Marc Andreessen, Mark Zuckerberg, move fast and break things, Nate Silver, Paul Graham, Peter Thiel, pets.com, power law, rolodex, Ronald Reagan, Ruby on Rails, Salesforce, Sam Altman, Sand Hill Road, side hustle, Silicon Valley, software as a service, South of Market, San Francisco, Steve Jobs, Steve Wozniak, subprime mortgage crisis, TED Talk, The Signal and the Noise by Nate Silver, Tony Hsieh, Uber for X, uber lyft, Y Combinator, Zipcar

While it was true that Stitch Fix would be selling clothes, it was the technology that was going to make the business work, and there was only one place in the country in 2011 that had the kind of concentration of technology talent Katrina would need to scale up as quickly as she wanted. That place was Silicon Valley. Almost overnight, the gravity shifted. “The talent is here,” she said. “We have many, many data scientists and many, many engineers, and if we wanted to fulfill the vision of using technology to deliver our service, it would have been very difficult to do it elsewhere.” A couple years later, Dropbox CEO Drew Houston, who like Katrina had come up with his idea and founded his company while in school in Boston and then moved to San Francisco to launch it in earnest, gave the commencement address to the graduating class at his alma mater, MIT.


The Smart Wife: Why Siri, Alexa, and Other Smart Home Devices Need a Feminist Reboot by Yolande Strengers, Jenny Kennedy

active measures, Amazon Robotics, Anthropocene, autonomous vehicles, Big Tech, Boston Dynamics, cloud computing, cognitive load, computer vision, Computing Machinery and Intelligence, crowdsourcing, cyber-physical system, data science, deepfake, Donald Trump, emotional labour, en.wikipedia.org, Evgeny Morozov, fake news, feminist movement, game design, gender pay gap, Grace Hopper, hive mind, Ian Bogost, Intergovernmental Panel on Climate Change (IPCC), Internet of things, Jeff Bezos, John Markoff, Kitchen Debate, knowledge economy, Masayoshi Son, Milgram experiment, Minecraft, natural language processing, Network effects, new economy, pattern recognition, planned obsolescence, precautionary principle, robot derives from the Czech word robota Czech, meaning slave, self-driving car, Shoshana Zuboff, side hustle, side project, Silicon Valley, smart grid, smart meter, social intelligence, SoftBank, Steve Jobs, surveillance capitalism, systems thinking, technological solutionism, technoutopianism, TED Talk, Turing test, Wall-E, Wayback Machine, women in the workforce

Gemma Hartley, Fed Up: Emotional Labor, Women, and the Way Forward (New York: HarperOne, 2018). 72. Schiller and McMahon, “Alexa, Alert Me When the Revolution Comes.” 73. Schiller and McMahon, “Alexa, Alert Me When the Revolution Comes,” 185, citing a 2017 job description for a position as a data scientist in the “Alexa Engine” team. 74. Schiller and McMahon, “Alexa, Alert Me When the Revolution Comes,” 185. 75. Emma, “The Gender Wars of Household Chores: A Feminist Comic,” Guardian, May 26, 2017, https://www.theguardian.com/world/2017/may/26/gender-wars-household-chores-comic. 76. Caroline Criado Perez, Invisible Women: Data Bias in a World Designed for Men (New York: Abrams, 2019). 77.


pages: 387 words: 106,753

Why Startups Fail: A New Roadmap for Entrepreneurial Success by Tom Eisenmann

Airbnb, Atul Gawande, autonomous vehicles, Ben Horowitz, Big Tech, bitcoin, Blitzscaling, blockchain, call centre, carbon footprint, Checklist Manifesto, clean tech, conceptual framework, coronavirus, corporate governance, correlation does not imply causation, COVID-19, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, Dean Kamen, drop ship, Elon Musk, fail fast, fundamental attribution error, gig economy, growth hacking, Hyperloop, income inequality, initial coin offering, inventory management, Iridium satellite, Jeff Bezos, Jeff Hawkins, Larry Ellison, Lean Startup, Lyft, Marc Andreessen, margin call, Mark Zuckerberg, minimum viable product, Network effects, nuclear winter, Oculus Rift, PalmPilot, Paul Graham, performance metric, Peter Pan Syndrome, Peter Thiel, reality distortion field, Richard Thaler, ride hailing / ride sharing, risk/return, Salesforce, Sam Altman, Sand Hill Road, side project, Silicon Valley, Silicon Valley startup, Skype, social graph, software as a service, Solyndra, speech recognition, stealth mode startup, Steve Jobs, TED Talk, two-sided market, Uber and Lyft, Uber for X, uber lyft, vertical integration, We wanted flying cars, instead we got 140 characters, WeWork, Y Combinator, young professional, Zenefits

So, Nagaraj moved to Palo Alto after graduation to pursue his vision alone. He recalled, “I had blind faith, but I had no validation for my idea, no investors, no product, and no team. I look back and wonder how I ever did it.” Upon arrival, Nagaraj recruited two new co-founders: an engineer and a data scientist. The team finished building the first version of Triangulate’s matching engine in October 2009. This version automated the collection of users’ digital information from sites like Facebook, Twitter, and Netflix using browser plug-ins and application programming interfaces (APIs). To avoid all the technical jargon, Nagaraj referred to these plug-ins and APIs as “life-stream connectors.”


pages: 432 words: 106,612

Trillions: How a Band of Wall Street Renegades Invented the Index Fund and Changed Finance Forever by Robin Wigglesworth

Albert Einstein, algorithmic trading, asset allocation, Bear Stearns, behavioural economics, Benoit Mandelbrot, Big Tech, Black Monday: stock market crash in 1987, Blitzscaling, Brownian motion, buy and hold, California gold rush, capital asset pricing model, Carl Icahn, cloud computing, commoditize, coronavirus, corporate governance, corporate raider, COVID-19, data science, diversification, diversified portfolio, Donald Trump, Elon Musk, Eugene Fama: efficient market hypothesis, fear index, financial engineering, fixed income, Glass-Steagall Act, Henri Poincaré, index fund, industrial robot, invention of the wheel, Japanese asset price bubble, Jeff Bezos, Johannes Kepler, John Bogle, John von Neumann, Kenneth Arrow, lockdown, Louis Bachelier, machine readable, money market fund, Myron Scholes, New Journalism, passive investing, Paul Samuelson, Paul Volcker talking about ATMs, Performance of Mutual Funds in the Period, Peter Thiel, pre–internet, RAND corporation, random walk, risk-adjusted returns, road to serfdom, Robert Shiller, rolodex, seminal paper, Sharpe ratio, short selling, Silicon Valley, sovereign wealth fund, subprime mortgage crisis, the scientific method, transaction costs, uptick rule, Upton Sinclair, Vanguard fund

These days, even PhD economists aren’t guaranteed jobs in asset management, unless they have married their degree with a programming language like Python, which would allow them to parse vast digital datasets that are now commonplace, such as credit card data, satellite imagery, and consumer sentiment gleaned from continuously scraping billions of social media posts. Beating the market is not impossible. But the degree of difficulty in doing so consistently is far greater than it was in the past. Even giant, multibillion-dollar hedge funds staffed with an army of data scientists, programmers, rocket scientists, and the best financial minds in the industry can struggle to consistently outperform their benchmarks after fees. To use Mauboussin’s poker metaphor, not only are the remaining players around the table the best ones, but new ones entering the game are even more cunning, calculating, and inscrutable than in the past.* * * * ♦ THE RESULT IS THAT EVERY facet of the money management industry is being altered by the advent of index funds.


pages: 704 words: 182,312

This Is Service Design Doing: Applying Service Design Thinking in the Real World: A Practitioners' Handbook by Marc Stickdorn, Markus Edgar Hormess, Adam Lawrence, Jakob Schneider

"World Economic Forum" Davos, 3D printing, business cycle, business process, call centre, Clayton Christensen, commoditize, corporate governance, corporate social responsibility, crowdsourcing, data science, different worldview, Eyjafjallajökull, fail fast, glass ceiling, Internet of things, iterative process, Kanban, Lean Startup, M-Pesa, minimum viable product, mobile money, off-the-grid, pattern recognition, RFID, scientific management, side project, Silicon Valley, software as a service, stealth mode startup, sustainable-tourism, systems thinking, tacit knowledge, the built environment, the scientific method, urban planning, work culture

Lessons learned A valuable lesson was how the insights from the data science informed the ethnography (e.g., revealing how mental and physical health are related), and how the ethnography informed the data science (e.g., highlighting the non-health needs of those on health-related benefits). There is huge power in using these two techniques together, with the data science giving the broad, large-scale “what” and the ethnography providing the deep, rich “why.” KEY TAKEAWAYS 01 Data science can inform ethnographic insights (and vice versa) through correlation of different events. 02 Combine data science to understand the large-scale context with ethnography to determine the deeper meaning or “why” of your research. 03 When conducting research, speak with people from all ages, levels, and perspectives.

The approach The UK government’s Policy Lab and the joint Work & Health Unit (a joint unit sponsored by the Department of Health and Department for Work and Pensions) created a multidisciplinary team with the service design agency Uscreates, ethnography agency Keep Your Shoes Dirty, and data science organization Mastodon C, and involved around 70 service providers, users, and stakeholders to solve the problem. After a three-day sprint to properly diagnose the problem, we embarked on a discovery phase of ethnography and data science, and a develop phase where we co-designed and prototyped ideas which we are now taking to scale. We conducted ethnography with 30 users and people that supported them: doctors, employers, Jobcentre staff, and community groups. The insights We used data science techniques (Sankey analysis and k-means clustering) to look at patterns of people surveyed through the Understanding Society survey.

CAT DREW — SENIOR POLICY DESIGNER, UK POLICY LAB Cat is a hybrid policy maker and designer with more than 10 years of experience working in government, including at the Cabinet Office and No. 10. She also holds a postgraduate degree in Design. This allows her to seek out innovative new practices (such as speculative design, data visualization, and combining rich user insight and big data science) and experiment with how they could work in government. CHRIS FERGUSON — CEO, BRIDGEABLE Chris is a service design leader and CX strategist who works with complex organizations such as Roche, TELUS, Genentech, RBC, and Mount Sinai Hospital to increase the impact of their services. He is the Founder and CEO of Bridgeable, a lecturer at the University of Toronto’s Rotman School of Management, and the Co-Founder of the Canadian chapter of the Service Design Network.


pages: 373 words: 112,822

The Upstarts: How Uber, Airbnb, and the Killer Companies of the New Silicon Valley Are Changing the World by Brad Stone

Affordable Care Act / Obamacare, Airbnb, Amazon Web Services, Andy Kessler, autonomous vehicles, Ben Horowitz, Benchmark Capital, Boris Johnson, Burning Man, call centre, Chuck Templeton: OpenTable:, collaborative consumption, data science, Didi Chuxing, Dr. Strangelove, driverless car, East Village, fake it until you make it, fixed income, gentrification, Google X / Alphabet X, growth hacking, Hacker News, hockey-stick growth, housing crisis, inflight wifi, Jeff Bezos, John Zimmer (Lyft cofounder), Justin.tv, Kickstarter, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Menlo Park, Mitch Kapor, Necker cube, obamacare, PalmPilot, Paul Graham, peer-to-peer, Peter Thiel, power law, race to the bottom, rent control, ride hailing / ride sharing, Ruby on Rails, San Francisco homelessness, Sand Hill Road, self-driving car, semantic web, sharing economy, side project, Silicon Valley, Silicon Valley startup, Skype, SoftBank, South of Market, San Francisco, Startup school, Steve Jobs, TaskRabbit, tech bro, TechCrunch disrupt, Tony Hsieh, transportation-network company, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, ubercab, Y Combinator, Y2K, Zipcar

Once again seeking to take advantage of the attention that comes with an onstage appearance at an industry conference, Kalanick wanted to launch the company’s first international city during LeWeb, the European technology confab where, three years before, he and Camp had hashed over plans for a hypothetical on-demand car service. By then the startup had finally moved to its own office, on the seventh floor of 800 Market Street. It had a round conference room with broad windows that opened up onto Market, the city’s main commercial artery. There were twenty employees in the new office, mostly engineers and data scientists, and another dozen in the field. The engineers rebelled against the idea of opening overseas so soon. Launching in Paris required accepting foreign credit cards, converting euros to dollars, and translating the app into French, among other tasks. Kalanick simply directed his team to work harder.


pages: 298 words: 43,745

Understanding Sponsored Search: Core Elements of Keyword Advertising by Jim Jansen

AltaVista, AOL-Time Warner, barriers to entry, behavioural economics, Black Swan, bounce rate, business intelligence, butterfly effect, call centre, Claude Shannon: information theory, complexity theory, content marketing, correlation does not imply causation, data science, en.wikipedia.org, first-price auction, folksonomy, Future Shock, information asymmetry, information retrieval, intangible asset, inventory management, life extension, linear programming, longitudinal study, machine translation, megacity, Nash equilibrium, Network effects, PageRank, place-making, power law, price mechanism, psychological pricing, random walk, Schrödinger's Cat, sealed-bid auction, search costs, search engine result page, second-price auction, second-price sealed-bid, sentiment analysis, social bookmarking, social web, software as a service, stochastic process, tacit knowledge, telemarketer, the market place, The Present Situation in Quantum Mechanics, the scientific method, The Wisdom of Crowds, Vickrey auction, Vilfredo Pareto, yield management

Therefore, the prediction was that black swans cannot exist. However, black swans do exist, being native to Australia. Basically, in the end, we cannot prove that something will or will not occur just because it occurred or did not occur in the past. However, this does not mean that we cannot do anything with data. Scientists have gotten around this touchy point by using data only to disprove something. That is, empirically, we can show that there is evidence to disprove a hypothesis, but we cannot prove a hypothesis is true. The best we can say is that a hypothesis is supported based on the data. We see this in the warnings in the marketing literature of financial investments – “Past performance is not a guarantee of future success.”


pages: 354 words: 118,970

Transaction Man: The Rise of the Deal and the Decline of the American Dream by Nicholas Lemann

"Friedman doctrine" OR "shareholder theory", "World Economic Forum" Davos, Abraham Maslow, Affordable Care Act / Obamacare, Airbnb, airline deregulation, Alan Greenspan, Albert Einstein, augmented reality, basic income, Bear Stearns, behavioural economics, Bernie Sanders, Black-Scholes formula, Blitzscaling, buy and hold, capital controls, Carl Icahn, computerized trading, Cornelius Vanderbilt, corporate governance, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, deal flow, dematerialisation, diversified portfolio, Donald Trump, Elon Musk, Eugene Fama: efficient market hypothesis, Fairchild Semiconductor, financial deregulation, financial innovation, fixed income, future of work, George Akerlof, gig economy, Glass-Steagall Act, Henry Ford's grandson gave labor union leader Walter Reuther a tour of the company’s new, automated factory…, Ida Tarbell, index fund, information asymmetry, invisible hand, Irwin Jacobs, Joi Ito, Joseph Schumpeter, junk bonds, Kenneth Arrow, Kickstarter, life extension, Long Term Capital Management, Mark Zuckerberg, Mary Meeker, mass immigration, means of production, Metcalfe’s law, Michael Milken, money market fund, Mont Pelerin Society, moral hazard, Myron Scholes, Neal Stephenson, new economy, Norman Mailer, obamacare, PalmPilot, Paul Samuelson, Performance of Mutual Funds in the Period, Peter Thiel, price mechanism, principal–agent problem, profit maximization, proprietary trading, prudent man rule, public intellectual, quantitative trading / quantitative finance, Ralph Nader, Richard Thaler, road to serfdom, Robert Bork, Robert Metcalfe, rolodex, Ronald Coase, Ronald Reagan, Sand Hill Road, Savings and loan crisis, shareholder value, short selling, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Snow Crash, Social Responsibility of Business Is to Increase Its Profits, Steve Jobs, TaskRabbit, TED Talk, The Nature of the Firm, the payments system, the strength of weak ties, Thomas Kuhn: the structure of scientific revolutions, Thorstein Veblen, too big to fail, transaction costs, universal basic income, War on Poverty, white flight, working poor

What radio had been for Roosevelt, a new mass medium that offered unprecedented possibilities for a politician who wanted to connect with the public, the new online networks—which by now had far bigger audiences than newspapers, radio, or television—were for Obama. The White House hired a former LinkedIn executive, DJ Patil, as its first chief data scientist. LinkedIn provided proprietary data about the employment market to the White House, to be used in the annual “Economic Report of the President.” When the website associated with Obama’s health-care reform legislation had an unsuccessful debut, Hoffman was part of a group of Silicon Valley executives that organized a rescue operation.


pages: 501 words: 114,888

The Future Is Faster Than You Think: How Converging Technologies Are Transforming Business, Industries, and Our Lives by Peter H. Diamandis, Steven Kotler

Ada Lovelace, additive manufacturing, Airbnb, Albert Einstein, AlphaGo, Amazon Mechanical Turk, Amazon Robotics, augmented reality, autonomous vehicles, barriers to entry, Big Tech, biodiversity loss, bitcoin, blockchain, blood diamond, Boston Dynamics, Burning Man, call centre, cashless society, Charles Babbage, Charles Lindbergh, Clayton Christensen, clean water, cloud computing, Colonization of Mars, computer vision, creative destruction, CRISPR, crowdsourcing, cryptocurrency, data science, Dean Kamen, deep learning, deepfake, DeepMind, delayed gratification, dematerialisation, digital twin, disruptive innovation, Donald Shoup, driverless car, Easter island, Edward Glaeser, Edward Lloyd's coffeehouse, Elon Musk, en.wikipedia.org, epigenetics, Erik Brynjolfsson, Ethereum, ethereum blockchain, experimental economics, fake news, food miles, Ford Model T, fulfillment center, game design, Geoffrey West, Santa Fe Institute, gig economy, gigafactory, Google X / Alphabet X, gravity well, hive mind, housing crisis, Hyperloop, impact investing, indoor plumbing, industrial robot, informal economy, initial coin offering, intentional community, Intergovernmental Panel on Climate Change (IPCC), Internet of things, invention of the telegraph, Isaac Newton, Jaron Lanier, Jeff Bezos, job automation, Joseph Schumpeter, Kevin Kelly, Kickstarter, Kiva Systems, late fees, Law of Accelerating Returns, life extension, lifelogging, loss aversion, Lyft, M-Pesa, Mary Lou Jepsen, Masayoshi Son, mass immigration, megacity, meta-analysis, microbiome, microdosing, mobile money, multiplanetary species, Narrative Science, natural language processing, Neal Stephenson, Neil Armstrong, Network effects, new economy, New Urbanism, Nick Bostrom, Oculus Rift, One Laptop per Child (OLPC), out of africa, packet switching, peer-to-peer lending, Peter H. Diamandis: Planetary Resources, Peter Thiel, planned obsolescence, QR code, RAND corporation, Ray Kurzweil, RFID, Richard Feynman, Richard Florida, ride hailing / ride sharing, risk tolerance, robo advisor, Satoshi Nakamoto, Second Machine Age, self-driving car, Sidewalk Labs, Silicon Valley, Skype, smart cities, smart contracts, smart grid, Snapchat, SoftBank, sovereign wealth fund, special economic zone, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steve Jurvetson, Steven Pinker, Stewart Brand, supercomputer in your pocket, supply-chain management, tech billionaire, technoutopianism, TED Talk, Tesla Model S, Tim Cook: Apple, transaction costs, Uber and Lyft, uber lyft, unbanked and underbanked, underbanked, urban planning, Vision Fund, VTOL, warehouse robotics, Watson beat the top human players on Jeopardy!, We wanted flying cars, instead we got 140 characters, X Prize

They leverage (that is, rent out) the assets (spare bedrooms) of the crowd. These models also lean on staff-on-demand, which provides a company with the agility needed to adapt to a rapidly changing environment. Sure, this once meant call centers in India, but today it’s everything from micro-task laborers behind Amazon’s Mechanical Turk on the low end to Kaggle’s data scientist-on-demand service on the high end. The Free/Data Economy: This is the platform version of the “bait and hook” model, essentially baiting the customer with free access to a cool service (like Facebook) and then making money off the data gathered about that customer (also like Facebook). It also includes all the developments spurred by the big data revolution, which is allowing us to exploit micro-demographics like never before.


pages: 521 words: 110,286

Them and Us: How Immigrants and Locals Can Thrive Together by Philippe Legrain

affirmative action, Albert Einstein, AlphaGo, autonomous vehicles, Berlin Wall, Black Lives Matter, Boris Johnson, Brexit referendum, British Empire, call centre, centre right, Chelsea Manning, clean tech, coronavirus, corporate social responsibility, COVID-19, creative destruction, crowdsourcing, data science, David Attenborough, DeepMind, Demis Hassabis, demographic dividend, digital divide, discovery of DNA, Donald Trump, double helix, Edward Glaeser, en.wikipedia.org, eurozone crisis, failed state, Fall of the Berlin Wall, future of work, illegal immigration, immigration reform, informal economy, Jane Jacobs, job automation, Jony Ive, labour market flexibility, lockdown, low cost airline, low interest rates, low skilled workers, lump of labour, Mahatma Gandhi, Mark Zuckerberg, Martin Wolf, Mary Meeker, mass immigration, moral hazard, Mustafa Suleyman, Network effects, new economy, offshore financial centre, open borders, open immigration, postnationalism / post nation state, purchasing power parity, remote working, Richard Florida, ride hailing / ride sharing, Rishi Sunak, Ronald Reagan, Silicon Valley, Skype, SoftBank, Steve Jobs, tech worker, The Death and Life of Great American Cities, The future is already here, The Future of Employment, Tim Cook: Apple, Tyler Cowen, urban sprawl, WeWork, Winter of Discontent, women in the workforce, working-age population

‘If you want to attract the best talent, you need to be reflective of the talent in that market,’ says Eileen Taylor, Deutsche Bank’s global head of diversity.36 Diverse perspectives can also help generate better solutions to problems. In Chapter 9 on the deftness dividend, we met Rúben, the Portuguese head of design at Century Tech, a London-based edtech company. He manages a team made up of a Lithuanian designer, a Dutch one and three Britons. Century also employs a Chinese-born data scientist who studied in Toronto, San Diego and London, both a Brazilian engineer and an Argentinian one who previously worked in Germany, a developer who is an Eritrean-born Swede, an account manager from Lebanon and a project manager from Ukraine, among others. ‘It’s great to work with a diverse team with people from different backgrounds,’ Rúben remarks.


pages: 463 words: 115,103

Head, Hand, Heart: Why Intelligence Is Over-Rewarded, Manual Workers Matter, and Caregivers Deserve More Respect by David Goodhart

active measures, Airbnb, Albert Einstein, assortative mating, basic income, Berlin Wall, Bernie Sanders, Big Tech, big-box store, Black Lives Matter, Boris Johnson, Branko Milanovic, Brexit referendum, British Empire, call centre, Cass Sunstein, central bank independence, centre right, computer age, corporate social responsibility, COVID-19, data science, David Attenborough, David Brooks, deglobalization, deindustrialization, delayed gratification, desegregation, deskilling, different worldview, Donald Trump, Elon Musk, emotional labour, Etonian, fail fast, Fall of the Berlin Wall, Flynn Effect, Frederick Winslow Taylor, future of work, gender pay gap, George Floyd, gig economy, glass ceiling, Glass-Steagall Act, Great Leap Forward, illegal immigration, income inequality, James Hargreaves, James Watt: steam engine, Jeff Bezos, job automation, job satisfaction, John Maynard Keynes: Economic Possibilities for our Grandchildren, knowledge economy, knowledge worker, labour market flexibility, lockdown, longitudinal study, low skilled workers, Mark Zuckerberg, mass immigration, meritocracy, new economy, Nicholas Carr, oil shock, pattern recognition, Peter Thiel, pink-collar, post-industrial society, post-materialism, postindustrial economy, precariat, reshoring, Richard Florida, robotic process automation, scientific management, Scientific racism, Skype, social distancing, social intelligence, spinning jenny, Steven Pinker, superintelligent machines, TED Talk, The Bell Curve by Richard Herrnstein and Charles Murray, The Rise and Fall of American Growth, Thorstein Veblen, twin studies, Tyler Cowen, Tyler Cowen: Great Stagnation, universal basic income, upwardly mobile, wages for housework, winner-take-all economy, women in the workforce, young professional

However, our experience suggests that many of the recipients of professional advice are in fact seeking a reliable solution or outcome rather than a trusted adviser per se.”9 Like Haldane and Baldwin, the Susskinds advise young people either to look for jobs that either favor human capabilities over artificial intelligence—above all, creativity and empathy—or become directly involved in the design and delivery of these increasingly capable systems as a data scientist or knowledge engineer. Less Room at the Top There is plenty of plausible skepticism about how swiftly AI is going to advance. Yet even if some of the predictions turn out to be overenthusiastic, there is other evidence for the decline and fall of the knowledge worker all around us: the decline in the graduate pay premium, especially in the United Kingdom; the increasing number of graduates in nongraduate jobs; and even the shrinkage, or at least slower growth, of the top managerial and professional social class.


pages: 385 words: 112,842

Arriving Today: From Factory to Front Door -- Why Everything Has Changed About How and What We Buy by Christopher Mims

air freight, Airbnb, Amazon Robotics, Amazon Web Services, Apollo 11, augmented reality, autonomous vehicles, big-box store, blue-collar work, Boeing 747, book scanning, business logic, business process, call centre, cloud computing, company town, coronavirus, cotton gin, COVID-19, creative destruction, data science, Dava Sobel, deep learning, dematerialisation, deskilling, digital twin, Donald Trump, easy for humans, difficult for computers, electronic logging device, Elon Musk, Frederick Winslow Taylor, fulfillment center, gentrification, gig economy, global pandemic, global supply chain, guest worker program, Hans Moravec, heat death of the universe, hive mind, Hyperloop, immigration reform, income inequality, independent contractor, industrial robot, interchangeable parts, intermodal, inventory management, Jacquard loom, Jeff Bezos, Jessica Bruder, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, Joseph Schumpeter, Kaizen: continuous improvement, Kanban, Kiva Systems, level 1 cache, Lewis Mumford, lockdown, lone genius, Lyft, machine readable, Malacca Straits, Mark Zuckerberg, market bubble, minimum wage unemployment, Nomadland, Ocado, operation paperclip, Panamax, Pearl River Delta, planetary scale, pneumatic tube, polynesian navigation, post-Panamax, random stow, ride hailing / ride sharing, robot derives from the Czech word robota Czech, meaning slave, Rodney Brooks, rubber-tired gantry crane, scientific management, self-driving car, sensor fusion, Shenzhen special economic zone , Shoshana Zuboff, Silicon Valley, six sigma, skunkworks, social distancing, South China Sea, special economic zone, spinning jenny, standardized shipping container, Steve Jobs, supply-chain management, surveillance capitalism, TED Talk, the scientific method, Tim Cook: Apple, Toyota Production System, traveling salesman, Turing test, two-sided market, Uber and Lyft, Uber for X, uber lyft, Upton Sinclair, vertical integration, warehouse automation, warehouse robotics, workplace surveillance

Upward mobility may be at historic lows in the United States, but education and hard work still propel some out of these jobs. Amazon itself announced in July 2019 that it would spend more than $700 million to “upskill” 100,000 of its employees, propelling them into roles including “data mapping specialist, data scientist, solutions architect and business analyst, as well as logistics coordinator, process improvement manager and transportation specialist.” I asked associates about this program and the opportunities for training at Amazon in general. Associates in the Baltimore fulfillment center said that the most popular course of study for associates there was a commercial driver’s license.


pages: 229 words: 75,606

Two and Twenty: How the Masters of Private Equity Always Win by Sachin Khajuria

"World Economic Forum" Davos, affirmative action, bank run, barriers to entry, Big Tech, blockchain, business cycle, buy and hold, carried interest, COVID-19, credit crunch, data science, decarbonisation, disintermediation, diversification, East Village, financial engineering, gig economy, glass ceiling, high net worth, hiring and firing, impact investing, index fund, junk bonds, Kickstarter, low interest rates, mass affluent, moral hazard, passive investing, race to the bottom, random walk, risk/return, rolodex, Rubik’s Cube, Silicon Valley, sovereign wealth fund, two and twenty, Vanguard fund, zero-sum game

Just by examining deals, as well as doing them, you gain valuable investing experience—including sector expertise and contacts with management teams. And your picture about the macroeconomy sharpens markedly, too. You develop deep knowledge about the state of the economy by analyzing its component parts. You have an edge. Once you have that edge, you do not stop. You do not stand still; you invest in data science and machine learning to harness more of the power of the information you gather. And when you combine that power with your track record for investing and the enormous reserves of cash you have on tap to do deals, you start to outpace your rivals more and more. It is a virtuous circle, with the odd mishap along the way, a deal here or there that fails and can’t be salvaged.

It is clear to the special committee that although the two firms were founded in the same year, the Firm is light-years ahead. It has grown to be more than three times larger than Madison Stone in assets under management across its investment strategies for private capital. The Firm has the best tools and people in data science, information technology, financial reporting, risk management, and environmental impact—part of the core infrastructure of a modern private equity firm. The Firm already has a preliminary view on each asset in Madison Stone’s portfolio, using its library—of the macro picture and the micro trends of the customers, suppliers, and supply and demand.

They have achieved this evolution, remarkably, without diluting the cultures and working practices that are specific to their firms and the traits of success we have been discussing. Lateral hires have been integrated into partnerships and senior management layers, and first-class talent has been brought in to run key parts of a firm’s infrastructure from the CFO’s office to human capital to data science. Where it has made sense to seed or buy a stake in another firm rather than start up a new industry vertical, they have taken the opportunity to do so. Double-digit growth, delivered each year. Whisper it softly…but the truth is that comparing what private equity firms used to be—and where the perception of private equity still sits in many quarters—to what they are now is like comparing a Motorola cellphone from the 1990s to the latest iPhone.


pages: 481 words: 125,946

What to Think About Machines That Think: Today's Leading Thinkers on the Age of Machine Intelligence by John Brockman

Adam Curtis, agricultural Revolution, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic trading, Anthropocene, artificial general intelligence, augmented reality, autism spectrum disorder, autonomous vehicles, backpropagation, basic income, behavioural economics, bitcoin, blockchain, bread and circuses, Charles Babbage, clean water, cognitive dissonance, Colonization of Mars, complexity theory, computer age, computer vision, constrained optimization, corporate personhood, cosmological principle, cryptocurrency, cuban missile crisis, Danny Hillis, dark matter, data science, deep learning, DeepMind, Demis Hassabis, digital capitalism, digital divide, digital rights, discrete time, Douglas Engelbart, driverless car, Elon Musk, Emanuel Derman, endowment effect, epigenetics, Ernest Rutherford, experimental economics, financial engineering, Flash crash, friendly AI, functional fixedness, global pandemic, Google Glasses, Great Leap Forward, Hans Moravec, hive mind, Ian Bogost, income inequality, information trail, Internet of things, invention of writing, iterative process, James Webb Space Telescope, Jaron Lanier, job automation, Johannes Kepler, John Markoff, John von Neumann, Kevin Kelly, knowledge worker, Large Hadron Collider, lolcat, loose coupling, machine translation, microbiome, mirror neurons, Moneyball by Michael Lewis explains big data, Mustafa Suleyman, natural language processing, Network effects, Nick Bostrom, Norbert Wiener, paperclip maximiser, pattern recognition, Peter Singer: altruism, phenotype, planetary scale, Ray Kurzweil, Recombinant DNA, recommendation engine, Republic of Letters, RFID, Richard Thaler, Rory Sutherland, Satyajit Das, Search for Extraterrestrial Intelligence, self-driving car, sharing economy, Silicon Valley, Skype, smart contracts, social intelligence, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, Steven Pinker, Stewart Brand, strong AI, Stuxnet, superintelligent machines, supervolcano, synthetic biology, systems thinking, tacit knowledge, TED Talk, the scientific method, The Wisdom of Crowds, theory of mind, Thorstein Veblen, too big to fail, Turing machine, Turing test, Von Neumann architecture, Watson beat the top human players on Jeopardy!, We are as Gods, Y2K

Next question. An executive might ask, “The algorithm is doing well on loan applications in the United Kingdom. Will it also do well if we deploy it in Brazil?” There’s no satisfying answer here, either; we’re not good at assessing how well a highly optimized rule will transfer to a new domain. A data scientist might say, “We know how well the algorithm does with the data it has. But surely more information about the consumers would help it. What new data should we collect?” Our human domain knowledge suggests lots of possibilities, but with an incomprehensible algorithm we don’t know which of these possibilities will help it.


pages: 593 words: 118,995

Relevant Search: With Examples Using Elasticsearch and Solr by Doug Turnbull, John Berryman

business logic, cognitive load, commoditize, crowdsourcing, data science, domain-specific language, Dr. Strangelove, fail fast, finite state, fudge factor, full text search, heat death of the universe, information retrieval, machine readable, natural language processing, premature optimization, recommendation engine, sentiment analysis, the long tail

Often simpler relevance gains can be gathered with the straightforward techniques discussed earlier in this book. In our consulting work, we’re often hired to implement an advanced solution when a far simpler adjustment can provide more immediate and less risky gains for an organization. You don’t need data scientists to provide a simple tweak to an analyzer or query strategy that gains you a significant—and with test-driven relevancy—measurable improvement to search’s bottom line. Nevertheless, with the right expertise and data in place, learning to rank can be extremely powerful in helping push beyond the “diminishing returns” of relevance tuning.


pages: 421 words: 120,332

The World in 2050: Four Forces Shaping Civilization's Northern Future by Laurence C. Smith

Boeing 747, Bretton Woods, BRICs, business cycle, clean water, climate change refugee, Climategate, colonial rule, data science, deglobalization, demographic transition, Deng Xiaoping, Easter island, electricity market, energy security, flex fuel, G4S, global supply chain, Google Earth, Great Leap Forward, guest worker program, Hans Island, hydrogen economy, ice-free Arctic, informal economy, Intergovernmental Panel on Climate Change (IPCC), invention of agriculture, invisible hand, land tenure, Martin Wolf, Medieval Warm Period, megacity, megaproject, Mikhail Gorbachev, New Urbanism, oil shale / tar sands, oil shock, peak oil, Pearl River Delta, purchasing power parity, Ronald Reagan, Ronald Reagan: Tear down this wall, side project, Silicon Valley, smart grid, sovereign wealth fund, special economic zone, standardized shipping container, The Wealth of Nations by Adam Smith, Thomas Malthus, trade liberalization, trade route, Tragedy of the Commons, UNCLOS, UNCLOS, urban planning, Washington Consensus, Y2K

From this model and others, we see that by midcentury the Mediterranean, southwestern North America, north and south Africa, the Middle East, central Asia and India, northern China, Australia, Chile, and eastern Brazil will be facing even tougher water-supply challenges than they do today. One model even projects the eventual disappearance of the Jordan River and the Fertile Crescent267—the slow, convulsing death of agriculture in the very cradle of its birth. Computer models like these aren’t built and run in a vacuum. They are built and tuned using whatever real-world data scientists can get their hands on. Take, for example, the western United States. In Kansas, falling water tables from groundwater mining is already drying up the streams that refill four federal reservoirs; another in Oklahoma is now bone-dry. These past observed trends, together with reasonable expectations of climate change, suggest that over half of the region’s surface water supply will be gone by 2050.268 Kevin Mulligan’s projection of the remaining life of the southern Ogallala Aquifer requires no climate models at all—it simply subtracts how much water we are currently pumping from what’s left in the ground, then counts down the remaining years until the water is gone.


pages: 402 words: 126,835

The Job: The Future of Work in the Modern Era by Ellen Ruppel Shell

"Friedman doctrine" OR "shareholder theory", 3D printing, Abraham Maslow, affirmative action, Affordable Care Act / Obamacare, Airbnb, airport security, Albert Einstein, AlphaGo, Amazon Mechanical Turk, basic income, Baxter: Rethink Robotics, big-box store, blue-collar work, Buckminster Fuller, call centre, Capital in the Twenty-First Century by Thomas Piketty, Clayton Christensen, cloud computing, collective bargaining, company town, computer vision, corporate governance, corporate social responsibility, creative destruction, crowdsourcing, data science, deskilling, digital divide, disruptive innovation, do what you love, Donald Trump, Downton Abbey, Elon Musk, emotional labour, Erik Brynjolfsson, factory automation, follow your passion, Frederick Winslow Taylor, future of work, game design, gamification, gentrification, glass ceiling, Glass-Steagall Act, hiring and firing, human-factors engineering, immigration reform, income inequality, independent contractor, industrial research laboratory, industrial robot, invisible hand, It's morning again in America, Jeff Bezos, Jessica Bruder, job automation, job satisfaction, John Elkington, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, Joseph Schumpeter, Kickstarter, knowledge economy, knowledge worker, Kodak vs Instagram, labor-force participation, low skilled workers, Lyft, manufacturing employment, Marc Andreessen, Mark Zuckerberg, means of production, move fast and break things, new economy, Norbert Wiener, obamacare, offshore financial centre, Paul Samuelson, precariat, Quicken Loans, Ralph Waldo Emerson, risk tolerance, Robert Gordon, Robert Shiller, Rodney Brooks, Ronald Reagan, scientific management, Second Machine Age, self-driving car, shareholder value, sharing economy, Silicon Valley, Snapchat, Steve Jobs, stock buybacks, TED Talk, The Chicago School, The Theory of the Leisure Class by Thorstein Veblen, Thomas L Friedman, Thorstein Veblen, Tim Cook: Apple, Uber and Lyft, uber lyft, universal basic income, urban renewal, Wayback Machine, WeWork, white picket fence, working poor, workplace surveillance , Y Combinator, young professional, zero-sum game

the Weather Channel broadcasts 18 million forecasts John Koetsier, “Data Deluge: What People Do on the Internet, Every Minute of Every Day,” Inc.com, July 25, 2017, https://www.inc.com/​john-koetsier/​every-minute-on-the-internet-2017-new-numbers-to-b.html. continuously improving its performance Many thanks to the very kind and patient data scientists who helped clarify this for me, and also see Christof Koch, “How the Computer Beat the Go Master,” Scientific American, March 19, 2016, https://www.scien­tificam­erican.com/​article/​how-the-computer-beat-the-go-master/. the third leading cause of death in America Martin A. Makary and Michael Daniel, “Medical Error: The Third Leading Cause of Death in the US,” British Medical Journal, May 3, 2016, i2139, http://dx.doi.org/​doi:10.1136/​bmj.i2139.


pages: 385 words: 123,168

Bullshit Jobs: A Theory by David Graeber

1960s counterculture, active measures, antiwork, basic income, Berlin Wall, Bernie Sanders, Bertrand Russell: In Praise of Idleness, Black Lives Matter, Bretton Woods, Buckminster Fuller, business logic, call centre, classic study, cognitive dissonance, collateralized debt obligation, data science, David Graeber, do what you love, Donald Trump, emotional labour, equal pay for equal work, full employment, functional programming, global supply chain, High speed trading, hiring and firing, imposter syndrome, independent contractor, informal economy, Jarndyce and Jarndyce, Jarndyce and Jarndyce, job automation, John Maynard Keynes: technological unemployment, knowledge worker, moral panic, Post-Keynesian economics, post-work, precariat, Rutger Bregman, scientific management, Silicon Valley, Silicon Valley startup, single-payer health, software as a service, telemarketer, The Future of Employment, Thorstein Veblen, too big to fail, Travis Kalanick, universal basic income, unpaid internship, wage slave, wages for housework, women in the workforce, working poor, Works Progress Administration, young professional, éminence grise

Then the total number of “fails” in each department would be turned over to be tabulated by a metrics division, this allowing everyone involved to spend hours every week in meetings arguing over whether any particular “fail” was real. Irene: There was an even higher caste of bullshit, propped atop the metrics bullshit, which were the data scientists. Their job was to collect the fail metrics and apply complex software to make pretty pictures out of the data. The bosses would then take these pretty pictures to their bosses, which helped ease the awkwardness inherent in the fact that they had no idea what they were talking about or what any of their teams actually did.


pages: 480 words: 119,407

Invisible Women by Caroline Criado Perez

"Hurricane Katrina" Superdome, Affordable Care Act / Obamacare, algorithmic bias, augmented reality, Bernie Sanders, Cambridge Analytica, collective bargaining, crowdsourcing, data science, Diane Coyle, Donald Trump, falling living standards, first-past-the-post, gender pay gap, gig economy, glass ceiling, Grace Hopper, Hacker Ethic, independent contractor, Indoor air pollution, informal economy, lifelogging, low skilled workers, mental accounting, meta-analysis, Nate Silver, new economy, obamacare, Oculus Rift, offshore financial centre, pattern recognition, phenotype, post-industrial society, randomized controlled trial, remote working, Sheryl Sandberg, Silicon Valley, Simon Kuznets, speech recognition, stem cell, Stephen Hawking, Steven Levy, tech bro, the built environment, urban planning, women in the workforce, work culture , zero-sum game

A widely quoted 1967 psychological paper had identified a ‘disinterest in people’ and a dislike of ‘activities involving close personal interaction’ as a ‘striking characteristic of programmers’.62 As a result, companies sought these people out, they became the top programmers of their generation, and the psychological profile became a self-fulfilling prophecy. This being the case, it should not surprise us to find this kind of hidden bias enjoying a resurgence today courtesy of the secretive algorithms that have become increasingly involved in the hiring process. Writing for the Guardian, Cathy O’Neil, the American data scientist and author of Weapons of Math Destruction, explains how online tech-hiring platform Gild (which has now been bought and brought in-house by investment firm Citadel63) enables employers to go well beyond a job applicant’s CV, by combing through their ‘social data’.64 That is, the trace they leave behind them online.


pages: 432 words: 124,635

Happy City: Transforming Our Lives Through Urban Design by Charles Montgomery

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, Abraham Maslow, accelerated depreciation, agricultural Revolution, American Society of Civil Engineers: Report Card, Apollo 11, behavioural economics, Bernie Madoff, Boeing 747, British Empire, Buckminster Fuller, car-free, carbon credits, carbon footprint, centre right, City Beautiful movement, clean water, congestion charging, correlation does not imply causation, data science, Donald Shoup, East Village, edge city, energy security, Enrique Peñalosa, experimental subject, food desert, Frank Gehry, General Motors Futurama, gentrification, Google Earth, happiness index / gross national happiness, hedonic treadmill, Home mortgage interest deduction, housing crisis, income inequality, income per capita, Induced demand, Intergovernmental Panel on Climate Change (IPCC), invisible hand, Jane Jacobs, license plate recognition, McMansion, means of production, megacity, Menlo Park, meta-analysis, mortgage tax deduction, New Urbanism, Panopticon Jeremy Bentham, peak oil, Ponzi scheme, power law, rent control, restrictive zoning, ride hailing / ride sharing, risk tolerance, science of happiness, Seaside, Florida, Silicon Valley, starchitect, streetcar suburb, the built environment, The Death and Life of Great American Cities, the High Line, The Spirit Level, The Wealth of Nations by Adam Smith, trade route, transit-oriented development, upwardly mobile, urban planning, urban sprawl, wage slave, white flight, World Values Survey, zero-sum game, Zipcar

They say they want economic development, livability, mobility, housing affordability, taxes, all stuff that relates to happiness.” These are just the concerns that have caused us to delay action on climate change. But Boston insists that by focusing on the relationship between energy, efficiency, and the things that make life better, cities can succeed where scary data, scientists, logic, and conscience have failed. The happy city plan is an energy plan. It is a climate plan. It is a belt-tightening plan for cash-strapped cities. It is also an economic plan, a jobs plan, and a corrective for weak systems. It is a plan for resilience. The Green Surprise Consider the by-product of the happy city project in Bogotá.


pages: 401 words: 119,488

Smarter Faster Better: The Secrets of Being Productive in Life and Business by Charles Duhigg

Air France Flight 447, Asperger Syndrome, Atul Gawande, behavioural economics, Black Swan, cognitive dissonance, Daniel Kahneman / Amos Tversky, data science, David Brooks, digital map, epigenetics, Erik Brynjolfsson, framing effect, high-speed rail, hiring and firing, index card, John von Neumann, knowledge worker, Lean Startup, Malcom McLean invented shipping containers, meta-analysis, new economy, power law, Saturday Night Live, Silicon Valley, Silicon Valley startup, statistical model, Steve Jobs, the scientific method, the strength of weak ties, theory of mind, Toyota Production System, William Langewiesche, Yom Kippur War

Since I contacted Gawande four years ago, I’ve sought out neurologists, businesspeople, government leaders, psychologists, and other productivity experts. I’ve spoken to the filmmakers behind Disney’s Frozen, and learned how they made one of the most successful movies in history under crushing time pressures—and narrowly averted disaster—by fostering a certain kind of creative tension within their ranks. I talked to data scientists at Google and writers from the early seasons of Saturday Night Live who said both organizations were successful, in part, because they abided by a similar set of unwritten rules regarding mutual support and risk taking. I interviewed FBI agents who solved a kidnapping through agile management and a culture influenced by an old auto plant in Fremont, California.


pages: 457 words: 126,996

Hacker, Hoaxer, Whistleblower, Spy: The Story of Anonymous by Gabriella Coleman

1960s counterculture, 4chan, Aaron Swartz, Amazon Web Services, Bay Area Rapid Transit, bitcoin, Chelsea Manning, citizen journalism, cloud computing, collective bargaining, corporate governance, creative destruction, crowdsourcing, data science, David Graeber, Debian, digital rights, disinformation, do-ocracy, East Village, Eben Moglen, Edward Snowden, false flag, feminist movement, Free Software Foundation, Gabriella Coleman, gentrification, George Santayana, Hacker News, hive mind, impulse control, information security, Jacob Appelbaum, jimmy wales, John Perry Barlow, Julian Assange, Laura Poitras, lolcat, low cost airline, mandatory minimum, Mohammed Bouazizi, Network effects, Occupy movement, Oklahoma City bombing, operational security, pirate software, power law, Richard Stallman, SETI@home, side project, Silicon Valley, Skype, SQL injection, Steven Levy, Streisand effect, TED Talk, Twitter Arab Spring, WikiLeaks, zero day

Even if Anonymous could never replicate the high levels of its LulzSec/AntiSec days (back when Fox News’ description of them as “hackers on steroids” was apt), Anonymous carried along just fine through 2012 and much of 2013, executing major hacks and attacks across the world. Its formidable reputation is best illustrated by an anecdote about the highest echelons of US officialdom. In 2012, Barack Obama’s reelection campaign team assembled a group of programmers, system administrators, mathematicians, and data scientists to fine-tune voter targeting. Journalists praised Obama’s star-studded and maverick technology team, detailing its members’ hard work, success, and travails, and ultimately heralding the system as a success. These articles, however, failed to report one of the team’s big concerns. Throughout the campaign, the technologists had treated Anonymous as a potentially even bigger nuisance than the foreign state hackers who had infiltrated the McCain and Obama campaigns in 2008.38 In late November 2012, Asher Wolf, a geek crusader who acted as a sometimes-informal-adviser to Anonymous, noticed that Harper Reed, the chief technologist for Obama’s reelection tech team, followed @AnonyOps on Twitter.


pages: 756 words: 120,818

The Levelling: What’s Next After Globalization by Michael O’sullivan

"World Economic Forum" Davos, 3D printing, Airbnb, Alan Greenspan, algorithmic trading, Alvin Toffler, bank run, banking crisis, barriers to entry, Bernie Sanders, Big Tech, bitcoin, Black Swan, blockchain, bond market vigilante , Boris Johnson, Branko Milanovic, Bretton Woods, Brexit referendum, British Empire, business cycle, business process, capital controls, carbon tax, Celtic Tiger, central bank independence, classic study, cloud computing, continuation of politics by other means, corporate governance, credit crunch, CRISPR, cryptocurrency, data science, deglobalization, deindustrialization, disinformation, disruptive innovation, distributed ledger, Donald Trump, driverless car, eurozone crisis, fake news, financial engineering, financial innovation, first-past-the-post, fixed income, gentrification, Geoffrey West, Santa Fe Institute, Gini coefficient, Glass-Steagall Act, global value chain, housing crisis, impact investing, income inequality, Intergovernmental Panel on Climate Change (IPCC), It's morning again in America, James Carville said: "I would like to be reincarnated as the bond market. You can intimidate everybody.", junk bonds, knowledge economy, liberal world order, Long Term Capital Management, longitudinal study, low interest rates, market bubble, minimum wage unemployment, new economy, Northern Rock, offshore financial centre, open economy, opioid epidemic / opioid crisis, Paris climate accords, pattern recognition, Peace of Westphalia, performance metric, Phillips curve, private military company, quantitative easing, race to the bottom, reserve currency, Robert Gordon, Robert Shiller, Robert Solow, Ronald Reagan, Scramble for Africa, secular stagnation, Silicon Valley, Sinatra Doctrine, South China Sea, South Sea Bubble, special drawing rights, Steve Bannon, Suez canal 1869, supply-chain management, The inhabitant of London could order by telephone, sipping his morning tea in bed, the various products of the whole earth, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, total factor productivity, trade liberalization, tulip mania, Valery Gerasimov, Washington Consensus

India is a case in point: from 1991 to 2011 the number of internal migrants more than doubled. In 60 percent of cases, global migration consists of people moving to neighboring countries. As an example, a large proportion of Indian migrants move to regional neighbors, such as the United Arab Emirates, Kuwait, and Saudi Arabia.7 An excellent resource here comes from the data scientist Max Galka, who graphically tracks the flow of immigrants across the world.8 He produces long-term charts showing the waves of immigration into the United States over the past two centuries. His data show that America is founded on a bedrock of German, Irish, Italian, and eastern European immigrants and that lately (since the late 1980s) the biggest flow of immigrants has come from Mexico.


pages: 578 words: 131,346

Humankind: A Hopeful History by Rutger Bregman

"Hurricane Katrina" Superdome, Airbnb, Anton Chekhov, basic income, behavioural economics, Berlin Wall, bitcoin, Bletchley Park, Broken windows theory, call centre, data science, David Graeber, domesticated silver fox, Donald Trump, Easter island, experimental subject, fake news, Fall of the Berlin Wall, Frederick Winslow Taylor, Garrett Hardin, Hans Rosling, invention of writing, invisible hand, knowledge economy, late fees, Mahatma Gandhi, mass incarceration, meta-analysis, Milgram experiment, mirror neurons, Nelson Mandela, New Journalism, nocebo, placebo effect, Rutger Bregman, scientific management, sharing economy, Shoshana Zuboff, Silicon Valley, social intelligence, Stanford prison experiment, Stephen Fry, Stephen Hawking, Steve Jobs, Steven Pinker, surveillance capitalism, TED Talk, The Spirit Level, The Wealth of Nations by Adam Smith, Tragedy of the Commons, transatlantic slave trade, tulip mania, universal basic income, W. E. B. Du Bois, World Values Survey

Psychologically, physiologically, neurologically – they must be every kind of screwed up. They must be psychopaths, or maybe they never went to school, or grew up in abject poverty – there must be something to explain why they deviate so far from the average person. Not so, say sociologists. These stoic data scientists have filled miles of Excel sheets with the personality traits of people who have blown themselves up, only to conclude that, empirically, there is no such thing as an ‘average terrorist’. Terrorists span the spectrum from highly to hardly educated, from rich to poor, silly to serious, religious to godless.


The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns, Aaron Roth

23andMe, affirmative action, algorithmic bias, algorithmic trading, Alignment Problem, Alvin Roth, backpropagation, Bayesian statistics, bitcoin, cloud computing, computer vision, crowdsourcing, data science, deep learning, DeepMind, Dr. Strangelove, Edward Snowden, Elon Musk, fake news, Filter Bubble, general-purpose programming language, Geoffrey Hinton, Google Chrome, ImageNet competition, Lyft, medical residency, Nash equilibrium, Netflix Prize, p-value, Pareto efficiency, performance metric, personalized medicine, pre–internet, profit motive, quantitative trading / quantitative finance, RAND corporation, recommendation engine, replication crisis, ride hailing / ride sharing, Robert Bork, Ronald Coase, self-driving car, short selling, sorting algorithm, sparse data, speech recognition, statistical model, Stephen Hawking, superintelligent machines, TED Talk, telemarketer, Turing machine, two-sided market, Vilfredo Pareto

We shall grapple here with difficult and consequential issues, but at the same time we want to communicate the excitement of new science. It is in this spirit of uncertainty and adventure that we begin our investigations. 1 Algorithmic Privacy From Anonymity to Noise “Anonymized Data Isn’t” It has been difficult for medical research to reap the fruits of large-scale data science because the relevant data is often highly sensitive individual patient records, which cannot be freely shared. In the mid-1990s, a government agency in Massachusetts called the Group Insurance Commission (GIC) decided to help academic researchers by releasing data summarizing hospital visits for every state employee.

But now that we know this, can the problem of privacy be solved by simply concealing information about birthdate, sex, and zip code in future data releases? It turns out that lots of less obvious things can also identify you—like the movies you watch. In 2006, Netflix launched the Netflix Prize competition, a public data science competition to find the best “collaborative filtering” algorithm to power Netflix’s movie recommendation engine. A key feature of Netflix’s service is its ability to recommend to users movies that they might like, given how they have rated past movies. (This was especially important when Netflix was primarily a mail-order DVD rental service, rather than a streaming service—it was harder to quickly browse or sample movies.)

These days virtually every major research university makes grand claims to interdisciplinarity, but Penn is the real deal. For that we give warm thanks to Eduardo Glandt, Amy Gutmann, Vijay Kumar, Vincent Price, and Wendell Pritchett. We are particularly grateful to Fred and Robin Warren, founders and benefactors of Penn’s Warren Center for Network and Data Sciences, for helping to create the remarkable intellectual melting pot that allowed this book to develop. Many thanks to Lily Hoot of the Warren Center for her unflagging professionalism and organizational help. We also are grateful to Raj and Neera Singh, founders and benefactors of Penn’s Networked and Social Systems Engineering (NETS) Program, in which we developed much of the narrative expressed in these pages.


pages: 139 words: 35,022

Roads and Bridges by Nadia Eghbal

AGPL, Airbnb, Amazon Web Services, barriers to entry, Benevolent Dictator For Life (BDFL), corporate social responsibility, crowdsourcing, cryptocurrency, data science, David Heinemeier Hansson, Debian, DevOps, en.wikipedia.org, Firefox, Free Software Foundation, GnuPG, Guido van Rossum, Ken Thompson, Khan Academy, Kickstarter, leftpad, Marc Andreessen, market design, Network effects, platform as a service, pull request, Richard Stallman, Ruby on Rails, Salesforce, side project, Silicon Valley, Skype, software is eating the world, the Cathedral and the Bazaar, Tragedy of the Commons, Y Combinator

NumFOCUS is an example of a 501(c)(3) foundation that supports open source scientific software through fiscal sponsorship and donations. [190] An external foundation model could help provide the support that scientific software needs within the context of an academic environment. The Alfred P. Sloan Foundation and the Gordon and Betty Moore Foundation are also experimenting with ways to connect academic institutions with maintainers of data science software, in order to facilitate an open and sustainable ecosystem. [191] Opportunities Ahead Developing effective support strategies Although there is growing interest in efforts to support digital infrastructure, current initiatives are still new, ad hoc or provide only partial support (such as fiscal sponsorship).

The enormous social contributions of today’s digital infrastructure cannot be ignored or argued away, as has happened with other, equally important debates about data and privacy, net neutrality, or private versus public interests. This makes it easier to shift the conversation to solutions. Secondly, there are already engaged, thriving open source communities to work with. Many developers identify with the programming language they use (such as Python or JavaScript), the function they provide (such as data science or devops), or a prominent project (such as Node.js or Rails). These are strong, vocal, and enthusiastic communities. The builders of our digital infrastructure are connected to each other, aware of their needs, and technically talented. They already built our city; we just need to help keep the lights on so they can continue doing what they do best.


pages: 344 words: 104,077

Superminds: The Surprising Power of People and Computers Thinking Together by Thomas W. Malone

Abraham Maslow, agricultural Revolution, Airbnb, Albert Einstein, Alvin Toffler, Amazon Mechanical Turk, Apple's 1984 Super Bowl advert, Asperger Syndrome, Baxter: Rethink Robotics, bitcoin, blockchain, Boeing 747, business process, call centre, carbon tax, clean water, Computing Machinery and Intelligence, creative destruction, crowdsourcing, data science, deep learning, Donald Trump, Douglas Engelbart, Douglas Engelbart, driverless car, drone strike, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, experimental economics, Exxon Valdez, Ford Model T, future of work, Future Shock, Galaxy Zoo, Garrett Hardin, gig economy, happiness index / gross national happiness, independent contractor, industrial robot, Internet of things, invention of the telegraph, inventory management, invisible hand, Jeff Rulifson, jimmy wales, job automation, John Markoff, Joi Ito, Joseph Schumpeter, Kenneth Arrow, knowledge worker, longitudinal study, Lyft, machine translation, Marshall McLuhan, Nick Bostrom, Occupy movement, Pareto efficiency, pattern recognition, prediction markets, price mechanism, radical decentralization, Ray Kurzweil, Rodney Brooks, Ronald Coase, search costs, Second Machine Age, self-driving car, Silicon Valley, slashdot, social intelligence, Stephen Hawking, Steve Jobs, Steven Pinker, Stewart Brand, technological singularity, The Nature of the Firm, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, Tim Cook: Apple, Tragedy of the Commons, transaction costs, Travis Kalanick, Uber for X, uber lyft, Vernor Vinge, Vilfredo Pareto, Watson beat the top human players on Jeopardy!

For a description of the invention of the process for printing on Pringles, see Larry Huston and Nabil Sakkab, “Connect and Develop: Inside Procter & Gamble’s New Model for Innovation,” Harvard Business Review, March 2006, reprint no. R0603C, https://hbr.org/2006/03/connect-and-develop-inside-procter-gambles-new-model-for-innovation. 6. Vincent Granville, “21 Data Science Systems Used by Amazon to Operate Its Business,” Data Science Central, November 19, 2015, http://www.datasciencecentral.com/profiles/blogs/20-data-science-systems-used-by-amazon-to-operate-its-business. 7. Martin Reeves and Daichi Ueda use the term integrated strategy machine to describe a somewhat similar idea in “Designing the Machines That Will Design Strategy,” Harvard Business Review, April 18, 2016, https://hbr.org/2016/04/welcoming-the-chief-strategy-robot.

For instance, if the people who submit proposed strategies for all the parts of your business include revenue and expense projections, then spreadsheets (or other simple programs) can do a good job of estimating the consolidated earnings for your whole company. Or if you’ve already done enough market research to have good automated models of how different customers respond to price changes, then you could use those models to estimate your revenue at different price points. For instance, Amazon has done vast amounts of data-science work to develop detailed models of many parts of its business: how customers respond to prices, ads, and recommendations; how supply-chain costs vary with inventory policies, delivery methods, and warehouse locations; and how load balancing and server purchases affect software and hardware costs.6 With tools like these, computers can do much of the work by “running the numbers,” and people can then use their general intelligence to do a higher level of analysis.


pages: 474 words: 130,575

Surveillance Valley: The Rise of the Military-Digital Complex by Yasha Levine

23andMe, activist fund / activist shareholder / activist investor, Adam Curtis, Airbnb, AltaVista, Amazon Web Services, Anne Wojcicki, anti-communist, AOL-Time Warner, Apple's 1984 Super Bowl advert, bitcoin, Black Lives Matter, borderless world, Boston Dynamics, British Empire, Californian Ideology, call centre, Charles Babbage, Chelsea Manning, cloud computing, collaborative editing, colonial rule, company town, computer age, computerized markets, corporate governance, crowdsourcing, cryptocurrency, data science, digital map, disinformation, don't be evil, Donald Trump, Douglas Engelbart, Douglas Engelbart, Dr. Strangelove, drone strike, dual-use technology, Edward Snowden, El Camino Real, Electric Kool-Aid Acid Test, Elon Musk, end-to-end encryption, fake news, fault tolerance, gentrification, George Gilder, ghettoisation, global village, Google Chrome, Google Earth, Google Hangouts, Greyball, Hacker Conference 1984, Howard Zinn, hypertext link, IBM and the Holocaust, index card, Jacob Appelbaum, Jeff Bezos, jimmy wales, John Gilmore, John Markoff, John Perry Barlow, John von Neumann, Julian Assange, Kevin Kelly, Kickstarter, Laura Poitras, life extension, Lyft, machine readable, Mark Zuckerberg, market bubble, Menlo Park, military-industrial complex, Mitch Kapor, natural language processing, Neal Stephenson, Network effects, new economy, Norbert Wiener, off-the-grid, One Laptop per Child (OLPC), packet switching, PageRank, Paul Buchheit, peer-to-peer, Peter Thiel, Philip Mirowski, plutocrats, private military company, RAND corporation, Ronald Reagan, Ross Ulbricht, Satoshi Nakamoto, self-driving car, sentiment analysis, shareholder value, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, Skype, slashdot, Snapchat, Snow Crash, SoftBank, speech recognition, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, Susan Wojcicki, Telecommunications Act of 1996, telepresence, telepresence robot, The Bell Curve by Richard Herrnstein and Charles Murray, The Hackers Conference, Tony Fadell, uber lyft, vertical integration, Whole Earth Catalog, Whole Earth Review, WikiLeaks

It was a kind of stripped-down 1960s version of Palantir, the powerful data mining, surveillance, and prediction software the military and intelligence planners use today. The project also funded various efforts to use these programs in ways that were beneficial to the military, including compiling various intelligence databases. As a bonus, the Cambridge Project served as a training ground for a new cadre of data scientists and military planners who learned to be proficient in data mining on it. The Cambridge Project had another, less menacing side. Financial analysts, psychologists, sociologists, CIA agents—the Cambridge Project was useful to anyone interested in working with large and complex data sets. The technology was universal and dual use.


pages: 400 words: 129,841

Capitalism: the unknown ideal by Ayn Rand

Alan Greenspan, Albert Einstein, anti-communist, Berlin Wall, British Empire, business cycle, data science, East Village, Ford Model T, Ford paid five dollars a day, full employment, Isaac Newton, laissez-faire capitalism, means of production, minimum wage unemployment, profit motive, the market place, trade route, transcontinental railway, urban renewal, War on Poverty, yellow journalism

Today’s frantic development in the field of technology has a quality reminiscent of the days preceding the economic crash of 1929: riding on the momentum of the past, on the unacknowledged remnants of an Aristotelian epistemology, it is a hectic, feverish expansion, heedless of the fact that its theoretical account is long since overdrawn—that in the field of scientific theory, unable to integrate or interpret their own data, scientists are abetting the resurgence of a primitive mysticism. In the humanities, however, the crash is past, the depression has set in, and the collapse of science is all but complete. The clearest evidence of it may be seen in such comparatively young sciences as psychology and political economy. In psychology, one may observe the attempt to study human behavior without reference to the fact that man is conscious.


pages: 515 words: 132,295

Makers and Takers: The Rise of Finance and the Fall of American Business by Rana Foroohar

"Friedman doctrine" OR "shareholder theory", "World Economic Forum" Davos, accounting loophole / creative accounting, activist fund / activist shareholder / activist investor, additive manufacturing, Airbnb, Alan Greenspan, algorithmic trading, Alvin Roth, Asian financial crisis, asset allocation, bank run, Basel III, Bear Stearns, behavioural economics, Big Tech, bonus culture, Bretton Woods, British Empire, business cycle, buy and hold, call centre, Capital in the Twenty-First Century by Thomas Piketty, Carl Icahn, Carmen Reinhart, carried interest, centralized clearinghouse, clean water, collateralized debt obligation, commoditize, computerized trading, corporate governance, corporate raider, corporate social responsibility, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, crony capitalism, crowdsourcing, data science, David Graeber, deskilling, Detroit bankruptcy, diversification, Double Irish / Dutch Sandwich, electricity market, Emanuel Derman, Eugene Fama: efficient market hypothesis, financial deregulation, financial engineering, financial intermediation, Ford Model T, Frederick Winslow Taylor, George Akerlof, gig economy, Glass-Steagall Act, Goldman Sachs: Vampire Squid, Gordon Gekko, greed is good, Greenspan put, guns versus butter model, High speed trading, Home mortgage interest deduction, housing crisis, Howard Rheingold, Hyman Minsky, income inequality, index fund, information asymmetry, interest rate derivative, interest rate swap, Internet of things, invisible hand, James Carville said: "I would like to be reincarnated as the bond market. You can intimidate everybody.", John Bogle, John Markoff, joint-stock company, joint-stock limited liability company, Kenneth Rogoff, Kickstarter, knowledge economy, labor-force participation, London Whale, Long Term Capital Management, low interest rates, manufacturing employment, market design, Martin Wolf, money market fund, moral hazard, mortgage debt, mortgage tax deduction, new economy, non-tariff barriers, offshore financial centre, oil shock, passive investing, Paul Samuelson, pensions crisis, Ponzi scheme, principal–agent problem, proprietary trading, quantitative easing, quantitative trading / quantitative finance, race to the bottom, Ralph Nader, Rana Plaza, RAND corporation, random walk, rent control, Robert Shiller, Ronald Reagan, Satyajit Das, Savings and loan crisis, scientific management, Second Machine Age, shareholder value, sharing economy, Silicon Valley, Silicon Valley startup, Snapchat, Social Responsibility of Business Is to Increase Its Profits, sovereign wealth fund, Steve Jobs, stock buybacks, subprime mortgage crisis, technology bubble, TED Talk, The Chicago School, the new new thing, The Spirit Level, The Wealth of Nations by Adam Smith, Tim Cook: Apple, Tobin tax, too big to fail, Tragedy of the Commons, trickle-down economics, Tyler Cowen: Great Stagnation, Vanguard fund, vertical integration, zero-sum game

Designs will be altered in real time to reflect the knowledge. But while all this technology in Schenectady has reduced the number of machinists needed to make a battery, it has also fueled the creation of a GE global research center in San Ramon, California. The center now employs more than one thousand software engineers, data scientists, and user-experience designers who are well paid to develop the software for that kind of industrial Internet—otherwise known as the Internet of things. GE plans to hire thousands more such employees within the next half-decade. “We are probably the most competitive, on a global basis, that we’ve been in the past 30 years,” in terms of being able to make things again in the United States, says CEO Jeffrey Immelt.


pages: 504 words: 129,087

The Ones We've Been Waiting For: How a New Generation of Leaders Will Transform America by Charlotte Alter

"Hurricane Katrina" Superdome, "World Economic Forum" Davos, 4chan, affirmative action, Affordable Care Act / Obamacare, basic income, Berlin Wall, Bernie Sanders, Big Tech, Black Lives Matter, carbon footprint, carbon tax, clean water, collective bargaining, Columbine, corporate personhood, correlation does not imply causation, Credit Default Swap, crowdsourcing, data science, David Brooks, deepfake, deplatforming, disinformation, Donald Trump, double helix, East Village, ending welfare as we know it, fake news, Fall of the Berlin Wall, feminist movement, Ferguson, Missouri, financial deregulation, Francis Fukuyama: the end of history, gentrification, gig economy, glass ceiling, Glass-Steagall Act, Google Hangouts, green new deal, Greta Thunberg, housing crisis, illegal immigration, immigration reform, income inequality, Intergovernmental Panel on Climate Change (IPCC), job-hopping, Kevin Kelly, knowledge economy, Lyft, mandatory minimum, Marc Andreessen, Mark Zuckerberg, mass incarceration, McMansion, medical bankruptcy, microaggression, move fast and break things, Nate Silver, obamacare, Occupy movement, opioid epidemic / opioid crisis, passive income, pre–internet, race to the bottom, RAND corporation, Ronald Reagan, sexual politics, Sheryl Sandberg, side hustle, Silicon Valley, single-payer health, Snapchat, Social Justice Warrior, Steve Bannon, TaskRabbit, tech bro, too big to fail, Uber and Lyft, uber lyft, universal basic income, unpaid internship, We are the 99%, white picket fence, working poor, Works Progress Administration

Billingsley and Clyde Tucker found that, contrary to the conventional wisdom that people simply get more conservative as they age, “each generation seems to display its own political behavior as a result of experiences during early adulthood.” Nearly thirty years later, Columbia political scientist Andrew Gelman and data scientist Yair Ghitza built on this research in their 2014 study of longitudinal data on voter behavior. They found that while variables such as religion, geography, or parental political influence remain important, shared experiences between ages fourteen and twenty-four have a significant impact on lifelong political attitudes.


pages: 515 words: 143,055

The Attention Merchants: The Epic Scramble to Get Inside Our Heads by Tim Wu

1960s counterculture, Aaron Swartz, Affordable Care Act / Obamacare, AltaVista, Andrew Keen, anti-communist, AOL-Time Warner, Apple II, Apple's 1984 Super Bowl advert, barriers to entry, Bob Geldof, borderless world, Brownian motion, Burning Man, Cass Sunstein, citizen journalism, colonial rule, content marketing, cotton gin, data science, do well by doing good, East Village, future of journalism, George Gilder, Golden age of television, Golden Gate Park, Googley, Gordon Gekko, Herbert Marcuse, housing crisis, informal economy, Internet Archive, Jaron Lanier, Jeff Bezos, jimmy wales, John Perry Barlow, Live Aid, Mark Zuckerberg, Marshall McLuhan, McMansion, mirror neurons, Nate Silver, Neal Stephenson, Network effects, Nicholas Carr, Pepsi Challenge, placebo effect, Plato's cave, post scarcity, race to the bottom, road to serfdom, Saturday Night Live, science of happiness, self-driving car, side project, Silicon Valley, Skinner box, slashdot, Snapchat, Snow Crash, Steve Jobs, Steve Wozniak, Steven Levy, Ted Nelson, telemarketer, the built environment, The Chicago School, the scientific method, The Structural Transformation of the Public Sphere, Tim Cook: Apple, Torches of Freedom, Upton Sinclair, upwardly mobile, Virgin Galactic, Wayback Machine, white flight, Yochai Benkler, zero-sum game

And while this might sound like unprecedented cynicism vis-à-vis the audience, the idea was to transfer creative intention to them; they alone would “decide if the project reaches 10 people or 10 million people.”1 To help them decide, BuzzFeed pioneered techniques like “headline optimization,” which was meant to make the piece irresistible and clicking on it virtually involuntary. In the hands of the headline doctors, a video like “Zach Wahls Speaks About Family” became “Two Lesbians Raised a Baby and This Is What They Got”—and earned 18 million views. BuzzFeed’s lead data scientist, Ky Harlin, once crisply explained the paradoxical logic of headlining: “You can usually get somebody to click on something just based on their own curiosity or something like that, but it doesn’t mean that they’re actually going to end up liking the content.” BuzzFeed also developed the statistical analysis of sharing, keeping detailed information on various metrics, especially the one they called “viral lift.”


pages: 432 words: 143,491

Failures of State: The Inside Story of Britain's Battle With Coronavirus by Jonathan Calvert, George Arbuthnott

Boeing 747, Boris Johnson, Brexit referendum, Bullingdon Club, centre right, collapse of Lehman Brothers, contact tracing, contact tracing app, coronavirus, COVID-19, data science, disinformation, Dominic Cummings, Donald Trump, Etonian, gig economy, global pandemic, high-speed rail, Jeremy Corbyn, Kickstarter, lockdown, nudge unit, open economy, Rishi Sunak, Ronald Reagan, Skype, social distancing, zoonotic diseases

There were signs, however, that Javid’s enemy – the prime minister’s chief adviser – was beginning to sense that coronavirus was more of a problem than Downing Street had previously realised. Cummings had sent one of his most trusted lieutenants, Ben Warner, to listen in on the Sage expert committee meetings, which were now taking place regularly in response to the virus. Warner was a data scientist who helped mastermind the computer modelling for Vote Leave’s 2016 referendum campaign and Cummings had drafted him into No. 10 to do similar analysis for the Conservative Party’s 2019 general election campaign. He had joined the committee as an observer for the first time on Thursday 20 February – which was the ninth meeting Sage had held to discuss the UK’s reaction to the virus.


pages: 642 words: 141,888

Like, Comment, Subscribe: Inside YouTube's Chaotic Rise to World Domination by Mark Bergen

23andMe, 4chan, An Inconvenient Truth, Andy Rubin, Anne Wojcicki, Big Tech, Black Lives Matter, book scanning, Burning Man, business logic, call centre, Cambridge Analytica, citizen journalism, cloud computing, Columbine, company town, computer vision, coronavirus, COVID-19, crisis actor, crowdsourcing, cryptocurrency, data science, David Graeber, DeepMind, digital map, disinformation, don't be evil, Donald Trump, Edward Snowden, Elon Musk, fake news, false flag, game design, gender pay gap, George Floyd, gig economy, global pandemic, Golden age of television, Google Glasses, Google X / Alphabet X, Googley, growth hacking, Haight Ashbury, immigration reform, James Bridle, John Perry Barlow, Justin.tv, Kevin Roose, Khan Academy, Kinder Surprise, Marc Andreessen, Marc Benioff, Mark Zuckerberg, mass immigration, Max Levchin, Menlo Park, Minecraft, mirror neurons, moral panic, move fast and break things, non-fungible token, PalmPilot, paypal mafia, Peter Thiel, Ponzi scheme, QAnon, race to the bottom, recommendation engine, Rubik’s Cube, Salesforce, Saturday Night Live, self-driving car, Sheryl Sandberg, side hustle, side project, Silicon Valley, slashdot, Snapchat, social distancing, Social Justice Warrior, speech recognition, Stanford marshmallow experiment, Steve Bannon, Steve Jobs, Steven Levy, surveillance capitalism, Susan Wojcicki, systems thinking, tech bro, the long tail, The Wisdom of Crowds, TikTok, Walter Mischel, WikiLeaks, work culture

The internet’s top destination, Facebook—Pepsi to their Coke—captured about 200 million hours every month of its users’ time, according to their math. Very sticky. TV, the big stomach, claimed four to five hours of the average American’s day, depending on who was doing the counting. YouTube was then clocking around 100 million hours of viewed footage every day. So, 100 million. 10x. Mehrotra left the conference room and beelined to a data scientist who worked for him. “What would it mean to hit a billion?” he asked. “When could we reasonably do that?” * * * • • • Mehrotra announced the new OKR the following year at YouTube’s annual leadership summit in Los Angeles: YouTube would work to get one billion hours of watch time every day within four years.


pages: 439 words: 131,081

The Chaos Machine: The Inside Story of How Social Media Rewired Our Minds and Our World by Max Fisher

2021 United States Capitol attack, 4chan, A Declaration of the Independence of Cyberspace, Airbnb, Bellingcat, Ben Horowitz, Bernie Sanders, Big Tech, Bill Gates: Altair 8800, bitcoin, Black Lives Matter, call centre, centre right, cloud computing, Comet Ping Pong, Computer Lib, coronavirus, COVID-19, crisis actor, crowdsourcing, dark pattern, data science, deep learning, deliberate practice, desegregation, disinformation, domesticated silver fox, Donald Trump, Douglas Engelbart, Douglas Engelbart, end-to-end encryption, fake news, Filter Bubble, Future Shock, game design, gamification, George Floyd, growth hacking, Hacker Conference 1984, Hacker News, hive mind, illegal immigration, Jeff Bezos, John Perry Barlow, Jon Ronson, Joseph Schumpeter, Julian Assange, Kevin Roose, lockdown, Lyft, Marc Andreessen, Mark Zuckerberg, Max Levchin, military-industrial complex, Oklahoma City bombing, Parler "social media", pattern recognition, Paul Graham, Peter Thiel, profit maximization, public intellectual, QAnon, recommendation engine, ride hailing / ride sharing, Rutger Bregman, Saturday Night Live, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Snapchat, social distancing, Social Justice Warrior, social web, Startup school, Stephen Hawking, Steve Bannon, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, Susan Wojcicki, tech billionaire, tech worker, Ted Nelson, TED Talk, TikTok, Uber and Lyft, uber lyft, Whole Earth Catalog, WikiLeaks, Y Combinator

Facebook announced it would allow politicians to lie on the platform and grant them special latitude on hate speech, rules that seemed written for Trump and his allies. “I’d been at FB for less than a year when I was pulled into an urgent inquiry—President Trump’s campaign complained about experiencing a decline in views,” Sophie Zhang, a Facebook data scientist, recalled on Twitter, “I never was asked to investigate anything similar for anyone else.” This sort of appeasement of political leaders appeared to be a global strategy. Between 2018 and 2020, Zhang flagged dozens of incidents of foreign leaders promoting lies and hate for gain, but was consistently overruled, she has said.


pages: 302 words: 73,581

Platform Scale: How an Emerging Business Model Helps Startups Build Large Empires With Minimum Investment by Sangeet Paul Choudary

3D printing, Airbnb, Amazon Web Services, barriers to entry, bitcoin, blockchain, business logic, business process, Chuck Templeton: OpenTable:, Clayton Christensen, collaborative economy, commoditize, crowdsourcing, cryptocurrency, data acquisition, data science, fake it until you make it, frictionless, game design, gamification, growth hacking, Hacker News, hive mind, hockey-stick growth, Internet of things, invisible hand, Kickstarter, Lean Startup, Lyft, M-Pesa, Marc Andreessen, Mark Zuckerberg, means of production, multi-sided market, Network effects, new economy, Paul Graham, recommendation engine, ride hailing / ride sharing, Salesforce, search costs, shareholder value, sharing economy, Silicon Valley, Skype, Snapchat, social bookmarking, social graph, social software, software as a service, software is eating the world, Spread Networks laid a new fibre optics cable between New York and Chicago, TaskRabbit, the long tail, the payments system, too big to fail, transport as a service, two-sided market, Uber and Lyft, Uber for X, uber lyft, vertical integration, Wave and Pay

As the value of the platform increases with greater participation, consumers and producers are organically incentivized to stay engaged on the platform because the platform provides increasing amounts of value to both parties. DATA SCIENCE IS THE NEW BUSINESS PROCESS OPTIMIZATION Pipes achieve scale by improving the repeatability and efficiency of value-creation processes. The world of pipes required process engineering and optimization. Process engineers and managers helped improve internal processes and make them more efficient. In a platformed world, value is created in interactions between users, powered by data. Data science improves the platform’s ability to orchestrate interactions in the ecosystem. As value creation moves from organizational processes to ecosystem interactions, the focus of efficiency shifts from the enhancement of controlled processes to the improvement of the platform’s ability to orchestrate interactions in the ecosystem.

Community management is the new human resources management 6. Liquidity management is the new inventory control 7. Curation and reputation are the new quality control 8. User journeys are the new sales funnels 9. Distribution is the new destination 10. Behavior design is the new loyalty program 11. Data science is the new business process optimization 12. Social feedback is the new sales commission 13. Algorithms are the new decision makers 14. Real-time customization is the new market research 15. Plug-and-play is the new business development 16. The invisible hand is the new iron fist 1.3 THE RISE OF THE INTERACTION-FIRST BUSINESS A Fundamental Redesign Of Business Logic Platforms compete with each other on the basis of their ability to enable interactions sustainably.


pages: 284 words: 79,265

The Half-Life of Facts: Why Everything We Know Has an Expiration Date by Samuel Arbesman

Albert Einstein, Alfred Russel Wallace, Amazon Mechanical Turk, Andrew Wiles, Apollo 11, bioinformatics, British Empire, Cesare Marchetti: Marchetti’s constant, Charles Babbage, Chelsea Manning, Clayton Christensen, cognitive bias, cognitive dissonance, conceptual framework, data science, David Brooks, demographic transition, double entry bookkeeping, double helix, Galaxy Zoo, Gregor Mendel, guest worker program, Gödel, Escher, Bach, Ignaz Semmelweis: hand washing, index fund, invention of movable type, Isaac Newton, John Harrison: Longitude, Kevin Kelly, language acquisition, Large Hadron Collider, life extension, Marc Andreessen, meta-analysis, Milgram experiment, National Debt Clock, Nicholas Carr, P = NP, p-value, Paul Erdős, Pluto: dwarf planet, power law, publication bias, randomized controlled trial, Richard Feynman, Rodney Brooks, scientific worldview, SimCity, social contagion, social graph, social web, systematic bias, text mining, the long tail, the scientific method, the strength of weak ties, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, Tyler Cowen, Tyler Cowen: Great Stagnation

For example, researchers have examined the drinking establishment locations and characteristics in different communities, and even whether the elderly are capable of crossing the street in the time a given traffic light provides them. In the past few years there has been a surge in what is being called data science. Of course, all science uses data, but data science is more of a return to the Galtonian approach, where through the analysis of massive amounts of data—how people date on the Internet, make phone calls, shop online, and much more—one can begin to visualize and make sense of the world, and in the process discover new facts about ourselves and our surroundings.

., 18–19 classification systems, 204–5 Clay Mathematics Institute, 133 climate change, 203 clinical trials, 107–9, 157, 160 coelacanths, 26–27 cognitive biases, 175–76, 177, 188 cognitive dissonance, 4 Colbert, Stephen, 193 Cole, Jonathan, 48–49 Cole, Stephen, 162, 163 computation, human, 20 computers, 20, 41, 53, 110 automated discovery programs, 112–14 Babbage and, 106–7 games and, 2, 51 information transformation and, 43–44, 46 Moore’s Law and, 42 confirmation bias, 177 Consumer Price Index (CPI), 196 Cope, Edward, 80, 81, 169 Copernicus, Nicolaus, 206 CoPub Discovery, 110–12 Cosmos, 121, 129 Couric, Katie, 41 Courtenay-Latimer, Marjorie, 26–27 Cowen, Tyler, 23 cryptography, 134 cumulative knowledge, 56–57 Daily Show, The, 159 Darwin, Charles, 79, 80, 105, 166, 187 data science, 167–68 Davy, Humphry, 51 decline effect, 155–56, 157, 162 de Grey, Aubrey, 53 demographics, 204 Dessler, A. J., 148–49, 155 deuterium, 151 Devezas, Tessaleno, 207–8 DEVONthink, 118–19 Diabetes Care, 67 dialect, situation-based, 190 Diamond, Arthur, 187 Dictionary of Theories (Bothamley, ed.), 85 dinosaurs, 3, 79–82, 168–69, 194 discovery: long tail of, 38 multiple independent, 104–5 pace of, 9–25 discriminating power, 159–60 diseases, 52, 176–77 categorization of, 205 spread of, 62, 64 Dittmar, Jeremiah, 71, 73 Dixon, William Macneile, 8 DNA, 88, 90, 122, 163 drugs, 24, 111–12 repurposing of, 112 streptokinase, 108–9 Dunbar, Robin, 205 Dunbar’s Number, 205–6 Earth, curvature of, 35–36 education, 182–83, 195 Einstein, Albert, 36, 106, 186 Electronics, 42 Ellsworth, Henry, 54 e-mail, 41 Empedocles, 201 Encyclopaedia of Scientific Units, Weights, and Measures: Their SI Equivalences and Origins (Cardarelli), 146 EndNote, 117–18 energy, 55, 204 Eos, 148 Erdo˝s, Paul, 104 errors, 78–95 contrary to popular belief phrase and, 84–85 Essay on the Application of Mathematical Analysis to the Theories of Electricity and Magnetism, An (Green), 106 eurekometrics, 21, 22 Eureqa, 113–14 Everest, George, 140 evolution, 79, 187 evolutionary programming, 113 evolutionary psychology, 175 expertise, long tail of, 96, 102 experts, 96–97 exponential growth, 10–14, 44–45, 46–47, 54–55, 57, 59, 130, 204 extinct species, 26, 27–28 facts, see knowledge and facts factual inertia, 175, 179–83, 188, 190, 199 Fallows, James, 86 Fermat, Pierre de, 132 Feynman, Richard, 104 fish, 201 fishing, 173 fish oil, 99, 110 Florey, Lord, 163 Flory, Paul, 104 Foldit, 20 Franzen, Jonathan, 208–9 French Canadians, 193–94 frogs: boiling of, 86, 171 vision of, 171 Galaxy Zoo, 20 Galileo, 21, 143–44 Galton, Francis, 165–68 games, 51 generational knowledge, 183–85, 199 genetics, 87–90 genome sequencing, 48, 51 Gibrat’s Law, 103 Goddard, Robert H., 174 Godwin’s law, 105 Goldbach’s Conjecture, 112–13 Goodman, Steven, 107–8 Gould, Stephen Jay, 82 grammar: descriptive, 188–89 prescriptive, 188–89, 194 Granovetter, Mark, 76–78 Graves’ disease, 111 Great Vowel Shift, 191–93 Green, George, 105–6 growth: exponential, 10–14, 44–45, 46–47, 54–55, 57, 59, 130, 204 hyperbolic, 59 linear, 10, 11 Gumbel, Bryant, 41 Gutenberg, Johannes, 71–73, 78, 95 Hamblin, Terry, 83 Harrison, John, 102 Hawthorne effect, 55–56 helium, 104 Helmann, John, 162 Henrich, Joseph, 58 hepatitis, 28–30 hidden knowledge, 96–120 h-index, 17 Hirsch, Jorge, 17 History of the Modern Fact, A (Poovey), 200 Holmes, Sherlock, 206 homeoteleuton, 89 Hooke, Robert, 21, 94 Hull, David, 187–88 human anatomy, 23 human computation, 20 hydrogen, 151 hyperbolic growth rate, 59 idiolect, 190 impact factors, 16–17 inattentional blindness (change blindness), 177–79 India, 140–41 informational index funds, 197 information transformation, 43–44, 46 InnoCentive, 96–98, 101, 102 innovation, 204 population size and, 135–37, 202 prizes for, 102–3 simultaneous, 104–5 integrated circuits, 42, 43, 55, 203 Intel Corporation, 42 interdisciplinary research, 68–69 International Bureau of Weights and Measures, 47 Internet, 2, 40–41, 53, 198, 208, 211 Ioannidis, John, 156–61, 162 iPhone, 123 iron: magnetic properties of, 49–50 in spinach, 83–84 Ising, Ernst, 124, 125–26, 138 isotopes, 151 Jackson, John Hughlings, 30 Johnson, Steven, 119 Journal of Physical and Chemical Reference Data, 33–35 journals, 9, 12, 16–17, 32 Kahneman, Daniel, 177 Kay, Alan, 173 Kelly, Kevin, 38, 46 Kelly, Stuart, 115 Kelvin, Lord, 142–43 Kennaway, Kristian, 86 Keynes, John Maynard, 172 kidney stones, 52 kilogram, 147–48 Kiribati, 203 Kissinger, Henry, 190 Kleinberg, Jon, 92–93 knowledge and facts, 5, 54 cumulative, 56–57 erroneous, 78–95, 211–14 half-lives of, 1–8, 202 hidden, 96–120 phase transitions in, 121–39, 185 spread of, 66–95 Koh, Heebyung, 43, 45–46, 56 Kremer, Michael, 58–61 Kuhn, Thomas, 163, 186 Lambton, William, 140 land bridges, 57, 59–60 language, 188–94 French Canadians and, 193–94 grammar and, 188–89, 194 Great Vowel Shift and, 191–93 idiolect and, 190 situation-based dialect and, 190 verbs in, 189 voice onset time and, 190 Large Hadron Collider, 159 Laughlin, Gregory, 129–31 “Laws Underlying the Physics of Everyday Life Really Are Completely Understood, The” (Carroll), 36–37 Lazarus taxa, 27–28 Le Fanu, James, 23 LEGO, 184–85, 194 Lehman, Harvey, 13–14, 15 Leibniz, Gottfried, 67 Lenat, Doug, 112 Levan, Albert, 1–2 Liben-Nowell, David, 92–93 libraries, 31–32 life span, 53–54 Lincoln, Abraham, 70 linear growth, 10, 11 Linnaeus, Carl, 22, 204 Lippincott, Sara, 86 Lipson, Hod, 113 Little Science, Big Science (Price), 13 logistic curves, 44–46, 50, 116, 130, 203–4 longitude, 102 Long Now Foundation, 195 long tails: of discovery, 38 of expertise, 96, 102 of life, 38 of popularity, 103 Lou Gehrig’s disease (ALS), 98, 100–101 machine intelligence, 207 Magee, Chris, 43, 45–46, 56, 207–8 magicians, 178–79 magnetic properties of iron, 49–50 Maldives, 203 Malthus, Thomas, 59 mammal species, 22, 23, 128 extinct, 28 manuscripts, 87–91, 114–16 Marchetti, Cesare, 64 Marsh, Othniel, 80–81, 169 mathematics, 19, 51, 112–14, 124–25, 132–35 Matthew effect, 103 Mauboussin, Michael, 84 Mayor, Michel, 122 McGovern, George, 66 McIntosh, J.


pages: 163 words: 42,402

Machine Learning for Email by Drew Conway, John Myles White

call centre, correlation does not imply causation, data science, Debian, natural language processing, Netflix Prize, pattern recognition, recommendation engine, SpamAssassin, text mining

--The R Project for Statistical Computing, http://www.r-project.org/ The best thing about R is that it was developed by statisticians. The worst thing about R is that...it was developed by statisticians. --Bo Cowgill, Google, Inc. R is an extremely powerful language for manipulating and analyzing data. Its meteoric rise in popularity within the data science and machine learning communities has made it the de facto lingua franca for analytics. R’s success in the data analysis community stems from two factors described in the epitaphs above: R provides most of the technical power that statisticians require built into the default language, and R has been supported by a community of statisticians who are also open source devotees.

His academic curiosity is informed by his years as an analyst in the U.S. intelligence and defense communities. John Myles White is a Ph.D. student in the Princeton Psychology Department, where he studies how humans make decisions both theoretically and experimentally. Outside of academia, John has been heavily involved in the data science movement, which has pushed for an open source software approach to data analysis. He is also the leadmaintainer for several popular R packages, including ProjectTemplate and log4r.


pages: 199 words: 48,162

Capital Allocators: How the World’s Elite Money Managers Lead and Invest by Ted Seides

Albert Einstein, asset allocation, behavioural economics, business cycle, coronavirus, COVID-19, crowdsourcing, data science, deliberate practice, diversification, Everything should be made as simple as possible, fake news, family office, fixed income, high net worth, hindsight bias, impact investing, implied volatility, impulse control, index fund, Kaizen: continuous improvement, Lean Startup, loss aversion, Paradox of Choice, passive investing, Ralph Waldo Emerson, risk tolerance, Sharpe ratio, sovereign wealth fund, tail risk, The Wisdom of Crowds, Toyota Production System, zero-sum game

The turmoil in markets, employment, and working protocols caused by Covid-19 in March 2020 presented a recent case study for how CIOs respond to a period of uncertainty. To learn more Podcasts Capital Allocators: Patrick O’Shaughnessy – O’Shaughnessy Asset Management (First Meeting, Ep.1) Capital Allocators: Jordi Visser – Next Generation of Manager Allocation (Ep.92) Capital Allocators: Matthew Granade – Inside Data Science at Point72 (First Meeting, Ep.22) Companies Novus Partners, www.novus.com Essentia Analytics, www.essentia-analytics.com Alpha Theory, www.alphatheory.com Reading won’t help much in improving investment results through quantitative means. Instead, reach out to Novus, Essentia, and Alpha Theory to learn more about their application of tools for allocators and portfolio managers

Scott, Kim and Andy’s conversations are your picks – the most downloaded shows among the CIO interviews. Data analysis Podcasts Capital Allocators: Patrick O’Shaughnessy – O’Shaughnessy Asst Management (First Meeting, Ep.1) Capital Allocators: Jordi Visser – Next Generation of Manager Allocators (Ep.92) Capital Allocators: Matthew Granade – Inside Data Science at Point72 (First Meeting, Ep.22) Companies Novus Partners, www.novus.com Essentia Analytics, www.essentia-analytics.com Alpha Theory, www.alphatheory.com Reading won’t help much in improving investment results through quantitative means. Instead, reach out to Novus, Essentia, and Alpha Theory to learn more about their application of tools for allocators and portfolio managers


pages: 319 words: 90,965

The End of College: Creating the Future of Learning and the University of Everywhere by Kevin Carey

Albert Einstein, barriers to entry, Bayesian statistics, behavioural economics, Berlin Wall, Blue Ocean Strategy, business cycle, business intelligence, carbon-based life, classic study, Claude Shannon: information theory, complexity theory, data science, David Heinemeier Hansson, declining real wages, deliberate practice, discrete time, disruptive innovation, double helix, Douglas Engelbart, Douglas Engelbart, Downton Abbey, Drosophila, Fairchild Semiconductor, Firefox, Frank Gehry, Google X / Alphabet X, Gregor Mendel, informal economy, invention of the printing press, inventory management, John Markoff, Khan Academy, Kickstarter, low skilled workers, Lyft, Marc Andreessen, Mark Zuckerberg, meta-analysis, natural language processing, Network effects, open borders, pattern recognition, Peter Thiel, pez dispenser, Recombinant DNA, ride hailing / ride sharing, Ronald Reagan, Ruby on Rails, Sand Hill Road, self-driving car, Silicon Valley, Silicon Valley startup, social web, South of Market, San Francisco, speech recognition, Steve Jobs, technoutopianism, transcontinental railway, uber lyft, Vannevar Bush

By 2014, edX was offering hundreds of free online courses in subjects including the Poetry of Walt Whitman, the History of Early Christianity, Computational Neuroscience, Flight Vehicle Aerodynamics, Shakespeare, Dante’s Divine Comedy, Bioethics, Contemporary India, Historical Relic Treasures and Cultural China, Linear Algebra, Autonomous Mobile Robots, Electricity and Magnetism, Discrete Time Signals and Systems, Introduction to Global Sociology, Behavioral Economics, Fundamentals of Immunology, Computational Thinking and Data Science, and an astrophysics course titled Greatest Unsolved Mysteries of the Universe. Doing this seemed to contradict five hundred years of higher-education economics in which the wealthiest and most sought-after colleges enforced a rigid scarcity over their products and services. The emerging University of Everywhere threatened institutions that depended on the privilege of being scarce, expensive places.

She quoted Edwin Slosson’s well-known alleged quip that lecture notes are a way of transmitting information from the lecturer to the student without it passing through the minds of either one of them. Third, there will be a transformation of the study of human learning, from a series of anecdotes to real-data science. Suppes had written about this, too. “The power of the computer to assemble and provide data as a basis for [educational] decisions,” he wrote, “will be perhaps the most powerful impetus to the development of education theory yet to appear.” As we finished the interview, Michael Staton mentioned to Koller that Learn Capital was putting together a new pool of investment money.

Over time those courses will be organized into sequences that approximate the scope of learning we associate with college majors. MIT is already moving in this direction, starting with a seven-course sequence in computer programming that begins with introductions to coding, computational thinking, and data science and then moves to software construction, digital circuits, programmable architectures, and computer systems organization. The length of the course sequences will vary depending on the field, profession, or kind of work. Some will involve a few courses; others will be dozens long. Neither the courses nor the sequences will be constrained by the artificial limitations of semester hours or years spent attending school.


pages: 523 words: 148,929

Physics of the Future: How Science Will Shape Human Destiny and Our Daily Lives by the Year 2100 by Michio Kaku

agricultural Revolution, AI winter, Albert Einstein, Alvin Toffler, Apollo 11, Asilomar, augmented reality, Bill Joy: nanobots, bioinformatics, blue-collar work, British Empire, Brownian motion, caloric restriction, caloric restriction, cloud computing, Colonization of Mars, DARPA: Urban Challenge, data science, delayed gratification, digital divide, double helix, Douglas Hofstadter, driverless car, en.wikipedia.org, Ford Model T, friendly AI, Gödel, Escher, Bach, Hans Moravec, hydrogen economy, I think there is a world market for maybe five computers, industrial robot, Intergovernmental Panel on Climate Change (IPCC), invention of movable type, invention of the telescope, Isaac Newton, John Markoff, John von Neumann, Large Hadron Collider, life extension, Louis Pasteur, Mahatma Gandhi, Mars Rover, Mars Society, mass immigration, megacity, Mitch Kapor, Murray Gell-Mann, Neil Armstrong, new economy, Nick Bostrom, oil shale / tar sands, optical character recognition, pattern recognition, planetary scale, postindustrial economy, Ray Kurzweil, refrigerator car, Richard Feynman, Rodney Brooks, Ronald Reagan, Search for Extraterrestrial Intelligence, Silicon Valley, Simon Singh, social intelligence, SpaceShipOne, speech recognition, stem cell, Stephen Hawking, Steve Jobs, synthetic biology, telepresence, The future is already here, The Wealth of Nations by Adam Smith, Thomas L Friedman, Thomas Malthus, trade route, Turing machine, uranium enrichment, Vernor Vinge, Virgin Galactic, Wall-E, Walter Mischel, Whole Earth Review, world market for maybe five computers, X Prize

If you go to a nursing home, where people are wasting away, living with constant pain, and waiting to die and ask the same question, you might get an entirely different answer.) As UCLA’s Greg Stock says, “Gradually, our agonizing about playing God and our worries about longer life spans would give way to a new chorus: ‘When can I get a pill?’ ” In 2002, with the best demographic data, scientists estimated that 6 percent of all humans who have ever walked the face of the earth are still alive today. This is because the human population hovered at around 1 million for most of human history. Foraging for meager supplies of food kept the human population down. Even during the height of the Roman Empire, its population was estimated to be only 55 million.


pages: 527 words: 147,690

Terms of Service: Social Media and the Price of Constant Connection by Jacob Silverman

"World Economic Forum" Davos, 23andMe, 4chan, A Declaration of the Independence of Cyberspace, Aaron Swartz, Airbnb, airport security, Amazon Mechanical Turk, augmented reality, basic income, Big Tech, Brian Krebs, California gold rush, Californian Ideology, call centre, cloud computing, cognitive dissonance, commoditize, company town, context collapse, correlation does not imply causation, Credit Default Swap, crowdsourcing, data science, deep learning, digital capitalism, disinformation, don't be evil, driverless car, drone strike, Edward Snowden, Evgeny Morozov, fake it until you make it, feminist movement, Filter Bubble, Firefox, Flash crash, game design, global village, Google Chrome, Google Glasses, Higgs boson, hive mind, Ian Bogost, income inequality, independent contractor, informal economy, information retrieval, Internet of things, Jacob Silverman, Jaron Lanier, jimmy wales, John Perry Barlow, Kevin Kelly, Kevin Roose, Kickstarter, knowledge economy, knowledge worker, Larry Ellison, late capitalism, Laura Poitras, license plate recognition, life extension, lifelogging, lock screen, Lyft, machine readable, Mark Zuckerberg, Mars Rover, Marshall McLuhan, mass incarceration, meta-analysis, Minecraft, move fast and break things, national security letter, Network effects, new economy, Nicholas Carr, Occupy movement, off-the-grid, optical character recognition, payday loans, Peter Thiel, planned obsolescence, postindustrial economy, prediction markets, pre–internet, price discrimination, price stability, profit motive, quantitative hedge fund, race to the bottom, Ray Kurzweil, real-name policy, recommendation engine, rent control, rent stabilization, RFID, ride hailing / ride sharing, Salesforce, self-driving car, sentiment analysis, shareholder value, sharing economy, Sheryl Sandberg, Silicon Valley, Silicon Valley ideology, Snapchat, social bookmarking, social graph, social intelligence, social web, sorting algorithm, Steve Ballmer, Steve Jobs, Steven Levy, systems thinking, TaskRabbit, technological determinism, technological solutionism, technoutopianism, TED Talk, telemarketer, transportation-network company, Travis Kalanick, Turing test, Uber and Lyft, Uber for X, uber lyft, universal basic income, unpaid internship, women in the workforce, Y Combinator, yottabyte, you are the product, Zipcar

Ironically, it’s the very unreliability of Big Data–style analysis that prompts ever more data collection. If you think you’ve detected some false patterns or aren’t finding the kinds of correlations you sought, why not just collect and analyze more? If you can’t process all the data you’ve stored—a problem that the NSA has faced—just build more data centers and hire more mathematicians and data scientists. Whether you’re Facebook or the U.S. government, the money is out there to do just that. Even the apparent presence of a pattern can lead us toward some false choices. A health insurance company may believe that people who buy six key grocery items are 30 percent more likely to develop diabetes, but does that give the insurer the right to raise this group’s premiums or deny them coverage?


pages: 543 words: 153,550

Model Thinker: What You Need to Know to Make Data Work for You by Scott E. Page

Airbnb, Albert Einstein, Alfred Russel Wallace, algorithmic trading, Alvin Roth, assortative mating, behavioural economics, Bernie Madoff, bitcoin, Black Swan, blockchain, business cycle, Capital in the Twenty-First Century by Thomas Piketty, Checklist Manifesto, computer age, corporate governance, correlation does not imply causation, cuban missile crisis, data science, deep learning, deliberate practice, discrete time, distributed ledger, Easter island, en.wikipedia.org, Estimating the Reproducibility of Psychological Science, Everything should be made as simple as possible, experimental economics, first-price auction, Flash crash, Ford Model T, Geoffrey West, Santa Fe Institute, germ theory of disease, Gini coefficient, Higgs boson, High speed trading, impulse control, income inequality, Isaac Newton, John von Neumann, Kenneth Rogoff, knowledge economy, knowledge worker, Long Term Capital Management, loss aversion, low skilled workers, Mark Zuckerberg, market design, meta-analysis, money market fund, multi-armed bandit, Nash equilibrium, natural language processing, Network effects, opioid epidemic / opioid crisis, p-value, Pareto efficiency, pattern recognition, Paul Erdős, Paul Samuelson, phenotype, Phillips curve, power law, pre–internet, prisoner's dilemma, race to the bottom, random walk, randomized controlled trial, Richard Feynman, Richard Thaler, Robert Solow, school choice, scientific management, sealed-bid auction, second-price auction, selection bias, six sigma, social graph, spectrum auction, statistical model, Stephen Hawking, Supply of New York City Cabdrivers, systems thinking, tacit knowledge, The Bell Curve by Richard Herrnstein and Charles Murray, The Great Moderation, the long tail, The Rise and Fall of American Growth, the rule of 72, the scientific method, The Spirit Level, the strength of weak ties, The Wisdom of Crowds, Thomas Malthus, Thorstein Veblen, Tragedy of the Commons, urban sprawl, value at risk, web application, winner-take-all economy, zero-sum game

Studies show that people may reside in online bubbles. That is, we may belong to communities of people who get their news from similar sources. If so, that has implications for social cohesion. Prior to the creation of the internet, that may have been true as well, but demonstrating it with data would have been hard. Now data scientists can scrape the web to identify the news sources that people frequent and tell us that, yes, in fact we do live in bubbles to an extent. Models provide the formal definitions of communities. Data tells us the strength of those communities. Using judgment we can make wise inferences based on what the data say.


pages: 569 words: 156,139

Amazon Unbound: Jeff Bezos and the Invention of a Global Empire by Brad Stone

activist fund / activist shareholder / activist investor, air freight, Airbnb, Amazon Picking Challenge, Amazon Robotics, Amazon Web Services, autonomous vehicles, Bernie Sanders, big data - Walmart - Pop Tarts, Big Tech, Black Lives Matter, business climate, call centre, carbon footprint, Clayton Christensen, cloud computing, Colonization of Mars, commoditize, company town, computer vision, contact tracing, coronavirus, corporate governance, COVID-19, crowdsourcing, data science, deep learning, disinformation, disintermediation, Donald Trump, Downton Abbey, Elon Musk, fake news, fulfillment center, future of work, gentrification, George Floyd, gigafactory, global pandemic, Greta Thunberg, income inequality, independent contractor, invisible hand, Jeff Bezos, John Markoff, Kiva Systems, Larry Ellison, lockdown, Mahatma Gandhi, Mark Zuckerberg, Masayoshi Son, mass immigration, minimum viable product, move fast and break things, Neal Stephenson, NSO Group, Paris climate accords, Peter Thiel, Ponzi scheme, Potemkin village, private spaceflight, quantitative hedge fund, remote working, rent stabilization, RFID, Robert Bork, Ronald Reagan, search inside the book, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Snapchat, social distancing, SoftBank, SpaceX Starlink, speech recognition, Steve Ballmer, Steve Jobs, Steven Levy, tech billionaire, tech bro, techlash, TED Talk, Tim Cook: Apple, Tony Hsieh, too big to fail, Tragedy of the Commons, two-pizza team, Uber for X, union organizing, warehouse robotics, WeWork

A decentralized operating structure limited the company’s dexterity at precisely the time when it needed to evolve quickly to meet changing tastes, as well as to introduce home delivery and new digital payment methods. In an unusual arrangement, Mackey was running the company at the time with a co-CEO, Walter Robb, who managed day-to-day operations. They recognized the looming challenges, and hired teams of data scientists in Austin and contracted with the San Francisco–based grocery delivery startup Instacart. But things were progressing slowly—and then they ran out of time. In 2016, the New York investment firm Neuberger Berman started sending letters to Whole Foods leadership and to other shareholders, complaining about complacent management, the unconventional CEO structure, and highlighting deficiencies like the absence of a rewards program.


pages: 271 words: 52,814

Blockchain: Blueprint for a New Economy by Melanie Swan

23andMe, Airbnb, altcoin, Amazon Web Services, asset allocation, banking crisis, basic income, bioinformatics, bitcoin, blockchain, capital controls, cellular automata, central bank independence, clean water, cloud computing, collaborative editing, Conway's Game of Life, crowdsourcing, cryptocurrency, data science, digital divide, disintermediation, Dogecoin, Edward Snowden, en.wikipedia.org, Ethereum, ethereum blockchain, fault tolerance, fiat currency, financial innovation, Firefox, friendly AI, Hernando de Soto, information security, intangible asset, Internet Archive, Internet of things, Khan Academy, Kickstarter, Large Hadron Collider, lifelogging, litecoin, Lyft, M-Pesa, microbiome, Neal Stephenson, Network effects, new economy, operational security, peer-to-peer, peer-to-peer lending, peer-to-peer model, personalized medicine, post scarcity, power law, prediction markets, QR code, ride hailing / ride sharing, Satoshi Nakamoto, Search for Extraterrestrial Intelligence, SETI@home, sharing economy, Skype, smart cities, smart contracts, smart grid, Snow Crash, software as a service, synthetic biology, technological singularity, the long tail, Turing complete, uber lyft, unbanked and underbanked, underbanked, Vitalik Buterin, Wayback Machine, web application, WikiLeaks

Blockchain Layer Could Facilitate Big Data’s Predictive Task Automation As big data allows the predictive modeling of more and more processes of reality, blockchain technology could help turn prediction into action. Blockchain technology could be joined with big data, layered onto the reactive-to-predictive transformation that is slowly under way in big-data science to allow the automated operation of large areas of tasks through smart contracts and economics. Big data’s predictive analysis could dovetail perfectly with the automatic execution of smart contracts. We could accomplish this specifically by adding blockchain technology as the embedded economic payments layer and the tool for the administration of quanta, implemented through automated smart contracts, Dapps, DAOs, and DACs.

“Unreliable Research. Trouble at the Lab.” The Economist, October 17, 2013 (paywall restricted). http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-degree-it-not-trouble. 157 Schmidt, M. and H. Lipson. “Distilling Free-Form Natural Laws from Experimental Data.” Science 324, no. 5923 (2009): 81–5. http://creativemachines.cornell.edu/sites/default/files/Science09_Schmidt.pdf; Keim, B. “Computer Program Self-Discovers Laws of Physics.” Wired, April 2, 2009. http://www.wired.com/2009/04/newtonai/. 158 Muggleton, S. “Developing Robust Synthetic Biology Designs Using a Microfluidic Robot Scientist.


The Metropolitan Revolution: How Cities and Metros Are Fixing Our Broken Politics and Fragile Economy by Bruce Katz, Jennifer Bradley

"World Economic Forum" Davos, 3D printing, additive manufacturing, Affordable Care Act / Obamacare, benefit corporation, British Empire, business climate, carbon footprint, clean tech, clean water, collapse of Lehman Brothers, company town, congestion pricing, data science, deindustrialization, demographic transition, desegregation, Donald Shoup, double entry bookkeeping, edge city, Edward Glaeser, financial engineering, global supply chain, immigration reform, income inequality, industrial cluster, intermodal, Jane Jacobs, jitney, Kickstarter, knowledge economy, Lewis Mumford, lone genius, longitudinal study, Mark Zuckerberg, Masdar, megacity, megaproject, Menlo Park, Moneyball by Michael Lewis explains big data, Network effects, new economy, New Urbanism, Occupy movement, place-making, postindustrial economy, purchasing power parity, Quicken Loans, race to the bottom, Richard Florida, Shenzhen was a fishing village, Silicon Valley, smart cities, smart grid, sovereign wealth fund, tech worker, TechCrunch disrupt, TED Talk, the built environment, The Death and Life of Great American Cities, the market place, The Spirit Level, Tony Hsieh, too big to fail, trade route, transit-oriented development, urban planning, white flight, Yochai Benkler

The center will also work with private industry partners, including IBM, Cisco, ConEdison, National Grid, Siemens, Xerox, AECOM, Arup, IDEO, Lutron, and Microsoft, and government labs, including the Livermore, Los Alamos, and Sandia National Laboratories. 02-2151-2 ch2.indd 28 5/20/13 6:48 PM NYC: INNOVATION AND THE NEXT ECONOMY 29 NEW YORK CITY’S APPLIED SCIENCES CAMPUSES 02-2151-2 ch2.indd 29 5/20/13 6:48 PM 30 NYC: INNOVATION AND THE NEXT ECONOMY In July 2012 Columbia University’s new Institute for Data Sciences and Engineering, located at its Morningside Heights and Washington Heights campuses in New York City, became the third Applied Sciences campus.37 At Columbia, students and faculty will focus on applications for new media, smart cities, health analytics, cybersecurity, and financial analytics, among other areas.

Center for Urban Science Progress, “Educational Programs,” New York University, 2012 (http://cusp.nyu.edu/ms-in-applied-urban-science-and-informatics/). 37. New York City Economic Development Corporation, “Mayor Bloomberg and Columbia President Bollinger Announce Agreement to Create New Institute for Data Sciences and Engineering,” press release, July 30, 2012. 38. These include state renewable portfolio standards as well as national carbonreduction strategies, such as those promulgated by the U.K. Department of Energy and Climate Change. See Barry Rabe, “Race to the Top: The Expanding Role of U.S. State Renewable Portfolio Standards,” Sustainable Development Law and Policy 7 no. 3 (2007).

Houston Settlement Association, 91 “How America Can Rise Again” (Fallows), 154–55 Howder, Randy, 119 Hsieh, Tony, 119–20 Hughes, Tom, 158 Hull House settlement, 101 12-2151-2 index.indd 254 IBM, viii, 122, 148 Idea viruses, 10 Immelt, Jeffrey, 19–20, 32, 182 Immigrant populations: and benefits for U.S. economy, 92–93; education level among, 93, 154; and global trade, impact on, 154; in metropolitan areas, 92–93; and patent generation, 93; poverty among, 103; in suburbs, 48, 93–94, 98–99; trends among, 153 Induced travel, 57 Industrial districts, 115–16 Information vs. knowledge, 118 Innovation and innovation districts, vii, 113–43; anchor institutions model for, 114, 121–23, 127; challenges facing, 129–31; and clustering, 22–23; collaborative approach of, 117, 119–20, 139–40; defined, 114; and demographic changes, 120–21; drivers of, 116, 138–39; and economic growth, 4; and exports, 32–33, 34; factors affecting rise of, 114–15, 116; funding for, 130–31, 141–42; government’s role in, 142; implications of, 117, 118–19; international models for, 127–29, 130; and manufacturing, 82–83; metropolitan revolution, role in, 10–11, 141–43, 202–05; remake science park model for, 126–27; replication of, 10–11, 203–05; spatial impacts of, 121–29; transforming underused areas model for, 123–26, 127; trends facilitating necessity for, 113–14, 119; and urbanization, 114–15, 116, 121; and zoning regulations, 129–30 Institute for Data Sciences and Engineering, 30 Intel, 93, 157 Inter-American Development Bank urban initiative, 148 Intermediaries, defined, 75–76 International Trade Administration (U.S.), 32 Internet, replication of ideas through, 10, 203–05 Investments. See Funding Istrate, Emilia, 33 Jacksonville (Florida) metropolitan area, infrastructure development in, 4 Jacobs, Jane, 34, 113, 150 James, Franklin, 45–46 Jaquay, Bob, 70–71, 78–79 Johnson, Steven, 38, 39, 67, 83 5/20/13 7:04 PM INDEX Kansas City Federal Reserve Bank study, 53 Kendall Square, 122–23, 129 Kenney, Peter, 49, 54, 55, 56, 60, 61, 62–63 Kent State University, 75 Kharas, Homi, 147 Kim, Charlie, 27 Knowledge vs. information, 118 Koonin, Steven, 28, 29, 37 LaHood, Raymond, 138 Latin America: emerging market economies in, 32, 147, 148; innovation in, 204; and international tourism, 153; Miami, influence on, 161–62, 163, 186 Latinos/Latinas: education level of, 93, 103, 104; in suburbs, 99 Leadership, metropolitan vs. state and federal, 3–4, 5–9 Leal, Roberta, 99, 100, 105, 107 Lehman Brothers collapse (2008), 17–18 Lewis, Michael, 196 LG Corporation, 83–84 Light bulbs, metropolitan influence on invention of, 39–40 Liveris, Andrew, 182 London: East End development, viii; trade links with, 162, 165, 167; traffic congestion and pollution control, 204 Los Angeles (California) metropolitan area: game changers for, 197; transit system in, ix, 4, 185–86; vision established for, 196 Lübeck and Hamburg trade agreement, 166–68 Madison, James, 175–76 MAGNET development organization, 83 “Making Northeast Ohio Great Again: A Call to Arms to the Foundation Community” (Fund for Our Economic Future), 70 Manufacturing: additive, 77; and exports, 152; foreign investment in, 155; and innovation, 82–83 Marchio, Nicholas, 33 Marcuse, Peter, 160 Masdar City, viii Massachusetts Institute of Technology.


pages: 444 words: 117,770

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma by Mustafa Suleyman

"World Economic Forum" Davos, 23andMe, 3D printing, active measures, Ada Lovelace, additive manufacturing, agricultural Revolution, AI winter, air gap, Airbnb, Alan Greenspan, algorithmic bias, Alignment Problem, AlphaGo, Alvin Toffler, Amazon Web Services, Anthropocene, artificial general intelligence, Asilomar, Asilomar Conference on Recombinant DNA, ASML, autonomous vehicles, backpropagation, barriers to entry, basic income, benefit corporation, Big Tech, biodiversity loss, bioinformatics, Bletchley Park, Blitzscaling, Boston Dynamics, business process, business process outsourcing, call centre, Capital in the Twenty-First Century by Thomas Piketty, ChatGPT, choice architecture, circular economy, classic study, clean tech, cloud computing, commoditize, computer vision, coronavirus, corporate governance, correlation does not imply causation, COVID-19, creative destruction, CRISPR, critical race theory, crowdsourcing, cryptocurrency, cuban missile crisis, data science, decarbonisation, deep learning, deepfake, DeepMind, deindustrialization, dematerialisation, Demis Hassabis, disinformation, drone strike, drop ship, dual-use technology, Easter island, Edward Snowden, effective altruism, energy transition, epigenetics, Erik Brynjolfsson, Ernest Rutherford, Extinction Rebellion, facts on the ground, failed state, Fairchild Semiconductor, fear of failure, flying shuttle, Ford Model T, future of work, general purpose technology, Geoffrey Hinton, global pandemic, GPT-3, GPT-4, hallucination problem, hive mind, hype cycle, Intergovernmental Panel on Climate Change (IPCC), Internet Archive, Internet of things, invention of the wheel, job automation, John Maynard Keynes: technological unemployment, John von Neumann, Joi Ito, Joseph Schumpeter, Kickstarter, lab leak, large language model, Law of Accelerating Returns, Lewis Mumford, license plate recognition, lockdown, machine readable, Marc Andreessen, meta-analysis, microcredit, move 37, Mustafa Suleyman, mutually assured destruction, new economy, Nick Bostrom, Nikolai Kondratiev, off grid, OpenAI, paperclip maximiser, personalized medicine, Peter Thiel, planetary scale, plutocrats, precautionary principle, profit motive, prompt engineering, QAnon, quantum entanglement, ransomware, Ray Kurzweil, Recombinant DNA, Richard Feynman, Robert Gordon, Ronald Reagan, Sam Altman, Sand Hill Road, satellite internet, Silicon Valley, smart cities, South China Sea, space junk, SpaceX Starlink, stealth mode startup, stem cell, Stephen Fry, Steven Levy, strong AI, synthetic biology, tacit knowledge, tail risk, techlash, techno-determinism, technoutopianism, Ted Kaczynski, the long tail, The Rise and Fall of American Growth, Thomas Malthus, TikTok, TSMC, Turing test, Tyler Cowen, Tyler Cowen: Great Stagnation, universal basic income, uranium enrichment, warehouse robotics, William MacAskill, working-age population, world market for maybe five computers, zero day

., “ImageNet Classification with Deep Convolutional Neural Networks,” Neural Information Processing Systems, Sept. 30, 2012, proceedings.neurips.cc/​paper/​2012/​file/​c399862d3b9d6b76c8436e924a68c45b-Paper.pdf. GO TO NOTE REFERENCE IN TEXT In 2012, AlexNet beat Jerry Wei, “AlexNet: The Architecture That Challenged CNNs,” Towards Data Science, July 2, 2019, towardsdatascience.com/​alexnet-the-architecture-that-challenged-cnns-e406d5297951. GO TO NOTE REFERENCE IN TEXT Thanks to deep learning Chanan Bos, “Tesla’s New HW3 Self-Driving Computer—It’s a Beast,” CleanTechnica, June 15, 2019, cleantechnica.com/​2019/​06/​15/​teslas-new-hw3-self-driving-computer-its-a-beast-cleantechnica-deep-dive.

., “Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity,” Journal of Machine Learning Research, June 16, 2022, arxiv.org/​abs/​2101.03961. GO TO NOTE REFERENCE IN TEXT Or look at DeepMind’s Chinchilla Alberto Romero, “A New AI Trend: Chinchilla (70B) Greatly Outperforms GPT-3 (175B) and Gopher (280B),” Towards Data Science, April 11, 2022, towardsdatascience.com/​a-new-ai-trend-chinchilla-70b-greatly-outperforms-gpt-3-175b-and-gopher-280b-408b9b4510. GO TO NOTE REFERENCE IN TEXT At the other end of the spectrum See github.com/​karpathy/​nanoGPT for more details. GO TO NOTE REFERENCE IN TEXT Meta has open-sourced Susan Zhang et al., “Democratizing Access to Large-Scale Language Models with OPT-175B,” Meta AI, May 3, 2022, ai.facebook.com/​blog/​democratizing-access-to-large-scale-language-models-with-opt-175b.

GO TO NOTE REFERENCE IN TEXT Under it, India established Trisha Ray and Akhil Deo, “Priorities for a Technology Foreign Policy for India,” Washington International Trade Association, Sept. 25, 2020, www.wita.org/​atp-research/​tech-foreign-policy-india. GO TO NOTE REFERENCE IN TEXT We live in an age Cronin, Power to the People. GO TO NOTE REFERENCE IN TEXT For example, GitHub has Neeraj Kashyap, “GitHub’s Path to 128M Public Repositories,” Towards Data Science, March 4, 2020, towardsdatascience.com/​githubs-path-to-128m-public-repositories-f6f656ab56b1. GO TO NOTE REFERENCE IN TEXT The original such service arXiv, “About ArXiv,” arxiv.org/about. GO TO NOTE REFERENCE IN TEXT The great stock of the world’s “The General Index,” Internet Archive, Oct. 7, 2021, archive.org/​details/​GeneralIndex.


User Friendly by Cliff Kuang, Robert Fabricant

A Pattern Language, Abraham Maslow, Airbnb, anti-communist, Any sufficiently advanced technology is indistinguishable from magic, Apple II, augmented reality, autonomous vehicles, behavioural economics, Bill Atkinson, Brexit referendum, Buckminster Fuller, Burning Man, business logic, call centre, Cambridge Analytica, Chuck Templeton: OpenTable:, cognitive load, computer age, Daniel Kahneman / Amos Tversky, dark pattern, data science, Donald Trump, Douglas Engelbart, Douglas Engelbart, driverless car, Elaine Herzberg, en.wikipedia.org, fake it until you make it, fake news, Ford Model T, Frederick Winslow Taylor, frictionless, Google Glasses, Internet of things, invisible hand, James Dyson, John Markoff, Jony Ive, knowledge economy, Kodak vs Instagram, Lyft, M-Pesa, Mark Zuckerberg, mobile money, Mother of all demos, move fast and break things, Norbert Wiener, Paradox of Choice, planned obsolescence, QWERTY keyboard, randomized controlled trial, replication crisis, RFID, scientific management, self-driving car, seminal paper, Silicon Valley, skeuomorphism, Skinner box, Skype, smart cities, Snapchat, speech recognition, Steve Jobs, Steve Wozniak, tacit knowledge, Tesla Model S, three-martini lunch, Tony Fadell, Uber and Lyft, Uber for X, uber lyft, Vannevar Bush, women in the workforce

But is a user-friendly world actually the best world we can create? In the months after the election, as flummoxed Hillary Clinton staffers were wondering how they’d so badly misunderstood the race they were running against Donald Trump, news reports began trickling out about Cambridge Analytica, a mysterious data-science company that had been paid millions to help Trump’s campaign in the run-up to the election.34 Cambridge Analytica itself wasn’t an innovator. It had been inspired by Michal Kosinski, a young psychologist at Cambridge University. Kosinski typically wears the uniform of a venture capitalist: pressed khakis, crisp button-down shirt tucked in.

Just 150 likes would be enough to outdo the person’s parents. At 300 or more likes, you could predict nuances of preference and personality unknown even to a person’s partner.36 On April 9, 2013, when Kosinski published his findings, a recruiter at Facebook called to see if he’d be interested in a role on its data science team. Later, when he checked his snail mail, he saw that Facebook’s lawyers had also sent him a threat of a lawsuit. Facebook quickly responded by allowing likes to be made private. But the genie had escaped its bottle. Kosinski had shown that if you knew a person’s Facebook likes, you knew their personality.

It has been estimated that during the election, the firm was testing 40,000 to 50,000 ads a day to better understand what would motivate voters—or keep voters who didn’t like Trump from voting at all.37 In one instance, Trump’s own digital operatives claimed that they’d targeted black voters in Miami’s Haitian community with stories about the Clinton Foundation’s supposedly corrupt efforts to deliver aid after Haiti’s catastrophic 2010 earthquake.38 Some months later, journalists began to question whether Cambridge Analytica’s data science really could be as advanced as it claimed.39 What no one questioned was that Facebook could easily do what Cambridge Analytica had boasted about. Indeed, months after the election, a leaked Facebook document produced by company executives in Australia suggested that they could target teens precisely at the moment they felt “insecure,” “worthless,” or “needed a confidence boost.”


pages: 561 words: 163,916

The History of the Future: Oculus, Facebook, and the Revolution That Swept Virtual Reality by Blake J. Harris

"World Economic Forum" Davos, 4chan, airport security, Anne Wojcicki, Apollo 11, Asian financial crisis, augmented reality, barriers to entry, Benchmark Capital, Bernie Sanders, bitcoin, call centre, Carl Icahn, company town, computer vision, cryptocurrency, data science, disruptive innovation, Donald Trump, drone strike, Elon Musk, fake news, financial independence, game design, Grace Hopper, hype cycle, illegal immigration, invisible hand, it's over 9,000, Ivan Sutherland, Jaron Lanier, Jony Ive, Kickstarter, Marc Andreessen, Mark Zuckerberg, Menlo Park, Minecraft, move fast and break things, Neal Stephenson, Network effects, Oculus Rift, off-the-grid, Peter Thiel, QR code, sensor fusion, Sheryl Sandberg, side project, Silicon Valley, SimCity, skunkworks, Skype, slashdot, Snapchat, Snow Crash, software patent, stealth mode startup, Steve Jobs, unpaid internship, white picket fence

And as with most cases at Oculus where the opinion was split, each half spent the next few days trying to prove why the other half were idiots. But on June 29, the playful banter was interrupted by a hot new topic that hit even closer to home. “Did you see this?” Dycus asked Luckey, pointing out one of the many articles about Facebook’s “Secret Mood Experiment,” in which data scientists at Facebook manipulated what appeared in the newsfeed of 689,000 users to try and see if they could make people feel “more positive” or “more negative” through a process called “emotional contagion.” “Yeah, I saw it,” Luckey said. “The experiment was successful, by the way,” Dycus said. “What does Brendan think?”


pages: 239 words: 74,845

The Antisocial Network: The GameStop Short Squeeze and the Ragtag Group of Amateur Traders That Brought Wall Street to Its Knees by Ben Mezrich

4chan, Asperger Syndrome, Bayesian statistics, bitcoin, Carl Icahn, contact tracing, data science, democratizing finance, Dogecoin, Donald Trump, Elon Musk, fake news, gamification, global pandemic, Google Hangouts, Hyperloop, meme stock, Menlo Park, payment for order flow, Pershing Square Capital Management, Robinhood: mobile stock trading app, security theater, short selling, short squeeze, Silicon Valley, Silicon Valley startup, social distancing, Tesla Model S, too big to fail, Two Sigma, value at risk, wealth creators

When you finally sell your positions, you still have the same problems you had before. You still have the same life, with the same bills to pay—” “Bills? Who pays bills?” Jeremy laughed. He knew he was privileged—his dad still sent him checks every month to cover his living expenses and tuition. Before Covid, he had worked part time doing data science for a professor to help cover his student loans, but now he was mostly on the parental dole. He knew there were a lot of people on the WSB board who were a lot worse off than he was. The pandemic had hit the community hard, and many were out of work. Which made it more understandable, to Jeremy, that they were willing to try to use whatever little money they had to change things, and not just incrementally—but monumentally.

Moving forward, I don’t think you’re going to see stocks with the kind of short interest levels that we’ve seen prior to this year. I don’t think investors like myself will want to be susceptible to these types of dynamics. I think there will be a lot closer monitoring of message boards…we have a data science team that will be looking at that…You know, whatever regulation that you guys come up with—certainly we’ll abide by.” Even over livestream, the transformation was visible; in less than a minute, Gabe had gone from a bewildered victim to the professional athlete he’d always been. It was time to accept the loss and move on, because there were plenty more wins in his future.


pages: 288 words: 86,995

Rule of the Robots: How Artificial Intelligence Will Transform Everything by Martin Ford

AI winter, Airbnb, algorithmic bias, algorithmic trading, Alignment Problem, AlphaGo, Amazon Mechanical Turk, Amazon Web Services, artificial general intelligence, Automated Insights, autonomous vehicles, backpropagation, basic income, Big Tech, big-box store, call centre, carbon footprint, Chris Urmson, Claude Shannon: information theory, clean water, cloud computing, commoditize, computer age, computer vision, Computing Machinery and Intelligence, coronavirus, correlation does not imply causation, COVID-19, crowdsourcing, data is the new oil, data science, deep learning, deepfake, DeepMind, Demis Hassabis, deskilling, disruptive innovation, Donald Trump, Elon Musk, factory automation, fake news, fulfillment center, full employment, future of work, general purpose technology, Geoffrey Hinton, George Floyd, gig economy, Gini coefficient, global pandemic, Googley, GPT-3, high-speed rail, hype cycle, ImageNet competition, income inequality, independent contractor, industrial robot, informal economy, information retrieval, Intergovernmental Panel on Climate Change (IPCC), Internet of things, Jeff Bezos, job automation, John Markoff, Kiva Systems, knowledge worker, labor-force participation, Law of Accelerating Returns, license plate recognition, low interest rates, low-wage service sector, Lyft, machine readable, machine translation, Mark Zuckerberg, Mitch Kapor, natural language processing, Nick Bostrom, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, Ocado, OpenAI, opioid epidemic / opioid crisis, passive income, pattern recognition, Peter Thiel, Phillips curve, post scarcity, public intellectual, Ray Kurzweil, recommendation engine, remote working, RFID, ride hailing / ride sharing, Robert Gordon, Rodney Brooks, Rubik’s Cube, Sam Altman, self-driving car, Silicon Valley, Silicon Valley startup, social distancing, SoftBank, South of Market, San Francisco, special economic zone, speech recognition, stealth mode startup, Stephen Hawking, superintelligent machines, TED Talk, The Future of Employment, The Rise and Fall of American Growth, the scientific method, Turing machine, Turing test, Tyler Cowen, Tyler Cowen: Great Stagnation, Uber and Lyft, uber lyft, universal basic income, very high income, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, WikiLeaks, women in the workforce, Y Combinator

AI’s tentacles will eventually reach into and transform virtually every existing industry, and any new industries that arise in the future will very likely incorporate the latest AI and robotics innovations from their inception. In other words, it seems very unlikely that some entirely new sector with tens of millions of new jobs will somehow materialize to absorb all the workers displaced by automation in existing industries. Rather, future industries will be built on a foundation of digital technology, data science and artificial intelligence—and as a result, they will simply not generate large numbers of jobs. A second point involves the nature of the activities undertaken by workers. It’s reasonable to estimate that roughly half our workforce is engaged in occupations that are largely routine and predictable in nature.4 By this, I don’t mean “rote-repetitive” but simply that these workers tend to face the same basic set of tasks and challenges again and again.

McCorvey, “This image-authentication startup is combating faux social media accounts, doctored photos, deep fakes, and more,” Fast Company, February 19, 2019, www.fastcompany.com/90299000/truepic-most-innovative-companies-2019. 8. Ian Goodfellow, Nicolas Papernot, Sandy Huang, et al., “Attacking machine learning with adversarial examples,” OpenAI Blog, February 24, 2017, openai.com/blog/adversarial-example-research/. 9. Anant Jain, “Breaking neural networks with adversarial attacks,” Towards Data Science, February 9, 2019, towardsdatascience.com/breaking-neural-networks-with-adversarial-attacks-f4290a9a45aa. 10. Ibid. 11. Slaughterbots, released November 12, 2017, Space Digital, www.youtube.com/watch?reload=9&v=9CO6M2HsoIA. 12. Stuart Russell, “Building a lethal autonomous weapon is easier than building a self-driving car.


pages: 517 words: 147,591

Small Wars, Big Data: The Information Revolution in Modern Conflict by Eli Berman, Joseph H. Felter, Jacob N. Shapiro, Vestal Mcintyre

basic income, call centre, centre right, classic study, clean water, confounding variable, crowdsourcing, data science, demand response, drone strike, experimental economics, failed state, George Akerlof, Google Earth, guns versus butter model, HESCO bastion, income inequality, income per capita, information asymmetry, Internet of things, iterative process, land reform, mandatory minimum, minimum wage unemployment, moral hazard, natural language processing, operational security, RAND corporation, randomized controlled trial, Ronald Reagan, school vouchers, statistical model, the scientific method, trade route, Twitter Arab Spring, unemployed young men, WikiLeaks, World Values Survey

We argue that taking a conventional approach, based on a symmetric warfare doctrine, will waste lives and resources, and risk defeat. However, taking a smarter approach can improve strategy and make dramatic gains in efficiency. Two major new tools enable this smart approach: research methods that were unavailable just fifteen years ago and data science, including the analysis of “big data.” Our use of these tools has already yielded an important central finding: in information-centric warfare, small-scale efforts can have large-scale effects. Larger efforts may be neutral at best and counterproductive at worst. If this more nuanced view can guide policy, lives and money could be saved.

Conversely, there is not a causal relationship from smoke to fire: if I want to cause more fires, I shouldn’t rent a smoke machine. Correlations are great for prediction—for example, predicting the flow of populations after a disaster or the level of hospitalization at different times—and this accounts for much of the excitement about big data that we cited in chapter 1. Big data and data science can do well predicting what will happen in the world absent policy changes, but when predicting the effects of those policy changes, correlations are not enough. If you want to know what to do in the world to produce a certain outcome, then you need to establish causality. And when the goal is discovering causality, correlations can mislead: “smoke causes fire” is obviously an erroneous statement, but we see equivalent logic in policy all the time.

Heather, 143–44 credit, attribution of. See attribution of credit/blame crisis aid, 274–78 Crost, Ben, 129, 134 Cruz, Cesi, 282 Daesh. See Islamic State DARPA. See Defense Advanced Research Projects Agency Das, Jishnu, 277 data. See microdata data access and confidentiality, 314–16, 323–24 data science. See big data D-Day, 2–4 decision process, for civilian informants, 16–17, 65–77, 80–81, 188–89 Defense Advanced Research Projects Agency (DARPA), 13–14 Deininger, Klaus, 241 Dell, Melissa, 177, 179–81, 214 Democratic Republic of the Congo, 296 demonstration effects, 276 Department of Defense Rewards Program, 247–48 development assistance, 109–51; in Afghanistan, 132–34, 144–48, 153–56, 223–25, 280; in asymmetric vs. symmetric wars, 141, 150; characteristics of successful, 80, 128, 149, 151, 221, 252, 258, 260, 321–22; civilian attitudes in relation to, 132–34; community-tailored, 124; conditional nature of, 122, 126–27, 131–32; expertise as factor in success of, 125–26; food as the form of, 139–42; humanitarian rationales for, 149–50; in Iraq, 109–13, 123–28, 146–48; large-scale, 123–28, 134–35, 146–48, 298–99, 299, 322–23; level of existing violence as factor in, 122–23, 127–28; as military strategy, 113–14, 298; modestly-scaled, 123–28, 148–49, 167–70, 278–81; in Pakistan, 279–80; in Philippines, 128–32, 134–39; predictions on, from information-centric model, 120–23, 148–49; rationales for, 109, 114–15; security provision in relation to, 127, 147–48, 153–59, 162–77, 169, 181–83; studies of effects of, 29–31, 52, 123–38, 151; theft and corruption involving, 139–48; violence diminished by, 123–34, 148–49, 157–62, 167–77, 323; violence increased by, 115, 134–45, 136, 156, 223–24, 224, 322.


pages: 559 words: 155,372

Chaos Monkeys: Obscene Fortune and Random Failure in Silicon Valley by Antonio Garcia Martinez

Airbnb, airport security, always be closing, Amazon Web Services, Big Tech, Burning Man, business logic, Celtic Tiger, centralized clearinghouse, cognitive dissonance, collective bargaining, content marketing, corporate governance, Credit Default Swap, crowdsourcing, data science, deal flow, death of newspapers, disruptive innovation, Dr. Strangelove, drone strike, drop ship, El Camino Real, Elon Musk, Emanuel Derman, Fairchild Semiconductor, fake it until you make it, financial engineering, financial independence, Gary Kildall, global supply chain, Goldman Sachs: Vampire Squid, Hacker News, hive mind, How many piano tuners are there in Chicago?, income inequality, industrial research laboratory, information asymmetry, information security, interest rate swap, intermodal, Jeff Bezos, Kickstarter, Malcom McLean invented shipping containers, Marc Andreessen, Mark Zuckerberg, Maui Hawaii, means of production, Menlo Park, messenger bag, minimum viable product, MITM: man-in-the-middle, move fast and break things, Neal Stephenson, Network effects, orbital mechanics / astrodynamics, Paul Graham, performance metric, Peter Thiel, Ponzi scheme, pre–internet, public intellectual, Ralph Waldo Emerson, random walk, Reminiscences of a Stock Operator, Ruby on Rails, Salesforce, Sam Altman, Sand Hill Road, Scientific racism, second-price auction, self-driving car, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Skype, Snapchat, social graph, Social Justice Warrior, social web, Socratic dialogue, source of truth, Steve Jobs, tech worker, telemarketer, the long tail, undersea cable, urban renewal, Y Combinator, zero-sum game, éminence grise

As I was soon to find is de rigueur, I was asked to sign a nondisclosure agreement that made it illegal for me to so much as leak the wallpaper design in the kitchens or reveal any knowledge, whether carnal or technical, gleaned while inside Twitter. Then I waited. A couple of overdressed and nervous-looking people waited with me, probably job candidates. There were large-format coffee-table books, artsy tomes about the new world of data science and landscape photography, on the coffee tables. The reception area itself was tastefully paneled with reclaimed wood, and that little Twitter bird logo was on everything, down to the elegant black coffee mugs kept in the reception’s minikitchen. Jess appeared and was exactly like her publicity photos online.

Since Facebook Ads didn’t ship products that people actually asked for, launching always had a certain foie-gras-duck-undergoing-gavage quality to it: open up and pump it in.* However, to my earlier point, the performance gains attributed to Sponsored Stories (to the extent they even existed outside the perfect test conditions of the Facebook data-science team) weren’t significant in the everyday noise of a live ads campaign. And so the partners were struggling to get advertisers to invest in the new, sexy hotness. Facebook’s reaction was to basically tell them to grip the goose tighter and stick the tube farther down its throat. Yet despite all these hints of failure and fundamentally wrong direction, here we were at the big FMC show, with Paul Adams standing onstage with an image of a stylized social network behind him, like George C.

* Pronounced “eff-ait,” and not “fate,” the name F8 came from the eight hours that engineers spent in hackathons, the all-night company-wide coding sessions that produced some of Facebook’s more random (and successful) products. * Titled Social Influence in Social Advertising: Evidence from Field Experiments, the paper would eventually appear at an Association for Computing Machinery e-commerce conference, with Eytan Bakshy, Dean Eckles, Rong Yan, and Itamar Rosenn as its authors. Facebook’s data-science team was absolutely top-notch and boasted both already-prominent academics and young, up-and-coming PhDs who were ecstatic to get their busy hands on Facebook’s vast store of proprietary data. The team’s papers, such as this one, were always carefully executed experiments that often called bullshit on some social media truism—often one that originated with Facebook itself


Learn Algorithmic Trading by Sebastien Donadio

active measures, algorithmic trading, automated trading system, backtesting, Bayesian statistics, behavioural economics, buy and hold, buy low sell high, cryptocurrency, data science, deep learning, DevOps, en.wikipedia.org, fixed income, Flash crash, Guido van Rossum, latency arbitrage, locking in a profit, market fundamentalism, market microstructure, martingale, natural language processing, OpenAI, p-value, paper trading, performance metric, prediction markets, proprietary trading, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, Sharpe ratio, short selling, sorting algorithm, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, type inference, WebSocket, zero-sum game

He has been in the IT industry for more than 19 years and has worked in the technical & analytics divisions of Philip Morris, IBM, UBS Investment Bank, and Purdue Pharma. He led the Data Science team at Purdue, where he developed the company's award-winning Big Data and Machine Learning platform. Prior to Purdue, at UBS, he held the role of Associate Director, working with high-frequency & algorithmic trading technologies in the Foreign Exchange Trading group. He has authored Practical Big Data Analytics and co-authored Hands-on Data Science with R. Apart from his role at RxDataScience, and is also currently affiliated with Imperial College, London. Ratanlal Mahanta is currently working as a quantitative analyst at bittQsrv, a global quantitative research company offering quant models for its investors.


pages: 279 words: 87,875

Underwater: How Our American Dream of Homeownership Became a Nightmare by Ryan Dezember

"RICO laws" OR "Racketeer Influenced and Corrupt Organizations", activist fund / activist shareholder / activist investor, Airbnb, Bear Stearns, business cycle, call centre, Carl Icahn, Cesare Marchetti: Marchetti’s constant, cloud computing, collateralized debt obligation, company town, coronavirus, corporate raider, COVID-19, Credit Default Swap, credit default swaps / collateralized debt obligations, data science, deep learning, Donald Trump, Home mortgage interest deduction, housing crisis, interest rate swap, low interest rates, margin call, McMansion, mortgage debt, mortgage tax deduction, negative equity, opioid epidemic / opioid crisis, pill mill, rent control, rolodex, Savings and loan crisis, sharing economy, sovereign wealth fund, transaction costs

Once the computer got the picture, it pored over listing photos, written property descriptions, public records, and satellite imagery, looking for sunny kitchens. Kay and his partners decided to focus on helping better-funded investors find houses rather than enlarging their own pool of homes. Their specialty was data science, after all, not collecting rent. Whenever Entera needed money to train its algorithms on a new market or add employees, Kay would sell some of the Texas houses that he’d bought on the cheap. By 2018, Entera’s algorithms had unearthed tens of thousands of houses that wound up in the portfolios of American Homes 4 Rent, Invitation Homes, and others.

., selling stock prices compared to subdivision development over for Sunset Bay’s auctioning off Connors, Cristie conservation conspiracy theories CoreLogic corporate buyout firm (KKR) corruption charges Countrywide Financial courthouse auctions credit default swaps credit scores credit-rating firms Cypress Village data science Davidson, Jerry debt debt-to-income ratio deed filings Deepwater Horizon oil spill DeLawder, C. Daniel demand growth destruction, from hurricane developers, litigious buyers and development projects Dolphin Club down payments drug charges drug overdose DuBose, Kristi Dudley, William easy money ecology economy emergency management bunker Empire Group endangered species Engels, Friedrich English, Dewey Entera Technology Environmental Protection Agency (EPA) Envision Gulf Shores EPA.


pages: 339 words: 92,785

I, Warbot: The Dawn of Artificially Intelligent Conflict by Kenneth Payne

Abraham Maslow, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, AlphaGo, anti-communist, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, Asperger Syndrome, augmented reality, Automated Insights, autonomous vehicles, backpropagation, Black Lives Matter, Bletchley Park, Boston Dynamics, classic study, combinatorial explosion, computer age, computer vision, Computing Machinery and Intelligence, coronavirus, COVID-19, CRISPR, cuban missile crisis, data science, deep learning, deepfake, DeepMind, delayed gratification, Demis Hassabis, disinformation, driverless car, drone strike, dual-use technology, Elon Musk, functional programming, Geoffrey Hinton, Google X / Alphabet X, Internet of things, job automation, John Nash: game theory, John von Neumann, Kickstarter, language acquisition, loss aversion, machine translation, military-industrial complex, move 37, mutually assured destruction, Nash equilibrium, natural language processing, Nick Bostrom, Norbert Wiener, nuclear taboo, nuclear winter, OpenAI, paperclip maximiser, pattern recognition, RAND corporation, ransomware, risk tolerance, Ronald Reagan, self-driving car, semantic web, side project, Silicon Valley, South China Sea, speech recognition, Stanislav Petrov, stem cell, Stephen Hawking, Steve Jobs, strong AI, Stuxnet, technological determinism, TED Talk, theory of mind, TikTok, Turing machine, Turing test, uranium enrichment, urban sprawl, V2 rocket, Von Neumann architecture, Wall-E, zero-sum game

For a classic American statement, see Krulak, Charles, C. ‘The Strategic Corporal: Leadership in the three-block war’. Marines Magazine 6, 1999. https://apps.dtic.mil/dtic/tr/fulltext/u2/a399413.pdf. 28. Eady, Yarrow. ‘Tesla’s deep learning at scale: Using billions of miles to train neural networks’, Towards Data Science, 7 May 2019, towardsdatascience.com/teslas-deep-learning-at-scale-7eed85b235d3. 29. Lye, Harry. ‘UK flies 20-drone swarm in major test,’ Airforce Technology, 28 January 2021, https://www.airforce-technology.com/news/uk-flies-20-drone-swarm-in-major-test/. 30. Cooper, Helene, Ralf Blumenthal and Leslie Kean.

‘The accuracy, fairness, and limits of predicting recidivism’, Science Advances 4, no. 1 (2018): eaao5580. Dunbar, Robin I. M. ‘Neocortex size and group size in primates: a test of the hypothesis’, Journal of Human Evolution 28, no. 3 (1995): 287–296. Eady, Yarrow. ‘Tesla’s deep learning at scale: using billions of miles to train neural networks’, Towards Data Science, 7 May 2019, https://towardsdatascience.com/teslas-deep-learning-at-scale-7eed85b235d3. Eagleman, David. ‘Can we create new senses for humans’, TED talks, March 2015, https://www.ted.com/talks/david_eagleman_can_we_create_new_senses_for_humans?utm_campaign=tedspread&utm_medium=referral&utm_source=tedcomshare.


pages: 400 words: 99,489

The Sirens of Mars: Searching for Life on Another World by Sarah Stewart Johnson

Albert Einstein, Alfred Russel Wallace, Astronomia nova, back-to-the-land, Beryl Markham, classic study, cuban missile crisis, dark matter, data science, Drosophila, Elon Musk, invention of the printing press, Isaac Newton, Johannes Kepler, Late Heavy Bombardment, low earth orbit, Mars Rover, Mercator projection, Neil Armstrong, Pierre-Simon Laplace, Ronald Reagan, scientific mainstream, sensible shoes, Suez canal 1869

., “Oceans in the Past History of Mars: Tests for Their Presence Using Mars Orbiter Laser Altimeter (MOLA) Data,” Geophysical Research Letters (Dec. 1998), p. 4,403; J. W. Head III, et al., “Possible Ancient Oceans on Mars: Evidence from Mars Orbiter Laser Altimeter Data,” Science, 286 (1999), pp. 2,134–2,137. THREE AND A HALF J. P. Bibring, et al., “Global Mineralogical and Aqueous Mars History Derived from OMEGA/Mars Express Data,” Science, 312 (April 2006), pp. 400–404. ALMOST ALL OF THE ATMOSPHERE With essentially no greenhouse effect, the surface temperatures of Mars, following the Stefan-Boltzmann law, slowly dropped to an average of minus 60 degrees Celsius, the surface temperature today.


pages: 174 words: 34,672

Nginx Essentials by Valery Kholodkov

data science, Debian, en.wikipedia.org, web application

Jesse Estill Lawson is a computer scientist and social science researcher who works in higher education. He has consulted with dozens of colleges across the country to help them design, develop, and deploy computer information systems on everything from Windows and Apache to Nginx and node servers, and he centers his research on the coexistence of data science and sociology. In addition to his technological background, Jesse holds an MA in English and is currently working on his PhD in education. You can learn more about him on his website at http://lawsonry.com. Daniel Parraz is a Linux systems administrator with 15 years of experience in high-volume e-retailer sites, large system storage, and security enterprises.


pages: 446 words: 102,421

Network Security Through Data Analysis: Building Situational Awareness by Michael S Collins

business process, cloud computing, create, read, update, delete, data science, Firefox, functional programming, general-purpose programming language, index card, information security, Internet Archive, inventory management, iterative process, operational security, OSI model, p-value, Parkinson's law, peer-to-peer, slashdot, statistical model, zero day

Bad security policy will result in users increasingly evading detection in order to get their jobs done or just to blow off steam, and that adds additional work for your defenders. The emphasis on actionability and the goal of achieving security is what differentiates this book from a more general text on data science. The section on analysis proper covers statistical and data analysis techniques borrowed from multiple other disciplines, but the overall focus is on understanding the structure of a network and the decisions that can be made to protect it. To that end, I have abridged the theory as much as possible, and have also focused on mechanisms for identifying abusive behavior.

When building visualizations, it’s important to know how long it will take to complete one and to provide the user with some feedback that the visualization is actually being generated. Further Reading Greg Conti, Security Data Visualization: Graphical Techniques for Network Analysis (No Starch Press, 2001). NIST Handbook of Explorator Data Analysis Cathy O’Neil and Rachel Schutt, Doing Data Science (O’Reilly, 2013). Edward Tufte, The Visual Display of Quantitative Information (Graphics Press, 2001). John Tukey, Exploratory Data Analysis (Pearson, 1997). * * * [17] There’s nothing quite like the day you start an investigation based on the attacker being written up in the New York Times


pages: 451 words: 103,606

Machine Learning for Hackers by Drew Conway, John Myles White

call centre, centre right, correlation does not imply causation, data science, Debian, Erdős number, Nate Silver, natural language processing, Netflix Prize, off-by-one error, p-value, pattern recognition, Paul Erdős, recommendation engine, social graph, SpamAssassin, statistical model, text mining, the scientific method, traveling salesman

We’d also like to thank the members of the NYC Data Brunch for originally inspiring us to write this book and for giving us a place to refine our ideas about teaching machine learning. In particular, thanks to Hilary Mason for originally introducing us to several people at O’Reilly. Finally, we’d like to thank the many friends of ours in the data science community who’ve been so supportive and encouraging while we’ve worked on this book. Knowing that people wanted to read our book helped us keep up pace during the long haul that writing a full-length book entails. From Drew Conway I would like to thank Julie Steele, our editor, for appreciating our motivation for this book and giving us the ability to produce it.

—The R Project for Statistical Computing, http://www.r-project.org/ The best thing about R is that it was developed by statisticians. The worst thing about R is that...it was developed by statisticians. —Bo Cowgill, Google, Inc. R is an extremely powerful language for manipulating and analyzing data. Its meteoric rise in popularity within the data science and machine learning communities has made it the de facto lingua franca for analytics. R’s success in the data analysis community stems from two factors described in the preceding epitaphs: R provides most of the technical power that statisticians require built into the default language, and R has been supported by a community of statisticians who are also open source devotees.


pages: 685 words: 203,949

The Organized Mind: Thinking Straight in the Age of Information Overload by Daniel J. Levitin

Abraham Maslow, airport security, Albert Einstein, Amazon Mechanical Turk, Anton Chekhov, autism spectrum disorder, Bayesian statistics, behavioural economics, big-box store, business process, call centre, Claude Shannon: information theory, cloud computing, cognitive bias, cognitive load, complexity theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, data science, deep learning, delayed gratification, Donald Trump, en.wikipedia.org, epigenetics, Eratosthenes, Exxon Valdez, framing effect, friendly fire, fundamental attribution error, Golden Gate Park, Google Glasses, GPS: selective availability, haute cuisine, How many piano tuners are there in Chicago?, human-factors engineering, if you see hoof prints, think horses—not zebras, impulse control, index card, indoor plumbing, information retrieval, information security, invention of writing, iterative process, jimmy wales, job satisfaction, Kickstarter, language acquisition, Lewis Mumford, life extension, longitudinal study, meta-analysis, more computing power than Apollo, Network effects, new economy, Nicholas Carr, optical character recognition, Pareto efficiency, pattern recognition, phenotype, placebo effect, pre–internet, profit motive, randomized controlled trial, Rubik’s Cube, Salesforce, shared worldview, Sheryl Sandberg, Skype, Snapchat, social intelligence, statistical model, Steve Jobs, supply-chain management, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, Thomas Bayes, traumatic brain injury, Turing test, Twitter Arab Spring, ultimatum game, Wayback Machine, zero-sum game

NOTES NOTE ON THE ENDNOTES Scientists make their living by evaluating evidence, and come to provisional conclusions based on the weight of that evidence. I say “provisional” because we acknowledge the possibility that new data may come to light that challenge current assumptions and understanding. In evaluating published data, scientists have to consider such things as the quality of the experiment (and the experimenters), the quality of the review process under which the work was assessed, and the explanatory power of the work. Part of the evaluation includes considering alternative explanations and contradictory findings, and forming a (preliminary) conclusion about what all the existing data say.


pages: 677 words: 206,548

Future Crimes: Everything Is Connected, Everyone Is Vulnerable and What We Can Do About It by Marc Goodman

23andMe, 3D printing, active measures, additive manufacturing, Affordable Care Act / Obamacare, Airbnb, airport security, Albert Einstein, algorithmic trading, Alvin Toffler, Apollo 11, Apollo 13, artificial general intelligence, Asilomar, Asilomar Conference on Recombinant DNA, augmented reality, autonomous vehicles, Baxter: Rethink Robotics, Bill Joy: nanobots, bitcoin, Black Swan, blockchain, borderless world, Boston Dynamics, Brian Krebs, business process, butterfly effect, call centre, Charles Lindbergh, Chelsea Manning, Citizen Lab, cloud computing, Cody Wilson, cognitive dissonance, computer vision, connected car, corporate governance, crowdsourcing, cryptocurrency, data acquisition, data is the new oil, data science, Dean Kamen, deep learning, DeepMind, digital rights, disinformation, disintermediation, Dogecoin, don't be evil, double helix, Downton Abbey, driverless car, drone strike, Edward Snowden, Elon Musk, Erik Brynjolfsson, Evgeny Morozov, Filter Bubble, Firefox, Flash crash, Free Software Foundation, future of work, game design, gamification, global pandemic, Google Chrome, Google Earth, Google Glasses, Gordon Gekko, Hacker News, high net worth, High speed trading, hive mind, Howard Rheingold, hypertext link, illegal immigration, impulse control, industrial robot, information security, Intergovernmental Panel on Climate Change (IPCC), Internet of things, Jaron Lanier, Jeff Bezos, job automation, John Harrison: Longitude, John Markoff, Joi Ito, Jony Ive, Julian Assange, Kevin Kelly, Khan Academy, Kickstarter, Kiva Systems, knowledge worker, Kuwabatake Sanjuro: assassination market, Large Hadron Collider, Larry Ellison, Laura Poitras, Law of Accelerating Returns, Lean Startup, license plate recognition, lifelogging, litecoin, low earth orbit, M-Pesa, machine translation, Mark Zuckerberg, Marshall McLuhan, Menlo Park, Metcalfe’s law, MITM: man-in-the-middle, mobile money, more computing power than Apollo, move fast and break things, Nate Silver, national security letter, natural language processing, Nick Bostrom, obamacare, Occupy movement, Oculus Rift, off grid, off-the-grid, offshore financial centre, operational security, optical character recognition, Parag Khanna, pattern recognition, peer-to-peer, personalized medicine, Peter H. Diamandis: Planetary Resources, Peter Thiel, pre–internet, printed gun, RAND corporation, ransomware, Ray Kurzweil, Recombinant DNA, refrigerator car, RFID, ride hailing / ride sharing, Rodney Brooks, Ross Ulbricht, Russell Brand, Salesforce, Satoshi Nakamoto, Second Machine Age, security theater, self-driving car, shareholder value, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, SimCity, Skype, smart cities, smart grid, smart meter, Snapchat, social graph, SoftBank, software as a service, speech recognition, stealth mode startup, Stephen Hawking, Steve Jobs, Steve Wozniak, strong AI, Stuxnet, subscription business, supply-chain management, synthetic biology, tech worker, technological singularity, TED Talk, telepresence, telepresence robot, Tesla Model S, The future is already here, The Future of Employment, the long tail, The Wisdom of Crowds, Tim Cook: Apple, trade route, uranium enrichment, Virgin Galactic, Wall-E, warehouse robotics, Watson beat the top human players on Jeopardy!, Wave and Pay, We are Anonymous. We are Legion, web application, Westphalian system, WikiLeaks, Y Combinator, you are the product, zero day

I visited the scions of Silicon Valley and made friends within the highly talented San Francisco Bay Area start-up community. I was invited to join the faculty of Singularity University, an amazing institution housed on the campus of NASA’s Ames Research Center, where I worked with a brilliant array of astronauts, roboticists, data scientists, computer engineers, and synthetic biologists. These pioneering men and women have the ability to see beyond today’s world, unlocking the tremendous potential of technology to confront the grandest challenges facing humanity. But many of these Silicon Valley entrepreneurs hard at work creating our technological future pay precious little attention to the public policy, legal, ethical, and security risks that their creations pose to the rest of society.


pages: 691 words: 203,236

Whiteshift: Populism, Immigration and the Future of White Majorities by Eric Kaufmann

4chan, Abraham Maslow, affirmative action, Amazon Mechanical Turk, anti-communist, anti-globalists, augmented reality, battle of ideas, behavioural economics, Berlin Wall, Bernie Sanders, Boris Johnson, Brexit referendum, British Empire, centre right, Chelsea Manning, cognitive dissonance, complexity theory, corporate governance, correlation does not imply causation, critical race theory, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, David Brooks, deindustrialization, demographic transition, Donald Trump, Elon Musk, en.wikipedia.org, facts on the ground, failed state, fake news, Fall of the Berlin Wall, first-past-the-post, Francis Fukuyama: the end of history, gentrification, Great Leap Forward, Haight Ashbury, Herbert Marcuse, illegal immigration, immigration reform, imperial preference, income inequality, it's over 9,000, Jeremy Corbyn, knowledge economy, knowledge worker, liberal capitalism, longitudinal study, Lyft, mass immigration, meta-analysis, microaggression, moral panic, Nate Silver, New Urbanism, Norman Mailer, open borders, open immigration, opioid epidemic / opioid crisis, Overton Window, phenotype, postnationalism / post nation state, Ralph Waldo Emerson, Republic of Letters, Ronald Reagan, Scientific racism, Silicon Valley, Social Justice Warrior, statistical model, Steve Bannon, Steven Pinker, the built environment, the scientific method, The Wisdom of Crowds, transcontinental railway, twin studies, uber lyft, upwardly mobile, urban sprawl, W. E. B. Du Bois, Washington Consensus, white flight, working-age population, World Values Survey, young professional

Controlling for all the usual confounders, white British Leavers, UKIP/BNP voters or right-wing voters move to places a few points whiter than white British Remainers and left-wing voters, but the difference is small. In the US, there is no equivalent data, so I turned to geocoded pro- and anti-Trump tweets. This work, with a data scientist, Andrius Mudinas, finds a similar pattern to Britain. Namely, white Americans move to significantly whiter places than minorities, but whites who are pro- and anti-Trump move to equally white areas. This echoes a growing number of US studies using voter registration files which find that the partisan composition of areas is not what attracts white Republicans or Democrats there.


pages: 404 words: 43,442

The Art of R Programming by Norman Matloff

data science, Debian, discrete time, Donald Knuth, functional programming, general-purpose programming language, linked data, sorting algorithm, statistical model

Thus, I have the points of view of both a “hard-core” computer scientist and of a statistician and statistics researcher. I hope this blend enables this book to fill a gap in the literature and enhances its value for you, the reader. Introduction xxiii 1 GETTING S TAR TED As detailed in the introduction, R is an extremely versatile open source programming language for statistics and data science. It is widely used in every field where there is data— business, industry, government, medicine, academia, and so on. In this chapter, you’ll get a quick introduction to R—how to invoke it, what it can do, and what files it uses. We’ll cover just enough to give you the basics you need to work through the examples in the next few chapters, where the details will be presented.

You can delete rows or columns by reassignment, too: > m <- matrix(1:6,nrow=3) > m [,1] [,2] [1,] 1 4 [2,] 2 5 [3,] 3 6 > m <- m[c(1,3),] > m [,1] [,2] [1,] 1 4 [2,] 3 6 3.4.2 Extended Example: Finding the Closest Pair of Vertices in a Graph Finding the distances between vertices on a graph is a common example used in computer science courses and is used in statistics/data sciences too. This kind of problem arises in some clustering algorithms, for instance, and in genomics applications. Matrices and Arrays 75 Here, we’ll look at the common example of finding distances between cities, as it is easier to describe than, say, finding distances between DNA strands. Suppose we need a function that inputs a distance matrix, where the element in row i, column j gives the distance between city i and city j and outputs the minimum one-hop distance between cities and the pair of cities that achieves that minimum.


pages: 386 words: 113,709

Why We Drive: Toward a Philosophy of the Open Road by Matthew B. Crawford

1960s counterculture, Airbus A320, airport security, augmented reality, autonomous vehicles, behavioural economics, Bernie Sanders, Big Tech, Boeing 737 MAX, British Empire, Burning Man, business logic, call centre, classic study, collective bargaining, confounding variable, congestion pricing, crony capitalism, data science, David Sedaris, deskilling, digital map, don't be evil, Donald Trump, driverless car, Elon Musk, emotional labour, en.wikipedia.org, Fellow of the Royal Society, Ford Model T, gamification, gentrification, gig economy, Google Earth, Great Leap Forward, Herbert Marcuse, hive mind, Ian Bogost, income inequality, informal economy, Internet of things, Jane Jacobs, labour mobility, Lyft, mirror neurons, Network effects, New Journalism, New Urbanism, Nicholas Carr, planned obsolescence, Ponzi scheme, precautionary principle, Ralph Nader, ride hailing / ride sharing, Ronald Reagan, Sam Peltzman, security theater, self-driving car, sharing economy, Shoshana Zuboff, Silicon Valley, smart cities, social graph, social intelligence, Stephen Hawking, surveillance capitalism, tacit knowledge, tech worker, technoutopianism, the built environment, The Death and Life of Great American Cities, the High Line, time dilation, too big to fail, traffic fines, Travis Kalanick, trolley problem, Uber and Lyft, Uber for X, uber lyft, Unsafe at Any Speed, urban planning, Wall-E, Works Progress Administration

Google is building a model city within Toronto, a sort of Bonsai version of what is possible, conceived in the spirit of other demonstration cities that were intended to sway elite opinion, such as Potemkin’s village that so impressed Catherine the Great. Sensors will be embedded throughout the physical plant to capture the resident’s activities, then to be massaged by cutting-edge data science. The hope, clearly, is to build a deep, proprietary social science. Such a science could lead to real improvements in urban management, for example by being able to predict demand for heat and electricity, manage the allocation of street capacity based on demand, and automate waste disposal. But note that hoarding the data collected, and guarding it with military-grade secrecy, is key to the whole concept, as without that there is no business rationale.

If that legitimacy cannot be grounded in our shared rationality, based on reasons that can be articulated, interrogated, and defended, it will surely be claimed on some other basis. What this will be is already coming into view, and it bears a striking resemblance to priestly divination: the inscrutable arcana of data science, by which a new clerisy peers into a hidden layer of reality that is revealed only by a self-taught AI program—the logic of which is beyond human knowing. For the past several years it has been common to hear establishmentarian intellectuals lament “populism” as a rejection of Enlightenment ideals.


Common Stocks and Uncommon Profits and Other Writings by Philip A. Fisher

book value, business climate, business cycle, buy and hold, data science, El Camino Real, estate planning, fixed income, index fund, low interest rates, market bubble, market fundamentalism, profit motive, RAND corporation, Salesforce, the market place, transaction costs, vertical integration

It is alarming to read some of the reasons given in brokerage reports recommending purchase of these shares and then to compare the outlook described in such documents with what actually was to happen. A fragmentary list of such companies might include: Memorex high 173⅞, Ampex high 49⅞, Levitz Furniture high 60½, Mohawk Data Sciences high 111, Litton Industries high 101¾, Kalvar high 176½. The list could go on and on and on. However, more examples would serve only to make the same point over and over. Since it should already be quite apparent how important is the habit of evaluating any difference that may exist between a contemporary financial-community appraisal of a company and the fundamental aspects of that company, it should be more productive for us to spend our time examining fur-ther the characteristics of these financial-community appraisals.

Mallory-Sharon Metals Corporation Management approaching of change in concept of depth in deterioration of discipline of integrity of knowing Margin, buying on Market efficiency of possible downturns in, selling and Marketability, of stocks. See Liquidity Marketing Market potential, of products Market price trends, (chart). See also Price entries Market research Markets, exhaustion of Market timing Matsushita Memorex Metal Hydrides Middle companies, in diversification Mistakes Mohawk Data Sciences Monopolies Montgomery Ward Motorola N National Association of Securities Dealers Needs, of investor New-issue supply New products New York Stock Exchange Nielsen, A. C., Co. Noble, Daniel O Opportunity history vs. price vs. Overpriced stocks Over-the-counter stocks P Panic of 1873 Past, clues from Patents Patience People-effectiveness program People factors Performance Per-share earnings, past Personnel relations.


pages: 475 words: 127,389

Apollo's Arrow: The Profound and Enduring Impact of Coronavirus on the Way We Live by Nicholas A. Christakis

agricultural Revolution, Anthropocene, Atul Gawande, Boris Johnson, butterfly effect, Chuck Templeton: OpenTable:, classic study, clean water, Columbian Exchange, contact tracing, contact tracing app, coronavirus, COVID-19, dark matter, data science, death of newspapers, disinformation, Donald Trump, Downton Abbey, Edward Jenner, Edward Lorenz: Chaos theory, George Floyd, global pandemic, global supply chain, helicopter parent, Henri Poincaré, high-speed rail, income inequality, invention of agriculture, invisible hand, it's over 9,000, job satisfaction, lockdown, manufacturing employment, mass immigration, mass incarceration, medical residency, meta-analysis, New Journalism, randomized controlled trial, risk tolerance, Robert Shiller, school choice, security theater, social contagion, social distancing, Steven Pinker, TED Talk, the scientific method, trade route, Upton Sinclair, zoonotic diseases

The “war on cancer” declared in 1971 had a similar impact (although it did not cure cancer, it advanced fundamental medical science). Perhaps the multitrillion-dollar hit to the American economy by the COVID-19 pandemic will make multibillion-dollar investments in science—from virology to medicine to epidemiology to data science—seem well worth it. Plagues can also lead to long-term shifts in how we think about government and leaders. In medieval times, the manifest inability of rulers, priests, doctors, and others in positions of authority to control the course of plague led to a wholesale loss of faith in the corresponding institutions and a strong desire for new sources of authority.

Christakis is a physician and sociologist who explores the ancient origins and modern implications of human nature. He directs the Human Nature Lab at Yale University, where he is the Sterling Professor of Social and Natural Science in the Departments of Sociology, Medicine, Ecology and Evolutionary Biology, Statistics and Data Science, and Biomedical Engineering. He is the codirector of the Yale Institute for Network Science, the coauthor of Connected, and the author of Blueprint. Also by Nicholas A. Christakis Death Foretold Connected (with James H. Fowler) Blueprint


pages: 592 words: 125,186

The Science of Hate: How Prejudice Becomes Hate and What We Can Do to Stop It by Matthew Williams

3D printing, 4chan, affirmative action, agricultural Revolution, algorithmic bias, Black Lives Matter, Brexit referendum, Cambridge Analytica, citizen journalism, cognitive dissonance, coronavirus, COVID-19, dark matter, data science, deep learning, deindustrialization, desegregation, disinformation, Donald Trump, European colonialism, fake news, Ferguson, Missouri, Filter Bubble, gamification, George Floyd, global pandemic, illegal immigration, immigration reform, impulse control, income inequality, longitudinal study, low skilled workers, Mark Zuckerberg, meta-analysis, microaggression, Milgram experiment, Oklahoma City bombing, OpenAI, Overton Window, power law, selection bias, Snapchat, statistical model, The Turner Diaries, theory of mind, TikTok, twin studies, white flight

The pattern in the UK is broadly similar, with the internet ahead (77 per cent), and TV (55 per cent) and print (22 per cent) trailing behind.4 For younger age groups (particularly those aged sixteen to twenty-four), online sources are their primary gateway to information about the world, family and friends.5 Algorithms learn from user behaviour, and therefore influence our collective actions. This means our prejudices and biases become embedded in bits of code that go on to influence what we are exposed to online, reflecting back these biases in often amplified ways. The emerging consensus from the field of data science is that algorithms are assisting in the polarisation of information exposure and hence debate and action online. Take YouTube as an example. The website Algotransparency.org, developed by an ex-Google employee, analyses YouTube’s top autoplay suggestions based on any search in order to demonstrate how the site’s recommendation algorithm works.

Filter bubbles and our bias Research on internet ‘filter bubbles’, often used interchangeably with the term ‘echo chambers’,§ has established that partisan information sources are amplified in online networks of like-minded social media users, where they go largely unchallenged due to ranking algorithms filtering out any challenging posts.9 Data science shows these filter bubbles are resilient accelerators of prejudice, reinforcing and amplifying extreme viewpoints on both sides of the spectrum. Looking at over half a million tweets covering the issues of gun control, same-sex marriage and climate change, New York University’s Social Perception and Evaluation Lab found that hateful posts related to these issues increased retweeting within filter bubbles, but not between them.


pages: 933 words: 205,691

Hadoop: The Definitive Guide by Tom White

Amazon Web Services, bioinformatics, business intelligence, business logic, combinatorial explosion, data science, database schema, Debian, domain-specific language, en.wikipedia.org, exponential backoff, fallacies of distributed computing, fault tolerance, full text search, functional programming, Grace Hopper, information retrieval, Internet Archive, Kickstarter, Large Hadron Collider, linked data, loose coupling, openstreetmap, recommendation engine, RFID, SETI@home, social graph, sparse data, web application

You can inspect the generated script and check that the substitutions look sane (because they are dynamically generated, for example) before running it in normal mode. At the time of this writing, Grunt does not support parameter substitution. Chapter 12. Hive In “Information Platforms and the Rise of the Data Scientist,”[100] Jeff Hammerbacher describes Information Platforms as “the locus of their organization’s efforts to ingest, process, and generate information,” and how they “serve to accelerate the process of learning from empirical data.” One of the biggest ingredients in the Information Platform built by Jeff’s team at Facebook was Hive, a framework for data warehousing on top of Hadoop.


pages: 1,085 words: 219,144

Solr in Action by Trey Grainger, Timothy Potter

business intelligence, cloud computing, commoditize, conceptual framework, crowdsourcing, data acquisition, data science, en.wikipedia.org, failed state, fault tolerance, finite state, full text search, functional programming, glass ceiling, information retrieval, machine readable, natural language processing, openstreetmap, performance metric, premature optimization, recommendation engine, web application

You can also combine the Boost query parser with another query parser through the use of a nested query: /select?q=_query_:"{!edismax qf=title content}data science" AND _query_:"{!boost b=log(popularity)}*:*" AND _query_:"{!boost b=recip( ms(NOW,articledate),3.16e-11,1,1)}category:news" This query will run a search for the keywords data science, boosting all documents by their popularity and by how recently they were posted if they fall within the “news” category. The number of results will be the same as the search for data science; the boost clauses only serve to affect document relevancy. 7.6.6. Prefix query parser The Prefix query parser can be used in place of a wildcard query.


pages: 420 words: 135,569

Imaginable: How to See the Future Coming and Feel Ready for Anything―Even Things That Seem Impossible Today by Jane McGonigal

2021 United States Capitol attack, Airbnb, airport security, Alvin Toffler, augmented reality, autism spectrum disorder, autonomous vehicles, availability heuristic, basic income, biodiversity loss, bitcoin, Black Lives Matter, blockchain, circular economy, clean water, climate change refugee, cognitive bias, cognitive dissonance, Community Supported Agriculture, coronavirus, COVID-19, CRISPR, cryptocurrency, data science, decarbonisation, digital divide, disinformation, Donald Trump, drone strike, Elon Musk, fake news, fiat currency, future of work, Future Shock, game design, George Floyd, global pandemic, global supply chain, Greta Thunberg, income inequality, index card, Internet of things, Jane Jacobs, Jeff Bezos, Kickstarter, labor-force participation, lockdown, longitudinal study, Mason jar, mass immigration, meta-analysis, microbiome, Minecraft, moral hazard, open borders, pattern recognition, place-making, plant based meat, post-truth, QAnon, QR code, remote working, RFID, risk tolerance, School Strike for Climate, Search for Extraterrestrial Intelligence, self-driving car, Silicon Valley, Silicon Valley startup, Snapchat, social distancing, stem cell, TED Talk, telepresence, telepresence robot, The future is already here, TikTok, traumatic brain injury, universal basic income, women in the workforce, work culture , Y Combinator

And, studies show, they work incredibly well: programs trained with noise injections learn much faster and perform much better than programs trained only on real-world data sets. Recently, Erik Hoel, a neuroscientist at Tufts University, noticed how similar these machine-learning techniques are to the surreal and hard-to-interpret nature of human dreams.2 When we dream, Hoel suggested in a 2021 paper in the data science journal Patterns, it often feels like a “noise injection” into our brains. Our dreams rarely repeat the exact details of our real-world experiences. Instead they recombine real people, places, experiences, and events in bizarre and seemingly random ways. Human dreams also have the same sparseness, or missing data, as noise injections, a kind of narrative fuzziness.

How can we make room there for all of us? Not everyone is on the move in this future. The rest of humanity is learning how to make others feel welcome and at home somewhere new. In fact, the art of welcoming is now ranked by online learners as the most useful and desirable practical skill to master, ahead of computer programming, data science, and even health care. It turns out that a “soft” skill may be the most essential one for humanity’s future. Migration in this future is no longer an individual burden or a dangerous, illegal journey. It’s coordinated, intentional, and strategic—the whole world working together to build vibrant, thriving societies.


pages: 175 words: 54,755

Robot, Take the Wheel: The Road to Autonomous Cars and the Lost Art of Driving by Jason Torchinsky

autonomous vehicles, barriers to entry, call centre, commoditize, computer vision, connected car, DARPA: Urban Challenge, data science, driverless car, Elon Musk, en.wikipedia.org, interchangeable parts, job automation, Philippa Foot, ransomware, self-driving car, sensor fusion, side project, Tesla Model S, trolley problem, urban sprawl

v=PgnsapPGaaw. 27 Wikipedia, “Edge Detection,” https://en.wikipedia.org/wiki/Edge_detection. 28 Torchinsky, Jason, “Why Nissan Built Realistic Inflatable Versions of Its Most Popular Cars,” Jalopnik, October 18, 2012, https://jalopnik.com/why-nissan-built-realistic-inflatable-versions-of-its-m-5952415. 29 Condliffe, Jamie, “This Image Is Why Self-Driving Cars Come Loaded with Many Types of Sensors,” MIT Technology Review, July 21, 2017, https://www. technologyreview.com/s/608321/this-image-is-why -self-driving-­cars-come-­loaded-­with-many-types-of-sensors/. 30 Antunes, João, “Performance over Price: Lumina’s Novel Lidar Tech for Autonomous Vehicles,” SPAR 3D, May 5, 2017, https://www.spar3d.com/news/lidar/performance-price-luminars -novel-lidar-tech-autonomous-vehicles/. 31 Dwivedi, Priya, “Tracking a self-driving car with high precision,” Towards Data Science, April 30, 2017, https://towardsdatascience.com/helping-a-self-driving-car-localize-itself-88705f419e4a. 32 Kichun Jo; Yongwoo Jo; Jae Kyu Suhr; Ho Gi Jung; Myoungho Sunwoo, “Precise Localization of an Autonomous Car Based on Probabilistic Noise Models of Road Surface Marker Features Using Multiple Cameras,” IEEE Transactions on Intelligent Transportaion Systems, vol, 16, 6, December 2015, https://ieeexplore.ieee.org/document/7160754/. 33 Silver, David, “How Self-Driving Cars Work,” Medium, December 14, 2017, https://medium.com/udacity/how-self-driving-cars-work -f77c49dca47e. 34 Website of the Australian Government Department of Infrastructure, Regional Development and Cities, https://infrastructure.gov.au/vehicles/mv_standards_act/files/Sub136_Austroads.pdf.


pages: 161 words: 52,058

The Art of Corporate Success: The Story of Schlumberger by Ken Auletta

Albert Einstein, Bretton Woods, data science, George Gilder, job satisfaction, offshore financial centre, oil shale / tar sands, oil shock, Ronald Reagan, the scientific method, union organizing

And while the profits of most oil and oilfield-service companies fell sharply in 1982, Schlumberger’s net income rose by more than 6 percent. Science is the foundation of Schlumberger. Science is the link between the various corporate subsidiaries, for the task of most of them is collecting, measuring, and transmitting data. Science, and particularly geophysics, was at the core of the careers of Conrad and Marcel Schlumberger, the company’s founders. Both were born in the town of Guebwiller, in Alsace—Conrad in 1878 and Marcel six years later. They were two of six children of a Protestant family that owned a prosperous textile machine business.


pages: 590 words: 152,595

Army of None: Autonomous Weapons and the Future of War by Paul Scharre

"World Economic Forum" Davos, active measures, Air France Flight 447, air gap, algorithmic trading, AlphaGo, Apollo 13, artificial general intelligence, augmented reality, automated trading system, autonomous vehicles, basic income, Black Monday: stock market crash in 1987, brain emulation, Brian Krebs, cognitive bias, computer vision, cuban missile crisis, dark matter, DARPA: Urban Challenge, data science, deep learning, DeepMind, DevOps, Dr. Strangelove, drone strike, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, facts on the ground, fail fast, fault tolerance, Flash crash, Freestyle chess, friendly fire, Herman Kahn, IFF: identification friend or foe, ImageNet competition, information security, Internet of things, Jeff Hawkins, Johann Wolfgang von Goethe, John Markoff, Kevin Kelly, Korean Air Lines Flight 007, Loebner Prize, loose coupling, Mark Zuckerberg, military-industrial complex, moral hazard, move 37, mutually assured destruction, Nate Silver, Nick Bostrom, PalmPilot, paperclip maximiser, pattern recognition, Rodney Brooks, Rubik’s Cube, self-driving car, sensor fusion, South China Sea, speech recognition, Stanislav Petrov, Stephen Hawking, Steve Ballmer, Steve Wozniak, Strategic Defense Initiative, Stuxnet, superintelligent machines, Tesla Model S, The Signal and the Noise by Nate Silver, theory of mind, Turing test, Tyler Cowen, universal basic income, Valery Gerasimov, Wall-E, warehouse robotics, William Langewiesche, Y2K, zero day

Comment at 5:10. 224 Automated hacking back is a theoretical concept: Alexander Velez-Green, “When ‘Killer Robots’ Declare War,” Defense One, April 12, 2015, http://www.defenseone.com/ideas/2015/04/when-killer-robots-declare-war/109882/. 224 automate “spear phishing” attacks: Karen Epper Hoffman, “Machine Learning Can Be Used Offensively to Automate Spear Phishing,” Infosecurity Magazine, August 5, 2016, https://www.infosecurity-magazine.com/news/bhusa-researchers-present-phishing/. 224 automatically develop “humanlike” tweets: John Seymour and Philip Tully, “Weaponizing data science for social engineering: Automated E2E spear phishing on Twitter,” https://www.blackhat.com/docs/us-16/materials/us-16-Seymour-Tully-Weaponizing-Data-Science-For-Social-Engineering-Automated-E2E-Spear-Phishing-On-Twitter-wp.pdf. 224 “in offensive cyberwarfare”: Eric Messinger, “Is It Possible to Ban Autonomous Weapons in Cyberwar?,” Just Security, January 15, 2015, https://www.justsecurity.org/19119/ban-autonomous-weapons-cyberwar/. 225 estimated 8 to 15 million computers worldwide: “Virus Strikes 15 Million PCs,” UPI, January 26, 2009, http://www.upi.com/Top_News/2009/01/26/Virus-strikes-15-million-PCs/19421232924206/. 225 method to counter Conficker: “Clock ticking on worm attack code,” BBC News, January 20, 2009, http://news.bbc.co.uk/2/hi/technology/7832652.stm. 225 brought Conficker to heel: Microsoft Security Intelligence Report: Volume 11 (11), Microsoft, 2011. 226 “prevent and react to countermeasures”: Alessandro Guarino, “Autonomous Intelligent Agents in Cyber Offence,” in K.


pages: 285 words: 58,517

The Network Imperative: How to Survive and Grow in the Age of Digital Business Models by Barry Libert, Megan Beck

active measures, Airbnb, Amazon Web Services, asset allocation, asset light, autonomous vehicles, big data - Walmart - Pop Tarts, business intelligence, call centre, Clayton Christensen, cloud computing, commoditize, crowdsourcing, data science, disintermediation, diversification, Douglas Engelbart, Douglas Engelbart, future of work, Google Glasses, Google X / Alphabet X, independent contractor, Infrastructure as a Service, intangible asset, Internet of things, invention of writing, inventory management, iterative process, Jeff Bezos, job satisfaction, John Zimmer (Lyft cofounder), Kevin Kelly, Kickstarter, Larry Ellison, late fees, Lyft, Mark Zuckerberg, Mary Meeker, Oculus Rift, pirate software, ride hailing / ride sharing, Salesforce, self-driving car, sharing economy, Silicon Valley, Silicon Valley startup, six sigma, software as a service, software patent, Steve Jobs, subscription business, systems thinking, TaskRabbit, Travis Kalanick, uber lyft, Wall-E, women in the workforce, Zipcar

He is internationally known for pioneering research on networked organizations, leadership mental models, and marketing strategy. He consults with major firms around the world, providing expert testimony, and has lectured at over fifty universities worldwide. He has authored more than two dozen books on various topics, including network theory, innovation, and leadership. OPENMATTERS is a data science company. It focuses on analyzing business models and the underlying sources of value. The firm harnesses technology, big data and analytics to categorize and measure business model performance. OpenMatters uses proprietary research to build indices and ratings for investors and strategies and rankings for companies to help both achieve better returns.


pages: 207 words: 59,298

The Gig Economy: A Critical Introduction by Jamie Woodcock, Mark Graham

Airbnb, algorithmic management, Amazon Mechanical Turk, autonomous vehicles, barriers to entry, British Empire, business process, business process outsourcing, Californian Ideology, call centre, collective bargaining, commoditize, corporate social responsibility, crowdsourcing, data science, David Graeber, deindustrialization, Didi Chuxing, digital divide, disintermediation, emotional labour, en.wikipedia.org, full employment, future of work, gamification, gender pay gap, gig economy, global value chain, Greyball, independent contractor, informal economy, information asymmetry, inventory management, Jaron Lanier, Jeff Bezos, job automation, knowledge economy, low interest rates, Lyft, mass immigration, means of production, Network effects, new economy, Panopticon Jeremy Bentham, planetary scale, precariat, rent-seeking, RFID, ride hailing / ride sharing, Ronald Reagan, scientific management, self-driving car, sentiment analysis, sharing economy, Silicon Valley, Silicon Valley ideology, TaskRabbit, The Future of Employment, transaction costs, Travis Kalanick, two-sided market, Uber and Lyft, Uber for X, uber lyft, union organizing, women in the workforce, working poor, young professional

Available at: https://www.oecd-ilibrary.org/employment/automation-skills-use-and-training_2e2f4eea-en Noble, S.U. (2018) Algorithms of Oppression: How Search Engines Reinforce Racism. New York: NYU Press. OECD (2019) Measuring platform mediated workers. OECD Digital Economy Papers No. 282. Ojanperä, S., O’Clery, N. and Graham, M. (2018) Data science, artificial intelligence and the futures of work. Alan Turing Institute Report, 24 October. Available at: http://doi.org/10.5281/zenodo.1470609 O’Neil, C. (2017) Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. London: Penguin. Pasquale, F. (2015) The Black Box Society: The Secret Algorithms That Control Money and Information.


pages: 262 words: 60,248

Python Tricks: The Book by Dan Bader

anti-pattern, business logic, data science, domain-specific language, don't repeat yourself, functional programming, Hacker News, higher-order functions, linked data, off-by-one error, pattern recognition, performance metric

. >>> arr = bytearray((0, 1, 2, 3)) >>> arr[1] 1 # The bytearray repr: >>> arr bytearray(b'x00x01x02x03') # Bytearrays are mutable: >>> arr[1] = 23 >>> arr bytearray(b'x00x17x02x03') >>> arr[1] 23 # Bytearrays can grow and shrink in size: >>> del arr[1] >>> arr bytearray(b'x00x02x03') >>> arr.append(42) >>> arr bytearray(b'x00x02x03*') # Bytearrays can only hold "bytes" # (integers in the range 0 <= x <= 255) >>> arr[1] = 'hello' TypeError: "an integer is required" >>> arr[1] = 300 ValueError: "byte must be in range(0, 256)" # Bytearrays can be converted back into bytes objects: # (This will copy the data) >>> bytes(arr) b'x00x02x03*' Key Takeaways There are a number of built-in data structures you can choose from when it comes to implementing arrays in Python. In this chapter we’ve focused on core language features and data structures included in the standard library only. If you’re willing to go beyond the Python standard library, third-party packages like NumPy14 offer a wide range of fast array implementations for scientific computing and data science. By restricting ourselves to the array data structures included with Python, here’s what our choices come down to: You need to store arbitrary objects, potentially with mixed data types? Use a list or a tuple, depending on whether you want an immutable data structure or not. You have numeric (integer or floating point) data and tight packing and performance is important?


She Has Her Mother's Laugh by Carl Zimmer

23andMe, agricultural Revolution, Anthropocene, clean water, clockwatching, cloud computing, CRISPR, dark matter, data science, discovery of DNA, double helix, Drosophila, Easter island, Elon Musk, epigenetics, Fellow of the Royal Society, Flynn Effect, friendly fire, Gary Taubes, germ theory of disease, Gregor Mendel, Helicobacter pylori, Isaac Newton, James Webb Space Telescope, lolcat, longitudinal study, medical bankruptcy, meta-analysis, microbiome, moral panic, mouse model, New Journalism, out of africa, phenotype, Ralph Waldo Emerson, Recombinant DNA, Scientific racism, statistical model, stem cell, twin studies, W. E. B. Du Bois

Social media platforms have worked hard to make this replication not merely perfect but easy. You don’t have to dig into the HTML code for your favorite political slogan or your favorite clip of an insane Russian driver. You press SHARE. You retweet. It’s not just easy to spread memes; it’s also easy to track them. Data scientists can track memes with all the numerical precision of a geneticist following an allele for antibiotic resistance in a petri dish. Forty years after the publication of The Selfish Gene, Dawkins wrote an epilogue to an anniversary edition in which he looked back at his idea with satisfaction.


pages: 217 words: 63,287

The Participation Revolution: How to Ride the Waves of Change in a Terrifyingly Turbulent World by Neil Gibb

Abraham Maslow, Adam Neumann (WeWork), Airbnb, Albert Einstein, blockchain, Buckminster Fuller, call centre, carbon footprint, Clayton Christensen, collapse of Lehman Brothers, corporate social responsibility, creative destruction, crowdsourcing, data science, Didi Chuxing, disruptive innovation, Donald Trump, gentrification, gig economy, iterative process, Jeremy Corbyn, job automation, Joseph Schumpeter, Khan Academy, Kibera, Kodak vs Instagram, Mark Zuckerberg, Menlo Park, Minecraft, mirror neurons, Network effects, new economy, performance metric, ride hailing / ride sharing, shareholder value, side project, Silicon Valley, Silicon Valley startup, Skype, Snapchat, Steve Jobs, Susan Wojcicki, the scientific method, Thomas Kuhn: the structure of scientific revolutions, trade route, urban renewal, WeWork

There was another pattern in the data – a clear correlation between the hormone levels in each man and the significance he attributed to the practice: the more importance given to meditation, the greater the levels of hormone. What Dr Saron began to conclude was that it wasn’t the meditation per se that was making the difference, but how meaningful it was to those who practised it. This was a great insight. But it was actually only half the picture. The problem with a lot of data science and research is the frame. Dr Saron’s research was focused on the individual. What it missed was that the men were meditating in a group. So much research into addiction, illness, and depression misses this point. Why do people so often get well in rehab but relapse when they leave? Why do people get fit in gym and yoga classes but find it hard to maintain fitness when they are on their own?


pages: 196 words: 61,981

Blockchain Chicken Farm: And Other Stories of Tech in China's Countryside by Xiaowei Wang

4chan, AI winter, Amazon Web Services, artificial general intelligence, autonomous vehicles, back-to-the-land, basic income, Big Tech, bitcoin, blockchain, business cycle, cloud computing, Community Supported Agriculture, computer vision, COVID-19, cryptocurrency, data science, deep learning, Deng Xiaoping, Didi Chuxing, disruptive innovation, Donald Trump, drop ship, emotional labour, Ethereum, ethereum blockchain, Francis Fukuyama: the end of history, Garrett Hardin, gig economy, global pandemic, Great Leap Forward, high-speed rail, Huaqiangbei: the electronics market of Shenzhen, China, hype cycle, income inequality, informal economy, information asymmetry, Internet Archive, Internet of things, job automation, Kaizen: continuous improvement, Kickstarter, knowledge worker, land reform, Marc Andreessen, Mark Zuckerberg, Menlo Park, multilevel marketing, One Laptop per Child (OLPC), Pearl River Delta, peer-to-peer lending, precision agriculture, QR code, ride hailing / ride sharing, risk tolerance, Salesforce, Satoshi Nakamoto, scientific management, self-driving car, Silicon Valley, Snapchat, SoftBank, software is eating the world, surveillance capitalism, TaskRabbit, tech worker, technological solutionism, the long tail, TikTok, Tragedy of the Commons, universal basic income, vertical integration, Vision Fund, WeWork, Y Combinator, zoonotic diseases

They are the creative director at Logic magazine, and their work encompasses community-based and public art projects, data visualization, technology, ecology, and education. Their projects have been finalists for the Index Award and featured by The New York Times, the BBC, CNN, VICE, and other outlets. They are working toward a Ph.D. at UC Berkeley, where they are a part of the National Science Foundation’s Research Traineeship program in Environment and Society: Data Science for the 21st Century. You can sign up for email updates here. Thank you for buying this Farrar, Straus and Giroux ebook. To receive special offers, bonus content, and info on new releases and other great reads, sign up for our newsletters. Or visit us online at us.macmillan.com/newslettersignup For email updates on the author, click here.


pages: 391 words: 71,600

Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone by Satya Nadella, Greg Shaw, Jill Tracie Nichols

3D printing, AlphaGo, Amazon Web Services, anti-globalists, artificial general intelligence, augmented reality, autonomous vehicles, basic income, Bretton Woods, business process, cashless society, charter city, cloud computing, complexity theory, computer age, computer vision, corporate social responsibility, crowdsourcing, data science, DeepMind, Deng Xiaoping, Donald Trump, Douglas Engelbart, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, equal pay for equal work, everywhere but in the productivity statistics, fault tolerance, fulfillment center, Gini coefficient, global supply chain, Google Glasses, Grace Hopper, growth hacking, hype cycle, industrial robot, Internet of things, Jeff Bezos, job automation, John Markoff, John von Neumann, knowledge worker, late capitalism, Mars Rover, Minecraft, Mother of all demos, Neal Stephenson, NP-complete, Oculus Rift, pattern recognition, place-making, Richard Feynman, Robert Gordon, Robert Solow, Ronald Reagan, Salesforce, Second Machine Age, self-driving car, side project, Silicon Valley, Skype, Snapchat, Snow Crash, special economic zone, speech recognition, Stephen Hawking, Steve Ballmer, Steve Jobs, subscription business, TED Talk, telepresence, telerobotics, The Rise and Fall of American Growth, The Soul of a New Machine, Tim Cook: Apple, trade liberalization, two-sided market, universal basic income, Wall-E, Watson beat the top human players on Jeopardy!, young professional, zero-sum game

Similarly, an insurer like MetLife can spin up our cloud with ML overnight to run enormous actuarial tables and have answers to its most crucial financial questions in the morning, making it possible for the company to adapt quickly to dramatic shifts in the insurance landscape—an unexpected flu epidemic, a more-violent-than-normal hurricane season. Whether you are in Ethiopia or Evanston, Ohio, or if you hold a doctorate in data science or not, everyone should have that capability to learn from the data. With Azure, Microsoft would democratize machine learning just as it had done with personal computing back in the 1980s. To me, meeting with customers and learning from both their articulated and unarticulated needs is key to any product innovation agenda.


pages: 237 words: 67,154

Ours to Hack and to Own: The Rise of Platform Cooperativism, a New Vision for the Future of Work and a Fairer Internet by Trebor Scholz, Nathan Schneider

1960s counterculture, activist fund / activist shareholder / activist investor, Airbnb, Amazon Mechanical Turk, Anthropocene, barriers to entry, basic income, benefit corporation, Big Tech, bitcoin, blockchain, Build a better mousetrap, Burning Man, business logic, capital controls, circular economy, citizen journalism, collaborative economy, collaborative editing, collective bargaining, commoditize, commons-based peer production, conceptual framework, content marketing, crowdsourcing, cryptocurrency, data science, Debian, decentralized internet, deskilling, disintermediation, distributed ledger, driverless car, emotional labour, end-to-end encryption, Ethereum, ethereum blockchain, food desert, future of work, gig economy, Google bus, hiring and firing, holacracy, income inequality, independent contractor, information asymmetry, Internet of things, Jacob Appelbaum, Jeff Bezos, job automation, Julian Assange, Kickstarter, lake wobegon effect, low skilled workers, Lyft, Mark Zuckerberg, means of production, minimum viable product, moral hazard, Network effects, new economy, offshore financial centre, openstreetmap, peer-to-peer, planned obsolescence, post-work, profit maximization, race to the bottom, radical decentralization, remunicipalization, ride hailing / ride sharing, Rochdale Principles, SETI@home, shareholder value, sharing economy, Shoshana Zuboff, Silicon Valley, smart cities, smart contracts, Snapchat, surveillance capitalism, TaskRabbit, technological solutionism, technoutopianism, transaction costs, Travis Kalanick, Tyler Cowen, Uber for X, uber lyft, union organizing, universal basic income, Vitalik Buterin, W. E. B. Du Bois, Whole Earth Catalog, WikiLeaks, women in the workforce, workplace surveillance , Yochai Benkler, Zipcar

He is also helping build seed.coop, a platform for co-ops everywhere to grow their membership. Say hi on Twitter @daspitzberg. Arun Sundararajan is Professor and the Robert L. and Dale Atkins Rosen Faculty Fellow at New York University’s Leonard N. Stern School of Business. He is also an affiliated faculty member at NYU’s Center for Urban Science+Progress, and at NYU’s Center for Data Science. Astra Taylor is a documentary filmmaker, writer, and political organizer. She is the director of the films Zizek! and Examined Life, and the author of The People’s Platform: Taking Back Power and Culture in the Digital Age (Picador, 2015), winner of a 2015 American Book Award. She helped launch the Rolling Jubilee debt-abolishing campaign and is a co-founder of the Debt Collective.


pages: 220 words: 66,518

The Biology of Belief: Unleashing the Power of Consciousness, Matter & Miracles by Bruce H. Lipton

Albert Einstein, Benoit Mandelbrot, Boeing 747, correlation does not imply causation, data science, discovery of DNA, double helix, Drosophila, epigenetics, Isaac Newton, Mahatma Gandhi, mandelbrot fractal, Mars Rover, nocebo, On the Revolutions of the Heavenly Spheres, phenotype, placebo effect, randomized controlled trial, selective serotonin reuptake inhibitor (SSRI), stem cell, sugar pill

Hallett, M. (2000). “Transcranial magnetic stimulation and the human brain.” Nature 406: 147-150. Helmuth, L. (2001). “Boosting Brain Activity From The Outside In.” Science 292: 1284-1286. Jansen, R., H. Yu, et al. (2003). “A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data.” Science 302: 449-453. Jin, M., M. Blank, et al. (2000). “ERK1/2 Phosphorylation, Induced by Electromagnetic Fields, Diminishes During Neoplastic Transformation.” Journal of Cell Biology 78: 371-379. Kübler-Ross, Elizabeth (1997) On Death and Dying, New York, Scribner. Li, S., C. M. Armstrong, et al. (2004).


pages: 681 words: 64,159

Numpy Beginner's Guide - Third Edition by Ivan Idris

algorithmic trading, business intelligence, Conway's Game of Life, correlation coefficient, data science, Debian, discrete time, en.wikipedia.org, functional programming, general-purpose programming language, Khan Academy, p-value, random walk, reversible computing, time value of money

A step-by-step tutorial that will help users solve research-based problems from various areas of science using Scipy. IPython Interactive Computing and Visualization Cookbook ISBN: 978-1-78328-481-8 Paperback: 512 pages Over 100 hands-on recipes to sharpen your skills in high-performance numerical computng and data science with Python 1. Find out how to improve your Code to write high-quality, readable, and well-tested programs with IPython. 2. Master all of the new features of the IPython Notebook, including interactve HTML/JavaScript widgets. 3. Analyze data efectvely using Bayesian and Frequentst data models with Pandas, PyMC, and R.


pages: 256 words: 67,563

Explaining Humans: What Science Can Teach Us About Life, Love and Relationships by Camilla Pang

autism spectrum disorder, backpropagation, bioinformatics, Brownian motion, correlation does not imply causation, data science, deep learning, driverless car, frictionless, job automation, John Nash: game theory, John von Neumann, Kickstarter, Nash equilibrium, neurotypical, phenotype, random walk, self-driving car, stem cell, Stephen Hawking

But I also know, deep down, that it’s love that makes us feel alive, even when it’s inconvenient, painful and hard to bear. The mathematician in me is also a romantic. She believes there are ways we can use statistics, probability and machine-learning techniques to improve our search for love and harmony with the people we care about. And if you’re sceptical about the role of data science in your love life, then I would ask if you’ve ever used Tinder, Bumble or any other dating app. Because the truth is that many of us have been sharing a bed with AI for some time. Relationships may be far from a science, but there are many ways in which science can help us to manage them better.


pages: 210 words: 65,833

This Is Not Normal: The Collapse of Liberal Britain by William Davies

Airbnb, basic income, Bernie Sanders, Big bang: deregulation of the City of London, Black Lives Matter, Boris Johnson, Cambridge Analytica, central bank independence, centre right, Chelsea Manning, coronavirus, corporate governance, COVID-19, credit crunch, data science, deindustrialization, disinformation, Dominic Cummings, Donald Trump, double entry bookkeeping, Edward Snowden, fake news, family office, Filter Bubble, Francis Fukuyama: the end of history, ghettoisation, gig economy, global pandemic, global village, illegal immigration, Internet of things, Jeremy Corbyn, late capitalism, Leo Hollis, liberal capitalism, loadsamoney, London Interbank Offered Rate, mass immigration, moral hazard, Neil Kinnock, Northern Rock, old-boy network, post-truth, postnationalism / post nation state, precariat, prediction markets, quantitative easing, recommendation engine, Robert Mercer, Ronald Reagan, sentiment analysis, sharing economy, Silicon Valley, Slavoj Žižek, statistical model, Steve Bannon, Steven Pinker, surveillance capitalism, technoutopianism, The Chicago School, Thorstein Veblen, transaction costs, universal basic income, W. E. B. Du Bois, web of trust, WikiLeaks, Yochai Benkler

This transformation in our recording equipment is responsible for much of the outrage directed at those formerly tasked with describing the world. The rise of blanket surveillance technologies has paradoxical effects, raising expectations for objective knowledge to unrealistic levels, and then provoking fury when those in the public eye do not meet them. On the one hand, data science appears to make the question of objective truth easier to settle. The slow and imperfect institutions of social science and journalism can be circumvented, and we can get directly to reality itself, unpolluted by human bias. Surely, in this age of mass data capture, the truth will become undeniable.


pages: 231 words: 64,734

Safe Haven: Investing for Financial Storms by Mark Spitznagel

Albert Einstein, Antoine Gombaud: Chevalier de Méré, asset allocation, behavioural economics, bitcoin, Black Swan, blockchain, book value, Brownian motion, Buckminster Fuller, cognitive dissonance, commodity trading advisor, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, delayed gratification, diversification, diversified portfolio, Edward Thorp, fiat currency, financial engineering, Fractional reserve banking, global macro, Henri Poincaré, hindsight bias, Long Term Capital Management, Mark Spitznagel, Paul Samuelson, phenotype, probability theory / Blaise Pascal / Pierre de Fermat, quantitative trading / quantitative finance, random walk, rent-seeking, Richard Feynman, risk free rate, risk-adjusted returns, Schrödinger's Cat, Sharpe ratio, spice trade, Steve Jobs, tail risk, the scientific method, transaction costs, value at risk, yield curve, zero-sum game

A reliable indication that this fallacy is at work is when bold forward‐looking statements about a particular risk‐mitigation strategy are uttered by someone who has never actually done it, in real time (as in during the Sunday game, as opposed to on the following Monday from their armchair). It is a kissing cousin to datamining, or overfitting, surely the most well‐trodden pitfall in the data sciences. There will always be flawless, successful strategies to be gleaned from past data, from randomness alone. There is a deceptive narrative basis to such strategies that always sound so reasonable and plausible; they have charm and seductive powers. So it's a pretty easy sale, really. Sad to say, but heuristic storytelling plays a huge roll in what risk mitigation has become.


pages: 247 words: 69,593

The Creative Curve: How to Develop the Right Idea, at the Right Time by Allen Gannett

Alfred Russel Wallace, collective bargaining, content marketing, data science, David Brooks, deliberate practice, Desert Island Discs, Elon Musk, en.wikipedia.org, gentrification, glass ceiling, iterative process, lone genius, longitudinal study, Lyft, Mark Zuckerberg, McMansion, pattern recognition, profit motive, randomized controlled trial, recommendation engine, Richard Florida, ride hailing / ride sharing, Salesforce, Saturday Night Live, sentiment analysis, Silicon Valley, Silicon Valley startup, Skype, Snapchat, South of Market, San Francisco, Steve Jobs, TED Talk, too big to fail, uber lyft, work culture

Unfortunately, he hated anthropology: Kurt Vonnegut, A Man Without a Country (New York: Seven Stories Press, 2005). brought together a team of academic superheroes: The study produced by his team of academic superheroes was Reagan et al., “The Emotional Arcs of Stories Are Dominated by Six Basic Shapes,” EPJ Data Science, November 4, 2016, https://epjdatascience.springeropen.com/​articles/​10.1140/​epjds/​s13688-016-0093-1. Kenya Barris: Details relating to Black-ish and Barris’s involvement and background drawn mostly from my interviews with him. Researcher Gregory Berns: Details relating to dopamine drawn from my interviews with Berns.


Designing the Mind: The Principles of Psychitecture by Designing the Mind, Ryan A Bush

Abraham Maslow, adjacent possible, Albert Einstein, algorithmic bias, augmented reality, butterfly effect, carbon footprint, cognitive bias, cognitive load, correlation does not imply causation, data science, delayed gratification, deliberate practice, drug harm reduction, effective altruism, Elon Musk, en.wikipedia.org, endowment effect, fundamental attribution error, hedonic treadmill, hindsight bias, impulse control, Kevin Kelly, Lao Tzu, lifelogging, longitudinal study, loss aversion, meta-analysis, Own Your Own Home, pattern recognition, price anchoring, randomized controlled trial, Silicon Valley, Stanford marshmallow experiment, Steven Pinker, systems thinking, Walter Mischel

You could engage in gratitude to up-regulate your desire for all the great things you have, even if this particular job isn’t one of them. You also might down-regulate the specific desire causing your suffering by reminding yourself of the hour-and-a-half-long commute, or that movie rental is probably not a great industry to build a career in right now, or that you have a master’s in data science. Honestly, I have no idea what you saw in that job in the first place, Sarah. Once you learn and strengthen your ability to use these tactics, you will be able to adjust your desires at will, largely eliminating the tendency to suffer over ungratified longings. The Counteraction of Desire Greed and aversion surface in the form of thoughts, and thus can be eroded by a process of ‘thought substitution,’ by replacing them with the thoughts opposed to them


pages: 227 words: 63,186

An Elegant Puzzle: Systems of Engineering Management by Will Larson

Ben Horowitz, Cass Sunstein, Clayton Christensen, data science, DevOps, en.wikipedia.org, fault tolerance, functional programming, Google Earth, hive mind, Innovator's Dilemma, iterative process, Kanban, Kickstarter, Kubernetes, loose coupling, microservices, MITM: man-in-the-middle, no silver bullet, pull request, Richard Thaler, seminal paper, Sheryl Sandberg, Silicon Valley, statistical model, systems thinking, the long tail, web application

I’ve found that agreeing on the expected skills for a given role can be far harder than anyone anticipates, and it can require spending significant time with your interviewers to agree on what the role requires. (This is often in the context of what extent and kind of programming experience is needed in engineering management, DevOps, and data science roles.) 6.2.3 Finding signal After you’ve broken the role down into a certain set of skills and requirements, the next step is to break your interview loop into a series of interview slots that together cover all of those signals. Typically, each skill is covered by two different interviewers to create some redundancy in signal detection, in case one of the interviews doesn’t go cleanly.


pages: 229 words: 72,431

Shadow Work: The Unpaid, Unseen Jobs That Fill Your Day by Craig Lambert

airline deregulation, Asperger Syndrome, banking crisis, Barry Marshall: ulcers, big-box store, business cycle, carbon footprint, cashless society, Clayton Christensen, cognitive dissonance, collective bargaining, Community Supported Agriculture, corporate governance, crowdsourcing, data science, disintermediation, disruptive innovation, emotional labour, fake it until you make it, financial independence, Galaxy Zoo, ghettoisation, gig economy, global village, helicopter parent, IKEA effect, industrial robot, informal economy, Jeff Bezos, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, Mark Zuckerberg, new economy, off-the-grid, pattern recognition, plutocrats, pneumatic tube, recommendation engine, Schrödinger's Cat, Silicon Valley, single-payer health, statistical model, the strength of weak ties, The Theory of the Leisure Class by Thorstein Veblen, Thorstein Veblen, Turing test, unpaid internship, Vanguard fund, Vilfredo Pareto, you are the product, zero-sum game, Zipcar

In 1997, the College Board introduced an advanced placement examination in statistics. The number of high school students taking it tripled in the decade after 2001, to 149,165 by 2012. American universities conferred close to 3,000 bachelor’s degrees in statistics in the 2010–11 academic year, a 68 percent increase from four years before. A Harvard data science course drew 400 students in 2013, not only undergraduates but those from graduate schools of law, business, government, design, and medicine. At the University of California, Berkeley, the number of statistics majors quintupled from 50 in 2003 to 250 a decade later. Fields of study that strongly attract students like this are an index of what we value, and where society is headed.


Chasing My Cure: A Doctor's Race to Turn Hope Into Action; A Memoir by David Fajgenbaum

Atul Gawande, Barry Marshall: ulcers, crowdsourcing, data science, Easter island, friendly fire, medical residency, personalized medicine, phenotype, placebo effect, randomized controlled trial, Saturday Night Live, Silicon Valley, the scientific method

But actually, it’s those patients, who donate samples, data, and funds, that are the only hope the CDCN has for developing a cure. We also partner with tech and pharmaceutical companies on large-scale studies whenever possible. One tech company, Medidata, is contributing machine learning and data science tools to help us generate clinically meaningful insights from the half a million data points in the proteomics study. Though they are often demonized because of a few notable bad actors, pharmaceutical companies have incredible power to do good through contributing funds, data, and samples for research.


The Jobs to Be Done Playbook: Align Your Markets, Organization, and Strategy Around Customer Needs by Jim Kalbach

Airbnb, Atul Gawande, Build a better mousetrap, Checklist Manifesto, Clayton Christensen, commoditize, data science, Dean Kamen, fail fast, Google Glasses, job automation, Kanban, Kickstarter, knowledge worker, Lean Startup, market design, minimum viable product, prediction markets, Quicken Loans, Salesforce, shareholder value, Skype, software as a service, Steve Jobs, subscription business, Zipcar

Thanks to JTBD, our team was able to focus on solving the biggest opportunities of the customer experience on carmax.com, thus making a meaningful impact to the product. Jake Mitchell is a Principal Product Designer at CarMax, where he strives to reinvent the way customers find and fall in love with their next car. In addition to user experience design and research, Jake is proficient in web development and data science. This case study is a summary of his presentation “Using Jobs to Be Done at CarMax to Guide Product Innovation,” given at UX STRAT 2017 in Boulder, Colorado. Recap JTBD not only helps you understand the customer’s problem, but it also guides solution development. In particular, you can leverage JTBD in several ways to tie the design of products and services back to the individual’s job to be done.


pages: 211 words: 78,547

How Elites Ate the Social Justice Movement by Fredrik Deboer

2021 United States Capitol attack, Affordable Care Act / Obamacare, anti-communist, Bernie Sanders, BIPOC, Black Lives Matter, Capital in the Twenty-First Century by Thomas Piketty, centre right, collective bargaining, coronavirus, COVID-19, data science, David Brooks, defund the police, deindustrialization, delayed gratification, Donald Trump, Edward Snowden, effective altruism, false flag, Ferguson, Missouri, George Floyd, global pandemic, helicopter parent, income inequality, lockdown, obamacare, Occupy movement, open immigration, post-materialism, profit motive, QAnon, Silicon Valley, single-payer health, social distancing, TikTok, upwardly mobile, W. E. B. Du Bois, We are the 99%, working poor, zero-sum game

turnout for high school graduates: “Voting and Registration in the Election of 2020,” United States Census Bureau, April 2021, https://www.census.gov/data/tables/time-series/demo/voting-and-registration/p20-585.html. And they’re more likely: See, among many other studies: Jacob R. Brown, Ryan D. Enos, James Felgenbaum, and Soumyajit Mazumder, “Childhood Cross-Ethnic Exposure Predicts Political Behavior Seven Decades Later: Evidence from Linked Administrative Data.” Science Advances 7, no. 24 (June 2021), https://www.science.org/doi/10.1126/sciadv.abe8432. But recent high-quality research: Ralph Scott, “Does University Make You More Liberal? Estimating the Within-Individual Effects of Higher Education on Political Values,” Electoral Studies 77 (June 2022), 102471.


pages: 306 words: 82,765

Skin in the Game: Hidden Asymmetries in Daily Life by Nassim Nicholas Taleb

anti-fragile, availability heuristic, behavioural economics, Benoit Mandelbrot, Bernie Madoff, Black Swan, Brownian motion, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, cellular automata, Claude Shannon: information theory, cognitive dissonance, complexity theory, data science, David Graeber, disintermediation, Donald Trump, Edward Thorp, equity premium, fake news, financial independence, information asymmetry, invisible hand, knowledge economy, loss aversion, mandelbrot fractal, Mark Spitznagel, mental accounting, microbiome, mirror neurons, moral hazard, Murray Gell-Mann, offshore financial centre, p-value, Paradox of Choice, Paul Samuelson, Ponzi scheme, power law, precautionary principle, price mechanism, principal–agent problem, public intellectual, Ralph Nader, random walk, rent-seeking, Richard Feynman, Richard Thaler, Ronald Coase, Ronald Reagan, Rory Sutherland, Rupert Read, Silicon Valley, Social Justice Warrior, Steven Pinker, stochastic process, survivorship bias, systematic bias, tail risk, TED Talk, The Nature of the Firm, Tragedy of the Commons, transaction costs, urban planning, Yogi Berra

Just a little bit of significant data is needed when one is right, particularly when it is disconfirmatory empiricism, or counterexamples: only one data point (a single extreme deviation) is sufficient to show that Black Swans exist. Traders, when they make profits, have short communications; when they lose they drown you in details, theories, and charts. Probability, statistics, and data science are principally logic fed by observations—and absence of observations. For many environments, the relevant data points are those in the extremes; these are rare by definition, and it suffices to focus on those few but big to get an idea of the story. If you want to show that a person has more than, say $10 million, all you need is to show the $50 million in his brokerage account, not, in addition, list every piece of furniture in his house, including the $500 painting in his study and the silver spoons in the pantry.


Pearls of Functional Algorithm Design by Richard Bird

bioinformatics, data science, functional programming, Kickstarter, Menlo Park, sorting algorithm

Gries, D. (1979). The Schorr–Waite graph marking algorithm. Acta Informatica 11, 223–32. McCarthy, J. (1960). Recursive functions of symbolic expressions and their computation by machine. Communications of the ACM 3, 184. Mason, I. A. (1988). Verification of programs that destructively manipulate data. Science of Computer Programming 10 (2), 177–210. Möller, B. (1997). Calculating with pointer structures. IFIP TC2/WG2.1 Working Conference on Algorithmic Languages and Calculi. Chapman and Hall, pp. 24–48. Möller, B. (1999). Calculating with acyclic and cyclic lists. Information Sciences 119, 135–54.


pages: 472 words: 80,835

Life as a Passenger: How Driverless Cars Will Change the World by David Kerrigan

3D printing, Airbnb, airport security, Albert Einstein, autonomous vehicles, big-box store, Boeing 747, butterfly effect, call centre, car-free, Cesare Marchetti: Marchetti’s constant, Chris Urmson, commoditize, computer vision, congestion charging, connected car, DARPA: Urban Challenge, data science, deep learning, DeepMind, deskilling, disruptive innovation, Donald Shoup, driverless car, edge city, Elon Musk, en.wikipedia.org, fake news, Ford Model T, future of work, General Motors Futurama, hype cycle, invention of the wheel, Just-in-time delivery, Lewis Mumford, loss aversion, Lyft, Marchetti’s constant, Mars Rover, megacity, Menlo Park, Metcalfe’s law, Minecraft, Nash equilibrium, New Urbanism, QWERTY keyboard, Ralph Nader, RAND corporation, Ray Kurzweil, ride hailing / ride sharing, Rodney Brooks, Sam Peltzman, self-driving car, sensor fusion, Silicon Valley, Simon Kuznets, smart cities, Snapchat, Stanford marshmallow experiment, Steve Jobs, technological determinism, technoutopianism, TED Talk, the built environment, Thorstein Veblen, traffic fines, transit-oriented development, Travis Kalanick, trolley problem, Uber and Lyft, Uber for X, uber lyft, Unsafe at Any Speed, urban planning, urban sprawl, warehouse robotics, Yogi Berra, young professional, zero-sum game, Zipcar

_r=0 Chapter 5 - All Change http://www.wsj.com/articles/could-self-driving-cars-spell-the-end-of-ownership-1448986572 http://www.bloomberg.com/news/articles/2016-09-11/self-driving-cars-to-cut-u-s-insurance-premiums-40-aon-says http://www.wsj.com/articles/will-the-driverless-car-upend-insurance-1425428891 https://www.wsj.com/articles/driverless-cars-threaten-to-crash-insurers-earnings-1469542958 http://www.datakind.org/projects/creating-safer-streets-through-data-science/ https://twitter.com/BenedictEvans/status/721484633351696384?replies_view=true&cursor=ARAUhOE6Awo https://techcrunch.com/2015/10/30/ride-sharing-will-give-us-back-our-cities/ http://dupress.deloitte.com/dup-us-en/focus/future-of-mobility/roadmap-for-future-of-urban-mobility.html?id=us:2el:3pr:prwhatnext:eng:cons:091516 https://www.planning.org/planning/2015/may/autonomouscars.htm http://www.uspirg.org/news/usp/new-report-shows-mounting-evidence-millennials%E2%80%99-shift-away-driving http://www.slate.com/articles/business/the_juice/2014/07/driving_vs_flying_which_is_more_harmful_to_the_environment.html Chapter 6 - Challenges http://www.wsj.com/articles/driverless-cars-to-fuel-suburban-sprawl-1466395201 https://www.washingtonpost.com/news/energy-environment/wp/2016/06/23/save-the-driver-or-save-the-crowd-scientists-wonder-how-driverless-cars-will-choose/?


pages: 286 words: 87,401

Blitzscaling: The Lightning-Fast Path to Building Massively Valuable Companies by Reid Hoffman, Chris Yeh

"Susan Fowler" uber, activist fund / activist shareholder / activist investor, adjacent possible, Airbnb, Amazon Web Services, Andy Rubin, autonomous vehicles, Benchmark Capital, bitcoin, Blitzscaling, blockchain, Bob Noyce, business intelligence, Cambridge Analytica, Chuck Templeton: OpenTable:, cloud computing, CRISPR, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, database schema, DeepMind, Didi Chuxing, discounted cash flows, Elon Musk, fake news, Firefox, Ford Model T, forensic accounting, fulfillment center, Future Shock, George Gilder, global pandemic, Google Hangouts, Google X / Alphabet X, Greyball, growth hacking, high-speed rail, hockey-stick growth, hydraulic fracturing, Hyperloop, initial coin offering, inventory management, Isaac Newton, Jeff Bezos, Joi Ito, Khan Academy, late fees, Lean Startup, Lyft, M-Pesa, Marc Andreessen, Marc Benioff, margin call, Mark Zuckerberg, Max Levchin, minimum viable product, move fast and break things, Network effects, Oculus Rift, oil shale / tar sands, PalmPilot, Paul Buchheit, Paul Graham, Peter Thiel, pre–internet, Quicken Loans, recommendation engine, ride hailing / ride sharing, Salesforce, Sam Altman, Sand Hill Road, Saturday Night Live, self-driving car, shareholder value, sharing economy, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Skype, smart grid, social graph, SoftBank, software as a service, software is eating the world, speech recognition, stem cell, Steve Jobs, subscription business, synthetic biology, Tesla Model S, thinkpad, three-martini lunch, transaction costs, transport as a service, Travis Kalanick, Uber for X, uber lyft, web application, winner-take-all economy, work culture , Y Combinator, yellow journalism

They may not have the pedigree, but they are great at learning new things and at charging hard to execute on them. Plus, the early business is in too much flux to effectively leverage the finely tuned capabilities of a true specialist. Even at the Tribe stage, hiring a specialist should be considered a major exception—for example, if you need an engineer with a very specialized area of expertise, such as data science or machine learning. The Village stage is where it becomes prudent to hire specialists, as both executives and key contributors. At the Tribe stage you want employees with skill sets flexible enough to pivot along with the company, but if you have hundreds of employees, you better have some pretty well-developed theories about your business and where it is going!


pages: 302 words: 84,881

The Digital Party: Political Organisation and Online Democracy by Paolo Gerbaudo

Airbnb, barriers to entry, basic income, Bernie Sanders, bitcoin, Californian Ideology, call centre, Cambridge Analytica, centre right, creative destruction, crowdsourcing, data science, digital capitalism, digital divide, digital rights, disintermediation, disruptive innovation, Donald Trump, Dunbar number, Edward Snowden, end-to-end encryption, Evgeny Morozov, feminist movement, gig economy, industrial robot, Jaron Lanier, Jeff Bezos, Jeremy Corbyn, jimmy wales, Joseph Schumpeter, Mark Zuckerberg, Network effects, Occupy movement, offshore financial centre, oil shock, post-industrial society, precariat, Ralph Waldo Emerson, Richard Florida, Richard Stallman, Ruby on Rails, self-driving car, Silicon Valley, Skype, Slavoj Žižek, smart cities, Snapchat, social web, software studies, Stewart Brand, technological solutionism, technoutopianism, the long tail, Thomas L Friedman, universal basic income, vertical integration, Vilfredo Pareto, WikiLeaks

Using the polling and rating mechanisms built into the architecture of social media and online platforms more generally, they engage their members/users in all forms of consultations, constantly charting their shifting opinions and with the ultimate aim of adapting to their evolving tendencies, in ways not too dissimilar from those practiced by digital companies and their data science teams. Table 3.1 Similarities between platform companies and platform parties Platform Companies Platform Parties Operational logic Data gathering Political data gathering Membership Free sign-up Free membership Value extraction Free labour Free political labour Second, platform parties operate with a free registration model in which membership is disconnected from financial contribution.


pages: 239 words: 80,319

Lurking: How a Person Became a User by Joanne McNeil

"World Economic Forum" Davos, 4chan, A Declaration of the Independence of Cyberspace, Ada Lovelace, Adam Curtis, Airbnb, AltaVista, Amazon Mechanical Turk, Andy Rubin, benefit corporation, Big Tech, Black Lives Matter, Burning Man, Cambridge Analytica, Chelsea Manning, Chris Wanstrath, citation needed, cloud computing, context collapse, crowdsourcing, data science, deal flow, decentralized internet, delayed gratification, dematerialisation, disinformation, don't be evil, Donald Trump, drone strike, Edward Snowden, Elon Musk, eternal september, fake news, feminist movement, Firefox, gentrification, Google Earth, Google Glasses, Google Hangouts, green new deal, helicopter parent, holacracy, Internet Archive, invention of the telephone, Jeff Bezos, jimmy wales, John Perry Barlow, Jon Ronson, Julie Ann Horvath, Kim Stanley Robinson, l'esprit de l'escalier, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Max Levchin, means of production, Menlo Park, Mondo 2000, moral panic, move fast and break things, Neal Stephenson, Network effects, packet switching, PageRank, pre–internet, profit motive, Project Xanadu, QAnon, real-name policy, recommendation engine, Salesforce, Saturday Night Live, Sheryl Sandberg, Shoshana Zuboff, Silicon Valley, slashdot, Snapchat, social graph, Social Justice Warrior, Stephen Hawking, Steve Jobs, Steven Levy, Stewart Brand, subscription business, surveillance capitalism, tech worker, techlash, technoutopianism, Ted Nelson, TED Talk, Tim Cook: Apple, trade route, Turing complete, Wayback Machine, We are the 99%, web application, white flight, Whole Earth Catalog, you are the product

A user searching “stages of pancreatic cancer” might not mean “(I am experiencing) stages of pancreatic cancer” but rather, “(my nephew is experiencing) stages of pancreatic cancer” or “(this fictional character in my screenplay is experiencing) stages of pancreatic cancer” or “(I am researching) stages of pancreatic cancer.” This is why I can only side-eye data science researchers who wish to declare one state is more queer than another or more gullible to conspiracy theories than the other, based on unreliable data like Google Trends. Who can say for certain why other people google what they do? A search engine is no truth serum. It is distilled curiosity, which has no borders and is, by definition, undefined.


pages: 268 words: 81,811

Flash Crash: A Trading Savant, a Global Manhunt, and the Most Mysterious Market Crash in History by Liam Vaughan

algorithmic trading, backtesting, bank run, barriers to entry, Bernie Madoff, Black Monday: stock market crash in 1987, Black Swan, Bob Geldof, centre right, collapse of Lehman Brothers, data science, Donald Trump, Elliott wave, eurozone crisis, family office, financial engineering, Flash crash, Great Grain Robbery, high net worth, High speed trading, information asymmetry, Jeff Bezos, Kickstarter, land bank, margin call, market design, market microstructure, Market Wizards by Jack D. Schwager, Navinder Sarao, Nick Leeson, offshore financial centre, pattern recognition, Ponzi scheme, proprietary trading, Ralph Nelson Elliott, Reminiscences of a Stock Operator, Ronald Reagan, selling pickaxes during a gold rush, sovereign wealth fund, spectrum auction, Stephen Hawking, the market place, Timothy McVeigh, Tobin tax, tulip mania, yield curve, zero-sum game

The first element involved statistically analyzing changes in the order book and elsewhere for information that indicated whether prices would rise or fall. Inputs might include the number and type of resting orders at different levels, how fast prices are moving around, and the types of market participants active at any time. “Think of it as a giant data science project,” explains one HFT owner. For years, Nav had used his superior pattern recognition and recall skills to read the ebbs and flows of the order book until it became second nature, but even the most gifted human scalper is no match for a computer at parsing large amounts of data. When it came to speed, the leading HFT firms invested hundreds of millions of dollars in computers, cable, and telecommunications equipment to ensure they could react first in what was often a winner-takes-all game.


The Buddha and the Badass: The Secret Spiritual Art of Succeeding at Work by Vishen Lakhiani

Abraham Maslow, Buckminster Fuller, Burning Man, call centre, Colonization of Mars, crowdsourcing, data science, deliberate practice, do what you love, Elon Musk, fail fast, fundamental attribution error, future of work, gamification, Google Glasses, Google X / Alphabet X, iterative process, Jeff Bezos, meta-analysis, microbiome, performance metric, Peter Thiel, profit motive, Ralph Waldo Emerson, Silicon Valley, Silicon Valley startup, skunkworks, Skype, social bookmarking, social contagion, solopreneur, Steve Jobs, Steven Levy, TED Talk, web application, white picket fence, work culture

Don’t underestimate how much you matter or assume your managers won’t have time for your concern or question. And NEVER ever take on the disempowering beliefs of someone else. In fact, when you hear such a thing, correct them. Simply ask a question like: “Have you validated that belief with hard data science and study? Or is that a personal opinion clouded by Fundamental Attribution Error and one’s own childhood insecurities projecting a character trait onto someone else?” You get the idea ;-) The simple rule to live by is this: “If the belief makes me feel disempowered, unless it’s backed by empirical scientific data, and not just on someone’s opinion, I’m going to choose to ignore it and do what will empower me instead.”


pages: 297 words: 84,447

The Star Builders: Nuclear Fusion and the Race to Power the Planet by Arthur Turrell

Albert Einstein, Arthur Eddington, autonomous vehicles, Boeing 747, Boris Johnson, carbon tax, coronavirus, COVID-19, data science, decarbonisation, deep learning, Donald Trump, Eddington experiment, energy security, energy transition, Ernest Rutherford, Extinction Rebellion, green new deal, Greta Thunberg, Higgs boson, Intergovernmental Panel on Climate Change (IPCC), ITER tokamak, Jeff Bezos, Kickstarter, Large Hadron Collider, lockdown, New Journalism, nuclear winter, Peter Thiel, planetary scale, precautionary principle, Project Plowshare, Silicon Valley, social distancing, sovereign wealth fund, statistical model, Stephen Hawking, Steve Bannon, TED Talk, The Rise and Fall of American Growth, Tunguska event

PHOTOGRAPH: KAREN HATCH ARTHUR TURRELL has a PhD in plasma physics from Imperial College London and won the Rutherford Prize for the Public Understanding of Plasma Physics. His research and writing have been featured in the Daily Mail, The Guardian, the International Business Times, Gizmodo, and other publications. He works as a deputy director at the Data Science Campus of the Office for National Statistics in the UK. SimonandSchuster.com www.SimonandSchuster.com/Authors/Arthur-Turrell @ScribnerBooks We hope you enjoyed reading this Simon & Schuster ebook. Get a FREE ebook when you join our mailing list. Plus, get updates on new releases, deals, recommended reads, and more from Simon & Schuster.


Know Thyself by Stephen M Fleming

Abraham Wald, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, AlphaGo, autism spectrum disorder, autonomous vehicles, availability heuristic, backpropagation, citation needed, computer vision, confounding variable, data science, deep learning, DeepMind, Demis Hassabis, Douglas Hofstadter, Dunning–Kruger effect, Elon Musk, Estimating the Reproducibility of Psychological Science, fake news, global pandemic, higher-order functions, index card, Jeff Bezos, l'esprit de l'escalier, Lao Tzu, lifelogging, longitudinal study, meta-analysis, mutually assured destruction, Network effects, patient HM, Pierre-Simon Laplace, power law, prediction markets, QWERTY keyboard, recommendation engine, replication crisis, self-driving car, side project, Skype, Stanislav Petrov, statistical model, theory of mind, Thomas Bayes, traumatic brain injury

A scientist might wonder whether they should spend more time learning new analysis tools or a new theory, and whether the benefits of doing so outweigh the time they could be spending on research. This kind of dilemma is now even more acute thanks to the rise of online courses providing high-quality material on topics ranging from data science to Descartes. One influential theory of the role played by metacognition in choosing what to learn is known as the discrepancy reduction theory. It suggests that people begin studying new material by selecting a target level of learning and keep studying until their assessment of how much they know matches their target.


pages: 282 words: 85,658

Ask Your Developer: How to Harness the Power of Software Developers and Win in the 21st Century by Jeff Lawson

Airbnb, AltaVista, Amazon Web Services, barriers to entry, big data - Walmart - Pop Tarts, Big Tech, big-box store, bitcoin, business process, call centre, Chuck Templeton: OpenTable:, cloud computing, coronavirus, COVID-19, create, read, update, delete, cryptocurrency, data science, David Heinemeier Hansson, deep learning, DevOps, Elon Musk, financial independence, global pandemic, global supply chain, Hacker News, Internet of things, Jeff Bezos, Kanban, Lean Startup, loose coupling, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, microservices, minimum viable product, Mitch Kapor, move fast and break things, Paul Graham, peer-to-peer, ride hailing / ride sharing, risk tolerance, Ruby on Rails, Salesforce, side project, Silicon Valley, Silicon Valley startup, Skype, social distancing, software as a service, software is eating the world, sorting algorithm, Startup school, Steve Ballmer, Steve Jobs, Telecommunications Act of 1996, Toyota Production System, transaction costs, transfer pricing, two-pizza team, Uber and Lyft, uber lyft, ubercab, web application, Y Combinator

The idea was born of necessity—like most nonprofits, Thorn didn’t have deep pockets. Hackathons became part of Thorn’s R&D lab. “We just introduce four or five problems, and say, ‘Okay, go fix it. Solve the problem,’” Kutcher says. Thorn continues to use hackathons to expand upon its work, and has also built out its own dedicated engineering and data science team, 100 percent dedicated to advanced tools to end online child sexual abuse. More interesting, from my perspective, is what this reveals about developers themselves. The people who participate in these hackathons often work for companies that treat them like code monkeys. Thorn invites them to spend a weekend trying to solve an important and difficult tech problem—how to wipe out child sex trafficking—and gives them complete freedom.


pages: 366 words: 94,209

Throwing Rocks at the Google Bus: How Growth Became the Enemy of Prosperity by Douglas Rushkoff

activist fund / activist shareholder / activist investor, Airbnb, Alan Greenspan, algorithmic trading, Amazon Mechanical Turk, Andrew Keen, bank run, banking crisis, barriers to entry, benefit corporation, bitcoin, blockchain, Burning Man, business process, buy and hold, buy low sell high, California gold rush, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, centralized clearinghouse, citizen journalism, clean water, cloud computing, collaborative economy, collective bargaining, colonial exploitation, Community Supported Agriculture, corporate personhood, corporate raider, creative destruction, crowdsourcing, cryptocurrency, data science, deep learning, disintermediation, diversified portfolio, Dutch auction, Elon Musk, Erik Brynjolfsson, Ethereum, ethereum blockchain, fiat currency, Firefox, Flash crash, full employment, future of work, gamification, Garrett Hardin, gentrification, gig economy, Gini coefficient, global supply chain, global village, Google bus, Howard Rheingold, IBM and the Holocaust, impulse control, income inequality, independent contractor, index fund, iterative process, Jaron Lanier, Jeff Bezos, jimmy wales, job automation, Joseph Schumpeter, Kickstarter, Large Hadron Collider, loss aversion, low interest rates, Lyft, Marc Andreessen, Mark Zuckerberg, market bubble, market fundamentalism, Marshall McLuhan, means of production, medical bankruptcy, minimum viable product, Mitch Kapor, Naomi Klein, Network effects, new economy, Norbert Wiener, Oculus Rift, passive investing, payday loans, peer-to-peer lending, Peter Thiel, post-industrial society, power law, profit motive, quantitative easing, race to the bottom, recommendation engine, reserve currency, RFID, Richard Stallman, ride hailing / ride sharing, Ronald Reagan, Russell Brand, Satoshi Nakamoto, Second Machine Age, shareholder value, sharing economy, Silicon Valley, Snapchat, social graph, software patent, Steve Jobs, stock buybacks, TaskRabbit, the Cathedral and the Bazaar, The Future of Employment, the long tail, trade route, Tragedy of the Commons, transportation-network company, Turing test, Uber and Lyft, Uber for X, uber lyft, unpaid internship, Vitalik Buterin, warehouse robotics, Wayback Machine, Y Combinator, young professional, zero-sum game, Zipcar

The big rub is that invention of genuinely new products, of game changers, never comes from refining our analysis of existing consumer trends but from stoking the human ingenuity of our innovators. Without an internal source of innovation, a company loses any competitive advantage over its peers. It is only as good as the data science firm it has hired—which may be the very same one that its competitors are using. In any event, everyone’s buying data from the same brokers and using essentially the same analytics techniques. The only long-term winners in this scheme are the big data firms themselves. Paranoia just feeds the system.


pages: 285 words: 86,853

What Algorithms Want: Imagination in the Age of Computing by Ed Finn

Airbnb, Albert Einstein, algorithmic bias, algorithmic management, algorithmic trading, AlphaGo, Amazon Mechanical Turk, Amazon Web Services, bitcoin, blockchain, business logic, Charles Babbage, Chuck Templeton: OpenTable:, Claude Shannon: information theory, commoditize, Computing Machinery and Intelligence, Credit Default Swap, crowdsourcing, cryptocurrency, data science, DeepMind, disruptive innovation, Donald Knuth, Donald Shoup, Douglas Engelbart, Douglas Engelbart, Elon Musk, Evgeny Morozov, factory automation, fiat currency, Filter Bubble, Flash crash, game design, gamification, Google Glasses, Google X / Alphabet X, Hacker Conference 1984, High speed trading, hiring and firing, Ian Bogost, industrial research laboratory, invisible hand, Isaac Newton, iterative process, Jaron Lanier, Jeff Bezos, job automation, John Conway, John Markoff, Just-in-time delivery, Kickstarter, Kiva Systems, late fees, lifelogging, Loebner Prize, lolcat, Lyft, machine readable, Mother of all demos, Nate Silver, natural language processing, Neal Stephenson, Netflix Prize, new economy, Nicholas Carr, Nick Bostrom, Norbert Wiener, PageRank, peer-to-peer, Peter Thiel, power law, Ray Kurzweil, recommendation engine, Republic of Letters, ride hailing / ride sharing, Satoshi Nakamoto, self-driving car, sharing economy, Silicon Valley, Silicon Valley billionaire, Silicon Valley ideology, Silicon Valley startup, SimCity, Skinner box, Snow Crash, social graph, software studies, speech recognition, statistical model, Steve Jobs, Steven Levy, Stewart Brand, supply-chain management, tacit knowledge, TaskRabbit, technological singularity, technological solutionism, technoutopianism, the Cathedral and the Bazaar, The Coming Technological Singularity, the scientific method, The Signal and the Noise by Nate Silver, The Structural Transformation of the Public Sphere, The Wealth of Nations by Adam Smith, transaction costs, traveling salesman, Turing machine, Turing test, Uber and Lyft, Uber for X, uber lyft, urban planning, Vannevar Bush, Vernor Vinge, wage slave

Unless, of course, something goes wrong and the computer has been possessed by some malicious enemy (like the nanites in the episode “Evolution”).29 Like other elements of the diegetic background of the show, the Enterprise’s talking computer was meant to be unremarkable and efficient.30 The conversational computer of Star Trek had its limits, comically misunderstanding requests and occasionally inspiring the kind of stilted “keywordese” many of us now use with voice-driven algorithmic systems. At its peak, it served as a kind of natural language interface for data science, seeking patterns in various kinds of information and presenting analysis.31 Most important, it presented a simple ideal of frictionless vocal interaction: what Google appears to mean by the Star Trek computer, and what LCARS does simply and effectively for the show’s plotting, is respond usefully to verbal commands and queries.


pages: 313 words: 92,053

Places of the Heart: The Psychogeography of Everyday Life by Colin Ellard

Apollo 11, augmented reality, Benoit Mandelbrot, Berlin Wall, Broken windows theory, Buckminster Fuller, carbon footprint, classic study, cognitive load, commoditize, crowdsourcing, data science, Dunbar number, Frank Gehry, gentrification, Google Glasses, Guggenheim Bilbao, haute couture, Howard Rheingold, Internet of things, Jaron Lanier, Lewis Mumford, mandelbrot fractal, Marshall McLuhan, Masdar, mass immigration, megastructure, mirror neurons, Mondo 2000, more computing power than Apollo, Oculus Rift, overview effect, Peter Eisenman, RFID, Richard Florida, risk tolerance, sentiment analysis, Skinner box, smart cities, starchitect, TED Talk, the built environment, theory of mind, time dilation, urban decay, urban planning, urban sprawl, Victor Gruen

Theoretically, such data could constitute an extremely useful tool for the democratization of city design. Access to this new form of information, critical as it is to understanding how places work, should not only be easily available to everyone, but the basic tools for understanding how it can be used and what it can tell us should be available for all. Data science should be taught in our schools. Discourse in how cities work couched in visualizations built from big data is becoming so important that the basics should be included in the public educational curriculum, just as civics has been now for generations. And, as architectural theorist and historian Sarah Goldhagen has argued, so should architectural history and design.


pages: 340 words: 94,464

Randomistas: How Radical Researchers Changed Our World by Andrew Leigh

Albert Einstein, Amazon Mechanical Turk, Anton Chekhov, Atul Gawande, basic income, behavioural economics, Black Swan, correlation does not imply causation, crowdsourcing, data science, David Brooks, Donald Trump, ending welfare as we know it, Estimating the Reproducibility of Psychological Science, experimental economics, Flynn Effect, germ theory of disease, Ignaz Semmelweis: hand washing, Indoor air pollution, Isaac Newton, It's morning again in America, Kickstarter, longitudinal study, loss aversion, Lyft, Marshall McLuhan, meta-analysis, microcredit, Netflix Prize, nudge unit, offshore financial centre, p-value, Paradox of Choice, placebo effect, price mechanism, publication bias, RAND corporation, randomized controlled trial, recommendation engine, Richard Feynman, ride hailing / ride sharing, Robert Metcalfe, Ronald Reagan, Sheryl Sandberg, statistical model, Steven Pinker, sugar pill, TED Talk, uber lyft, universal basic income, War on Poverty

Randomised testing of email subject headers found that a fundraising appeal titled ‘Do this for Michelle’ raised about $700,000, while ‘I will be outspent’ raised $2.6 million.47 Given that politics is a zero-sum contest, it’s likely that many of the insights on political fundraising aren’t yet public. But there is some sharing of ideas among ideological bedfellows. For example, Dan Wagner, who led Obama’s 2012 data science team, went on to found Civis Analytics, which offers analysis to progressives, including Justin Trudeau’s successful 2015 campaign for the Canadian prime ministership. * On 2 February 2001, a public meeting was held in the West African village of Tissierou by supporters of presidential candidate Sacca Lafia.48 Villagers were informed that Lafia was the first candidate from that region since 1960.


pages: 305 words: 93,091

The Art of Invisibility: The World's Most Famous Hacker Teaches You How to Be Safe in the Age of Big Brother and Big Data by Kevin Mitnick, Mikko Hypponen, Robert Vamosi

4chan, big-box store, bitcoin, Bletchley Park, blockchain, connected car, crowdsourcing, data science, Edward Snowden, en.wikipedia.org, end-to-end encryption, evil maid attack, Firefox, Google Chrome, Google Earth, incognito mode, information security, Internet of things, Kickstarter, Laura Poitras, license plate recognition, Mark Zuckerberg, MITM: man-in-the-middle, off-the-grid, operational security, pattern recognition, ransomware, Ross Ulbricht, Salesforce, self-driving car, Silicon Valley, Skype, Snapchat, speech recognition, Tesla Model S, web application, WikiLeaks, zero day, Zimmermann PGP

It’s easy to figure out the MAC address of authorized devices by using a penetration-test tool known as Wireshark. 10. https://www.pwnieexpress.com/blog/wps-cracking-with-reaver. 11. http://www.wired.com/2010/10/webcam-spy-settlement/. 12. http://www.telegraph.co.uk/technology/internet-security/11153381/How-hackers-took-over-my-computer.html. 13. https://www.blackhat.com/docs/us-16/materials/us-16-Seymour-Tully-Weaponizing-Data-Science-For-Social-Engineering-Automated-E2E-Spear-Phishing-On-Twitter.pdf. 14. http://www.wired.com/2010/01/operation-aurora/. 15. http://www.nytimes.com/2015/01/04/opinion/sunday/how-my-mom-got-hacked.html. 16. http://arstechnica.com/security/2013/10/youre-infected-if-you-want-to-see-your-data-again-pay-us-300-in-bitcoins/. 17. https://securityledger.com/2015/10/fbis-advice-on-cryptolocker-just-pay-the-ransom/.


pages: 293 words: 88,490

The End of Theory: Financial Crises, the Failure of Economics, and the Sweep of Human Interaction by Richard Bookstaber

asset allocation, bank run, Bear Stearns, behavioural economics, bitcoin, business cycle, butterfly effect, buy and hold, capital asset pricing model, cellular automata, collateralized debt obligation, conceptual framework, constrained optimization, Craig Reynolds: boids flock, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, dark matter, data science, disintermediation, Edward Lorenz: Chaos theory, epigenetics, feminist movement, financial engineering, financial innovation, fixed income, Flash crash, geopolitical risk, Henri Poincaré, impact investing, information asymmetry, invisible hand, Isaac Newton, John Conway, John Meriwether, John von Neumann, Joseph Schumpeter, Long Term Capital Management, margin call, market clearing, market microstructure, money market fund, Paul Samuelson, Pierre-Simon Laplace, Piper Alpha, Ponzi scheme, quantitative trading / quantitative finance, railway mania, Ralph Waldo Emerson, Richard Feynman, risk/return, Robert Solow, Saturday Night Live, self-driving car, seminal paper, sovereign wealth fund, the map is not the territory, The Predators' Ball, the scientific method, Thomas Kuhn: the structure of scientific revolutions, too big to fail, transaction costs, tulip mania, Turing machine, Turing test, yield curve

Journal of Financial Economics 59, no. 3: 383–411. doi: 10.1016/S0304-405X(00)00091-X. Helbing, Dirk, Illés Farkas, and Tamás Vicsek. 2000. “Simulating Dynamical Features of Escape Panic.” Nature 407: 487–90. doi: 10.1038/35035023. Helbing, Dirk, and Pratik Mukerji. 2012. “Crowd Disasters as Systemic Failures: Analysis of the Love Parade Disaster.” EPJ Data Science 1: 7. doi: 10.1140/epjds7. Hemelrijk, Charlotte K., and Hanno Hildenbrandt. 2012. “Schools of Fish and Flocks of Birds: Their Shape and Internal Structure by Self-Organization.” Interface Focus 8, no. 21: 726–37. doi: 10.1098/rsfs.2012.0025. Hobsbawm, Eric. 1999. Industry and Empire: The Birth of the Industrial Revolution.


pages: 340 words: 91,416

Lost in Math: How Beauty Leads Physics Astray by Sabine Hossenfelder

Adam Curtis, Albert Einstein, Albert Michelson, anthropic principle, Arthur Eddington, Brownian motion, clockwork universe, cognitive bias, cosmic microwave background, cosmological constant, cosmological principle, crowdsourcing, dark matter, data science, deep learning, double helix, game design, Henri Poincaré, Higgs boson, income inequality, Intergovernmental Panel on Climate Change (IPCC), Isaac Newton, Johannes Kepler, Large Hadron Collider, Murray Gell-Mann, Nick Bostrom, random walk, Richard Feynman, Schrödinger's Cat, Skype, Stephen Hawking, sunk-cost fallacy, systematic bias, TED Talk, the scientific method

William Daniel Phillips, who won the 1997 Nobel Prize in Physics together with Claude Cohen-Tannoudji and Steven Chu for laser cooling, a technique to slow down atoms. 19. Sparkes A et al. 2010. “Towards robot scientists for autonomous scientific discovery.” Automated Experimentation 2:1. 20. Schmidt M, Lipson H. 2009. “Distilling free-form natural laws from experimental data.” Science 324:81–85. 21. Krenn M, Malik M, Fickler R, Lapkiewicz R, Zeilinger A. 2016. “Automated search for new quantum experiments.” Phys Rev Lett. 116:090405. 22. Quoted in Ball P. 2016. “Focus: computer chooses quantum experiments.” Physics 9:25. 23. Powell E. 2011. “Discover interview: Anton Zeilinger dangled from windows, teleported photons, and taught the Dalai Lama.”


pages: 362 words: 87,462

Laziness Does Not Exist by Devon Price

Affordable Care Act / Obamacare, call centre, coronavirus, COVID-19, data science, demand response, Donald Trump, emotional labour, fake news, financial independence, Firefox, gamification, gig economy, Google Chrome, helicopter parent, impulse control, Jean Tirole, job automation, job satisfaction, Lyft, meta-analysis, Minecraft, New Journalism, off-the-grid, pattern recognition, prosperity theology / prosperity gospel / gospel of success, randomized controlled trial, remote working, Saturday Night Live, selection bias, side hustle, side project, Silicon Valley, social distancing, strikebreaker, TaskRabbit, TikTok, traumatic brain injury, uber lyft, working poor

Thank you to my mom and sister for constantly trying to teach me how to lighten up and enjoy life once in a while. I swear I have internalized at least some of it. Finally, thank you to my partner, chinchilla co-parent, and best buckaroo, Nick, for your patience, weirdness, creativity, and love. About the Author Dr. Devon Price is an Assistant Clinical Professor of Applied Psychology and Data Science at Loyola University Chicago’s School of Continuing & Professional Studies. Their research on intellectual humility and political open-mindedness has been published in The Journal of Experimental Social Psychology, Personality and Social Psychology Bulletin, and The Journal of Positive Psychology.


pages: 314 words: 88,524

American Marxism by Mark R. Levin

"RICO laws" OR "Racketeer Influenced and Corrupt Organizations", 2021 United States Capitol attack, affirmative action, American ideology, belling the cat, Bernie Sanders, Big Tech, BIPOC, Black Lives Matter, British Empire, carbon tax, centre right, clean water, collective bargaining, colonial exploitation, conceptual framework, coronavirus, COVID-19, creative destruction, critical race theory, crony capitalism, data science, defund the police, degrowth, deindustrialization, deplatforming, disinformation, Donald Trump, energy security, Food sovereignty, George Floyd, green new deal, Herbert Marcuse, high-speed rail, illegal immigration, income inequality, liberal capitalism, lockdown, Mark Zuckerberg, means of production, Michael Shellenberger, microaggression, New Journalism, open borders, Parler "social media", planned obsolescence, rolling blackouts, Ronald Reagan, school choice, school vouchers, single-payer health, tech billionaire, the market place, urban sprawl, yellow journalism

If man came into this century trailing clouds of transcendental glory, he was now accounted for in a way that would satisfy the positivists.”21 That is, by those intellectuals who reject eternal truths and experience through the ages for the social engineering by supposed experts and their administrative state—which claim to use data, science, and empiricism to analyze, manage, and control society. Weaver also referenced Charles Darwin and his theory of evolution, writing that “[b]iological necessity, issuing in the survival of the fittest, was offered as the causa causans [the primary cause of action], after the important question of human origin had been decided in favor of scientific materialism.


pages: 357 words: 94,852

No Is Not Enough: Resisting Trump’s Shock Politics and Winning the World We Need by Naomi Klein

"Hurricane Katrina" Superdome, "World Economic Forum" Davos, Airbnb, antiwork, basic income, battle of ideas, Berlin Wall, Bernie Sanders, Black Lives Matter, Brewster Kahle, carbon tax, Carl Icahn, Celebration, Florida, clean water, collective bargaining, Corrections Corporation of America, data science, desegregation, Donald Trump, drone strike, Edward Snowden, Elon Musk, end-to-end encryption, energy transition, extractivism, fake news, financial deregulation, gentrification, Global Witness, greed is good, green transition, high net worth, high-speed rail, Howard Zinn, illegal immigration, impact investing, income inequality, Internet Archive, Kickstarter, late capitalism, Mark Zuckerberg, market bubble, market fundamentalism, mass incarceration, megaproject, Mikhail Gorbachev, military-industrial complex, moral panic, Naomi Klein, Nate Silver, new economy, Occupy movement, ocean acidification, offshore financial centre, oil shale / tar sands, open borders, Paris climate accords, Patri Friedman, Peter Thiel, plutocrats, private military company, profit motive, race to the bottom, Ralph Nader, Ronald Reagan, Saturday Night Live, sexual politics, sharing economy, Silicon Valley, Steve Bannon, subprime mortgage crisis, tech billionaire, too big to fail, trade liberalization, transatlantic slave trade, Triangle Shirtwaist Factory, trickle-down economics, Upton Sinclair, urban decay, W. E. B. Du Bois, women in the workforce, working poor

, December 29, 2016, https://www.democracynow.org/​2016/​12/​29/​facing_possible_threats_under_trump_internet. “Data rescue” events Lisa Song and Zahra Hirji, “The Scramble to Protect Climate Data under Trump,” Inside Climate News, January 20, 2017, https://insideclimatenews.org/​news/​19012017/​climate-change-data-science-denial-donald-trump. “Hackathon” at UC Berkeley: two hundred data defenders Megan Molteni, “Diehard Coders Just Rescued NASA’s Earth Science Data,” Wired, February 13, 2017, https://www.wired.com/​2017/​02/​diehard-coders-just-saved-nasas-earth-science-data/. Jane Goodall: “a trumpet call” David Smith, “Jane Goodall Calls Trump’s Climate Change Agenda ‘Immensely Depressing,’ ”Guardian, March 29, 2017, https://www.theguardian.com/​environment/​2017/​mar/​28/​jane-goodall-trump-climate-change.


pages: 369 words: 98,776

The God Species: Saving the Planet in the Age of Humans by Mark Lynas

Airbus A320, Anthropocene, back-to-the-land, Berlin Wall, biodiversity loss, carbon credits, carbon footprint, clean water, Climategate, Climatic Research Unit, data science, David Ricardo: comparative advantage, decarbonisation, degrowth, dematerialisation, demographic transition, Easter island, Eyjafjallajökull, Great Leap Forward, Haber-Bosch Process, ice-free Arctic, Intergovernmental Panel on Climate Change (IPCC), invention of the steam engine, James Watt: steam engine, megacity, meta-analysis, moral hazard, Negawatt, New Urbanism, ocean acidification, oil shale / tar sands, out of africa, peak oil, planetary scale, precautionary principle, quantitative easing, race to the bottom, rewilding, Ronald Reagan, special drawing rights, Stewart Brand, synthetic biology, Tragedy of the Commons, two and twenty, undersea cable, University of East Anglia, We are as Gods

., 2008: “Vulnerability of Permafrost Carbon to Climate Change: Implications for the Global Carbon Cycle,” Bioscience, 58, 8. 40. C. Tarnocai et al., 2009: “Soil Organic Carbon Pools in the Northern Circumpolar Permafrost Region,” Global Biogeochemical Cycles, 23, GB2023. 41. A. Bloom et al., 2010: “Large-Scale Controls of Methanogenesis Inferred from Methane and Gravity Spaceborne Data,” Science, 327, 5963, 322–5. 42. N. Shakhova et al., 2010: “Extensive Methane Venting to the Atmosphere from Sediments of the East Siberian Arctic Shelf,” Science, 327, 5970, 1246–50, and G. Westbrook et al., 2009: “Escape of Methane Gas from the Seabed along the West Spitsbergen Continental Margin,” Geophysical Research Letters, 36, L15608. 43. http://www.realclimate.org/index.php/archives/2010/03/arctic-methane-on-the-move/. 44.


pages: 416 words: 100,130

New Power: How Power Works in Our Hyperconnected World--And How to Make It Work for You by Jeremy Heimans, Henry Timms

"Susan Fowler" uber, "World Economic Forum" Davos, 3D printing, 4chan, Affordable Care Act / Obamacare, Airbnb, algorithmic management, augmented reality, autonomous vehicles, battle of ideas, benefit corporation, Benjamin Mako Hill, Big Tech, bitcoin, Black Lives Matter, blockchain, British Empire, Chris Wanstrath, Columbine, Corn Laws, crowdsourcing, data science, David Attenborough, death from overwork, Donald Trump, driverless car, Elon Musk, fake news, Ferguson, Missouri, future of work, game design, gig economy, hiring and firing, holacracy, hustle culture, IKEA effect, impact investing, income inequality, informal economy, job satisfaction, John Zimmer (Lyft cofounder), Jony Ive, Kevin Roose, Kibera, Kickstarter, Lean Startup, Lyft, Mark Zuckerberg, Minecraft, Network effects, new economy, Nicholas Carr, obamacare, Occupy movement, post-truth, profit motive, race to the bottom, radical decentralization, ride hailing / ride sharing, rolling blackouts, rolodex, Salesforce, Saturday Night Live, sharing economy, side hustle, Silicon Valley, six sigma, Snapchat, social web, subscription business, TaskRabbit, tech billionaire, TED Talk, the scientific method, transaction costs, Travis Kalanick, Uber and Lyft, uber lyft, upwardly mobile, web application, WikiLeaks, Yochai Benkler

Most people received a “social” message, which was the same as the “billboard” message but with one big difference. It showed the profile pictures of up to six randomly selected Facebook friends who had clicked the “I voted” button. Researchers from the University of California, San Diego, in collaboration with Facebook’s data-science team, then compared online actions with public records to get a sense of whether which message the user received (or did not receive) affected whether the person voted. They published their study in Nature. Their first stunning result was that the billboard group voted at the same rate as the control group.


pages: 398 words: 105,032

Soonish: Ten Emerging Technologies That'll Improve And/or Ruin Everything by Kelly Weinersmith, Zach Weinersmith

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, 23andMe, 3D printing, Airbnb, Alvin Roth, Apollo 11, augmented reality, autonomous vehicles, connected car, CRISPR, data science, disinformation, double helix, Elon Musk, en.wikipedia.org, Google Glasses, hydraulic fracturing, industrial robot, information asymmetry, ITER tokamak, Kickstarter, low earth orbit, market design, megaproject, megastructure, microbiome, moral hazard, multiplanetary species, orbital mechanics / astrodynamics, personalized medicine, placebo effect, printed gun, Project Plowshare, QR code, Schrödinger's Cat, self-driving car, Skype, space junk, stem cell, synthetic biology, Tunguska event, Virgin Galactic

So to speak, if you have a computer model called “How Am I Doing?” a biomarker is anything you might input into that model that would help it find an answer. Just as the coming together of science and medical practice brought about modern medicine, the coming together of medical science with molecular analysis, data science, and machine learning may bring about a new paradigm, which is coming to be called precision medicine. In the future, you may get medical diagnoses that are determined quickly and correctly from thousands of biomarkers, followed by treatments that are tailored to you in particular. This means you will live longer, live healthier, and—if the detection systems get cheap and easy enough—you don’t spend nearly as much time wondering if that bump on your right butt cheek is cancer.


pages: 334 words: 104,382

Brotopia: Breaking Up the Boys' Club of Silicon Valley by Emily Chang

"Margaret Hamilton" Apollo, "Susan Fowler" uber, "World Economic Forum" Davos, 23andMe, 4chan, Ada Lovelace, affirmative action, Airbnb, Alan Greenspan, Andy Rubin, Apollo 11, Apple II, augmented reality, autism spectrum disorder, autonomous vehicles, barriers to entry, Benchmark Capital, Bernie Sanders, Big Tech, Burning Man, California gold rush, Chuck Templeton: OpenTable:, clean tech, company town, data science, David Brooks, deal flow, Donald Trump, Dr. Strangelove, driverless car, Elon Musk, emotional labour, equal pay for equal work, fail fast, Fairchild Semiconductor, fake news, Ferguson, Missouri, game design, gender pay gap, Google Glasses, Google X / Alphabet X, Grace Hopper, Hacker News, high net worth, Hyperloop, imposter syndrome, Jeff Bezos, job satisfaction, Khan Academy, Lyft, Marc Andreessen, Mark Zuckerberg, Mary Meeker, Maui Hawaii, Max Levchin, Menlo Park, meritocracy, meta-analysis, microservices, Parker Conrad, paypal mafia, Peter Thiel, post-work, pull request, reality distortion field, Richard Hendricks, ride hailing / ride sharing, rolodex, Salesforce, Saturday Night Live, shareholder value, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, Skype, Snapchat, Steve Jobs, Steve Jurvetson, Steve Wozniak, Steven Levy, subscription business, Susan Wojcicki, tech billionaire, tech bro, tech worker, TED Talk, Tim Cook: Apple, Travis Kalanick, uber lyft, women in the workforce, Zenefits

We face a near-term future of autonomous cars, augmented reality, and artificial intelligence, and yet we are at risk of embedding gender bias into all of these new algorithms. “It’s bad for shareholder value,” Megan Smith, who has worked as a Google VP and chief technology officer of the United States, told me. “We want the genetic flourishing of all humanity . . . in on making these products, especially as we move to AI and data sciences.” If robots are going to run the world, or at the very least play a hugely critical role in our future, men shouldn’t be programming them alone. “We have a long way to go and we recognize it,” Microsoft CEO Satya Nadella told me as his company pushes into a future of machine learning and mixed reality.


pages: 407 words: 104,622

The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution by Gregory Zuckerman

affirmative action, Affordable Care Act / Obamacare, Alan Greenspan, Albert Einstein, Andrew Wiles, automated trading system, backtesting, Bayesian statistics, Bear Stearns, beat the dealer, behavioural economics, Benoit Mandelbrot, Berlin Wall, Bernie Madoff, Black Monday: stock market crash in 1987, blockchain, book value, Brownian motion, butter production in bangladesh, buy and hold, buy low sell high, Cambridge Analytica, Carl Icahn, Claude Shannon: information theory, computer age, computerized trading, Credit Default Swap, Daniel Kahneman / Amos Tversky, data science, diversified portfolio, Donald Trump, Edward Thorp, Elon Musk, Emanuel Derman, endowment effect, financial engineering, Flash crash, George Gilder, Gordon Gekko, illegal immigration, index card, index fund, Isaac Newton, Jim Simons, John Meriwether, John Nash: game theory, John von Neumann, junk bonds, Loma Prieta earthquake, Long Term Capital Management, loss aversion, Louis Bachelier, mandelbrot fractal, margin call, Mark Zuckerberg, Michael Milken, Monty Hall problem, More Guns, Less Crime, Myron Scholes, Naomi Klein, natural language processing, Neil Armstrong, obamacare, off-the-grid, p-value, pattern recognition, Peter Thiel, Ponzi scheme, prediction markets, proprietary trading, quantitative hedge fund, quantitative trading / quantitative finance, random walk, Renaissance Technologies, Richard Thaler, Robert Mercer, Ronald Reagan, self-driving car, Sharpe ratio, Silicon Valley, sovereign wealth fund, speech recognition, statistical arbitrage, statistical model, Steve Bannon, Steve Jobs, stochastic process, the scientific method, Thomas Bayes, transaction costs, Turing machine, Two Sigma

“The inefficiencies are so complex they are, in a sense, hidden in the markets in code,” a staffer says. “RenTec decrypts them. We find them across time, across risk factors, across sectors and industries.” Even more important: Renaissance concluded that there are reliable mathematical relationships between all these forces. Applying data science, the researchers achieved a better sense of when various factors were relevant, how they interrelated, and the frequency with which they influenced shares. They also tested and teased out subtle, nuanced mathematical relationships between various shares—what staffers call multidimensional anomalies—that other investors were oblivious to or didn’t fully understand.


pages: 338 words: 104,815

Nobody's Fool: Why We Get Taken in and What We Can Do About It by Daniel Simons, Christopher Chabris

Abraham Wald, Airbnb, artificial general intelligence, Bernie Madoff, bitcoin, Bitcoin "FTX", blockchain, Boston Dynamics, butterfly effect, call centre, Carmen Reinhart, Cass Sunstein, ChatGPT, Checklist Manifesto, choice architecture, computer vision, contact tracing, coronavirus, COVID-19, cryptocurrency, DALL-E, data science, disinformation, Donald Trump, Elon Musk, en.wikipedia.org, fake news, false flag, financial thriller, forensic accounting, framing effect, George Akerlof, global pandemic, index fund, information asymmetry, information security, Internet Archive, Jeffrey Epstein, Jim Simons, John von Neumann, Keith Raniere, Kenneth Rogoff, London Whale, lone genius, longitudinal study, loss aversion, Mark Zuckerberg, meta-analysis, moral panic, multilevel marketing, Nelson Mandela, pattern recognition, Pershing Square Capital Management, pets.com, placebo effect, Ponzi scheme, power law, publication bias, randomized controlled trial, replication crisis, risk tolerance, Robert Shiller, Ronald Reagan, Rubik’s Cube, Sam Bankman-Fried, Satoshi Nakamoto, Saturday Night Live, Sharpe ratio, short selling, side hustle, Silicon Valley, Silicon Valley startup, Skype, smart transportation, sovereign wealth fund, statistical model, stem cell, Steve Jobs, sunk-cost fallacy, survivorship bias, systematic bias, TED Talk, transcontinental railway, WikiLeaks, Y2K

Simonsohn, “Just Post It: The Lesson from Two Cases of Fabricated Data Detected by Statistics Alone,” Psychological Science 24 (2013): 1875–1888 [https://doi.org/10.1177/0956797613480366]. The original study was conducted in the Netherlands, so the amounts were not in dollars, but the same principle applies. 28. M. Enserink, “Rotterdam Marketing Psychologist Resigns After University Investigates His Data,” Science, June 25, 2012 [doi.org/10.1126/article.27200]. 29. Simonsohn interview: E. Yong, “The Data Detective,” Nature 487 (2012): 18–19 [https://doi.org/10.1038/487018a]. 30. G. Spier, The Education of a Value Investor (New York: Palgrave Macmillan, 2014). The Farmer Mac story is on pp. 53–57. Having learned the lesson of his wrong initial call on Farmer Mac, Spier later spent over one year researching a company called BYD Auto, a Chinese battery and car maker, before investing his fund’s money (pp. 125–126). 31.


pages: 341 words: 107,933

The Dealmaker: Lessons From a Life in Private Equity by Guy Hands

Airbus A320, banking crisis, Bear Stearns, British Empire, Bullingdon Club, corporate governance, COVID-19, credit crunch, data science, deal flow, Etonian, family office, financial engineering, fixed income, flag carrier, high net worth, junk bonds, lockdown, Long Term Capital Management, low cost airline, Nelson Mandela, North Sea oil, old-boy network, Paul Samuelson, plutocrats, proprietary trading, Silicon Valley, South Sea Bubble, sovereign wealth fund, subprime mortgage crisis, traveling salesman

Back in my Nomura days I had one other trick in my magic box that other private equity firms at the time didn’t understand at all: technology. One of Nomura’s greatest assets was that they focused more on technical and analytical skills than sales and marketing skills. This meant they were much less likely to be impressed by an arts student from Cambridge than someone with a Ph.D. in data science. With the bank’s financial support I set up what became known as the Cyber Room – a room full of extremely analytical, ludicrously intelligent, quantitative mathematicians, or ‘quants’, most of whom had a Ph.D. in maths or particle physics. One had that rare neurological condition known as synaesthesia, in which senses that aren’t normally connected merge.


pages: 387 words: 120,155

Inside the Nudge Unit: How Small Changes Can Make a Big Difference by David Halpern

Affordable Care Act / Obamacare, availability heuristic, behavioural economics, carbon footprint, Cass Sunstein, centre right, choice architecture, cognitive dissonance, cognitive load, collaborative consumption, correlation does not imply causation, Daniel Kahneman / Amos Tversky, data science, different worldview, endowment effect, gamification, happiness index / gross national happiness, hedonic treadmill, hindsight bias, IKEA effect, illegal immigration, job satisfaction, Kickstarter, language acquisition, libertarian paternalism, light touch regulation, longitudinal study, machine readable, market design, meta-analysis, Milgram experiment, nudge unit, peer-to-peer lending, pension reform, precautionary principle, presumed consent, QR code, quantitative easing, randomized controlled trial, Richard Thaler, Right to Buy, Ronald Reagan, Rory Sutherland, Simon Kuznets, skunkworks, supply chain finance, the built environment, theory of mind, traffic fines, twin studies, World Values Survey

The transparency of the information has subtly changed the market, and might even edge us towards saving the planet. Shaping better nudges, with better data We have seen how behavioural science can shape how, and when, data is presented to create an especially powerful class of nudging – what we might call ‘behaviourally shaped informing’. But the relationship is two-way. Data science is also shaping and enhancing the power of nudges. We will explore more about this in Chapter 10, but let us have a glimpse into this world. Many businesses, and occasionally governments, have dabbled in the art of segmentation. Advertising agencies and political pundits often classify people into different groups, sometimes adding an evocative name to catch the segment, such as ‘soccer mums’ (argued to be a key segment in the Clinton campaign); ‘Generation X’; or the ‘aspirant working class’.


pages: 396 words: 112,832

Bread, Wine, Chocolate: The Slow Loss of Foods We Love by Simran Sethi

biodiversity loss, Chuck Templeton: OpenTable:, data science, food desert, Food sovereignty, Intergovernmental Panel on Climate Change (IPCC), invisible hand, Ken Thompson, Louis Pasteur, microbiome, phenotype, placebo effect, Skype, TED Talk, The Wealth of Nations by Adam Smith, women in the workforce

This Video Will Inspire You,” Sprudge.com, May 15, 2014, http://sprudge.com/erna-knutsen-specialty-coffee-legend-video-will-inspire-56318.html. 31.“Baseline Data for the Conservation of Coffee Species,” Kew Royal Botanical Gardens, accessed July 23, 2015, http://www.kew.org/science-conservation/research-data/science-directory/projects/baseline-data-conservation-coffee. 32.Julian Siddle and Vibeke Venema, “Saving Coffee from Extinction,” BBC News Magazine, May 24, 2015, http://www.bbc.com/news/magazine-32736366. 33.“All You Need to Know About Coffee: Species and Varieties,” CIRAD, accessed March 23, 2014, http://www.cirad.fr/en/publications-resources/science-for-all/the-issues/coffee/what-you-need-to-know/species-and-varieties. 34.Raimond Feil, “Coffea: Genus—Species—Varieties,” trans.


pages: 405 words: 117,219

In Our Own Image: Savior or Destroyer? The History and Future of Artificial Intelligence by George Zarkadakis

3D printing, Ada Lovelace, agricultural Revolution, Airbnb, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, animal electricity, anthropic principle, Asperger Syndrome, autonomous vehicles, barriers to entry, battle of ideas, Berlin Wall, bioinformatics, Bletchley Park, British Empire, business process, carbon-based life, cellular automata, Charles Babbage, Claude Shannon: information theory, combinatorial explosion, complexity theory, Computing Machinery and Intelligence, continuous integration, Conway's Game of Life, cosmological principle, dark matter, data science, deep learning, DeepMind, dematerialisation, double helix, Douglas Hofstadter, driverless car, Edward Snowden, epigenetics, Flash crash, Google Glasses, Gödel, Escher, Bach, Hans Moravec, income inequality, index card, industrial robot, intentional community, Internet of things, invention of agriculture, invention of the steam engine, invisible hand, Isaac Newton, Jacquard loom, Jacques de Vaucanson, James Watt: steam engine, job automation, John von Neumann, Joseph-Marie Jacquard, Kickstarter, liberal capitalism, lifelogging, machine translation, millennium bug, mirror neurons, Moravec's paradox, natural language processing, Nick Bostrom, Norbert Wiener, off grid, On the Economy of Machinery and Manufactures, packet switching, pattern recognition, Paul Erdős, Plato's cave, post-industrial society, power law, precautionary principle, prediction markets, Ray Kurzweil, Recombinant DNA, Rodney Brooks, Second Machine Age, self-driving car, seminal paper, Silicon Valley, social intelligence, speech recognition, stem cell, Stephen Hawking, Steven Pinker, Strategic Defense Initiative, strong AI, Stuart Kauffman, synthetic biology, systems thinking, technological singularity, The Coming Technological Singularity, The Future of Employment, the scientific method, theory of mind, Turing complete, Turing machine, Turing test, Tyler Cowen, Tyler Cowen: Great Stagnation, Vernor Vinge, Von Neumann architecture, Watson beat the top human players on Jeopardy!, Y2K

Within the past two years Google, one of the biggest companies in the computer industry,2 acquired a number of companies in Artificial Intelligence and advanced robotics. Facebook also announced that one of the most prominent AI researchers in the world, Professor Yann LeCun of NYU’s Center for Data Science, would be joining the company to direct a massive new AI effort. These global companies move towards smarter machine technologies because they understand the challenges and opportunities entailed in owning big data. They also understand that it is not enough to own the data. The real game changer lies in understanding the data’s true significance.


pages: 492 words: 118,882

The Blockchain Alternative: Rethinking Macroeconomic Policy and Economic Theory by Kariappa Bheemaiah

"World Economic Forum" Davos, accounting loophole / creative accounting, Ada Lovelace, Adam Curtis, Airbnb, Alan Greenspan, algorithmic trading, asset allocation, autonomous vehicles, balance sheet recession, bank run, banks create money, Basel III, basic income, behavioural economics, Ben Bernanke: helicopter money, bitcoin, Bletchley Park, blockchain, Bretton Woods, Brexit referendum, business cycle, business process, call centre, capital controls, Capital in the Twenty-First Century by Thomas Piketty, cashless society, cellular automata, central bank independence, Charles Babbage, Claude Shannon: information theory, cloud computing, cognitive dissonance, collateralized debt obligation, commoditize, complexity theory, constrained optimization, corporate governance, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, cross-border payments, crowdsourcing, cryptocurrency, data science, David Graeber, deep learning, deskilling, Diane Coyle, discrete time, disruptive innovation, distributed ledger, diversification, double entry bookkeeping, Ethereum, ethereum blockchain, fiat currency, financial engineering, financial innovation, financial intermediation, Flash crash, floating exchange rates, Fractional reserve banking, full employment, George Akerlof, Glass-Steagall Act, Higgs boson, illegal immigration, income inequality, income per capita, inflation targeting, information asymmetry, interest rate derivative, inventory management, invisible hand, John Maynard Keynes: technological unemployment, John von Neumann, joint-stock company, Joseph Schumpeter, junk bonds, Kenneth Arrow, Kenneth Rogoff, Kevin Kelly, knowledge economy, large denomination, Large Hadron Collider, Lewis Mumford, liquidity trap, London Whale, low interest rates, low skilled workers, M-Pesa, machine readable, Marc Andreessen, market bubble, market fundamentalism, Mexican peso crisis / tequila crisis, Michael Milken, MITM: man-in-the-middle, Money creation, money market fund, money: store of value / unit of account / medium of exchange, mortgage debt, natural language processing, Network effects, new economy, Nikolai Kondratiev, offshore financial centre, packet switching, Pareto efficiency, pattern recognition, peer-to-peer lending, Ponzi scheme, power law, precariat, pre–internet, price mechanism, price stability, private sector deleveraging, profit maximization, QR code, quantitative easing, quantitative trading / quantitative finance, Ray Kurzweil, Real Time Gross Settlement, rent control, rent-seeking, robo advisor, Satoshi Nakamoto, Satyajit Das, Savings and loan crisis, savings glut, seigniorage, seminal paper, Silicon Valley, Skype, smart contracts, software as a service, software is eating the world, speech recognition, statistical model, Stephen Hawking, Stuart Kauffman, supply-chain management, technology bubble, The Chicago School, The Future of Employment, The Great Moderation, the market place, The Nature of the Firm, the payments system, the scientific method, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, too big to fail, trade liberalization, transaction costs, Turing machine, Turing test, universal basic income, Vitalik Buterin, Von Neumann architecture, Washington Consensus

However, they tend to be computationally expensive. Footnotes 1The Royal Society is a Fellowship of many of the world’s most eminent scientists and is the oldest scientific academy in continuous existence. 2See ‘Technological novelty profile and invention’s future impact’, Kim et al., (2016), EPJ Data Science. 3The term ‘combinatorial evolution’, was coined by the scientific theorist W. Brian Arthur, who is also one of the founders of complexity economics. In a streak that is similar to Thomas Kuhn’s ‘The Structure of Scientific Revolutions’, Arthur’s book, ‘The Nature of Technology: What It Is and How It Evolves’, explains that technologies are based on interactions and composed into modular systems of components that can grow.


pages: 421 words: 110,406

Platform Revolution: How Networked Markets Are Transforming the Economy--And How to Make Them Work for You by Sangeet Paul Choudary, Marshall W. van Alstyne, Geoffrey G. Parker

3D printing, Affordable Care Act / Obamacare, Airbnb, Alvin Roth, Amazon Mechanical Turk, Amazon Web Services, Andrei Shleifer, Apple's 1984 Super Bowl advert, autonomous vehicles, barriers to entry, Benchmark Capital, big data - Walmart - Pop Tarts, bitcoin, blockchain, business cycle, business logic, business process, buy low sell high, chief data officer, Chuck Templeton: OpenTable:, clean water, cloud computing, connected car, corporate governance, crowdsourcing, data acquisition, data is the new oil, data science, digital map, discounted cash flows, disintermediation, driverless car, Edward Glaeser, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, financial innovation, Free Software Foundation, gigafactory, growth hacking, Haber-Bosch Process, High speed trading, independent contractor, information asymmetry, Internet of things, inventory management, invisible hand, Jean Tirole, Jeff Bezos, jimmy wales, John Markoff, Kevin Roose, Khan Academy, Kickstarter, Lean Startup, Lyft, Marc Andreessen, market design, Max Levchin, Metcalfe’s law, multi-sided market, Network effects, new economy, PalmPilot, payday loans, peer-to-peer lending, Peter Thiel, pets.com, pre–internet, price mechanism, recommendation engine, RFID, Richard Stallman, ride hailing / ride sharing, Robert Metcalfe, Ronald Coase, Salesforce, Satoshi Nakamoto, search costs, self-driving car, shareholder value, sharing economy, side project, Silicon Valley, Skype, smart contracts, smart grid, Snapchat, social bookmarking, social contagion, software is eating the world, Steve Jobs, TaskRabbit, The Chicago School, the long tail, the payments system, Tim Cook: Apple, transaction costs, Travis Kalanick, two-sided market, Uber and Lyft, Uber for X, uber lyft, vertical integration, winner-take-all economy, zero-sum game, Zipcar

We have also benefited from working with a group of world-class scholars who have dedicated their careers to understanding the digital economy, and who participate in the annual Workshop on Information Systems and Economics (WISE) and the Boston University Platform Strategy Research Symposium, as well as some of the world’s leading thinkers in adjacent fields such as behavior design, data science, systems design theory, and agile methodologies. We have written this book because we believe that digital connectivity and the platform model it makes possible are changing the world forever. The platform-driven economic transformation is producing enormous benefits for society as a whole and for the businesses and other organizations that create wealth, generate growth, and serve the needs of humankind.


pages: 404 words: 115,108

They Don't Represent Us: Reclaiming Our Democracy by Lawrence Lessig

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, Aaron Swartz, Affordable Care Act / Obamacare, Berlin Wall, Bernie Sanders, blockchain, Cambridge Analytica, Cass Sunstein, Columbine, crony capitalism, crowdsourcing, data science, David Brooks, disinformation, do-ocracy, Donald Trump, fake news, Fall of the Berlin Wall, Filter Bubble, Francis Fukuyama: the end of history, Free Software Foundation, Gabriella Coleman, illegal immigration, income inequality, Jaron Lanier, Jeff Bezos, John Gilmore, Joi Ito, Mark Zuckerberg, obamacare, opioid epidemic / opioid crisis, Parag Khanna, plutocrats, race to the bottom, Ralph Nader, rent-seeking, Richard Thaler, Ronald Reagan, Sheryl Sandberg, Shoshana Zuboff, Silicon Valley, Skype, speech recognition, Steven Levy, surveillance capitalism, Upton Sinclair, Yochai Benkler

CHAPTER 2: THE UNREPRESENTATIVE US 1.See “Topics of the Times: Italy Hails Our Dictator,” New York Times, March 7, 1933; “Italian Fascists Call Roosevelt Rome’s Disciple,” New York Herald Tribune, May 7, 1933; “When Thieves Fall Out,” Daily Worker, March 9, 1935; Harvey Klehr, “American Reds, Soviet Stooges,” New York Times, July 3, 2017, available at link #84; Roger Shaw, “Fascism and the New Deal,” North American Review 238 (1934): 559–64, available at link #85. Roosevelt himself acknowledged the criticism. Franklin Roosevelt, “Fireside Chat 5: On Addressing the Critics,” June 28, 1934, available at link #86. 2.Jill Lepore, “Politics and the New Machine: What the Turn from Polls to Data Science Means for Democracy,” New Yorker, November 16, 2015, available at link #87. Skepticism about polls and the idea of a public will is long-standing. For some representative sources, see Jean M. Converse, Survey Research in the United States: Roots and Emergence, 1890–1960 (Berkeley: University of California Press, 1987). 3.Peverill Squire, “Why the 1936 Literary Digest Poll Failed,” Public Opinion Quarterly 52, 1 (1988): 125–33 4.Daniel Robinson, The Measure of Democracy: Polling, Market Research, and Public Life, 1930–1945 (Toronto: University of Toronto Press, 2019), Kindle edition, loc. 869, n.4.


pages: 362 words: 116,497

Palace Coup: The Billionaire Brawl Over the Bankrupt Caesars Gaming Empire by Sujeet Indap, Max Frumes

Airbnb, Bear Stearns, Blythe Masters, book value, business cycle, Carl Icahn, coronavirus, corporate governance, corporate raider, Credit Default Swap, data science, deal flow, Donald Trump, family office, fear of failure, financial engineering, fixed income, Jeffrey Epstein, junk bonds, lockdown, low interest rates, Michael Milken, mortgage debt, NetJets, power law, ride hailing / ride sharing, Right to Buy, Robert Solow, Savings and loan crisis, shareholder value, super pumped, Travis Kalanick

He maintained Boston as his home base, but had a beach home in North Carolina he visited via his own turboprop plane (which had a green stripe on its tail in the same shade as the Boston Celtics logo, the NBA team of which he and David Bonderman were part owners). After spending decades using his knowledge of data science to get Americans to gamble more, Loveman perhaps found a more noble use of his talent. In 2015, he worked for a time at health insurer Aetna, trying to see how data could be used to promote better health outcomes—a topic that had interested him since managing tens of thousands of casino workers in his CEO days.


pages: 414 words: 117,581

Binge Times: Inside Hollywood's Furious Billion-Dollar Battle to Take Down Netflix by Dade Hayes, Dawn Chmielewski

activist fund / activist shareholder / activist investor, Airbnb, Albert Einstein, Amazon Web Services, AOL-Time Warner, Apollo 13, augmented reality, barriers to entry, Big Tech, borderless world, cloud computing, cognitive dissonance, content marketing, coronavirus, corporate raider, COVID-19, data science, digital rights, Donald Trump, Downton Abbey, Elon Musk, George Floyd, global pandemic, Golden age of television, haute cuisine, hockey-stick growth, invention of the telephone, Jeff Bezos, John Markoff, Jony Ive, late fees, lockdown, loose coupling, Marc Andreessen, Mark Zuckerberg, Mitch Kapor, Netflix Prize, Osborne effect, performance metric, period drama, Phoebe Waller-Bridge, QR code, reality distortion field, recommendation engine, remote working, Ronald Reagan, Salesforce, Saturday Night Live, Silicon Valley, skunkworks, Skype, Snapchat, social distancing, Steve Jobs, subscription business, tech bro, the long tail, the medium is the message, TikTok, Tim Cook: Apple, vertical integration, WeWork

Cheng spent two years rooting around in data to glean insights to guide how much of the studio’s entertainment resources to devote to particular projects. As is true at Netflix, analytics wouldn’t replace a creative executive’s judgment about which pitches and showrunners had the potential to make a hit show. But data science could make predictions about a show’s success based on historical performance—information that would help frame financial risk. Under Salke, Amazon Studios focused on global development, putting into production series from India, Japan, Britain, Germany, Mexico, and elsewhere to fulfill Bezos’s vision of Amazon Prime Video as a glittery customer acquisition tool for the Prime subscription service.


pages: 419 words: 119,476

Posh Boys: How English Public Schools Ruin Britain by Robert Verkaik

accounting loophole / creative accounting, Alistair Cooke, banking crisis, Berlin Wall, Boris Johnson, Brexit referendum, British Empire, Brixton riot, Bullingdon Club, Cambridge Analytica, data science, disinformation, Dominic Cummings, Donald Trump, Etonian, G4S, gender pay gap, God and Mammon, income inequality, Jeremy Corbyn, Khartoum Gordon, Kickstarter, knowledge economy, Livingstone, I presume, loadsamoney, mega-rich, Neil Kinnock, offshore financial centre, old-boy network, Piers Corbyn, place-making, plutocrats, Robert Gordon, Robert Mercer, school vouchers, Stephen Fry, Steve Bannon, Suez crisis 1956, The Bell Curve by Richard Herrnstein and Charles Murray, trade route, traveling salesman, unpaid internship

mhq5j=e3; http://www.mirror.co.uk/news/politics/greedy-george-osborne-facing-furious-10049285 51 https://www.byline.com/column/67/article/2049 11 Boys’ Own Brexit 1 Stuart Jeffries, The Guardian, 26 May 2014. 2 http://www.dulwich.org.uk/college/about/history 3 http://www.telegraph.co.uk/news/politics/ukip/11291050/Nigel-Farage-and-Enoch-Powell-the-full-story-of-Ukips-links-with-the-Rivers-of-Blood-politician.html 4 https://www.channel4.com/news/nigel-farage-ukip-letter-school-concerns-racism-fascism 5 Michael Crick, Channel 4 News, 19 September 2013. 6 http://www.independent.co.uk/news/uk/politics/nigel-farage-open-letter-schoolfriend-brexit-poster-nazi-song-dulwich-college-gas-them-all-a7185336.html 7 http://www.independent.co.uk/news/uk/politics/nigel-farage-fascist-nazi-song-gas-them-all-ukip-brexit-schoolfriend-dulwich-college-a7185236.html 8 Interview with the author at Dulwich College, 12 January 2018. 9 www.facebook.com/myiannopuolos, accessed 24 January 2018. 10 https://www.linkedin.com/in/sam-farage-85b406b2; http://www.telegraph.co.uk/news/politics/nigel-farage/11467039/Nigel-Farage-My-public-school-had-a-real-social-mix-but-now-only-the-mega-rich-can-afford-the-fees.html 11 Simon Kupar, Financial Times, 7 July 2016. 12 http://www.telegraph.co.uk/news/2017/01/05/project-fear-brexit-predictions-flawed-partisan-new-study-says/; http://www.telegraph.co.uk/news/2016/06/25/how-project-fear-failed-to-keep-britain-in-the-eu--and-the-signs/ 13 Odey declined to be interviewed. 14 Sunday Times, 23 April 2017, p. 4; http://www.independent.co.uk/news/uk/politics/brexit-leave-eu-campaign-arron-banks-jeremy-hosking-five-uk-richest-businessmen-peter-hargreaves-a7699046.html 15 https://inews.co.uk/news/technology/cambridge-analytica-facebook-data-protection/ 16 http://www.bbc.co.uk/news/technology-43581892 17 https://inews.co.uk/news/technology/cambridge-analytica-facebook-data-protection/ 18 https://www.reuters.com/article/us-facebook-cambridge-analytica-leave-eu/what-are-the-links-between-cambridge-analytica-and-a-brexit-campaign-group-idUSKBN1GX2IO 19 https://www.theguardian.com/uk-news/2018/mar/24/aggregateiq-data-firm-link-raises-leave-group-questions https://www.businesstimes.com.sg/government-economy/brexit-campaigners-breached-uk-vote-rules-lawyers-say 20 https://dominiccummings.com/2016/10/29/on-the-referendum-20-the-campaign-physics-and-data-science-vote-leaves-voter-intention-collection-system-vics-now-available-for-all/ 21 A Very British Coup, BBC2, 22 Sepptember 2016. 22 http://www.standard.co.uk/business/business-focus-the-billionaire-hedge-fund-winners-who-braved-the-brexit-rollercoaster-a3284101.html 23 http://fortune.com/2014/12/03/heineken-charlene-de-carvalho-self-made-heiress/ 24 http://www.cityam.com/262239/david-camerons-ex-adviser-daniel-korski-launches-major 25 Tim Shipman, All Out War: Brexit and the Sinking of Britain’s Political Class (London: William Collins, 2017), p. 610. 12 For the Few, Not the Many 1 http://www.telegraph.co.uk/news/politics/Jeremy_Corbyn/11818744/Jeremy-Corbyn-the-boy-to-the-manor-born.html 2 http://www.castlehouseschool.co.uk/about-the-school/fees/ 3 Rosa Prince, Comrade Corbyn: A Very Unlikely Coup (London: Biteback Publishing, 2017), p. 29.


pages: 752 words: 131,533

Python for Data Analysis by Wes McKinney

Alignment Problem, backtesting, Bear Stearns, cognitive dissonance, crowdsourcing, data science, Debian, duck typing, Firefox, functional programming, Google Chrome, Guido van Rossum, index card, machine readable, random walk, recommendation engine, revision control, sentiment analysis, Sharpe ratio, side project, sorting algorithm, statistical model, type inference

Import Conventions The Python community has adopted a number of naming conventions for commonly-used modules: import numpy as np import pandas as pd import matplotlib.pyplot as plt This means that when you see np.arange, this is a reference to the arange function in NumPy. This is done as it’s considered bad practice in Python software development to import everything (from numpy import *) from a large package like NumPy. Jargon I’ll use some terms common both to programming and data science that you may not be familiar with. Thus, here are some brief definitions: Munge/Munging/Wrangling Describes the overall process of manipulating unstructured and/or messy data into a structured or clean form. The word has snuck its way into the jargon of many modern day data hackers. Munge rhymes with “lunge”.


pages: 426 words: 136,925

Fulfillment: Winning and Losing in One-Click America by Alec MacGillis

"RICO laws" OR "Racketeer Influenced and Corrupt Organizations", Airbnb, Amazon Web Services, Bernie Sanders, Big Tech, Black Lives Matter, call centre, carried interest, cloud computing, cognitive dissonance, company town, coronavirus, COVID-19, data science, death of newspapers, deindustrialization, Donald Trump, edge city, fulfillment center, future of work, gentrification, George Floyd, Glass-Steagall Act, global pandemic, Great Leap Forward, high net worth, housing crisis, Ida Tarbell, income inequality, information asymmetry, Jeff Bezos, Jeffrey Epstein, Jessica Bruder, jitney, Kiva Systems, lockdown, Lyft, mass incarceration, McMansion, megaproject, microapartment, military-industrial complex, new economy, Nomadland, offshore financial centre, Oklahoma City bombing, opioid epidemic / opioid crisis, plutocrats, Ralph Nader, rent control, Richard Florida, ride hailing / ride sharing, Robert Mercer, Ronald Reagan, San Francisco homelessness, shareholder value, Silicon Valley, social distancing, strikebreaker, tech worker, Travis Kalanick, uber lyft, uranium enrichment, War on Poverty, warehouse robotics, white flight, winner-take-all economy, women in the workforce, working-age population, Works Progress Administration

“That was a big two and a half years ago.” Up walked John Hanly, from the Center for American Progress, the liberal think tank, and Manish Parikh, a chief technology officer for the defense contractor BAE, and James Armitage, a tax lawyer with Caplin & Drysdale, and Khuloud Odeh, CIO and vice president for technology and data science at the Urban Institute, another center-left think tank. The line was growing at the elevator in the lobby. “Holy moly,” said one woman as the line came into view. Around the corner on F Street, a young, Black woman was sleeping on a sidewalk grate with a towel as a pillow. Someone had left a sandwich for her.


pages: 689 words: 134,457

When McKinsey Comes to Town: The Hidden Influence of the World's Most Powerful Consulting Firm by Walt Bogdanich, Michael Forsythe

"RICO laws" OR "Racketeer Influenced and Corrupt Organizations", "World Economic Forum" Davos, activist fund / activist shareholder / activist investor, Affordable Care Act / Obamacare, Alistair Cooke, Amazon Web Services, An Inconvenient Truth, asset light, asset-backed security, Atul Gawande, Bear Stearns, Boris Johnson, British Empire, call centre, Cambridge Analytica, carbon footprint, Citizen Lab, cognitive dissonance, collective bargaining, compensation consultant, coronavirus, corporate governance, corporate social responsibility, Corrections Corporation of America, COVID-19, creative destruction, Credit Default Swap, crony capitalism, data science, David Attenborough, decarbonisation, deindustrialization, disinformation, disruptive innovation, do well by doing good, don't be evil, Donald Trump, double entry bookkeeping, facts on the ground, failed state, financial engineering, full employment, future of work, George Floyd, Gini coefficient, Glass-Steagall Act, global pandemic, illegal immigration, income inequality, information security, interchangeable parts, Intergovernmental Panel on Climate Change (IPCC), invisible hand, job satisfaction, job-hopping, junk bonds, Kenneth Arrow, Kickstarter, load shedding, Mark Zuckerberg, megaproject, Moneyball by Michael Lewis explains big data, mortgage debt, Multics, Nelson Mandela, obamacare, offshore financial centre, old-boy network, opioid epidemic / opioid crisis, profit maximization, public intellectual, RAND corporation, Rutger Bregman, scientific management, sentiment analysis, shareholder value, Sheryl Sandberg, Silicon Valley, smart cities, smart meter, South China Sea, sovereign wealth fund, tech worker, The future is already here, The Nature of the Firm, too big to fail, urban planning, WikiLeaks, working poor, Yogi Berra, zero-sum game

He also consulted for gaming companies, including casinos, sports books, horse racing, and e-sports. While Singer’s name rarely surfaced in news accounts of games, his insights were valued by data analysts who don’t make their living scoring runs or touchdowns. McKinsey deepened its expertise in data science by buying a small, elite consulting company called QuantumBlack, which used data to evaluate athletes in the United States and Europe. One of its specialties was injury prediction—an obvious area of interest to gamblers. Knowing whether certain players were prone to injury might influence betting odds, though there is no evidence this type of information was leaked to gamblers.


pages: 598 words: 134,339

Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World by Bruce Schneier

23andMe, Airbnb, airport security, AltaVista, Anne Wojcicki, AOL-Time Warner, augmented reality, behavioural economics, Benjamin Mako Hill, Black Swan, Boris Johnson, Brewster Kahle, Brian Krebs, call centre, Cass Sunstein, Chelsea Manning, citizen journalism, Citizen Lab, cloud computing, congestion charging, data science, digital rights, disintermediation, drone strike, Eben Moglen, Edward Snowden, end-to-end encryption, Evgeny Morozov, experimental subject, failed state, fault tolerance, Ferguson, Missouri, Filter Bubble, Firefox, friendly fire, Google Chrome, Google Glasses, heat death of the universe, hindsight bias, informal economy, information security, Internet Archive, Internet of things, Jacob Appelbaum, James Bridle, Jaron Lanier, John Gilmore, John Markoff, Julian Assange, Kevin Kelly, Laura Poitras, license plate recognition, lifelogging, linked data, Lyft, Mark Zuckerberg, moral panic, Nash equilibrium, Nate Silver, national security letter, Network effects, Occupy movement, operational security, Panopticon Jeremy Bentham, payday loans, pre–internet, price discrimination, profit motive, race to the bottom, RAND corporation, real-name policy, recommendation engine, RFID, Ross Ulbricht, satellite internet, self-driving car, Shoshana Zuboff, Silicon Valley, Skype, smart cities, smart grid, Snapchat, social graph, software as a service, South China Sea, sparse data, stealth mode startup, Steven Levy, Stuxnet, TaskRabbit, technological determinism, telemarketer, Tim Cook: Apple, transaction costs, Uber and Lyft, uber lyft, undersea cable, unit 8200, urban planning, Wayback Machine, WikiLeaks, workplace surveillance , Yochai Benkler, yottabyte, zero day

Evans (18 Apr 2014), “IAAS Series: Cloud storage pricing: How low can they go?” Architecting IT, http://blog.architecting.it/2014/04/18/iaas-series-cloud-storage-pricing-how-low-can-they-go. store every tweet ever sent: K. Young (6 Sep 2012), “How much would it cost to store the entire Twitter Firehose?” Mortar: Data Science at Scale, http://blog.mortardata.com/post/31027073689/how-much-would-it-cost-to-store-the-entire-twitter. every phone call ever made: Brewster Kahle (2013), “Cost to store all US phonecalls made in a year so it could be datamined,” https://docs.google.com/spreadsheet/ccc? key=0AuqlWHQKlooOdGJrSzhBVnh0WGlzWHpCZFNVcURkX0E#gid=0.


pages: 579 words: 160,351

Breaking News: The Remaking of Journalism and Why It Matters Now by Alan Rusbridger

"World Economic Forum" Davos, accounting loophole / creative accounting, Airbnb, Andy Carvin, banking crisis, Bellingcat, Bernie Sanders, Bletchley Park, Boris Johnson, Brexit referendum, Cambridge Analytica, centre right, Chelsea Manning, citizen journalism, country house hotel, cross-subsidies, crowdsourcing, data science, David Attenborough, David Brooks, death of newspapers, Donald Trump, Doomsday Book, Double Irish / Dutch Sandwich, Downton Abbey, Edward Snowden, Etonian, Evgeny Morozov, fake news, Filter Bubble, folksonomy, forensic accounting, Frank Gehry, future of journalism, G4S, high net worth, information security, invention of movable type, invention of the printing press, Jeff Bezos, jimmy wales, Julian Assange, Large Hadron Collider, Laura Poitras, Mark Zuckerberg, Mary Meeker, Menlo Park, natural language processing, New Journalism, offshore financial centre, oil shale / tar sands, open borders, packet switching, Panopticon Jeremy Bentham, post-truth, pre–internet, ransomware, recommendation engine, Ruby on Rails, sexual politics, Silicon Valley, Skype, Snapchat, social web, Socratic dialogue, sovereign wealth fund, speech recognition, Steve Bannon, Steve Jobs, the long tail, The Wisdom of Crowds, Tim Cook: Apple, traveling salesman, upwardly mobile, WikiLeaks, Yochai Benkler

Joanna Geary,10 whom we’d hired in 2011 to look after social media, posted on Facebook in late 2017: About 10 years ago I thought I might need to learn Ruby on Rails [to build web apps] to understand what’s going on in journalism. Then, about 5 years after that, I thought I might need an MBA. Now, the qualifications I need are probably in: Computer Science Data Science Natural Language Processing Graph Analysis Advanced Critical Thinking Anthropology Behavioural Sciences Product Management Business Administration Social Psychology Coaching & People Development Change Management I think I need to lie down . . . We had recruited two stars of the digital news universe – Wolfgang Blau from Die Zeit in Germany and Aron Pilhofer from the New York Times11 – and relaunched the website in a design that worked much better over desktop, tablet and mobile.


The Man Behind the Microchip: Robert Noyce and the Invention of Silicon Valley by Leslie Berlin

Apple II, Bob Noyce, book value, business cycle, California energy crisis, Charles Babbage, collective bargaining, computer age, data science, Fairchild Semiconductor, George Gilder, Henry Singleton, informal economy, John Markoff, Kickstarter, laissez-faire capitalism, low skilled workers, means of production, Menlo Park, military-industrial complex, Murray Gell-Mann, open economy, prudent man rule, Richard Feynman, rolling blackouts, ROLM, Ronald Reagan, Sand Hill Road, seminal paper, Silicon Valley, Silicon Valley startup, Steve Jobs, Steve Wozniak, tech worker, Teledyne, Tragedy of the Commons, union organizing, vertical integration, War on Poverty, women in the workforce, Yom Kippur War

Noyce also Political Entrepreneurship 275 began investing with Arthur Rock after the Callanish Fund (Noyce’s private investment partnership with Paul Hwoschinsky) was amicably dissolved in 1979. Noyce and Rock did not have a formal investment partnership, but as Rock puts it, “We’d rope each other in.”55 The two men together funded several small companies: General Signal, Mohawk Data Sciences, and, at the urging of Mike Markkula, Volant, a manufacturer of a novel steel ski that Noyce, who tried a prototype, was convinced instantly improved his skiing. It was never clear to Volant’s founders, Bucky Kashiwa and his brother Hank, that Noyce particularly cared about making money from the company.


Blueprint: The Evolutionary Origins of a Good Society by Nicholas A. Christakis

Abraham Maslow, agricultural Revolution, Alfred Russel Wallace, AlphaGo, Amazon Mechanical Turk, assortative mating, autism spectrum disorder, Cass Sunstein, classic study, CRISPR, crowdsourcing, data science, David Attenborough, deep learning, different worldview, disruptive innovation, domesticated silver fox, double helix, driverless car, Easter island, epigenetics, experimental economics, experimental subject, Garrett Hardin, intentional community, invention of agriculture, invention of gunpowder, invention of writing, iterative process, job satisfaction, Joi Ito, joint-stock company, land tenure, language acquisition, Laplace demon, longitudinal study, Mahatma Gandhi, Marc Andreessen, means of production, mental accounting, meta-analysis, microbiome, out of africa, overview effect, phenotype, Philippa Foot, Pierre-Simon Laplace, placebo effect, race to the bottom, Ralph Waldo Emerson, replication crisis, Rubik’s Cube, Silicon Valley, Skinner box, social intelligence, social web, stem cell, Steven Pinker, the scientific method, theory of mind, Tragedy of the Commons, twin studies, ultimatum game, zero-sum game

I dedicated this book to my beloved wife, Erika Christakis, but I will take this further opportunity to acknowledge her beautiful heart and mind, which improved this book immeasurably. About the Author Nicholas A. Christakis, MD, PhD, MPH, is the Sterling Professor of Social and Natural Science at Yale University, with appointments in the departments of Sociology, Ecology and Evolutionary Biology, Statistics and Data Science, Biomedical Engineering, and Medicine. Previously, he conducted research and taught for many years at Harvard University and at the University of Chicago. He was on Time magazine’s list of the 100 most influential people in the world in 2009. He worked as a hospice physician in underserved communities in Chicago and Boston until 2011.


pages: 821 words: 178,631

The Rust Programming Language by Steve Klabnik, Carol Nichols

anti-pattern, billion-dollar mistake, bioinformatics, business logic, business process, cryptocurrency, data science, DevOps, duck typing, Firefox, functional programming, Internet of things, iterative process, pull request, reproducible builds, Ruby on Rails, type inference

If the user wants a high-intensity workout, there’s some additional logic: if the value of the random number generated by the app happens to be 3, the app will recommend a break and hydration. If not, the user will get a number of minutes of running based on the complex algorithm. This code works the way the business wants it to now, but let’s say the data science team decides that we need to make some changes to the way we call the simulated_expensive_calculation function in the future. To simplify the update when those changes happen, we want to refactor this code so it calls the simulated_expensive_calculation function only once. We also want to cut the place where we’re currently unnecessarily calling the function twice without adding any other calls to that function in the process.


Big Data and the Welfare State: How the Information Revolution Threatens Social Solidarity by Torben Iversen, Philipp Rehm

23andMe, Affordable Care Act / Obamacare, algorithmic bias, barriers to entry, Big Tech, business cycle, centre right, collective bargaining, COVID-19, crony capitalism, data science, DeepMind, deindustrialization, full employment, George Akerlof, income inequality, information asymmetry, invisible hand, knowledge economy, land reform, lockdown, loss aversion, low interest rates, low skilled workers, microbiome, moral hazard, mortgage debt, Network effects, new economy, obamacare, personalized medicine, Ponzi scheme, price discrimination, principal–agent problem, profit maximization, Robert Gordon, speech recognition, subprime mortgage crisis, tail risk, The Market for Lemons, The Rise and Fall of American Growth, union organizing, vertical integration, working-age population

The pandemic made many traditional underwriting practices impossible, most importantly in-person medical examinations. Life insurance companies immediately sought to replace in-person medical exams – long the centerpiece of risk classification – with alternative ways of credibly assessing an applicant’s health status and history. Advances in data sciences came to the rescue. The approach taken by John Hancock life insurance is instructive here. In early April 2020, John Hancock rolled out access to the “Human API portal,” which allows applicants to give the company direct access to their health records.9 Human API has built up a large infrastructure that allows John Hancock to guzzle up, standardize, and interpret health information from users that authorize access to their data.


pages: 743 words: 201,651

Free Speech: Ten Principles for a Connected World by Timothy Garton Ash

"World Economic Forum" Davos, A Declaration of the Independence of Cyberspace, Aaron Swartz, activist lawyer, Affordable Care Act / Obamacare, Andrew Keen, Apple II, Ayatollah Khomeini, battle of ideas, Berlin Wall, bitcoin, British Empire, Cass Sunstein, Chelsea Manning, citizen journalism, Citizen Lab, Clapham omnibus, colonial rule, critical race theory, crowdsourcing, data science, David Attenborough, digital divide, digital rights, don't be evil, Donald Davies, Douglas Engelbart, dual-use technology, Edward Snowden, Etonian, European colonialism, eurozone crisis, Evgeny Morozov, failed state, Fall of the Berlin Wall, Ferguson, Missouri, Filter Bubble, financial independence, Firefox, Galaxy Zoo, George Santayana, global village, Great Leap Forward, index card, Internet Archive, invention of movable type, invention of writing, Jaron Lanier, jimmy wales, John Markoff, John Perry Barlow, Julian Assange, Laura Poitras, machine readable, machine translation, Mark Zuckerberg, Marshall McLuhan, Mary Meeker, mass immigration, megacity, mutually assured destruction, national security letter, Nelson Mandela, Netflix Prize, Nicholas Carr, obamacare, Open Library, Parler "social media", Peace of Westphalia, Peter Thiel, power law, pre–internet, profit motive, public intellectual, RAND corporation, Ray Kurzweil, Ronald Reagan, semantic web, Sheryl Sandberg, Silicon Valley, Simon Singh, Snapchat, social graph, Stephen Fry, Stephen Hawking, Steve Jobs, Steve Wozniak, Streisand effect, technological determinism, TED Talk, The Death and Life of Great American Cities, The Wisdom of Crowds, Tipper Gore, trolley problem, Turing test, We are Anonymous. We are Legion, WikiLeaks, World Values Survey, Yochai Benkler, Yom Kippur War, yottabyte

Typically, this takes the form of an A/B test, when two algorithmic alternatives are tried out simultaneously on a split sample group. These experiments are being made on us all the time, usually with our formal legal consent (that ‘I Agree’ button again) but without our being aware of it. An experiment conducted by Facebook’s Core Data Science Team in 2012, but only made public in a scientific paper that appeared in 2014, ‘manipulated the extent to which people (N = 689,003) were exposed to emotional expressions in their News Feed’. One group among those 689,003 users had their News Feeds manipulated to select more positive emotional content coming from their Facebook friends, while another got more negative emotional content.


Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

ISBN: 978-0-553-41881-1 [88] Julia Angwin: “Make Algorithms Accountable,” nytimes.com, August 1, 2016. [89] Bryce Goodman and Seth Flaxman: “European Union Regulations on Algorith‐ mic Decision-Making and a ‘Right to Explanation’,” arXiv:1606.08813, August 31, 2016. [90] “A Review of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Marketing Purposes,” Staff Report, United States Senate Committee on Com‐ merce, Science, and Transportation, commerce.senate.gov, December 2013. [91] Olivia Solon: “Facebook’s Failure: Did Fake News and Polarized Politics Get Trump Elected?” theguardian.com, November 10, 2016. [92] Donella H. Meadows and Diana Wright: Thinking in Systems: A Primer. Chelsea Green Publishing, 2008. ISBN: 978-1-603-58055-7 550 | Chapter 12: The Future of Data Systems [93] Daniel J. Bernstein: “Listening to a ‘big data’/‘data science’ talk,” twitter.com, May 12, 2015. [94] Marc Andreessen: “Why Software Is Eating the World,” The Wall Street Journal, 20 August 2011. [95] J. M. Porup: “‘Internet of Things’ Security Is Hilariously Broken and Getting Worse,” arstechnica.com, January 23, 2016. [96] Bruce Schneier: Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World.


pages: 1,072 words: 237,186

How to Survive a Pandemic by Michael Greger, M.D., FACLM

"Hurricane Katrina" Superdome, Anthropocene, coronavirus, COVID-19, data science, double helix, Edward Jenner, friendly fire, global pandemic, global supply chain, global village, Helicobacter pylori, inventory management, Kickstarter, lockdown, mass immigration, megacity, meta-analysis, New Journalism, out of africa, Peace of Westphalia, phenotype, profit motive, RAND corporation, randomized controlled trial, Ronald Reagan, Saturday Night Live, social distancing, statistical model, stem cell, supply-chain management, the medium is the message, Westphalian system, Y2K, Yogi Berra, zoonotic diseases

Community-acquired pneumonia in patients with chronic obstructive pulmonary disease requiring admission to the intensive care unit: risk factors for mortality. J Crit Care. 28(6):975–979. https://doi.org/10.1016/j.jcrc.2013.08.004. 2735. Aitken M, Kleinrock M. 2018 Apr 19. Medicine use and spending in the U.S.: a review of 2017 and outlook to 2022. Parsippany (NJ): IQVIA Institute for Human Data Science; [accessed 2020 Mar 31]. https://www.iqvia.com/insights/the-iqvia-institute/reports/medicine-use-and-spending-in-the-us-review-of-2017-outlook-to-2022. 2736. Caldeira D, Alarcão J, Vaz-Carneiro A, Costa J. 2012. Risk of pneumonia associated with use of angiotensin converting enzyme inhibitors and angiotensin receptor blockers: systematic review and meta-analysis.


pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Infrastructure as a Service, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

[91] Olivia Solon: “Facebook’s Failure: Did Fake News and Polarized Politics Get Trump Elected?” theguardian.com, November 10, 2016. [92] Donella H. Meadows and Diana Wright: Thinking in Systems: A Primer. Chelsea Green Publishing, 2008. ISBN: 978-1-603-58055-7 [93] Daniel J. Bernstein: “Listening to a ‘big data’/‘data science’ talk,” twitter.com, May 12, 2015. [94] Marc Andreessen: “Why Software Is Eating the World,” The Wall Street Journal, 20 August 2011. [95] J. M. Porup: “‘Internet of Things’ Security Is Hilariously Broken and Getting Worse,” arstechnica.com, January 23, 2016. [96] Bruce Schneier: Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World.


pages: 764 words: 261,694

The Elements of Statistical Learning (Springer Series in Statistics) by Trevor Hastie, Robert Tibshirani, Jerome Friedman

algorithmic bias, backpropagation, Bayesian statistics, bioinformatics, computer age, conceptual framework, correlation coefficient, data science, G4S, Geoffrey Hinton, greed is good, higher-order functions, linear programming, p-value, pattern recognition, random walk, selection bias, sparse data, speech recognition, statistical model, stochastic process, The Wisdom of Crowds

Rumelhart and J. McClelland (eds), Parallel Distributed Processing: Explorations in the Microstructure of Cognition, The MIT Press, Cambridge, MA., pp. 318–362. Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. and Nolan, G. (2005). Causal protein-signaling networks derived from multiparameter singlecell data, Science 308: 523–529. Schapire, R. (1990). The strength of weak learnability, Machine Learning 5(2): 197–227. Schapire, R. (2002). The boosting approach to machine learning: an overview, in D. Denison, M. Hansen, C. Holmes, B. Mallick and B. Yu (eds), MSRI workshop on Nonlinear Estimation and Classification, Springer, New York.