large language model

5 results back to index


pages: 444 words: 117,770

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma by Mustafa Suleyman

"World Economic Forum" Davos, 23andMe, 3D printing, active measures, Ada Lovelace, additive manufacturing, agricultural Revolution, AI winter, air gap, Airbnb, Alan Greenspan, algorithmic bias, Alignment Problem, AlphaGo, Alvin Toffler, Amazon Web Services, Anthropocene, artificial general intelligence, Asilomar, Asilomar Conference on Recombinant DNA, ASML, autonomous vehicles, backpropagation, barriers to entry, basic income, benefit corporation, Big Tech, biodiversity loss, bioinformatics, Bletchley Park, Blitzscaling, Boston Dynamics, business process, business process outsourcing, call centre, Capital in the Twenty-First Century by Thomas Piketty, ChatGPT, choice architecture, circular economy, classic study, clean tech, cloud computing, commoditize, computer vision, coronavirus, corporate governance, correlation does not imply causation, COVID-19, creative destruction, CRISPR, critical race theory, crowdsourcing, cryptocurrency, cuban missile crisis, data science, decarbonisation, deep learning, deepfake, DeepMind, deindustrialization, dematerialisation, Demis Hassabis, disinformation, drone strike, drop ship, dual-use technology, Easter island, Edward Snowden, effective altruism, energy transition, epigenetics, Erik Brynjolfsson, Ernest Rutherford, Extinction Rebellion, facts on the ground, failed state, Fairchild Semiconductor, fear of failure, flying shuttle, Ford Model T, future of work, general purpose technology, Geoffrey Hinton, global pandemic, GPT-3, GPT-4, hallucination problem, hive mind, hype cycle, Intergovernmental Panel on Climate Change (IPCC), Internet Archive, Internet of things, invention of the wheel, job automation, John Maynard Keynes: technological unemployment, John von Neumann, Joi Ito, Joseph Schumpeter, Kickstarter, lab leak, large language model, Law of Accelerating Returns, Lewis Mumford, license plate recognition, lockdown, machine readable, Marc Andreessen, meta-analysis, microcredit, move 37, Mustafa Suleyman, mutually assured destruction, new economy, Nick Bostrom, Nikolai Kondratiev, off grid, OpenAI, paperclip maximiser, personalized medicine, Peter Thiel, planetary scale, plutocrats, precautionary principle, profit motive, prompt engineering, QAnon, quantum entanglement, ransomware, Ray Kurzweil, Recombinant DNA, Richard Feynman, Robert Gordon, Ronald Reagan, Sam Altman, Sand Hill Road, satellite internet, Silicon Valley, smart cities, South China Sea, space junk, SpaceX Starlink, stealth mode startup, stem cell, Stephen Fry, Steven Levy, strong AI, synthetic biology, tacit knowledge, tail risk, techlash, techno-determinism, technoutopianism, Ted Kaczynski, the long tail, The Rise and Fall of American Growth, Thomas Malthus, TikTok, TSMC, Turing test, Tyler Cowen, Tyler Cowen: Great Stagnation, universal basic income, uranium enrichment, warehouse robotics, William MacAskill, working-age population, world market for maybe five computers, zero day

., 45 Khrushchev, Nikita, 126 Kilobots, 95 Klausner, Richard, 85 Krizhevsky, Alex, 59 Kurzweil, Ray, 57 L lab leaks, 173–75, 176 labor markets, 177–81, 261, 262, 282 LaMDA, 71–72, 75 Lander, Eric, 265 language, 27, 157 See also large language models LanzaTech, 87 large language models (LLMs), 62–65 bias in, 69–70, 239–40 capabilities of, 64–65 deepfakes and, 170 efficiency of, 68 open source and, 69 scale of, 65–66 synthetic biology and, 91 laser weapons, 263 law enforcement, 97–98 Lebanon, 196–97 LeCun, Yann, 130 Lee Sedol, 53–54, 117 Legg, Shane, 8 legislation, 260 See also regulation as method for containment Lemoine, Blake, 71–72 Lenoir, Jean Joseph Étienne, 23 Li, Fei-Fei, 59 libertarianism, 201 Library of Alexandria, 41 licensing, 261 lithium production, 109 LLaMA system, 69 London DNA Foundry, 83 longevity technologies, 85–86 Luddites, 39, 40, 281–83 M machine learning autonomy and, 113 bias in, 69–70, 239–40 computer vision and, 58–60 cyberattacks and, 162–63, 166–67 limitations of, 73 medical applications, 110 military applications and, 103–5 potential of, 61–62 protein structure and, 89–90 robotics and, 95 supervised deep learning, 65 synthetic biology and, 90–91 See also deep learning Macron, Emmanuel, 125 Malthus, Thomas, 136 Manhattan Project, 41, 124, 126, 141, 270 Maoism, 192 Mao Zedong, 194 Marcus, Gary, 73 Maybach, Wilhelm, 24 McCarthy, John, 73 McCormick, Cyrus, 133 medical applications, 85, 95, 110 megamachine, 217 Megvii, 194 Meta, 69, 128, 167 Micius, 122 Microsoft, 69, 98, 128, 160–61 military applications AI and, 104, 165 asymmetry and, 106 machine learning and, 103–5 nation-state fragility amplifiers and, 167–69 omni-use technology and, 110–11 robotics and, 165–66 Minsky, Marvin, 58, 130 misinformation.

They feature in shops, schools, hospitals, offices, courts, and homes. You already interact many times a day with AI; soon it will be many more, and almost everywhere it will make experiences more efficient, faster, more useful, and frictionless. AI is already here. But it’s far from done. AUTOCOMPLETE EVERYTHING: THE RISE OF LARGE LANGUAGE MODELS It wasn’t long ago that processing natural language seemed too complex, too varied, too nuanced for modern AI. Then, in November 2022, the AI research company OpenAI released ChatGPT. Within a week it had more than a million users and was being talked about in rapturous terms, a technology so seamlessly useful it might eclipse Google Search in short order.

Back in 2017 a small group of researchers at Google was focused on a narrower version of this problem: how to get an AI system to focus only on the most important parts of a data series in order to make accurate and efficient predictions about what comes next. Their work laid the foundation for what has been nothing short of a revolution in the field of large language models (LLMs)—including ChatGPT. LLMs take advantage of the fact that language data comes in a sequential order. Each unit of information is in some way related to data earlier in a series. The model reads very large numbers of sentences, learns an abstract representation of the information contained within them, and then, based on this, generates a prediction about what should come next.


Four Battlegrounds by Paul Scharre

2021 United States Capitol attack, 3D printing, active measures, activist lawyer, AI winter, AlphaGo, amateurs talk tactics, professionals talk logistics, artificial general intelligence, ASML, augmented reality, Automated Insights, autonomous vehicles, barriers to entry, Berlin Wall, Big Tech, bitcoin, Black Lives Matter, Boeing 737 MAX, Boris Johnson, Brexit referendum, business continuity plan, business process, carbon footprint, chief data officer, Citizen Lab, clean water, cloud computing, commoditize, computer vision, coronavirus, COVID-19, crisis actor, crowdsourcing, DALL-E, data is not the new oil, data is the new oil, data science, deep learning, deepfake, DeepMind, Demis Hassabis, Deng Xiaoping, digital map, digital rights, disinformation, Donald Trump, drone strike, dual-use technology, Elon Musk, en.wikipedia.org, endowment effect, fake news, Francis Fukuyama: the end of history, future of journalism, future of work, game design, general purpose technology, Geoffrey Hinton, geopolitical risk, George Floyd, global supply chain, GPT-3, Great Leap Forward, hive mind, hustle culture, ImageNet competition, immigration reform, income per capita, interchangeable parts, Internet Archive, Internet of things, iterative process, Jeff Bezos, job automation, Kevin Kelly, Kevin Roose, large language model, lockdown, Mark Zuckerberg, military-industrial complex, move fast and break things, Nate Silver, natural language processing, new economy, Nick Bostrom, one-China policy, Open Library, OpenAI, PalmPilot, Parler "social media", pattern recognition, phenotype, post-truth, purchasing power parity, QAnon, QR code, race to the bottom, RAND corporation, recommendation engine, reshoring, ride hailing / ride sharing, robotic process automation, Rodney Brooks, Rubik’s Cube, self-driving car, Shoshana Zuboff, side project, Silicon Valley, slashdot, smart cities, smart meter, Snapchat, social software, sorting algorithm, South China Sea, sparse data, speech recognition, Steve Bannon, Steven Levy, Stuxnet, supply-chain attack, surveillance capitalism, systems thinking, tech worker, techlash, telemarketer, The Brussels Effect, The Signal and the Noise by Nate Silver, TikTok, trade route, TSMC

., The Cost of Training NLP Models: A Concise Overview (AI21 Labs, April 19, 2020), https://arxiv.org/pdf/2004.08900.pdf, 2. Some research suggests that the optimal balance with a fixed amount of compute would be to scale training data size and model size equally and that many recent large language models would perform better if a smaller model were trained on a larger dataset. Jordan Hoffman et al., Training Compute-Optimal Large Language Models (arXiv.org, March 29, 2022), https://arxiv.org/pdf/2203.15556.pdf. For an analysis of overall trends in dataset size in machine learning research, see Pablo Villalobos, “Trends in Training Dataset Sizes,” Epoch, September 20, 2022, https://epochai.org/blog/trends-in-training-dataset-sizes.

In supervised learning, an algorithm is trained on labeled data. For example, an image classification algorithm may be trained on labeled pictures. Over many iterations, the algorithm learns to associate the image with the label. Unsupervised learning is when an algorithm is trained on unlabeled data and the algorithm learns patterns in the data. Large language models such as GPT-2 and GPT-3 use unsupervised learning. Once trained, they can output sentences and whole paragraphs based on patterns they’ve learned from the text on which they’ve been trained. Reinforcement learning is when an algorithm learns by interacting with its environment and gets rewards for certain behaviors.

Shifts in the significance of these inputs could advantage some actors and disadvantage others, further altering the global balance of AI power. One of the most striking trends in AI basic research today is the tendency toward ever-larger models with increasingly massive datasets and compute resources for training. The rapid growth in size for large language models, for example, is remarkable. In October 2018, researchers at Google announced BERTLARGE, a 340 million parameter language model. It was trained on a database of 3.3 billion words using 64 TPU chips running for four days. A few months later, in February 2019, OpenAI announced GPT-2, a 1.5 billion parameter model trained on 40 GB of text.


pages: 484 words: 104,873

Rise of the Robots: Technology and the Threat of a Jobless Future by Martin Ford

3D printing, additive manufacturing, Affordable Care Act / Obamacare, AI winter, algorithmic management, algorithmic trading, Amazon Mechanical Turk, artificial general intelligence, assortative mating, autonomous vehicles, banking crisis, basic income, Baxter: Rethink Robotics, Bernie Madoff, Bill Joy: nanobots, bond market vigilante , business cycle, call centre, Capital in the Twenty-First Century by Thomas Piketty, carbon tax, Charles Babbage, Chris Urmson, Clayton Christensen, clean water, cloud computing, collateralized debt obligation, commoditize, computer age, creative destruction, data science, debt deflation, deep learning, deskilling, digital divide, disruptive innovation, diversified portfolio, driverless car, Erik Brynjolfsson, factory automation, financial innovation, Flash crash, Ford Model T, Fractional reserve banking, Freestyle chess, full employment, general purpose technology, Geoffrey Hinton, Goldman Sachs: Vampire Squid, Gunnar Myrdal, High speed trading, income inequality, indoor plumbing, industrial robot, informal economy, iterative process, Jaron Lanier, job automation, John Markoff, John Maynard Keynes: technological unemployment, John von Neumann, Kenneth Arrow, Khan Academy, Kiva Systems, knowledge worker, labor-force participation, large language model, liquidity trap, low interest rates, low skilled workers, low-wage service sector, Lyft, machine readable, machine translation, manufacturing employment, Marc Andreessen, McJob, moral hazard, Narrative Science, Network effects, new economy, Nicholas Carr, Norbert Wiener, obamacare, optical character recognition, passive income, Paul Samuelson, performance metric, Peter Thiel, plutocrats, post scarcity, precision agriculture, price mechanism, public intellectual, Ray Kurzweil, rent control, rent-seeking, reshoring, RFID, Richard Feynman, Robert Solow, Rodney Brooks, Salesforce, Sam Peltzman, secular stagnation, self-driving car, Silicon Valley, Silicon Valley billionaire, Silicon Valley startup, single-payer health, software is eating the world, sovereign wealth fund, speech recognition, Spread Networks laid a new fibre optics cable between New York and Chicago, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, Steven Pinker, strong AI, Stuxnet, technological singularity, telepresence, telepresence robot, The Bell Curve by Richard Herrnstein and Charles Murray, The Coming Technological Singularity, The Future of Employment, the long tail, Thomas L Friedman, too big to fail, Tragedy of the Commons, Tyler Cowen, Tyler Cowen: Great Stagnation, uber lyft, union organizing, Vernor Vinge, very high income, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, women in the workforce

Google’s development team began by focusing on official documents prepared by the United Nations and then extended their effort to the Web, where the company’s search engine was able to locate a multitude of examples that became fodder for their voracious self-learning algorithms. The sheer number of documents used to train the system dwarfed anything that had come before. Franz Och, the computer scientist who led the effort, noted that the team had built “very, very large language models, much larger than anyone has ever built in the history of mankind.”8 In 2005, Google entered its system in the annual machine translation competition held by the National Bureau of Standards and Technology, an agency within the US Commerce department that publishes measurement standards. Google’s machine learning algorithms were able to easily outperform the competition—which typically employed language and linguistic experts who attempted to actively program their translation systems to wade through the mire of conflicting and inconsistent grammatical rules that characterize languages.


pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos

Albert Einstein, Amazon Mechanical Turk, Arthur Eddington, backpropagation, basic income, Bayesian statistics, Benoit Mandelbrot, bioinformatics, Black Swan, Brownian motion, cellular automata, Charles Babbage, Claude Shannon: information theory, combinatorial explosion, computer vision, constrained optimization, correlation does not imply causation, creative destruction, crowdsourcing, Danny Hillis, data is not the new oil, data is the new oil, data science, deep learning, DeepMind, double helix, Douglas Hofstadter, driverless car, Erik Brynjolfsson, experimental subject, Filter Bubble, future of work, Geoffrey Hinton, global village, Google Glasses, Gödel, Escher, Bach, Hans Moravec, incognito mode, information retrieval, Jeff Hawkins, job automation, John Markoff, John Snow's cholera map, John von Neumann, Joseph Schumpeter, Kevin Kelly, large language model, lone genius, machine translation, mandelbrot fractal, Mark Zuckerberg, Moneyball by Michael Lewis explains big data, Narrative Science, Nate Silver, natural language processing, Netflix Prize, Network effects, Nick Bostrom, NP-complete, off grid, P = NP, PageRank, pattern recognition, phenotype, planetary scale, power law, pre–internet, random walk, Ray Kurzweil, recommendation engine, Richard Feynman, scientific worldview, Second Machine Age, self-driving car, Silicon Valley, social intelligence, speech recognition, Stanford marshmallow experiment, statistical model, Stephen Hawking, Steven Levy, Steven Pinker, superintelligent machines, the long tail, the scientific method, The Signal and the Noise by Nate Silver, theory of mind, Thomas Bayes, transaction costs, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, white flight, yottabyte, zero-sum game

“Relevance weighting of search terms,”* by Stephen Robertson and Karen Sparck Jones (Journal of the American Society for Information Science, 1976), explains the use of Naïve Bayes–like methods in information retrieval. “First links in the Markov chain,” by Brian Hayes (American Scientist, 2013), recounts Markov’s invention of the eponymous chains. “Large language models in machine translation,”* by Thorsten Brants et al. (Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007), explains how Google Translate works. “The PageRank citation ranking: Bringing order to the Web,”* by Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd (Stanford University technical report, 1998), describes the PageRank algorithm and its interpretation as a random walk over the web.


pages: 370 words: 112,809

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future by Orly Lobel

2021 United States Capitol attack, 23andMe, Ada Lovelace, affirmative action, Airbnb, airport security, Albert Einstein, algorithmic bias, Amazon Mechanical Turk, augmented reality, barriers to entry, basic income, Big Tech, bioinformatics, Black Lives Matter, Boston Dynamics, Charles Babbage, choice architecture, computer vision, Computing Machinery and Intelligence, contact tracing, coronavirus, corporate social responsibility, correlation does not imply causation, COVID-19, crowdsourcing, data science, David Attenborough, David Heinemeier Hansson, deep learning, deepfake, digital divide, digital map, Elon Musk, emotional labour, equal pay for equal work, feminist movement, Filter Bubble, game design, gender pay gap, George Floyd, gig economy, glass ceiling, global pandemic, Google Chrome, Grace Hopper, income inequality, index fund, information asymmetry, Internet of things, invisible hand, it's over 9,000, iterative process, job automation, Lao Tzu, large language model, lockdown, machine readable, machine translation, Mark Zuckerberg, market bubble, microaggression, Moneyball by Michael Lewis explains big data, natural language processing, Netflix Prize, Network effects, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, occupational segregation, old-boy network, OpenAI, openstreetmap, paperclip maximiser, pattern recognition, performance metric, personalized medicine, price discrimination, publish or perish, QR code, randomized controlled trial, remote working, risk tolerance, robot derives from the Czech word robota Czech, meaning slave, Ronald Coase, Salesforce, self-driving car, sharing economy, Sheryl Sandberg, Silicon Valley, social distancing, social intelligence, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, surveillance capitalism, tech worker, TechCrunch disrupt, The Future of Employment, TikTok, Turing test, universal basic income, Wall-E, warehouse automation, women in the workforce, work culture , you are the product

Timnit Gebru—a rising star in AI research with an extraordinary path from Ethiopia to Eritrea to political asylum in the United States, to three degrees at Stanford, to Apple, to Microsoft, and then to Google—was ousted from Google over a dispute with company executives about publishing an article on the potential risks and harms of large language models. The article warns that ever larger natural language processing models may be too large to monitor, that the sheer mass of the data becomes inscrutable. The paper calls for curating and documenting data sets “rather than ingesting everything on the web.”3 It also warns against spreading claims about models understanding language and concepts, as opposed to simply identifying patterns in human-made texts, numbers, and images.


pages: 666 words: 181,495

In the Plex: How Google Thinks, Works, and Shapes Our Lives by Steven Levy

"World Economic Forum" Davos, 23andMe, AltaVista, Andy Rubin, Anne Wojcicki, Apple's 1984 Super Bowl advert, autonomous vehicles, Bill Atkinson, book scanning, Brewster Kahle, Burning Man, business process, clean water, cloud computing, crowdsourcing, Dean Kamen, discounted cash flows, don't be evil, Donald Knuth, Douglas Engelbart, Douglas Engelbart, Dutch auction, El Camino Real, Evgeny Morozov, fault tolerance, Firefox, General Magic , Gerard Salton, Gerard Salton, Google bus, Google Chrome, Google Earth, Googley, high-speed rail, HyperCard, hypertext link, IBM and the Holocaust, informal economy, information retrieval, Internet Archive, Jeff Bezos, John Markoff, Ken Thompson, Kevin Kelly, Kickstarter, large language model, machine translation, Mark Zuckerberg, Menlo Park, one-China policy, optical character recognition, PageRank, PalmPilot, Paul Buchheit, Potemkin village, prediction markets, Project Xanadu, recommendation engine, risk tolerance, Rubik’s Cube, Sand Hill Road, Saturday Night Live, search inside the book, second-price auction, selection bias, Sheryl Sandberg, Silicon Valley, SimCity, skunkworks, Skype, slashdot, social graph, social software, social web, spectrum auction, speech recognition, statistical model, Steve Ballmer, Steve Jobs, Steven Levy, subscription business, Susan Wojcicki, Ted Nelson, telemarketer, The future is already here, the long tail, trade route, traveling salesman, turn-by-turn navigation, undersea cable, Vannevar Bush, web application, WikiLeaks, Y Combinator

Och’s official role was as a scientist in Google’s research group, but it is indicative of Google’s view of research that no step was required to move beyond study into actual product implementation. Because Och and his colleagues knew they would have access to an unprecedented amount of data, they worked from the ground up to create a new translation system. “One of the things we did was to build very, very, very large language models, much larger than anyone has ever built in the history of mankind.” Then they began to train the system. To measure progress, they used a statistical model that, given a series of words, would predict the word that came next. Each time they doubled the amount of training data, they got a .5 percent boost in the metrics that measured success in the results.


pages: 562 words: 201,502

Elon Musk by Walter Isaacson

4chan, activist fund / activist shareholder / activist investor, Airbnb, Albert Einstein, AltaVista, Apollo 11, Apple II, Apple's 1984 Super Bowl advert, artificial general intelligence, autism spectrum disorder, autonomous vehicles, basic income, Big Tech, blockchain, Boston Dynamics, Burning Man, carbon footprint, ChatGPT, Chuck Templeton: OpenTable:, Clayton Christensen, clean tech, Colonization of Mars, computer vision, Computing Machinery and Intelligence, coronavirus, COVID-19, crowdsourcing, cryptocurrency, deep learning, DeepMind, Demis Hassabis, disinformation, Dogecoin, Donald Trump, Douglas Engelbart, drone strike, effective altruism, Elon Musk, estate planning, fail fast, fake news, game design, gigafactory, GPT-4, high-speed rail, hiring and firing, hive mind, Hyperloop, impulse control, industrial robot, information security, Jeff Bezos, Jeffrey Epstein, John Markoff, John von Neumann, Jony Ive, Kwajalein Atoll, lab leak, large language model, Larry Ellison, lockdown, low earth orbit, Marc Andreessen, Marc Benioff, Mars Society, Max Levchin, Michael Shellenberger, multiplanetary species, Neil Armstrong, Network effects, OpenAI, packet switching, Parler "social media", paypal mafia, peer-to-peer, Peter Thiel, QAnon, Ray Kurzweil, reality distortion field, remote working, rent control, risk tolerance, Rubik’s Cube, Salesforce, Sam Altman, Sam Bankman-Fried, San Francisco homelessness, Sand Hill Road, Saturday Night Live, self-driving car, seminal paper, short selling, Silicon Valley, Skype, SpaceX Starlink, Stephen Hawking, Steve Jobs, Steve Jurvetson, Steve Wozniak, Steven Levy, Streisand effect, supply-chain management, tech bro, TED Talk, Tesla Model S, the payments system, Tim Cook: Apple, universal basic income, Vernor Vinge, vertical integration, Virgin Galactic, wikimedia commons, William MacAskill, work culture , Y Combinator

This meant that his engineers were actually ahead of OpenAI in creating full-fledged artificial general intelligence, which requires both abilities. “Tesla’s real-world AI is underrated,” he said. “Imagine if Tesla and OpenAI had to swap tasks. They would have to make Self-Driving, and we would have to make large language-model chatbots. Who wins? We do.” In April, Musk assigned Babuschkin and his team three major goals. The first was to make an AI bot that could write computer code. A programmer could begin typing in any coding language, and the X.AI bot would auto-complete the task for the most likely action they were trying to take.