semantic web

68 results back to index


Mastering Structured Data on the Semantic Web: From HTML5 Microdata to Linked Open Data by Leslie Sikos

AGPL, Amazon Web Services, bioinformatics, business process, cloud computing, create, read, update, delete, Debian, en.wikipedia.org, fault tolerance, Firefox, Google Chrome, Google Earth, information retrieval, Infrastructure as a Service, Internet of things, linked data, machine readable, machine translation, natural language processing, openstreetmap, optical character recognition, platform as a service, search engine result page, semantic web, Silicon Valley, social graph, software as a service, SPARQL, text mining, Watson beat the top human players on Jeopardy!, web application, Wikidata, wikimedia commons, Wikivoyage

The next chapter will show you how to store triples and quads efficiently in purpose-built graph databases: triplestores and quadstores. 142 Chapter 5 ■ Semantic Web Services References 1. Domingue, J., Martin, D. (2008) Introduction to the Semantic Web Tutorial. The 7th International Semantic Web Conference, 26–30 October, 2008, Karlsruhe, Germany. 2. Facca, F. M., Krummenacher, R. (2008) Semantic Web Services in a Nutshell. Silicon Valley Semantic Web Meet Up, USA. 3. Stollberg, M., Haller, A. (2005) Semantic Web Services Tutorial. 3rd International Conference on Web Services, Orlando, FL, USA, 11 July 2005. 4. Chinnici, R., Moreau, J.

Web Semantics: Science, Services and Agents on the World Wide Web 2014, http://dx.doi.org/10.1016/j.websem.2014.07.004. 13. Oinonen, K. (2005) On the road to business application of Semantic Web technology. Semantic Web in Business—How to proceed. In: Industrial Applications of Semantic Web: Proceedings of the 1st IFIP WG12.5 Working Conference on Industrial Applications of Semantic Web. International Federation for Information Processing. Springer Science+Business Media, Inc., New York. Chapter 1 ■ Introduction to the Semantic Web 14. Murphy, T. (2010) Lin Clark On Why Drupal Matters. Socialmedia. http://socialmedia.net/2010/09/07/lin-clark-on-why-drupal-matters.

The mainstream XML-based standards for web service interoperability specify the syntax only, rather than the semantic meaning of messages. Semantic Web technologies can enhance service-oriented environments with well-defined, rich semantics. Semantic Web services leverage Semantic Web technologies to automate services and enable automatic service discovery, composition, and execution across heterogeneous users and domains. Semantic Web Service Modeling Web services are programs programmatically accessible over standard Internet protocols, using reusable components [1]. Web services are distributed and encapsulate discrete functionality. Semantic Web Services (SWS) make web service characteristics machine-interpretable via semantics.


RDF Database Systems: Triples Storage and SPARQL Query Processing by Olivier Cure, Guillaume Blin

Amazon Web Services, bioinformatics, business intelligence, cloud computing, database schema, fault tolerance, folksonomy, full text search, functional programming, information retrieval, Internet Archive, Internet of things, linked data, machine readable, NP-complete, peer-to-peer, performance metric, power law, random walk, recommendation engine, RFID, semantic web, Silicon Valley, social intelligence, software as a service, SPARQL, sparse data, web application

In: Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems. IEEE Computer Society, Washington, DC, pp. 17–32. Krötzsch, M., 2011. Efficient rule-based inferencing for OWL EL. IJCAI, 2668–2673. Ladwig, G., Harth, A., 2011. Cumulus RDF: Linked data management on nested key-value stores. In: ­Proceedings of the 7th International Workshop on Scalable Semantic Web Knowledge Base Systems at the 10th International Semantic Web Conference (ISWC2011), Springer Berlin Heidelberg, pp. 30-42. Ladwig, G., Tran, T., 2010. Linked data query processing strategies. In: International Semantic Web ­Conference, vol. 1, pp. 453–469.

Of course, the Web, and in particular the Web of Data, is an important document provider; in that context, we then talk about a Semantic Web consisting of a set of technologies that are supporting this whole process. In Berners-Lee et al. (2001), the Semantic Web is defined as “an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” (p. 1). This emphasizes that there is no rupture between a previous non-semantic Web and a semantic one. Introduction They will both rely on concepts such as HTTP, URIs, and the stack of representational standards such as HyperText Markup Language (HTML), Cascade Style Sheets (CSS), and all accompanying programming technologies like JavaScript, Ruby, and HyperText PreProcessor (PHP).

This will support the development and maintenance of novel mashup-based applications (i.e., mixing distinct and previously not related information resources) that are going far beyond what is used today. A first question one may ask is, how popular the Semantic Web is getting? First, the fact that the principles or even the main technologies of the Semantic Web may not be known by the general public cannot be considered a setback of the overall approach. Semantic Web technologies are expected to be present on the server side, not on the client side (i.e., web browsers). So it should not be transparent to the general public and should only be known to application designers and developers.


pages: 315 words: 70,044

Learning SPARQL by Bob Ducharme

database schema, Donald Knuth, en.wikipedia.org, G4S, linked data, machine readable, semantic web, SPARQL, web application

Later chapters describe how to create more complex queries, how to modify data, how to build applications around your queries, and how it all fits into the semantic web, but if you can execute the queries shown in this chapter, you’re ready to put SPARQL to work for you. Chapter 2. The Semantic Web, RDF, and Linked Data (and SPARQL) SPARQL is a query language for data that follows a particular model, but the semantic web isn’t about the query language or about the model—it’s about the data. The booming amount of data becoming available on the semantic web is making great new kinds of applications possible, and as a well-implemented, mature standard designed with the semantic web in mind, SPARQL is the best way to get that data and put it to work in your applications.

The booming amount of data becoming available on the semantic web is making great new kinds of applications possible, and as a well-implemented, mature standard designed with the semantic web in mind, SPARQL is the best way to get that data and put it to work in your applications. What Exactly Is the “Semantic Web”? As excitement over the semantic web grows, some vendors use the phrase to sell products with strong connections to the ideas behind the semantic web, and others use it to sell products with weaker connections. This can be confusing for people trying to understand the semantic web landscape. I like to define the semantic web as a set of standards and best practices for sharing data and the semantics of that data over the web for use by applications. Let’s look at this definition one or two phrases at a time, and then we’ll look at these issues in more detail.

, Querying the Data, More Realistic Data and Matching on Multiple Triples, URLs, URIs, IRIs, and Namespaces, Storing RDF in Databases, Data That Might Not Be There, Searching Further in the Data, Querying a Remote SPARQL Service, Creating New Data, Using Existing SPARQL Rules Vocabularies, Deleting and Replacing Triples in Named Graphs, Middleware SPARQL Support join (SPARQL equivalent), Searching Further in the Data normalization and, Creating New Data outer join (SPARQL equivalent), Data That Might Not Be There row ID values and, More Realistic Data and Matching on Multiple Triples, URLs, URIs, IRIs, and Namespaces SPARQL middleware and, Middleware SPARQL Support SPARQL rules and, Using Existing SPARQL Rules Vocabularies SQL habits, Querying the Data remote SPARQL service, querying, Querying a Remote SPARQL Service, Querying a Remote SPARQL Service Resource Description Format, The Data to Query (see RDF) round(), Numeric Functions S sample code, Using Code Examples schema, What Exactly Is the “Semantic Web”?, Glossary Schemarama, Using Existing SPARQL Rules Vocabularies screen scraping, What Exactly Is the “Semantic Web”?, Storing RDF in Files, Glossary searching for string, Searching for Strings SELECT, Querying the Data, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT semantic web, What Exactly Is the “Semantic Web”? semantics, What Exactly Is the “Semantic Web”?, Reusing and Creating Vocabularies: RDF Schema and OWL semicolon, Storing RDF in Files, More Readable Query Results, Converting Data, Named Graphs CONSTRUCT queries and, Converting Data in N3 and Turtle, Storing RDF in Files serialization, Storing RDF in Files, Glossary SERVICE, Querying a Remote SPARQL Service simple literal, Glossary SKOS, Making RDF More Readable with Language Tags and Labels, Datatypes and Queries, Checking, Adding, and Removing Spoken Language Tags creating, Checking, Adding, and Removing Spoken Language Tags custom datatypes and, Datatypes and Queries SKOS-XL, Changing Existing Data SNORQL, Querying a Public Data Source sorting data, Sorting Data space before SPARQL punctuation, The Data to Query SPARQL, Jumping Right In: Some Data and Some Queries, Jumping Right In: Some Data and Some Queries, The Data to Query, Querying the Data, Querying the Data, Querying the Data, Storing RDF in Databases, The SPARQL Specifications, The SPARQL Specifications, The SPARQL Specifications, Updating Data with SPARQL, Named Graphs, Glossary comments, The Data to Query engine, Querying the Data Graph Store HTTP Protocol specification, Named Graphs processor, Querying the Data protocol, Jumping Right In: Some Data and Some Queries, The SPARQL Specifications query language, The SPARQL Specifications SPARQL 1.1, Updating Data with SPARQL specifications, The SPARQL Specifications triplestores and, Storing RDF in Databases uppercase keywords, Querying the Data SPARQL endpoint, Querying a Public Data Source, SPARQL and Web Application Development, Triplestore SPARQL Support, Glossary creating your own, Triplestore SPARQL Support SPARQL processor, Glossary SPARQL protocol, Glossary SPARQL Query Results XML Format, The SPARQL Specifications, SPARQL Query Results XML Format, Standalone Processors as ARQ output, Standalone Processors SPARQL rules, Defining Rules with SPARQL, Defining Rules with SPARQL SPIN, Using Existing SPARQL Rules Vocabularies spreadsheets, Checking, Adding, and Removing Spoken Language Tags SQL, Querying the Data, Glossary square braces, Blank Nodes and Why They’re Useful, Using Existing SPARQL Rules Vocabularies str(), Node Type Conversion Functions STRDT(), Datatype Conversion STRENDS(), String Functions string datatype, Datatypes and Queries, Representing Strings striping, Storing RDF in Files, Glossary STRLANG(), Checking, Adding, and Removing Spoken Language Tags STRLEN(), String Functions STRSTARTS(), String Functions subject (of triple), The Data to Query, URLs, URIs, IRIs, and Namespaces, The Resource Description Format (RDF), Glossary namespaces and, URLs, URIs, IRIs, and Namespaces subqueries, Queries in Your Queries, Combining Values and Assigning Values to Variables, Federated Queries: Searching Multiple Datasets with One Query SUBSTR(), Creating New Data, String Functions subtraction, Comparing Values and Doing Arithmetic SUM(), Finding the Smallest, the Biggest, the Count, the Average...


pages: 511 words: 111,423

Learning SPARQL by Bob Ducharme

business logic, Donald Knuth, en.wikipedia.org, G4S, hypertext link, linked data, machine readable, place-making, semantic web, SPARQL, web application

Note The flexibility of the RDF data model means that it’s being used more and more with projects that have nothing to do with the “semantic web” other than their use of technology that uses these standards—that’s why you’ll often see references to “semantic web technology.” What Exactly Is the “Semantic Web”? As excitement over the semantic web grows, some vendors use the phrase to sell products with strong connections to the ideas behind the semantic web, and others use it to sell products with weaker connections. This can be confusing for people trying to understand the semantic web landscape. I like to define the semantic web as a set of standards and best practices for sharing data and the semantics of that data over the Web for use by applications.

Summary In this chapter, we learned: What SPARQL is The basics of RDF The meaning and role of URIs The parts of a simple SPARQL query How to execute a SPARQL query with ARQ How the same variable in multiple triple patterns can connect up the data in different triples What can lead to a query returning nothing What SPARQL endpoints are and how to query the most popular one, DBpedia Later chapters describe how to create more complex queries, how to modify data, how to build applications around your queries, the potential role of inferencing, and the technology’s roots in the semantic web world, but if you can execute the queries shown in this chapter, you’re ready to put SPARQL to work for you. Chapter 2. The Semantic Web, RDF, and Linked Data (and SPARQL) The SPARQL query language is for data that follows a particular model, but the semantic web isn’t about the query language or about the model—it’s about the data. The booming amount of data becoming available on the semantic web is making great new kinds of applications possible, and as a well-implemented, mature standard designed with the semantic web in mind, SPARQL is the best way to get that data and put it to work in your applications.

, Storing RDF in Databases, Querying a Remote SPARQL Service, Deleting and Replacing Triples in Named Graphs (see also SQL) join (SPARQL equivalent), Searching Further in the Data normalization and, Creating New Data outer join (SPARQL equivalent), Data That Might Not Be There row ID values and, More Realistic Data and Matching on Multiple Triples, URLs, URIs, IRIs, and Namespaces SPARQL middleware and, Middleware SPARQL Support SPARQL rules and, Using Existing SPARQL Rules Vocabularies remote SPARQL service, querying, Querying a Remote SPARQL Service–Querying a Remote SPARQL Service Resource Description Framework (see RDF) REST, SPARQL and HTTP restriction classes, SPARQL and OWL Inferencing round(), Numeric Functions Ruby, SPARQL and Web Application Development rules, SPARQL (see SPARQL rules) S sameTerm(), Node Type and Datatype Checking Functions sample code, Using Code Examples schema, What Exactly Is the “Semantic Web”?, Glossary querying, Querying Schemas Schemarama, Using Existing SPARQL Rules Vocabularies Schematron, Finding Bad Data screen scraping, What Exactly Is the “Semantic Web”?, Storing RDF in Files, Glossary search space, Reduce the Search Space searching for string, Searching for Strings SELECT, Querying the Data, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT semantic web, What Exactly Is the “Semantic Web”?, Glossary semantics, What Exactly Is the “Semantic Web”?, Reusing and Creating Vocabularies: RDF Schema and OWL semicolon, More Readable Query Results connecting operations with, Named Graphs CONSTRUCT queries and, Converting Data in N3 and Turtle, Storing RDF in Files serialization, Storing RDF in Files, Glossary SERVICE, Querying a Remote SPARQL Service Sesame triplestore, Querying Named Graphs, Datatypes and Queries inferencing with, Inferred Triples and Your Query repositories, SPARQL and HTTP simple literal, Glossary SKOS, Making RDF More Readable with Language Tags and Labels creating, Checking, Adding, and Removing Spoken Language Tags custom datatypes and, Datatypes and Queries SKOS-XL, Changing Existing Data SNORQL, Querying a Public Data Source sorting, Sorting Data query efficiency and, Efficiency Outside the WHERE Clause space before SPARQL punctuation, The Data to Query SPARQL, Jumping Right In: Some Data and Some Queries, Glossary comments, The Data to Query endpoint, Querying a Remote SPARQL Service engine, Querying the Data Graph Store HTTP Protocol specification, Named Graphs processor, Querying the Data protocol, Jumping Right In: Some Data and Some Queries, The SPARQL Specifications query language, The SPARQL Specifications SPARQL 1.1, Updating Data with SPARQL specifications, The SPARQL Specifications triplestores and, Storing RDF in Databases uppercase keywords, Querying the Data SPARQL algebra, SPARQL Algebra SPARQL endpoint, Querying a Public Data Source, Public Endpoints, Private Endpoints–Public Endpoints, Private Endpoints, Glossary creating your own, Triplestore SPARQL Support identifier, SPARQL and Web Application Development Linked Data Cloud and, Problem retrieving triples from, Problem SERVICE keyword and, Federated Queries: Searching Multiple Datasets with One Query SPARQL processor, SPARQL Processors–Public Endpoints, Private Endpoints, Glossary SPARQL protocol, Glossary SPARQL Query Results CSV and TSV Formats, SPARQL Query Results CSV and TSV Formats SPARQL Query Results JSON Format, SPARQL Query Results JSON Format SPARQL Query Results XML Format, The SPARQL Specifications, SPARQL Query Results XML Format as ARQ output, Standalone Processors SPARQL rules, Defining Rules with SPARQL–Defining Rules with SPARQL SPIN (SPARQL Inferencing Notation), Using Existing SPARQL Rules Vocabularies, Using SPARQL to Do Your Inferencing spreadsheets, Checking, Adding, and Removing Spoken Language Tags, Using CSV Query Results SQL, Querying the Data, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT, Middleware SPARQL Support, Glossary habits, Querying the Data square braces, Blank Nodes and Why They’re Useful, Using Existing SPARQL Rules Vocabularies str(), Node Type Conversion Functions CSV format and, SPARQL Query Results CSV and TSV Formats STRDT(), Datatype Conversion STRENDS(), String Functions string converting to URI, Problem datatype, Datatypes and Queries, Representing Strings functions, String Functions–String Functions searching for substrings, Problem striping, Storing RDF in Files, Glossary STRLANG(), Checking, Adding, and Removing Spoken Language Tags STRLEN(), String Functions STRSTARTS(), String Functions subject (of triple), The Data to Query, The Resource Description Framework (RDF), Glossary namespaces and, URLs, URIs, IRIs, and Namespaces subqueries, Queries in Your Queries, Combining Values and Assigning Values to Variables, Federated Queries: Searching Multiple Datasets with One Query SUBSTR(), Creating New Data, String Functions subtraction, Comparing Values and Doing Arithmetic SUM(), Finding the Smallest, the Biggest, the Count, the Average...


pages: 541 words: 109,698

Mining the Social Web: Finding Needles in the Social Haystack by Matthew A. Russell

Andy Rubin, business logic, Climategate, cloud computing, crowdsourcing, data science, en.wikipedia.org, fault tolerance, Firefox, folksonomy, full text search, Georg Cantor, Google Earth, information retrieval, machine readable, Mark Zuckerberg, natural language processing, NP-complete, power law, Saturday Night Live, semantic web, Silicon Valley, slashdot, social graph, social web, sparse data, statistical model, Steve Jobs, supply-chain management, text mining, traveling salesman, Turing test, web application

A rotating tag cloud that’s highly customizable and requires very little effort to get up and running Chapter 10. The Semantic Web: A Cocktail Discussion While the previous chapters attempted to provide an overview of the social web and motivate you to get busy hacking on data, it seems appropriate to wrap up with a brief postscript on the semantic web. This short discussion makes no attempt to regurgitate the reams of interesting mailing list discussions, blog posts, and other sources of information that document the origin of the Web, how it has revolutionized just about everything in our lives in under two decades, and how the semantic web has always been a part of that vision.

People and their virtual and real-world social connections and activities Web 3.0 (the semantic web) Prolific amounts of machine-understandable content * * * [62] As defined in Programming the Semantic Web, by Toby Segaran, Jamie Taylor, and Colin Evans (O’Reilly). [63] Inter-net literally implies “mutual or cooperating networks.” Man Cannot Live on Facts Alone The semantic web’s fundamental construct for representing knowledge is called a triple, which is a highly intuitive and very natural way of expressing a fact. As an example, the sentence we’ve considered on many previous occasions—“Mr. Green killed Colonel Mustard in the study with the candlestick”—expressed as a triple might be something like (Mr.

The Resource Description Framework (RDF) is the semantic web’s model for defining and enabling the exchange of triples. RDF is highly extensible in that while it provides a basic foundation for expressing knowledge, it can also be used to define specialized vocabularies called ontologies that provide precise semantics for modeling specific domains. More than a passing mention of specific semantic web technologies such as RDF, RDFa, RDF Schema, and OWL would be well out of scope here at the eleventh hour, but we will work through a high-level example that attempts to explain some of the hype around the semantic web in general. Open-World Versus Closed-World Assumptions One interesting difference between the way inference works in logic programming languages such as Prolog[64] as opposed to in other technologies, such as the RDF stack, is whether they make open-world or closed-world assumptions about the universe.


pages: 214 words: 14,382

Monadic Design Patterns for the Web by L.G. Meredith

barriers to entry, domain-specific language, don't repeat yourself, finite state, functional programming, Georg Cantor, ghettoisation, higher-order functions, John von Neumann, Kickstarter, semantic web, seminal paper, social graph, type inference, web application, WebSocket

eBook <www.wowebook.com> 9.5 Foundations Cover · Overview · Contents · Discuss · Suggest · Glossary · Index 180 Chapter 10 The Semantic Web Where are we; how did we get here; and where are we going? Chapter 10 query model Chapter 6 Chapter 1 request stream browser Chapter 3 http parser navigation model domain model storage model app request parser Chapter 5 Chapter 8 Chapter 4 User Chapter 2 Chapter 7 Download from Wow! eBook <www.wowebook.com> store Chapter 9 Figure 10.1 · Chapter 10 map Cover · Overview · Contents · Discuss · Suggest · Glossary · Index Download from Wow! eBook <www.wowebook.com> Section 10.1 Chapter 10 · The Semantic Web 10.1 Practice 10.2 Referential transparency In the interest of complete transparency, it is important for me to be clear about my position on the current approach to the semantic web.

., mn ), m0 ∈ [[c]], mi ∈ [[ci ]]} LET SEQ GROUP | val x = c;d | c;d | { c } PROBE | [[hdic]] = {m ∈ L(m) | ∃m0 ∈ [[d]].m0 (m) → m00 , m00 ∈ [[c]]} Other collection monads, other logics Cover · Overview · Contents · Discuss · Suggest · Glossary · Index 191 Section 10.5 Chapter 10 · The Semantic Web 192 Stateful collections Other logical operations EXPRESSION PREVIOUS QUANTIFICATION FIXPT DEFN c, d ::= | ... | ∀v.c | rec X.c FIXPT MENTION |X 10.5 Searching for programs A new foundation for search Monad composition via distributive laws Examples Download from Wow! eBook <www.wowebook.com> 10.6 Foundations Cover · Overview · Contents · Discuss · Suggest · Glossary · Index Section 10.6 Chapter 10 · The Semantic Web 193 data1 dataK { form1 } constraint1 constraintN formK Download from Wow!

eBook <www.wowebook.com> Contents List of Figures List of Tables List of Listings Acknowledgments 1. Motivation and Background 2. Toolbox 3. An I/O Monad for HTTP Streams 4. Parsing Requests, Monadically 5. The Domain Model as Abstract Syntax 6. Zippers and Contexts and URIs, Oh My! 7. A Review of Collections as Monads 8. Domain Model, Storage, and State 9. Putting it All Together 10. The Semantic Web Glossary Bibliography About the Author Cover · Overview · Contents · Discuss · Suggest · Glossary · Index vii x xi xii xiii 16 36 55 76 94 115 143 175 179 181 194 210 213 Contents Download from Wow! eBook <www.wowebook.com> Contents vii List of Figures x List of Tables xi List of Listings xii Acknowledgments xiii 1 Motivation and Background 1.1 Where are we?


Cataloging the World: Paul Otlet and the Birth of the Information Age by Alex Wright

1960s counterculture, Ada Lovelace, barriers to entry, British Empire, business climate, business intelligence, Cape to Cairo, card file, centralized clearinghouse, Charles Babbage, Computer Lib, corporate governance, crowdsourcing, Danny Hillis, Deng Xiaoping, don't be evil, Douglas Engelbart, Douglas Engelbart, Electric Kool-Aid Acid Test, European colonialism, folksonomy, Frederick Winslow Taylor, Great Leap Forward, hive mind, Howard Rheingold, index card, information retrieval, invention of movable type, invention of the printing press, Jane Jacobs, John Markoff, Kevin Kelly, knowledge worker, Law of Accelerating Returns, Lewis Mumford, linked data, Livingstone, I presume, lone genius, machine readable, Menlo Park, military-industrial complex, Mother of all demos, Norman Mailer, out of africa, packet switching, pneumatic tube, profit motive, RAND corporation, Ray Kurzweil, scientific management, Scramble for Africa, self-driving car, semantic web, Silicon Valley, speech recognition, Steve Jobs, Stewart Brand, systems thinking, Ted Nelson, The Death and Life of Great American Cities, the scientific method, Thomas L Friedman, urban planning, Vannevar Bush, W. E. B. Du Bois, Whole Earth Catalog

Berners-Lee, “The World Wide Web and the ‘Web of Life.’” 323 N O T E S T O PA G E S 2 7 3 – 3 0 7 6. Berners-Lee, “Keynote.” 7. Berners-Lee, Hendler, and Lassila, “The Semantic Web.” 8. Van den Heuvel, “Web 2.0 and the Semantic Web in Research from a Historical Perspective.” 9. Van den Heuvel, private correspondence. 10. Van den Heuvel, “Web 2.0 and the Semantic Web in Research from a Historical Perspective.” 11. Hillis, “Hillis Knowledge Web.” 12. Wright, “Data Streaming 2.0.” 13. Van den Heuvel, “Web 2.0 and the Semantic Web in Research from a Historical Perspective.” 14. Shirky, “Ontology Is Overrated.” 15. Weinberger, Everything Is Miscellaneous, 102. 16.

As early as 1996 he appeared before the W3C and lamented the lack of two-way authoring in modern Web browsers.6 In 2000, he published a paper in Scientific American in which he laid out his vision for a more orderly version of the Web that would allow computers to exchange information with each other more easily. He dubbed the project the Semantic Web, an umbrella term for a collection of 273 C ATA L O G I N G T H E WO R L D technologies aimed at making the Internet more useful by imposing more consistent structures on data that can then be exchanged automatically between machines. He envisioned a “Web of data” designed primarily to foster the automatic exchange of information between computers, to allow any number of applications to search, retrieve, and synthesize data drawn from disparate sources. “The Semantic Web will bring structure to the meaningful content of Web pages,” he wrote, “creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.”7 Substitute “bibliologists” for “software agents,” and that same description could just as easily apply to the Mundaneum.

Otlet also considered the possibility of engineering mechanical means of indexing and reassembling data from multiple sources, as in his microfilm experiments with Robert Goldschmidt. The environment that Otlet envisioned bears other similarities to the Semantic Web. By predicating its future on ontologies, handcrafted maps of topical relationships, the initiative shares the same spirit of expert knowledge systems that characterized Otlet’s work with the Universal Decimal Classification. Van den Heuvel, for one, argues that Otlet’s framework not only resembles the hyperlinked structure of the World Wide Web, but also presages some of the more advanced linking strategies of the Semantic Web.8 Otlet’s Monographic Principle provided a framework for breaking down documents and other forms of media into component parts, then recombining them into new formats using the multiple link dimensions afforded by the Universal Decimal Classification.


pages: 377 words: 110,427

The Boy Who Could Change the World: The Writings of Aaron Swartz by Aaron Swartz, Lawrence Lessig

Aaron Swartz, affirmative action, Alfred Russel Wallace, American Legislative Exchange Council, Benjamin Mako Hill, bitcoin, Bonfire of the Vanities, Brewster Kahle, Cass Sunstein, deliberate practice, do what you love, Donald Knuth, Donald Trump, failed state, fear of failure, Firefox, Free Software Foundation, full employment, functional programming, Hacker News, Howard Zinn, index card, invisible hand, Joan Didion, John Gruber, Lean Startup, low interest rates, More Guns, Less Crime, peer-to-peer, post scarcity, power law, Richard Feynman, Richard Stallman, Ronald Reagan, school vouchers, semantic web, single-payer health, SpamAssassin, SPARQL, telemarketer, The Bell Curve by Richard Herrnstein and Charles Murray, the scientific method, Toyota Production System, unbiased observer, wage slave, Washington Consensus, web application, WikiLeaks, working poor, zero-sum game

And it’s led many who have been working on the Semantic Web, in the vain hope of actually building a world where software can communicate, to burn out and tune out and find more productive avenues for their attentions. For an example, look at Sean B. Palmer. In his influential piece, “Ditching the Semantic Web?,” he proclaims “It’s not prudent, perhaps even not moral (if that doesn’t sound too melodramatic), to work on RDF, OWL, SPARQL, RIF, the broken ideas of distributed trust, CWM, Tabulator, Dublin Core, FOAF, SIOC, and any of these kinds of things” and says not only will he “stop working on the Semantic Web” but “I will, moreover, actively dissuade anyone from working on the Semantic Web where it distracts them from working on” more practical projects.

The Techniques of Mass Collaboration: A Third Way Out http://www.aaronsw.com/weblog/masscollab2 July 19, 2006 Age 19 I’m not the first to suggest that the Internet could be used for bringing users together to build grand databases. The most famous example is the Semantic Web project (where, in full disclosure, I worked for several years). The project, spearheaded by Tim Berners-Lee, inventor of the web, proposed to extend the working model of the web to more structured data, so that instead of simply publishing text web pages, users could publish their own databases, which could be aggregated by search engines like Google into major resources. The Semantic Web project has received an enormous amount of criticism, much (in my view) rooted in misunderstandings, but much legitimate as well.

But Wikipedia points to a different model, where all the users come to one website, where the interface for inputting data in the proper format is clear and unambiguous, and the users can work together to resolve any conflicts that may come up. Indeed, this method strikes me as so superior that I’m surprised I don’t see it discussed in this context more often. Ignorance doesn’t seem plausible; even if Wikipedia was a latecomer, sites like ChefMoz and MusicBrainz followed this model and were Semantic Web case studies. (Full disclosure: I worked on the Semantic Web portions of MusicBrainz.) Perhaps the reason is simply that both sides—W3C and Google—have the existing web as the foundation for their work, so it’s not surprising that they assume future work will follow from the same basic model. One possible criticism of the million-dollar-users proposal is that it’s somehow less free than the individualist approach.


pages: 223 words: 52,808

Intertwingled: The Work and Influence of Ted Nelson (History of Computing) by Douglas R. Dechow

3D printing, Apple II, Bill Duvall, Brewster Kahle, Buckminster Fuller, Claude Shannon: information theory, cognitive dissonance, computer age, Computer Lib, conceptual framework, Douglas Engelbart, Douglas Engelbart, Dynabook, Edward Snowden, game design, HyperCard, hypertext link, Ian Bogost, information retrieval, Internet Archive, Ivan Sutherland, Jaron Lanier, knowledge worker, linked data, Marc Andreessen, Marshall McLuhan, Menlo Park, Mother of all demos, pre–internet, Project Xanadu, RAND corporation, semantic web, Silicon Valley, software studies, Steve Jobs, Steve Wozniak, Stewart Brand, Ted Nelson, TED Talk, The Home Computer Revolution, the medium is the message, Vannevar Bush, Wall-E, Whole Earth Catalog

Each link was a triple that consisted of a source, a destination and a description. Little did I know at the time how prescient of the Semantic Web these ideas would be. Of course, there are problems with automatically making a link on a word without knowing its precise semantic meaning. There are a lot of different people with the name Mountbatten in the Mountbatten archive for example. So working out the context in which the link was being applied and therefore the meaning of the word became a key focus of our work: problems we are still dealing with as the Semantic Web develops today. We did also have specific links in Microcosm that were more like standard hypertext links because they were embedded in the documents and represented to the user through highlighted buttons, and you could trace them backwards though the link database or linkbase as we called it.

Ted was also at the Brisbane conference to pick up a special award. I remember him demoing ZigZag to us in the bar one night at that conference. He was so excited, and we were all mesmerized. So I had heard Tim talk about the Semantic Web and I saw Ted demo ZigZag at the same conference, and I didn’t fully appreciate either of them at the time. I understood the principles, but I didn’t understand the detail. It’s taken me a long time to appreciate both the Semantic Web and ZigZag, but as my understanding of both of them has increased I now firmly believe what I suspected all along: there is a one-to-one correspondence between the two ideas, and that you can implement ZigZag in the RDF graph.

Object Reuse and Exchange (2014) http://​www.​openarchives.​org/​ore/​ 27. Parsons MA, Fox PA (2013) Is data publication the right metaphor? Data Sci J 12:WDS32–WDS46. doi:10.​2481/​dsj.​WDS-042 28. Pepe A, Mayernik M, Borgman CL, Van de Sompel H (2010) From artifacts to aggregations: modeling scientific life cycles on the semantic web. J Am Soc Inf Sci Technol 61(3):567–582. doi:10.​1002/​asi.​21263 29. Rayward WB (1991) The case of Paul Otlet, pioneer of information science, internationalist, visionary: reflections on biography. J Librariansh Inf Sci 23:135–145CrossRef 30. Rayward WB (1994) Visions of Xanadu—Paul Otlet (1868–1944) and hypertext.


pages: 268 words: 109,447

The Cultural Logic of Computation by David Golumbia

Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, American ideology, Benoit Mandelbrot, Bletchley Park, borderless world, business process, cellular automata, citizen journalism, Claude Shannon: information theory, computer age, Computing Machinery and Intelligence, corporate governance, creative destruction, digital capitalism, digital divide, en.wikipedia.org, finite state, folksonomy, future of work, Google Earth, Howard Zinn, IBM and the Holocaust, iterative process, Jaron Lanier, jimmy wales, John von Neumann, Joseph Schumpeter, late capitalism, Lewis Mumford, machine readable, machine translation, means of production, natural language processing, Norbert Wiener, One Laptop per Child (OLPC), packet switching, RAND corporation, Ray Kurzweil, RFID, Richard Stallman, semantic web, Shoshana Zuboff, Slavoj Žižek, social web, stem cell, Stephen Hawking, Steve Ballmer, Stewart Brand, strong AI, supply-chain management, supply-chain management software, technological determinism, Ted Nelson, telemarketer, The Wisdom of Crowds, theory of mind, Turing machine, Turing test, Vannevar Bush, web application, Yochai Benkler

Yet the views of professional computational linguists have a surprisingly restricted sphere of influence. They are rarely referenced even by the computer scientists who are working to define future versions of the World Wide Web, especially in the recent project to get the the web to “understand” meanings called the Semantic Web. They have little influence, either, over the widespread presumption among computationalists that the severe restrictions today’s computers impose on natural language diversity are irrelevant to the worlwide distribution of cultural power. Computationalism and Digital Textuality During the last fifteen years, a small body of writing has emerged that is concerned with an idea called in the literature the OHCO thesis, spelled out to mean that texts are Ordered Hierarchies of Content Objects.

It is specifically human creativity in text production that the word processor is enabling, and over time it has seemed that in fact most writers prefer text processors to be as formal as possible—that is, to interfere as little as possible with the creation of text content, and to make alterations to text appearance straightforward and consistent. This does not make it sound like even text producers have been waiting for the capability to generate text in terms of the semantic categories it harbors—after all, we already have terrific ways of manipulating those categories, which we call writing and thinking. Language Ideologies of the Semantic Web Ordinarily, a discourse like the one on the OHCO thesis might simply pass unnoticed, but in the present context it is interesting for several reasons. Not least among these is the clear way in which the OHCO writers are proceeding not (only) from the Platonic philosophical intuitions which they say drive them, but instead or also from certain clear prevailing tendencies in our own culture and world.

To some extent, the idea sounds intrusive, and the only actual implementation of such strategies is still much The Cultural Logic of Computation p 114 more keyword-based and involves monitoring employee e-mails for illegal and anticompetitive activities. Related to this is the hard distinction between form and content that a database model of text implies. This idea is yoked to the implementation of the Semantic Web itself, suggesting somehow that the problem with the web so far and with HTML in particular has been its “blurring” of the form/content distinction.6 Tags like <b> for bold and <i> for italic are today “deprecated” because they express formatting ideas in what should be a purely semantic medium, so that we should only use tags like <em> and <strong> instead.


pages: 314 words: 94,600

Business Metadata: Capturing Enterprise Knowledge by William H. Inmon, Bonnie K. O'Neil, Lowell Fryman

affirmative action, bioinformatics, business cycle, business intelligence, business process, call centre, carbon-based life, continuous integration, corporate governance, create, read, update, delete, database schema, en.wikipedia.org, folksonomy, informal economy, knowledge economy, knowledge worker, semantic web, tacit knowledge, The Wisdom of Crowds, web application

For example, you could ask the computer to book a reservation at an Indian restaurant on the way home from work, and the computer would find an Indian restaurant located directly on your way home, book a reservation for you, and put it automatically on your calendar, all without human intervention. In the context of searching for documents, a semantic web would be able to understand what the documents contained. Today, we rely mostly on document titles and tagging. Tagging is usually done manually either by the document author, someone else charged with tagging after the fact, or through a folksonomy like del.icio.us. But a true semantic web could decipher document contents on its own. On a smaller scale, the semantic web means distinguishing between word senses: when there are two or more senses of a word, the user is asked, “Did you mean…?”

Semantics and Business Metadata 1. 2. 3. 4. 5. 6. 7. 8. 11.1 Introduction ................................................................................................195 The Vision of the Semantic Web .....................................................195 The Importance of Semantics ..........................................................196 Attempts to Capture Semantics: Semantic Frameworks .................................................................................................200 Semantics as Business Metadata ....................................................207 Semantics in Practice .............................................................................211 Summary .......................................................................................................216 References ....................................................................................................217 Introduction Semantics, a subject that has great depth and breadth, can only be viewed here in very broad overview, focusing specifically on semantics as a type of business metadata. After a brief survey of semantics and semantic technology, we will cover the relationship of semantics and business metadata. 11.2 C H A P T E R 11 C H A P T E R TA B L E O F CO N T E N T S The Vision of the Semantic Web Tim Berners-Lee envisioned the idea of the “semantic web,” wherein intelligent agents would be truly intelligent. 195 196 Chapter 11 Semantics and Business Metadata In his vision the computer would know exactly what “booking a restaurant reservation” meant, as well as all the underlying tasks associated with it. For example, you could ask the computer to book a reservation at an Indian restaurant on the way home from work, and the computer would find an Indian restaurant located directly on your way home, book a reservation for you, and put it automatically on your calendar, all without human intervention.

In Business Metadata, Bill, Bonnie, and Lowell provide the means for bridging the gap between the sometimes “fuzzy” human perception of data that fuels business processes and the rigid information management models used by business applications. Look to the future: next generation business intelligence, enterprise content management and search, the semantic web all will depend on business metadata. Read this book!” —David Loshin, President, Knowledge Integrity Incorporated These authors have written a book that ventures into new territory for data and information management. There are several books about metadata, but this is the first to offer in-depth discussion of the important topic of business metadata.


pages: 245 words: 68,420

Content Everywhere: Strategy and Structure for Future-Ready Content by Sara Wachter-Boettcher

business logic, crowdsourcing, John Gruber, Kickstarter, linked data, machine readable, search engine result page, semantic web, Silicon Valley, systems thinking, TechCrunch disrupt

The more you know about how these systems work and what’s being used for what, the better you can evaluate your content’s needs against them and the more you can participate in conversations with those on the database end of the spectrum. What About the Semantic Web? Once you understand a bit about markup, and about making content machine-readable and interoperable, then it’s time to consider some of the exciting stuff that markup makes possible. One of those things is the Semantic Web: a Web where all content shares a common framework and can be shared, reused, and understood across systems—to the point where, say, machines know whether the term “blackberry” is referring to the fruit or the phone. A completely semantic Web is a lofty goal—one not without its detractors, I might note—and our path toward it is still meandering at best.

Probably the biggest mindset shift for UXers is to stop thinking about pages and page types, and instead think purely about the mental model of the subject you’re trying to represent. How do linked data and Semantic Web fit in? Where once we built ourselves silos on the Web, these days it pays to recognize that it’s really one Web and we’re in the business of stitching our content into that wider canvas. Initiatives like the Linked Open Data and Semantic Web projects are helping us do this by providing standardized methods of sharing data for both people and computers. For example, dbPedia and MusicBrainz provide free, crowd-sourced sources of content and business data you can use to enrich and enhance your own offerings, on a scale that few businesses would have the time and resources to replicate.

A completely semantic Web is a lofty goal—one not without its detractors, I might note—and our path toward it is still meandering at best. But a more semantic Web seems closer than ever with the recent advent of linked data, which is made possible through structured content and markup. Coined by Tim Berners-Lee—yes, the guy who invented the World Wide Web—in 2006, linked data means exactly what it sounds like: bits of information that are linked to other, equivalent sets of data elsewhere on the Internet (often referred to as “in the cloud”), as illustrated in Figure 6.1. The idea is that, as opposed to HTML links, which link one document (e.g., a page) to another, linked data connects the things those pages are about by connecting the actual data behind those two pages instead.


The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences by Rob Kitchin

Bayesian statistics, business intelligence, business process, cellular automata, Celtic Tiger, cloud computing, collateralized debt obligation, conceptual framework, congestion charging, corporate governance, correlation does not imply causation, crowdsourcing, data science, discrete time, disruptive innovation, George Gilder, Google Earth, hype cycle, Infrastructure as a Service, Internet Archive, Internet of things, invisible hand, knowledge economy, Large Hadron Collider, late capitalism, lifelogging, linked data, longitudinal study, machine readable, Masdar, means of production, Nate Silver, natural language processing, openstreetmap, pattern recognition, platform as a service, recommendation engine, RFID, semantic web, sentiment analysis, SimCity, slashdot, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart grid, smart meter, software as a service, statistical model, supply-chain management, technological solutionism, the scientific method, The Signal and the Noise by Nate Silver, transaction costs

Paper prepared for the American Political Science Association Annual Meeting. Seattle, Washington, 1–4 September 2011. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1899790 (last accessed 19 August 2013). McCreary, D. (2009) ‘Entity extraction and the semantic web’, Semantic Web, 12 January, http://semanticweb.com/entity-extraction-and-the-semantic-web_b10675 (last accessed 19 July 2013). McKeon, S.G. (2013) ‘Hacking the hackathon’, Shaunagm.net, 10 October, http://www.shaunagm.net/blog/2013/10/hacking-the-hackathon/ (last accessed 21 October 2013). McNay, L. (1994) Foucault: A Critical Introduction.

Since the late 2000s the movement has noticeably gained prominence and traction, initially with the Guardian newspaper’s campaign in the UK to ‘Free Our Data’ (www.theguardian.com/technology/free-ourdata), the Organization for Economic Cooperation and Development (OECD)’s call for member governments to open up their data in 2008, the launch in 2009 by the US government of data.gov, a website designed to provide access to non-sensitive and historical datasets held by US state and federal agencies, and the development of linked data and the promotion of the ‘Semantic Web’ as a standard element of future Internet technologies, in which open and linked data are often discursively conjoined (Berners-Lee 2009). Since 2010 dozens of countries and international organisations (e.g., the European Union [EU] and the United Nations Development Programme [UNDP]) have followed suit, making thousands of previously restricted datasets open in nature for non-commercial and commercial use (see DataRemixed 2013).

Given that by their nature open data generate no or little income to fund such service arrangements, nor indeed the costs of opening data, while it is easy to agree that open data should be delivered as a service, in practice it might be an aspiration unless effective funding models are developed (as discussed more fully below). Linked Data The idea of linked data is to transform the Internet from a ‘web of documents’ to a ‘web of data’ through the creation of a semantic web (Berners-Lee 2009; P. Miller, 2010), or what Goddard and Byrne (2010) term a ‘machine-readable web’. Such a vision recognises that all of the information shared on the Web contains a rich diversity of data – names, addresses, product details, facts, figures, and so on. However, these data are not necessarily formally identified as such, nor are they formally structured in such a way as to be easily harvested and used.


pages: 287 words: 86,919

Protocol: how control exists after decentralization by Alexander R. Galloway

Ada Lovelace, airport security, Alvin Toffler, Berlin Wall, bioinformatics, Bretton Woods, Charles Babbage, computer age, Computer Lib, Craig Reynolds: boids flock, Dennis Ritchie, digital nomad, discovery of DNA, disinformation, Donald Davies, double helix, Douglas Engelbart, Douglas Engelbart, easy for humans, difficult for computers, Fall of the Berlin Wall, Free Software Foundation, Grace Hopper, Hacker Ethic, Hans Moravec, informal economy, John Conway, John Markoff, John Perry Barlow, Ken Thompson, Kevin Kelly, Kickstarter, late capitalism, Lewis Mumford, linear programming, macro virus, Marshall McLuhan, means of production, Menlo Park, moral panic, mutually assured destruction, Norbert Wiener, old-boy network, OSI model, packet switching, Panopticon Jeremy Bentham, phenotype, post-industrial society, profit motive, QWERTY keyboard, RAND corporation, Ray Kurzweil, Reflections on Trusting Trust, RFC: Request For Comment, Richard Stallman, semantic web, SETI@home, stem cell, Steve Crocker, Steven Levy, Stewart Brand, Ted Nelson, telerobotics, The future is already here, the market place, theory of mind, urban planning, Vannevar Bush, Whole Earth Review, working poor, Yochai Benkler

By making the descriptive protocols more complex, one is able to say more complex things about information, namely, that Galloway is my surname, and my given name is Alexander, and so on. The Semantic Web is simply the process of adding extra metalayers on top of information so that it can be parsed according to its semantic value. Why is this significant? Before this, protocol had very little to do with meaningful information. Protocol does not interface with content, with semantic value. It is, as I have said, against interpretation. But with Berners-Lee comes a new strain of protocol: protocol that cares about meaning. This is what he means by a Semantic Web. It is, as he says, “machineunderstandable information.” Does the Semantic Web, then, contradict my earlier principle that protocol is against interpretation?

But Web protocols are experiencing explosive growth 38. Berners-Lee, Weaving the Web, p. 36. 39. Berners-Lee, Weaving the Web, p. 71. 40. Berners-Lee, Weaving the Web, pp. 92, 94. Chapter 4 138 today. Current growth is due to an evolution of the concept of the Web into what Berners-Lee calls the Semantic Web. In the Semantic Web, information is not simply interconnected on the Internet using links and graphical markup—what he calls “a space in which information could permanently exist and be referred to”41—but it is enriched using descriptive protocols that say what the information actually is. For example, the word “Galloway” is meaningless to a machine.

So it is a matter of debate as to whether descriptive protocols actually add intelligence to information, or whether they are simply subjective descriptions (originally written by a human) that computers mimic but understand little about. Berners-Lee himself stresses that the Semantic Web is not an artificial intelligence machine.42 He calls it “well-defined” data, not interpreted data—and 41. Berners-Lee, Weaving the Web, p. 18. 42. Tim Berners-Lee, “What the Semantic Web Can Represent,” available online at http:// www.w3.org/DesignIssues/RDFnot.html. Institutionalization 139 in reality those are two very different things. I promised in the introduction to skip all epistemological questions, and so I leave this one to be debated by others.


Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage by Zdravko Markov, Daniel T. Larose

Firefox, information retrieval, Internet Archive, iterative process, natural language processing, pattern recognition, random walk, recommendation engine, semantic web, sparse data, speech recognition, statistical model, William of Occam

We look into these approaches in Part II. Semantic Web Semantic web is a recent initiative led by the web consortium (w3c.org). Its main objective is to bring formal knowledge representation techniques into the Web. Currently, web pages are designed basically for human readers. It is widely acknowledged that the Web is like a “fancy fax machine” used to send good-looking documents worldwide. The problem here is that the nice format of web pages is very difficult for computers to understand—something that we expect search engines to do. The main idea behind the semantic web is to add formal descriptive material to each web page that although invisible to people would make its content easily understandable by computers.

Thus, the Web would be organized and turned into the largest knowledge base in the world, which with the help of advanced reasoning techniques developed in the area of artificial intelligence would be able not just to provide ranked documents that match a keyword search query, but would also be able to answer questions and give explanations. The web consortium site (http://www.w3.org/2001/sw/) provides detailed information about the latest developments in the area of the semantic web. Although the semantic web is probably the future of the Web, our focus is on the former two approaches to bring semantics to the Web. The reason for this is that web search is the data mining approach to web semantics: extracting knowledge from web data. In contrast, the semantic web approach is about turning web pages into formal knowledge structures and extending the functionality of web browsers with knowledge manipulation and reasoning tools. 6 CHAPTER 1 INFORMATION RETRIEVAL AND WEB SEARCH CRAWLING THE WEB In this and later sections we use basic web terminology such as HTML, URL, web browsers, and servers.

QA76.9.D343M38 2007 005.74 – dc22 2006025099 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 For my children Teodora, Kalin, and Svetoslav – Z.M. For my children Chantal, Ellyriane, Tristan, and Ravel – D.T.L. CONTENTS PREFACE xi PART I WEB STRUCTURE MINING 1 2 INFORMATION RETRIEVAL AND WEB SEARCH 3 Web Challenges Web Search Engines Topic Directories Semantic Web Crawling the Web Web Basics Web Crawlers Indexing and Keyword Search Document Representation Implementation Considerations Relevance Ranking Advanced Text Search Using the HTML Structure in Keyword Search Evaluating Search Quality Similarity Search Cosine Similarity Jaccard Similarity Document Resemblance References Exercises 3 4 5 5 6 6 7 13 15 19 20 28 30 32 36 36 38 41 43 43 HYPERLINK-BASED RANKING 47 Introduction Social Networks Analysis PageRank Authorities and Hubs Link-Based Similarity Search Enhanced Techniques for Page Ranking References Exercises 47 48 50 53 55 56 57 57 vii viii CONTENTS PART II WEB CONTENT MINING 3 4 5 CLUSTERING 61 Introduction Hierarchical Agglomerative Clustering k-Means Clustering Probabilty-Based Clustering Finite Mixture Problem Classification Problem Clustering Problem Collaborative Filtering (Recommender Systems) References Exercises 61 63 69 73 74 76 78 84 86 86 EVALUATING CLUSTERING 89 Approaches to Evaluating Clustering Similarity-Based Criterion Functions Probabilistic Criterion Functions MDL-Based Model and Feature Evaluation Minimum Description Length Principle MDL-Based Model Evaluation Feature Selection Classes-to-Clusters Evaluation Precision, Recall, and F-Measure Entropy References Exercises 89 90 95 100 101 102 105 106 108 111 112 112 CLASSIFICATION 115 General Setting and Evaluation Techniques Nearest-Neighbor Algorithm Feature Selection Naive Bayes Algorithm Numerical Approaches Relational Learning References Exercises 115 118 121 125 131 133 137 138 PART III WEB USAGE MINING 6 INTRODUCTION TO WEB USAGE MINING 143 Definition of Web Usage Mining Cross-Industry Standard Process for Data Mining Clickstream Analysis 143 144 147 CONTENTS 7 8 9 ix Web Server Log Files Remote Host Field Date/Time Field HTTP Request Field Status Code Field Transfer Volume (Bytes) Field Common Log Format Identification Field Authuser Field Extended Common Log Format Referrer Field User Agent Field Example of a Web Log Record Microsoft IIS Log Format Auxiliary Information References Exercises 148 PREPROCESSING FOR WEB USAGE MINING 156 Need for Preprocessing the Data Data Cleaning and Filtering Page Extension Exploration and Filtering De-Spidering the Web Log File User Identification Session Identification Path Completion Directories and the Basket Transformation Further Data Preprocessing Steps References Exercises 156 149 149 149 150 151 151 151 151 151 152 152 152 153 154 154 154 158 161 163 164 167 170 171 174 174 174 EXPLORATORY DATA ANALYSIS FOR WEB USAGE MINING 177 Introduction Number of Visit Actions Session Duration Relationship between Visit Actions and Session Duration Average Time per Page Duration for Individual Pages References Exercises 177 MODELING FOR WEB USAGE MINING: CLUSTERING, ASSOCIATION, AND CLASSIFICATION Introduction Modeling Methodology Definition of Clustering The BIRCH Clustering Algorithm Affinity Analysis and the A Priori Algorithm 177 178 181 183 185 188 188 191 191 192 193 194 197 x CONTENTS Discretizing the Numerical Variables: Binning Applying the A Priori Algorithm to the CCSU Web Log Data Classification and Regression Trees The C4.5 Algorithm References Exercises INDEX 199 201 204 208 210 211 213 PREFACE DEFINING DATA MINING THE WEB By data mining the Web, we refer to the application of data mining methodologies, techniques, and models to the variety of data forms, structures, and usage patterns that comprise the World Wide Web.


pages: 369 words: 80,355

Too Big to Know: Rethinking Knowledge Now That the Facts Aren't the Facts, Experts Are Everywhere, and the Smartest Person in the Room Is the Room by David Weinberger

airport security, Alfred Russel Wallace, Alvin Toffler, Amazon Mechanical Turk, An Inconvenient Truth, Berlin Wall, Black Swan, book scanning, Cass Sunstein, commoditize, Computer Lib, corporate social responsibility, crowdsourcing, Danny Hillis, David Brooks, Debian, double entry bookkeeping, double helix, Dr. Strangelove, en.wikipedia.org, Exxon Valdez, Fall of the Berlin Wall, future of journalism, Future Shock, Galaxy Zoo, Gregor Mendel, Hacker Ethic, Haight Ashbury, Herman Kahn, hive mind, Howard Rheingold, invention of the telegraph, Jeff Hawkins, jimmy wales, Johannes Kepler, John Harrison: Longitude, Kevin Kelly, Large Hadron Collider, linked data, Neil Armstrong, Netflix Prize, New Journalism, Nicholas Carr, Norbert Wiener, off-the-grid, openstreetmap, P = NP, P vs NP, PalmPilot, Pluto: dwarf planet, profit motive, Ralph Waldo Emerson, RAND corporation, Ray Kurzweil, Republic of Letters, RFID, Richard Feynman, Ronald Reagan, scientific management, semantic web, slashdot, social graph, Steven Pinker, Stewart Brand, systems thinking, technological singularity, Ted Nelson, the Cathedral and the Bazaar, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, Whole Earth Catalog, X Prize

If each of these sites followed the conventions specified by the Semantic Web—initiated by Sir Tim Berners-Lee, the inventor of the World Wide Web, around the turn of the millennium—computer programs could far more easily know that these sites were referring to the same book. In fact, the Semantic Web would make it possible to share far more complex information from across multiple sites. Agreeing on how to encode metadata makes the Net capable of expressing more knowledge than was put into it. That is the very definition of a smart network. But creating that metadata can be difficult, especially since many Semantic Web adherents originally proceeded by trying to write large, complex, logical representations of domains of the world.

But writing an ontology of financial markets would require agreeing on exactly what the required definitional elements of a “trade,” “bond,” “regulation,” and “report” are—as well as on every detail and every connection with other domains, such as law, economics, and politics. So, some supporters of the Semantic Web (including Tim Berners-Lee 8) decided that there would be faster and enormous benefits to making data accessible in standardized but imperfect form—as what is called “Linked Data”—without waiting for agreement about overarching ontologies. So, if you have a store of information about, say, chemical elements, you can make it available on the Web as a series of basic assertions that are called “triples” because they have the form of two objects joined by a relation: “Mercury is an element.”

This approach may be messy and imperfect, but it is 100 percent better than not releasing data because you haven’t figured out how to get the metadata perfectly right. The rise of Linked Data encapsulates the transformation of knowledge we have explored throughout this book. While the original Semantic Web emphasized building ontologies that are “knowledge representations” of the world, it turns out that if we go straight to unleashing an abundance of linked but imperfect data, making it widely and openly available in standardized form, the Net becomes a dramatically improved infrastructure for knowledge.


pages: 397 words: 102,910

The Idealist: Aaron Swartz and the Rise of Free Culture on the Internet by Justin Peters

4chan, Aaron Swartz, activist lawyer, Alan Greenspan, Any sufficiently advanced technology is indistinguishable from magic, Bayesian statistics, Brewster Kahle, buy low sell high, crowdsourcing, digital rights, disintermediation, don't be evil, Free Software Foundation, global village, Hacker Ethic, hypertext link, index card, informal economy, information retrieval, Internet Archive, invention of movable type, invention of writing, Isaac Newton, John Markoff, Joi Ito, Lean Startup, machine readable, military-industrial complex, moral panic, Open Library, Paul Buchheit, Paul Graham, profit motive, RAND corporation, Republic of Letters, Richard Stallman, selection bias, semantic web, Silicon Valley, social bookmarking, social web, Steve Jobs, Steven Levy, Stewart Brand, strikebreaker, subprime mortgage crisis, Twitter Arab Spring, Vannevar Bush, Whole Earth Catalog, Y Combinator

. ,” Schoolyard Subversion, August 16, 2000, http://web.archive.org/web/20010514192627/http://swartzfam.com/aaron/school/2000/08/16/. 41 Aaron Swartz, “The Weight of School,” Schoolyard Subversion, October 8, 2000, http://web.archive.org/web/20010517235916/http://swartzfam.com/aaron/school/2000/10/08/. 42 Aaron Swartz, “Welcome to Unschooling,” Schoolyard Subversion, April 5, 2001, http://web.archive.org/web/20010502005216/http:/swartzfam.com/aaron/school/2001/04/05/. 43 Robert Swartz, interview. 44 Tim Berners-Lee, James Hendler, and Ora Lassila, “The Semantic Web.” Scientific American, May 17, 2001, http://www.scientificamerican.com/article/the-semantic-web/. 45 Aaron Swartz, “I think there is a,” Aaron Swartz: The Weblog, January 14, 2002, http://www.aaronsw.com/weblog/000111. 46 Felter, interview. 47 Wilcox-O’Hearn, “Part 1.” 48 Eldred, “Battle of the Books.” 49 Interview with Lisa Rein, January 2013. 50 Interview with Ben Adida, January 2013. 51 Rein, interview. 52 Aaron Swartz, “Emerging Technologies—Day 2,” Aaron Swartz: The Weblog, May 15, 2002, http://www.aaronsw.com/weblog/000254. 53 Aaron Swartz, “May 13, 2002: Visiting Google,” Google Weblog, May 13, 2002, http://google.blogspace.com/archives/000252. 54 Felter, interview. 55 Aaron Swartz, “Emerging Technologies—Day 3,” Aaron Swartz: The Weblog, May 16, 2002, http://www.aaronsw.com/weblog/000255. 56 Ibid. 57 Eric Eldred to Book People mailing list, October 19, 1998, http://onlinebooks.library.upenn.edu/webbin/bparchive?

,” a developer named Gabe Beged-Dov wrote to an online mailing list on July 3, 2000.38 Swartz responded: “I generally try not to mention my age, because I find that unfortunately some people immediately discredit me because of it. :-(, Thanks to everyone who is able to put aside their prejudices not only in age, but in all matters, so that work on standards like these can go ahead and we can build the Web of the future. I don’t know about all of you, but I get very excited when I think about the possibilities for the Semantic Web. The sooner we get standards, the better. It’s not hard—even an 8th grader can do it! :-) So let’s get moving.”39 Swartz attended a private school, North Shore Country Day, in Winnetka, Illinois, and he chafed at its rules and customs. After-school sports were mandatory, much to his dismay. (“I narrowly escaped another day of practice due to an awful migraine headache.

Just as a supermarket checkout machine scans a bar code to determine exactly what you’re buying and how much it costs, a computer reads a website’s metadata to acquire salient information about that site and the content therein. In a 2001 Scientific American article, Berners-Lee, James Hendler, and Ora Lassila made the case for a metadata-rich “Semantic Web” as one “in which information is given well-defined meaning, better enabling computers and people to work in cooperation. . . . In the near future, these developments will usher in significant new functionality as machines become much better able to process and ‘understand’ the data that they merely display at present.”44 The idea sounded great to Swartz.


pages: 291 words: 77,596

Total Recall: How the E-Memory Revolution Will Change Everything by Gordon Bell, Jim Gemmell

airport security, Albert Einstein, book scanning, cloud computing, Computing Machinery and Intelligence, conceptual framework, Douglas Engelbart, full text search, information retrieval, invention of writing, inventory management, Isaac Newton, Ivan Sutherland, John Markoff, language acquisition, lifelogging, Menlo Park, optical character recognition, pattern recognition, performance metric, RAND corporation, RFID, semantic web, Silicon Valley, Skype, social web, statistical model, Stephen Hawking, Steve Ballmer, Steve Bannon, Ted Nelson, telepresence, Turing test, Vannevar Bush, web application

Association for Computing Machinery, Inc. Montalbano, Elizabeth. 2008. “IBM Pledges $1 Billion to Unified Communications.” PC World (March 11). O’Reilly, Paul. 2009. “Managing Unified Communications Performance.” CRN (March 9). Semantic Web: Berners-Lee, T., and J. Hendler. 2001. “Scientific Publishing on the Semantic Web.” Nature (26 April). W3C Semantic Web Frequently Asked Questions. http://www.w3.org/RDF/FAQ British Library Digital Lives Project and conference: Digital Lives Research Project Web page. http://www.bl.uk/digital-lives First Digital Lives Research Conference: Personal Digital Archives for the 21st Century.

As anyone who has translated between languages knows, a word-for-word translation is inadequate; it gives us translations that turn “The spirit is willing but the flesh is weak” to “The alcohol is good but the meat is bad.” Likewise, it can be difficult to translate between storage formats, and a lot of work is yet to go into this effort. The Semantic Web, which aims to standardize transmission and translation of information, is an important effort in this area. There will also be a unification of networking in the sense that we will cease to have distinct networks for different types of data. Already we get telephone over our cable TV network and TV shows over our telephone’s DSL.

ScanMyPhotos.com scanners and digitizing books and file formats and implementation of Total Recall and memex and organization of data and origin of MyLifeBits pen scanners scanning services Schacter, Daniel scholarship science fiction scientific method scrapbooking screensavers Scripps Genomic Health Initiative searching data and associative memory and data analysis desktop search and e-books and implementation of Total Recall and lifelogging Second Life security of data and adaptation to lifelogging and education and encryption and ownership of health records and passwords and privacy self-awareness semantic memory Semantic Web SenseCam and CARPE and diet monitoring and memory aids origin of and summarization of data and travelogues sensory technology. See also biometric sensors The Seven Sins of Memory: How the Mind Forgets and Remembers (Schacter) sexual molestation memories sharing data sheet music shopping lists The Simpsons situational awareness Sixth Sense system Sky Server sleep data Slidescanning.com SmartDraw smartphones.


Beautiful Visualization by Julie Steele

barriers to entry, correlation does not imply causation, data acquisition, data science, database schema, Drosophila, en.wikipedia.org, epigenetics, global pandemic, Hans Rosling, index card, information retrieval, iterative process, linked data, Mercator projection, meta-analysis, natural language processing, Netflix Prize, no-fly zone, pattern recognition, peer-to-peer, performance metric, power law, QR code, recommendation engine, semantic web, social bookmarking, social distancing, social graph, sorting algorithm, Steve Jobs, the long tail, web application, wikimedia commons, Yochai Benkler

This allows similar books to self-organize together to form clusters of like topics, which reveal the human communities of interest behind the book clusters. In Figure 7-9, two obvious groupings cling together by topic: The bottom-right grouping is all about programmers and programming. The grouping at the top of the graph is all about the Semantic Web. Although clusters emerge in Figure 7-9, they are not as obvious as some others that we will see later; these clusters are intermixed and overlap, especially around other books about modern programming methods and processes. Figure 7-9. The network neighborhood of books surrounding Beautiful Data In addition to clusters of like topics, in Figure 7-9 there are clusters around the publishers, designated by the node colors: red books connect to other red books and yellows connect to other yellows.

Equivalent nodes may be substitutable for one another in the network. As an author, I would not like my book to be substitutable with many other books! As a reader, however, I would like equivalent choices. In Figure 7-9, the two books with the most similar link pattern to Beautiful Data are Cloud Application Architectures and Programming the Semantic Web. Another value-added service that Amazon provides is reader-submitted book reviews. A person considering the purchase of a particular book may be aided by the many reviews that accumulate. Unfortunately, the reviews can be skewed: an author with a large personal network can quickly get a dozen or more glowing reviews of his latest book posted to Amazon, and a reader with a grudge can do the opposite.

Our example is taken from the fields of art history and archaeology, as these are my trained areas of expertise. However, the findings I present here—namely, that it is possible to visualize the complex structures of databases—can also be demonstrated for many other structured data collections, including biological research databases and massive collaborative efforts such as DBpedia, Freebase, or the Semantic Web. All these data collections share a number of properties, which are not straightforward but are important if we want to make use of the recorded data or if we have to decide where and how our energies and funds should be spent in improving them. Curated databases in art history and archaeology come in a number of flavors, such as library catalogs and bibliographies, image archives, museum inventories, and more general research databases.


pages: 201 words: 63,192

Graph Databases by Ian Robinson, Jim Webber, Emil Eifrem

Amazon Web Services, anti-pattern, bioinformatics, business logic, commoditize, corporate governance, create, read, update, delete, data acquisition, en.wikipedia.org, fault tolerance, linked data, loose coupling, Network effects, recommendation engine, semantic web, sentiment analysis, social graph, software as a service, SPARQL, the strength of weak ties, web application

I like the fact that you liked that car), hypergraphs typically require fewer primitives than property graphs. Triples Triple stores come from the Semantic Web movement, where researchers are interested in large-scale knowledge inference by adding semantic markup to the links that connect Web resources.10 To date, very little of the Web has been marked up in a useful fashion, so running queries across the semantic layer is uncommon. Instead, most effort in the semantic Web appears to be invested in harvesting useful data and relationship infor‐ mation from the Web (or other more mundane data sources, such as applications) and depositing it in triple stores for querying.

Using triples, we can capture facts, such as “Ginger dances with Fred” and “Fred likes ice cream.” Individually, single triples are semantically rather poor, but en-masse they provide a rich dataset from which to harvest knowledge and infer connections. Triple stores typically provide SPARQL ca‐ pabilities to reason about stored RDF data.11 RDF—the lingua franca of triple stores and the Semantic Web—can be serialized several ways. RDF encoding of a simple three-node graph shows the RDF/XML format. Here we see how triples come together to form linked data. RDF encoding of a simple three-node graph. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.example.org/ter <rdf:Description rdf:about="http://www.example.org/ginger"> <name>Ginger Rogers</name> <occupation>dancer</occupation> <partner rdf:resource="http://www.example.org/fred"/> </rdf:Description> 10. http://www.w3.org/standards/semanticweb/ 11.

See http://www.w3.org/TR/rdf-sparql-query/ and http://www.w3.org/RDF/ Graph Databases | 185 <rdf:Description rdf:about="http://www.example.org/fred"> <name>Fred Astaire</name> <occupation>dancer</occupation> <likes rdf:resource="http://www.example.org/ice-cream"/> </rdf:Description> </rdf:RDF> W3C support That they produce logical representations of triples doesn’t mean triple stores necessarily have triple-like internal implementations. Most triple stores, however, are unified by their support for Semantic Web technology such as RDF and SPARQL. While there’s nothing particularly special about RDF as a means of serializing linked data, it is en‐ dorsed by the W3C and therefore benefits from being widely understood and well doc‐ umented. The query language SPARQL benefits from similar W3C patronage. In the graph database space there is a similar abundance of innovation around graph serialization formats (e.g.


pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Infrastructure as a Service, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

_:usa a :Location; :name "United States"; :type "country"; :within _:namerica. _:namerica a :Location; :name "North America"; :type "continent". The semantic web If you read more about triple-stores, you may get sucked into a maelstrom of articles written about the semantic web. The triple-store data model is completely independent of the semantic web—for example, Datomic [40] is a triple-store that does not claim to have anything to do with it.vii But since the two are so closely linked in many people’s minds, we should discuss them briefly. The semantic web is fundamentally a simple and reasonable idea: websites already publish information as text and pictures for humans to read, so why don’t they also publish information as machine-readable data for computers to read?

The Resource Description Framework (RDF) [41] was intended as a mechanism for different websites to publish data in a consistent format, allowing data from different websites to be automatically combined into a web of data—a kind of internet-wide “database of everything.” Unfortunately, the semantic web was overhyped in the early 2000s but so far hasn’t shown any sign of being realized in practice, which has made many people cynical about it. It has also suffered from a dizzying plethora of acronyms, overly complex standards proposals, and hubris. However, if you look past those failings, there is also a lot of good work that has come out of the semantic web project. Triples can be a good internal data model for applications, even if you have no interest in publishing RDF data on the semantic web. The RDF data model The Turtle language we used in Example 2-7 is a human-readable format for RDF data.

in databases, Dataflow Through Databases-Archival storage in message-passing, Distributed actor frameworks in service calls, Data encoding and evolution for RPC flexibility in document model, Schema flexibility in the document model for analytics, Stars and Snowflakes: Schemas for Analytics-Stars and Snowflakes: Schemas for Analytics for JSON and XML, JSON, XML, and Binary Variants merits of, The Merits of Schemas schema migration on railways, Reprocessing data for application evolution Thrift and Protocol Buffers, Thrift and Protocol Buffers-Datatypes and schema evolutionschema evolution, Field tags and schema evolution traditional approach to design, fallacy in, Deriving several views from the same event log searchesbuilding search indexes in batch processes, Building search indexes k-nearest neighbors, Specialization for different domains on streams, Search on streams partitioned secondary indexes, Partitioning and Secondary Indexes secondaries (see leader-based replication) secondary indexes, Other Indexing Structures, Glossarypartitioning, Partitioning and Secondary Indexes-Partitioning Secondary Indexes by Term, Summarydocument-partitioned, Partitioning Secondary Indexes by Document index maintenance, Maintaining derived state term-partitioned, Partitioning Secondary Indexes by Term problems with dual writes, Keeping Systems in Sync, Reasoning about dataflows updating, transaction isolation and, The need for multi-object transactions secondary sorts, Sort-merge joins sed (Unix tool), Simple Log Analysis self-describing files, Code generation and dynamically typed languages self-joins, Summary self-validating systems, A culture of verification semantic web, The semantic web semi-synchronous replication, Synchronous Versus Asynchronous Replication sequence number ordering, Sequence Number Ordering-Timestamp ordering is not sufficientgenerators, Synchronized clocks for global snapshots, Noncausal sequence number generators insufficiency for enforcing constraints, Timestamp ordering is not sufficient Lamport timestamps, Lamport timestamps use of timestamps, Timestamps for ordering events, Synchronized clocks for global snapshots, Noncausal sequence number generators sequential consistency, Implementing linearizable storage using total order broadcast serializability, Isolation, Weak Isolation Levels, Serializability-Performance of serializable snapshot isolation, Glossarylinearizability versus, What Makes a System Linearizable?


pages: 518 words: 49,555

Designing Social Interfaces by Christian Crumlish, Erin Malone

A Pattern Language, Amazon Mechanical Turk, anti-pattern, barriers to entry, c2.com, carbon footprint, cloud computing, collaborative editing, commons-based peer production, creative destruction, crowdsourcing, en.wikipedia.org, Firefox, folksonomy, Free Software Foundation, game design, ghettoisation, Howard Rheingold, hypertext link, if you build it, they will come, information security, lolcat, Merlin Mann, Nate Silver, Network effects, Potemkin village, power law, recommendation engine, RFC: Request For Comment, semantic web, SETI@home, Skype, slashdot, social bookmarking, social graph, social software, social web, source of truth, stealth mode startup, Stewart Brand, systems thinking, tacit knowledge, telepresence, the long tail, the strength of weak ties, The Wisdom of Crowds, web application, Yochai Benkler

——continued Download at WoweBook.Com Keeping Up 333 Social Metadata and Future Uses Today’s metadata and future uses Much of the future lies in an idea that has been around many years. This future is embracing the Semantic Web and similar tools, building on semantic information. Semantically relevant metadata improves relevance in information and media retrieval. The Semantic Web has had a chicken-and-egg problem, as it has the tools to do fantastic things with structured information, but it has been held back by the lack of that structured information at a scale that will make a difference. Today’s social semi structured information gives enough of a boost that Semantic Web tools can begin to provide their long-promised power. The Semantic Web is based on triples of information: subject, predicate, and object.

Social Metadata and Future Uses Summary In short, the social tools we are using today are letting us focus on what we care about, and through the use of lightweight connections and light form fields, are capturing and building a web of semi structured information. The web of semistructured information working as metadata provides enough of a foundation to be used as structured elements, which are the fodder for using Semantic Web tools. This use of the Semantic Web tools leads to better relevance and discernment providing drastically better search to find exactly what the seeker wants, not just what is good enough. This also provides much better capability for aggregating information people care about and would like to keep closer to them.


pages: 339 words: 92,785

I, Warbot: The Dawn of Artificially Intelligent Conflict by Kenneth Payne

Abraham Maslow, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, AlphaGo, anti-communist, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, Asperger Syndrome, augmented reality, Automated Insights, autonomous vehicles, backpropagation, Black Lives Matter, Bletchley Park, Boston Dynamics, classic study, combinatorial explosion, computer age, computer vision, Computing Machinery and Intelligence, coronavirus, COVID-19, CRISPR, cuban missile crisis, data science, deep learning, deepfake, DeepMind, delayed gratification, Demis Hassabis, disinformation, driverless car, drone strike, dual-use technology, Elon Musk, functional programming, Geoffrey Hinton, Google X / Alphabet X, Internet of things, job automation, John Nash: game theory, John von Neumann, Kickstarter, language acquisition, loss aversion, machine translation, military-industrial complex, move 37, mutually assured destruction, Nash equilibrium, natural language processing, Nick Bostrom, Norbert Wiener, nuclear taboo, nuclear winter, OpenAI, paperclip maximiser, pattern recognition, RAND corporation, ransomware, risk tolerance, Ronald Reagan, self-driving car, semantic web, side project, Silicon Valley, South China Sea, speech recognition, Stanislav Petrov, stem cell, Stephen Hawking, Steve Jobs, strong AI, Stuxnet, technological determinism, TED Talk, theory of mind, TikTok, Turing machine, Turing test, uranium enrichment, urban sprawl, V2 rocket, Von Neumann architecture, Wall-E, zero-sum game

Alas, the real world is rarely so obliging, which is one reason for concentrating research on narrowly constrained ‘toy universes’ like chess. Early efforts at language processing and translation amply demonstrate the difficulties. One popular technique here was to relate concepts to each other in so-called ‘semantic webs’—basically topological maps of how objects and ideas are inter-related. But language is rather more fluid and inconsistent than this neat approach suggests. Describing material objects is hard enough—try to define a chair in a rigorous and parsimonious fashion. And that’s before you get into immaterial ideas, like beauty.

And for expert systems, the heuristic was, in any case, exogenous—provided by humans, not learned by the machine. Fundamentally, the differences had to do with meaning and understanding. Humans were capable of that, but computers, except for a brittle grasp on some hand-coded categories, as in a semantic web, were not. In recent decades, philosophers of mind have become increasingly interested in the notion of ‘embodied cognition’, exploring the way in which knowledge for us is generated by and situated within a body. From the myriad noisy signals perceived by receptors in the body, the human mind is able to create meaningful representations and build flexible models of reality, updated in near real time.

But there were significant limits to the new approach. This was a distinctly non-human intelligence. It lacked our ability to understand, to display what we might call ‘common sense’, an ability to relate objects and concepts, and to intuit deeper meanings behind its observations. Logical AI could do that, albeit not very convincingly, with its semantic webs. But connectionism wasn’t even at the races. And while the same algorithms were flexible enough to be applied to any sort of reward-optimising task (like the score in Space Invaders), they were flexible in a narrow sense only. The human brain is hugely parallel, with multiple systems working away on various tasks.


pages: 400 words: 94,847

Reinventing Discovery: The New Era of Networked Science by Michael Nielsen

Albert Einstein, augmented reality, barriers to entry, bioinformatics, Cass Sunstein, Climategate, Climatic Research Unit, conceptual framework, dark matter, discovery of DNA, Donald Knuth, double helix, Douglas Engelbart, Douglas Engelbart, Easter island, en.wikipedia.org, Erik Brynjolfsson, fault tolerance, Fellow of the Royal Society, Firefox, Free Software Foundation, Freestyle chess, Galaxy Zoo, Higgs boson, Internet Archive, invisible hand, Jane Jacobs, Jaron Lanier, Johannes Kepler, Kevin Kelly, Large Hadron Collider, machine readable, machine translation, Magellanic Cloud, means of production, medical residency, Nicholas Carr, P = NP, P vs NP, publish or perish, Richard Feynman, Richard Stallman, selection bias, semantic web, Silicon Valley, Silicon Valley startup, Simon Singh, Skype, slashdot, social intelligence, social web, statistical model, Stephen Hawking, Stewart Brand, subscription business, tacit knowledge, Ted Nelson, the Cathedral and the Bazaar, The Death and Life of Great American Cities, The Nature of the Firm, The Wisdom of Crowds, University of East Anglia, Vannevar Bush, Vernor Vinge, Wayback Machine, Yochai Benkler

The Yale Law Journal, 112:369–446, 2002. [13] Yochai Benkler. The Wealth of Networks. New Haven: Yale University Press, 2006. [14] Tim Berners-Lee. Weaving the Web. New York: Harper Business, 2000. [15] Tim Berners-Lee and James Hendler. Publishing on the semantic web. Nature, 410:1023–1024, April 26, 2001. [16] Tim Berners-Lee, James Hendler, and Ora Lassila. The semantic web. Scientific American, May 17, 2001. [17] Mario Biagioli. Galileo’s Instruments of credit: Telescopes, images, secrecy. Chicago: University of Chicago Press, 2006. [18] Peter Block. Community: The Structure of Belonging. San Francisco: Berrett Koehler, 2008

In a similar way, today thousands of people and organizations have their own ideas about the best way to build the data web. All are aiming in roughly the same direction, but there are many differences in the details. Perhaps the best-known effort comes from academia, where many researchers are developing an approach called the semantic web. In the business world, the state of affairs is more fluid, as companies try out many different ways of sharing data. Because of these many approaches, there are passionate arguments about the best way to build the data web, often carried out with great conviction and certainty. But the data web is still in its infancy, and it’s too early to say which approach will succeed.

Interestingly, Hydra has played and lost twice in games of correspondence chess, against correspondence chess grandmaster Arno Nickel. Nickel was, however, allowed to use computer chess programs in these games. A full record of Hydra’s games may be found at [40]. p 119: Chuck Hansen’s book is [92]. The story I recount about Hansen’s methodology is told in Richard Rhodes’s book How to Write, [182], page 61. p 120: On the semantic web, see [16, 15] and http://www.w3.org/standards/semanticweb/. A stimulating alternate point of view is [88]. p 120: For Obama’s memorandum on transparency and open government, see [158]. p 123: The beautiful summary of Einstein’s general theory of relativity, “Spacetime tells matter how to move; matter tells spacetime how to curve,” is due to John Wheeler [240].


Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

_:lucy a :Person; :name _:idaho a :Location; :name _:usa a :Location; :name _:namerica a :Location; :name "Lucy"; "Idaho"; "United States"; "North America"; :bornIn _:idaho. :type "state"; :within _:usa. :type "country"; :within _:namerica. :type "continent". The semantic web If you read more about triple-stores, you may get sucked into a maelstrom of articles written about the semantic web. The triple-store data model is completely independ‐ ent of the semantic web—for example, Datomic [40] is a triple-store that does not claim to have anything to do with it.vii But since the two are so closely linked in many people’s minds, we should discuss them briefly. The semantic web is fundamentally a simple and reasonable idea: websites already publish information as text and pictures for humans to read, so why don’t they also publish information as machine-readable data for computers to read?

The Resource Description Framework (RDF) [41] was intended as a mechanism for different web‐ sites to publish data in a consistent format, allowing data from different websites to be automatically combined into a web of data—a kind of internet-wide “database of everything.” Unfortunately, the semantic web was overhyped in the early 2000s but so far hasn’t shown any sign of being realized in practice, which has made many people cynical about it. It has also suffered from a dizzying plethora of acronyms, overly complex standards proposals, and hubris. However, if you look past those failings, there is also a lot of good work that has come out of the semantic web project. Triples can be a good internal data model for appli‐ cations, even if you have no interest in publishing RDF data on the semantic web. The RDF data model The Turtle language we used in Example 2-7 is a human-readable format for RDF data.

# SPARQL Because RDF doesn’t distinguish between properties and edges but just uses predi‐ cates for both, you can use the same syntax for matching properties. In the following expression, the variable usa is bound to any vertex that has a name property whose value is the string "United States": (usa {name:'United States'}) # Cypher ?usa :name "United States". # SPARQL SPARQL is a nice query language—even if the semantic web never happens, it can be a powerful tool for applications to use internally. Graph-Like Data Models | 59 Graph Databases Compared to the Network Model In “Are Document Databases Repeating History?” on page 36 we discussed how CODASYL and the relational model competed to solve the problem of many-tomany relationships in IMS.


pages: 255 words: 76,495

The Facebook era: tapping online social networks to build better products, reach new audiences, and sell more stuff by Clara Shih

Benchmark Capital, business process, call centre, Clayton Christensen, cloud computing, commoditize, conceptual framework, corporate governance, crowdsourcing, glass ceiling, jimmy wales, Marc Benioff, Mark Zuckerberg, Metcalfe’s law, Network effects, pets.com, pre–internet, rolodex, Salesforce, Savings and loan crisis, semantic web, sentiment analysis, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, social graph, social web, software as a service, tacit knowledge, Tony Hsieh, web application

Recognizing this, both established media players like Thomson Reuters as well as Internet upstarts like Metaweb are investing in efforts to define a “semantic Web.”These initiatives seek to classify Web content in a way that is understandable by computers so that the tedious work of linking information on the Web can be automated. For example, say there is a semantic Web system for selling used books over the Internet. The first time someone visits the site, she will be asked to identify herself with information such as name, address, e-mail, and phone number. The data provided is stored in a Resource Description Framework (RDF) file to provide context about this person for future visits to this site and any other semantic Web site. Similarly, any data provided about a particular book, such as title, publisher, ISBN number, cover image, and description would be stored in an RDF file about the textbook to provide context for any future references to this textbook.

See innovation rapport with customers, building/sustaining, 75-76 RDF (Resource Description Framework) file, 28 reciprocity ring, 52-56 recommendations influence of, 3 social recommendations, 101-103 reconciling grassroots initiatives, 202 recruiting, 123-124 advice for job candidates, 141 candidate references, obtaining, 134-135 credibility of recruiter, establishing, 136 employee poaching, 141-142 employer reputation, marketing, 136 keeping contact with candidates, 137 alumni networks, 139-140 financial services example, 138-139 nonplacements, 138 successful placements, 137 in Second Life, 22 social networking sites for, 124-126 sourcing candidates, 126-127 active candidates, 127-128 college students, 129-131 from specialized networks, 132-133 From the Library of Kerri Ross Skyrock passive candidates, 128-129 by “reading between the lines,” 133-134 referrals, 131-132 Red Bull Energy Drink Web site, Facebook Connect and, 41 reducing testing variance with hypersegments, 167 references customer references, 74-75 obtaining, 134-135 referrals influence of, 3 for recruiting, 131-132 relationship interest, as hypertargeting dimension, 165 relationship status, as hypertargeting dimension, 164 relationships breaking off, 49 in customer organizations, navigating, 69-71 importance in business, 3 latent value of, 48-49 shifting nature of, 211 tagging, 51 valuable relationships, discovering, 48 weak ties, maintaining, 44-47 reminders, birthday, 189 requests, asking, 56 Resource Description Framework (RDF) file, 28 return on investment (ROI) of social networking, 205 risk management, 198-200 brand misrepresentation, 200 identity, privacy, security, 198-199 intellectual property, confidentiality, 199-200 RockYou, 38 Rogers, Everett, 118 ROI (return on investment) of social networking, 205 RSS feeds, 27 Ryze, recruiting via, 125 S sales, 61-62 B2B versus B2C, 63-64 CRM versus social networking sites, 80 multiple network structures in, 79-80 online social graph benefits, 62-63 building/sustaining customer rapport, 75-76 credibility, establishing, 64-65 customer references, 74-75 first call success rate, 67-69 navigating customer organizations, 69-71 postsales customer support, 77-78 prospecting for customers, 65-67 sales team collaboration, 72-73 social capital in, 71 233 sales leads, obtaining, 65-67 Salesforce CRM, 41 Salesforce Ideas, 112 Salesforce to Salesforce, 209-210 salesmen (in social epidemics), 101 Sanrio, unsanctioned communities related to, 149-150 Schatzer, Jeff, 140 Scott, Adrian, 125 search engine marketing, 27-28 searching for friends, 189 Second Life, 21-22 security risks, 198-199 segmenting audience. See hypersegments of audience selecting hypersegments of audience, 164-167 common problems, 166 connecting with social networking goals, 166 dimensions, list of, 164-165 reducing testing variance, 167 social networking sites for corporate presence, 156 The Selfish Gene (Dawkins), 109 semantic Web, 28 setting up accounts. See Facebook, account setup shopping, 101-103 Simply Hired, 125 Skyrock, 221 From the Library of Kerri Ross 234 Slide Slide, 38 small businesses in future of social networking, 208 Social Actions, 170, 173-175 social ads, 98-99 social applications, 190 social business, future of collaboration among organizations, 209-210 community strengthening, 208 in enterprise IT, 206-207 innovator’s dilemma, 204 organizational transparency and productivity, 207 relationships, shifting nature of, 211 ROI, 205 for small businesses, 208 trends, 205-206 social capital, 206 advantages of increasing, 43-44 building, 188-192 business implications of, 44 discovering valuable relationships, 48 entrepreneurial networks versus clique networks, 49-50 flattened communication hierarchy, 52 latent value of relationships, 48-49 online networks as supplement to offline networks, 50-51 reciprocity ring, 52-56 weak ties relationships, 44-47 defined, 43 in sales, 71 social distribution, 96-97 passive word of mouth, 97-98 reaching new audiences, 101 social ads, 98-99 social shopping and recommendations, 101-103 viral marketing, 99-101 social epidemics, types of people driven by, 100-101 social filtering, 25, 29-34 social innovation.


pages: 371 words: 93,570

Broad Band: The Untold Story of the Women Who Made the Internet by Claire L. Evans

4chan, Ada Lovelace, air gap, Albert Einstein, Bletchley Park, British Empire, Charles Babbage, colonial rule, Colossal Cave Adventure, computer age, crowdsourcing, D. B. Cooper, dark matter, dematerialisation, Doomsday Book, Douglas Engelbart, Douglas Engelbart, Douglas Hofstadter, East Village, Edward Charles Pickering, game design, glass ceiling, Grace Hopper, Gödel, Escher, Bach, Haight Ashbury, Harvard Computers: women astronomers, Honoré de Balzac, Howard Rheingold, HyperCard, hypertext link, index card, information retrieval, Internet Archive, Jacquard loom, John von Neumann, Joseph-Marie Jacquard, junk bonds, knowledge worker, Leonard Kleinrock, machine readable, Mahatma Gandhi, Mark Zuckerberg, Menlo Park, military-industrial complex, Mondo 2000, Mother of all demos, Network effects, old-boy network, On the Economy of Machinery and Manufactures, packet switching, PalmPilot, pets.com, rent control, RFC: Request For Comment, rolodex, San Francisco homelessness, semantic web, side hustle, Silicon Valley, Skype, South of Market, San Francisco, Steve Jobs, Steven Levy, Stewart Brand, subscription business, tech worker, technoutopianism, Ted Nelson, telepresence, The Soul of a New Machine, Wayback Machine, Whole Earth Catalog, Whole Earth Review, women in the workforce, Works Progress Administration, Y2K

How and why that data are linked is becoming increasingly important, especially as we teach machines to interpret connections for us—in order for artificial intelligence to understand the Web, it will need an additional layer of machine-readable information on top of our documents, a kind of meta-Web that proponents call the Semantic Web. While humans might understand connections intuitively, and are willing to ignore when links rot or lead nowhere, computers require more consistent information about the source, the destination, and the meaning of every link. “That was the core of Microcosm,” Wendy says. When she began to participate in building the Semantic Web in the 2000s, it was “so exciting because I could see all my original research ideas coming to life in the Web world. We still couldn’t do all the things we could do in Microcosm in the ’90s, but we could see how effective our linkbases were.”

., 70, 72, 73 San Francisco Bay Area, 95–98, 100–102, 104–6, 109, 135, 179 San Francisco Public Library, 106 San Francisco Switchboard, 97 Scientific Data Systems 940 (SDS-940), 96–99, 101, 103–5, 107, 109–10 search engines, 115, 154 Sears, 225 Secret Paths games, 232, 236 Sega, 233 Semantic Web, 174 Seneca Falls Conference on the Rights of Women, 11 September 11 terrorist attacks, 150, 200–201, 204 Sharp, Elliot, 187 Shepard, Alan, 24 Sherman, Aliza, 131–32, 140, 143, 214 Shirky, Clay, 181 Shone, Mya, 96, 104–6 Silicon Alley, 146, 182, 184, 186–88, 191–94, 196–201, 218, 219 Silicon Alley Reporter, 198–99 Simpson, O.

., 60 Web: use of word, 153 see also World Wide Web Web sites and pages, 131, 135, 153, 154, 184, 186 life spans of, 170 for women, see women’s Web see also World Wide Web WELL, The, 132–35, 140, 149, 153, 179–80, 205–6, 209 Wellington, Arthur Wellesley, Duke of, 16 Wescoff, Marlyn, 39, 43, 48, 49 Westheimer, Ellen, 114 WHOIS, 119–20 Whole Earth Catalog, 100, 132 Whole Earth Review, 132, 183 Wilcox, Patricia (Pat Crowther), 84–94, 110 William the Conqueror, 155 Wired, 138, 194, 206 women, 4–5 computers as viewed by, 229 men posing as, 143–44, 179 and software vs. hardware, 51–52 women, working, 23–24 black, 24 wage discrimination and, 23, 77, 78 women.com, 205, 214–21 Women in Telecommunications (WIT), 141–42, 144, 205 Women’s Internet History Project, 143 Women’space, 239 women’s Web, 131, 216, 221, 223, 233 advertising and, 214–16, 218, 219, 221 iVillage, 214, 216–21 women.com, 205, 214–21 Women’s WIRE, 205–15 Women’s WIRE, 205–15 Woods, Don, 90 Word, 188–95, 201–3, 205, 214, 215 Works Progress Administration, 25 World War I, 24 World War II, 24, 25, 28–29, 31, 32, 34–37, 40, 45, 47, 50, 51, 53–55 atomic bomb in, 36 Pearl Harbor attack, 27–29, 32 World Wide Web, 102, 131, 152, 154, 159, 165, 168–72, 177, 203, 204, 222 browsers for, see browsers commercialization of, 204–5, 217, 241; see also advertising conferences on, 170, 173 early true believers and, 187–88, 196, 197, 202 hypertext and, 168–70, 201 links on, 168–70, 201 Microcosm viewer for, 172–73 number of women on, 214 search engines for, 115, 154 Semantic Web and, 174 see also Internet; Web sites and pages Xerox, 161 Xerox PARC, 162–66, 210 Y2K, 71, 194 Yankelovich, Nicole, 162 Zapata Corporation, 194, 201 Zeroes + Ones (Plant), 238 About the Author CLAIRE L. EVANS is a contributor to VICE, The Guardian, WIRED, and Aeon, and is the founding editor of Terraform, VICE’s science-fiction vertical.


pages: 402 words: 110,972

Nerds on Wall Street: Math, Machines and Wired Markets by David J. Leinweber

"World Economic Forum" Davos, AI winter, Alan Greenspan, algorithmic trading, AOL-Time Warner, Apollo 11, asset allocation, banking crisis, barriers to entry, Bear Stearns, Big bang: deregulation of the City of London, Bob Litterman, book value, business cycle, butter production in bangladesh, butterfly effect, buttonwood tree, buy and hold, buy low sell high, capital asset pricing model, Charles Babbage, citizen journalism, collateralized debt obligation, Cornelius Vanderbilt, corporate governance, Craig Reynolds: boids flock, creative destruction, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, Danny Hillis, demand response, disintermediation, distributed generation, diversification, diversified portfolio, electricity market, Emanuel Derman, en.wikipedia.org, experimental economics, fake news, financial engineering, financial innovation, fixed income, Ford Model T, Gordon Gekko, Hans Moravec, Herman Kahn, implied volatility, index arbitrage, index fund, information retrieval, intangible asset, Internet Archive, Ivan Sutherland, Jim Simons, John Bogle, John Nash: game theory, Kenneth Arrow, load shedding, Long Term Capital Management, machine readable, machine translation, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, market fragmentation, market microstructure, Mars Rover, Metcalfe’s law, military-industrial complex, moral hazard, mutually assured destruction, Myron Scholes, natural language processing, negative equity, Network effects, optical character recognition, paper trading, passive investing, pez dispenser, phenotype, prediction markets, proprietary trading, quantitative hedge fund, quantitative trading / quantitative finance, QWERTY keyboard, RAND corporation, random walk, Ray Kurzweil, Reminiscences of a Stock Operator, Renaissance Technologies, risk free rate, risk tolerance, risk-adjusted returns, risk/return, Robert Metcalfe, Ronald Reagan, Rubik’s Cube, Savings and loan crisis, semantic web, Sharpe ratio, short selling, short squeeze, Silicon Valley, Small Order Execution System, smart grid, smart meter, social web, South Sea Bubble, statistical arbitrage, statistical model, Steve Jobs, Steven Levy, stock buybacks, Tacoma Narrows Bridge, the scientific method, The Wisdom of Crowds, time value of money, tontine, too big to fail, transaction costs, Turing machine, two and twenty, Upton Sinclair, value at risk, value engineering, Vernor Vinge, Wayback Machine, yield curve, Yogi Berra, your tax dollars at work

This opens yet another front in the algo wars. In the past year, we have seen the major news providers, Dow Jones25 and Reuters,26 offering costly high-end, low-latency news feeds designed for machines. In addition to being faster, they include extensive XML tagging for a variety of stories. These semantic Web approaches allow clever algo warriors to extract the salient facts with much greater accuracy than they could achieve writing code to parse plaintext feeds designed for human readers. What kind of tags are they talking about? The Dow Jones product is described as over 150 macroeconomic indicators, in developed markets, and a wide range of news on publicly traded U.S. and Canadian firms, as well as some in the United Kingdom.

Collectively, the new alphabet soup of technologies—AI, IA, NLP, and IR (artificial intelligence, intelligence amplification, natural language processing, and information retrieval, for those with a bigger soup bowl)—provides a means to make sense of patterns in the data collected in enterprise and global search. These means are molecular search, the use of persistent software agents so you don’t have to keep doing the same thing all the time; the semantic Web, using the information associated with data at the point of origin so there is less guessing about meaning of what find; and modern user interfaces and visualizations, so you can prioritize what you find, and focus on the important and the valuable in a timely way. The SEC: The Mother Lode of Pre-News The Securities and Exchange Commission is a good place to start looking for pre-news.

The elimination of the time disadvantage for ordinary investors, paying only with their taxes and using the SEC web site, is an overdue improvement in a system that (literally) delivered yesterday’s news for its first six years of existence. Other advances were slower in coming. The filings themselves remain unstructured text files, with no sign of the semantic Web and XML ideas that are used to deliver meaningful information in many other contexts. After years of lip service to modernizing EDGAR, SEC chairman Christopher Cox (who took office in 2005) made a serious effort to do so, replacing TRW with more Internet-savvy firms and actually demonstrating prototypes that allow extraction of specific content from SEC filings.


pages: 422 words: 86,414

Hands-On RESTful API Design Patterns and Best Practices by Harihara Subramanian

blockchain, business logic, business process, cloud computing, continuous integration, create, read, update, delete, cyber-physical system, data science, database schema, DevOps, disruptive innovation, domain-specific language, fault tolerance, information security, Infrastructure as a Service, Internet of things, inventory management, job automation, Kickstarter, knowledge worker, Kubernetes, loose coupling, Lyft, machine readable, microservices, MITM: man-in-the-middle, MVC pattern, Salesforce, self-driving car, semantic web, single page application, smart cities, smart contracts, software as a service, SQL injection, supply-chain management, web application, WebSocket

Learning about Web 3.0 The following sections focus on Web 3.0 and the evolution and history of web services. Web 3.0 is generally referred to as executing semantic web, or read-write-execute web. Web 3.0 decentralizes services such as search, social media, and chat applications that are dependent on a single organization to function. Semantic and web services are the primary constituents of Web 3.0. The following diagram depicts layers of typical Web 3.0 constructs. The semantic web layers are Static Web, Translations, and Rich Internet Applications (RIA) or Rich Web built on top of the internet: The layered structure of Web 3.0 This data-driven web adjusts according to the user's searches, for instance, if a user searches for architecture patterns, the advertisements shown are more relevant to architecture and patterns; it even remembers your last search and combines the last searched queries as well.

Services are applications hosted on application servers and interact with other applications through interfaces. SOA is not a technology or a programming language; it's a set of principles, procedures, and methodologies to develop a software application. Learning about resource-oriented architecture Resource-oriented architecture is a foundation of the semantic web (please refer to the Web 3.0 section of this chapter). The idea of ROA is to use basic, well-understood, and well-known web technologies (HTTP, URI, and XML) along with the core design principles. As we all know, the primary focus of web services is to connect information systems, and ROA defines a structural design or set of guidelines to support and implement interactions within any connected resources.


pages: 394 words: 118,929

Dreaming in Code: Two Dozen Programmers, Three Years, 4,732 Bugs, and One Quest for Transcendent Software by Scott Rosenberg

A Pattern Language, AOL-Time Warner, Benevolent Dictator For Life (BDFL), Berlin Wall, Bill Atkinson, c2.com, call centre, collaborative editing, Computer Lib, conceptual framework, continuous integration, Do you want to sell sugared water for the rest of your life?, Donald Knuth, Douglas Engelbart, Douglas Engelbart, Douglas Hofstadter, Dynabook, en.wikipedia.org, Firefox, Ford Model T, Ford paid five dollars a day, Francis Fukuyama: the end of history, Free Software Foundation, functional programming, General Magic , George Santayana, Grace Hopper, Guido van Rossum, Gödel, Escher, Bach, Howard Rheingold, HyperCard, index card, intentional community, Internet Archive, inventory management, Ivan Sutherland, Jaron Lanier, John Markoff, John Perry Barlow, John von Neumann, knowledge worker, L Peter Deutsch, Larry Wall, life extension, Loma Prieta earthquake, machine readable, Menlo Park, Merlin Mann, Mitch Kapor, Neal Stephenson, new economy, Nicholas Carr, no silver bullet, Norbert Wiener, pattern recognition, Paul Graham, Potemkin village, RAND corporation, Ray Kurzweil, Richard Stallman, Ronald Reagan, Ruby on Rails, scientific management, semantic web, side project, Silicon Valley, Singularitarianism, slashdot, software studies, source of truth, South of Market, San Francisco, speech recognition, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Stewart Brand, Strategic Defense Initiative, Ted Nelson, the Cathedral and the Bazaar, Therac-25, thinkpad, Turing test, VA Linux, Vannevar Bush, Vernor Vinge, Wayback Machine, web application, Whole Earth Catalog, Y2K

You could model just about anything in a simple three-part format that looked something like the subject-verb-object arrangement of a simple English sentence: <this> <has-relationship-with> <that> Then they discovered that the answer they’d come up with had already been outlined and at least partially implemented by researchers led by Tim Berners-Lee, the scientist who had invented the World Wide Web a dozen years before. Berners-Lee had a dream he called the Semantic Web, an upgraded version of the existing Web that relied on smarter and more complex representations of data. The Semantic Web would be built on a technical foundation called RDF, for Resource Description Framework. RDF stores all information in “triples”—statements in three parts that declare relationships between things. This was very close to the structure Sagen had independently sketched out, with the advantage that a considerable amount of work over several years had already been put into codifying the details.

The RDF-based Shimmer repository was something Morgen Sagen had built expressly as a prototype; it couldn’t simply be hitched onto the real Chandler. Besides, John Anderson had never gotten the RDF religion. The whole RDF enterprise had a reputation for academic complexity and impracticality. There were lots of papers about the Semantic Web, but not a lot of working software. As one programmer after another had a look at the world of RDF, each came to a similar conclusion: It was “scary.” Anderson knew how much work programming Chandler’s user interface would be. He had been there before and understood how critical it was to keep that job manageable; it was the area most likely to cause endless delay.

Carroll, Wall Street Journal, May 11, 1990. CHAPTER 3 PROTOTYPES AND PYTHON “a crew of twenty people”: Artist Chris Cobb’s project at the Adobe Bookstore in San Francisco is chronicled at the McSweeney’s Web site at http://www.mcsweeneys.net/links/events/chriscobb. htm. Information about the Semantic Web and RDF is at http://www.w3.org/2001/sw/. “plan to throw one away” and “promise to deliver a throwaway”: Frederick Brooks, The Mythical Man-Month (Addison Wesley, 1995), pp. 115–16. “The programmer, like the poet”: Ibid., p. 7. “The lunatic, the lover, and the poet”: William Shakespeare, A Midsummer Night’s Dream, Act V, sc. i.


pages: 570 words: 115,722

The Tangled Web: A Guide to Securing Modern Web Applications by Michal Zalewski

barriers to entry, business process, defense in depth, easy for humans, difficult for computers, fault tolerance, finite state, Firefox, Google Chrome, information retrieval, information security, machine readable, Multics, RFC: Request For Comment, semantic web, Steve Jobs, telemarketer, Tragedy of the Commons, Turing test, Vannevar Bush, web application, WebRTC, WebSocket

In the traditional HTML parser in Firefox versions prior to 4, any occurrence of “--”, later followed by “>”, is also considered good enough. The Battle over Semantics The low-level syntax of the language aside, HTML is also the subject of a fascinating conceptual struggle: a clash between the ideology and the reality of the online world. Tim Berners-Lee always championed the vision of a semantic web, an interconnected system of documents in which every functional block, such as a citation, a snippet of code, a mailing address, or a heading, has its meaning explained by an appropriate machine-readable tag (say, <cite>, <code>, <address>, or <h1> to <h6>). This approach, he and other proponents argued, would make it easier for machines to crawl, analyze, and index the content in a meaningful way, and in the near future, it would enable computers to reason using the sum of human knowledge.

With the help of CSS, the developers simply started relying on a soup of semantically agnostic <span> and <div> tags to build everything from headings to user-clickable buttons, all in a manner completely opaque to any automated content extraction tools. Despite having had a lasting impact on the design of the language, in some ways, the idea of a semantic web may be becoming obsolete: Online content less frequently maps to the concept of a single, viewable document, and HTML is often reduced to providing a convenient drawing surface and graphic primitives for JavaScript applications to build their interfaces with. * * * [25] To process HTML documents, Internet Explorer uses the Trident engine (aka MSHTML); Firefox and some derived products use Gecko; Safari, Chrome, and several other browsers use WebKit; and Opera relies on Presto.

See Safari (Apple), Type-Specific Content Inclusion, Content Rendering with Browser Plug-ins, Sun Java, Cross-Domain Content Inclusion application/binary, Detection for Non-HTTP Files application/javascript document type, Plaintext Files application/json document type, Plaintext Files, Unrecognized Content Type application/mathml+xml document type, Audio and Video application/octet-stream document type, Special Content-Type Values, Detection for Non-HTTP Files application/x-www-for-urlencoded, Forms and Form-Triggered Requests Arce, Ivan, Information Security in a Nutshell Arya, Abhishek, Character Set Inheritance and Override asynchronous XMLHttpRequest, Interactions with Browser Credentials Atom, RSS and Atom Feeds authentication, in HTTP, HTTP Cookie Semantics Authorization header (HTTP), HTTP Authentication authorization, vs. authentication, HTTP Cookie Semantics B background parameter for HTML tags, Type-Specific Content Inclusion background processes, in JavaScript, Content-Level Features \ (backslashes) in URLs, browser acceptance of, Fragment ID backslashes (\) in URLs, browser acceptance of, Fragment ID ` (backticks), as quote characters, Understanding HTML Parser Behavior, The Document Object Model backticks (`), as quote characters, Understanding HTML Parser Behavior, The Document Object Model Bad Request status error (400), 300-399: Redirection and Other Status Messages bandwidth, and XML, XML User Interface Language Barth, Adam, Nonconvergence of Visions, Frame Descendant Policy and Cross-Domain Communications, XDomainRequest, Other Uses of the Origin Header, Sandboxed Frames, URL- and Protocol-Level Proposals Base64 encoding, Header Character Set and Encoding Schemes basic credential-passing method, HTTP Authentication Bell-La Padula security model, Flirting with Formal Solutions, Flirting with Formal Solutions Berners-Lee, Tim, Tales of the Stone Age: 1945 to 1994, Tales of the Stone Age: 1945 to 1994, The First Browser Wars: 1995 to 1999, Hypertext Transfer Protocol, Hypertext Markup Language, Document Parsing Modes and semantic web, Document Parsing Modes World Wide Web browser, Tales of the Stone Age: 1945 to 1994 World Wide Web Consortium, The First Browser Wars: 1995 to 1999 binary HTTP, URL- and Protocol-Level Proposals bitmap images, browser recognition of, Plaintext Files blacklists, Same-Origin Policy for XMLHttpRequest, Same-Origin Policy for XMLHttpRequest, New and Upcoming Security Features malicious URLs, New and Upcoming Security Features of HTTP headers in XMLHttpRequest, Same-Origin Policy for XMLHttpRequest BMP file format, Type-Specific Content Inclusion BOM (byte order marks), Character Set Handling Breckman, John, Referer Header Behavior browser cache, Caching Behavior, Caching Behavior, Caching Behavior information in, Caching Behavior poisoning, Caching Behavior browser extensions and UI, Pseudo-URLs browser market share, May 2011, Global browser market share, May 2011 browser wars, The First Browser Wars: 1995 to 1999, A Glimpse of Things to Come browser-managed site permissions, Extrinsic Site Privileges browser-side scripts, Browser-Side Scripts buffer overflow, Common Problems Unique to Server-Side Code bugs, preventing classes of, Enlightenment Through Taxonomy Bush, Vannevar, Toward Practical Approaches byte order marks (BOM), Character Set Handling C cache manifests, URL- and Protocol-Level Proposals cache poisoning, Access to Internal Networks, Vulnerabilities Specific to Web Applications Cache-Control directive, Resolution of Duplicate or Conflicting Headers, Caching Behavior cache.


The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do by Erik J. Larson

AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, Alignment Problem, AlphaGo, Amazon Mechanical Turk, artificial general intelligence, autonomous vehicles, Big Tech, Black Swan, Bletchley Park, Boeing 737 MAX, business intelligence, Charles Babbage, Claude Shannon: information theory, Computing Machinery and Intelligence, conceptual framework, correlation does not imply causation, data science, deep learning, DeepMind, driverless car, Elon Musk, Ernest Rutherford, Filter Bubble, Geoffrey Hinton, Georg Cantor, Higgs boson, hive mind, ImageNet competition, information retrieval, invention of the printing press, invention of the wheel, Isaac Newton, Jaron Lanier, Jeff Hawkins, John von Neumann, Kevin Kelly, Large Hadron Collider, Law of Accelerating Returns, Lewis Mumford, Loebner Prize, machine readable, machine translation, Nate Silver, natural language processing, Nick Bostrom, Norbert Wiener, PageRank, PalmPilot, paperclip maximiser, pattern recognition, Peter Thiel, public intellectual, Ray Kurzweil, retrograde motion, self-driving car, semantic web, Silicon Valley, social intelligence, speech recognition, statistical model, Stephen Hawking, superintelligent machines, tacit knowledge, technological singularity, TED Talk, The Coming Technological Singularity, the long tail, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, theory of mind, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, Yochai Benkler

AI researchers hoped that the ease of use would encourage even non-­experts to make ­triples—­a dream articulated by Tim Berners-­Lee, the creator of HTML. Berners-­L ee called it the Semantic Web, ­because with web pages converted into machine-­readable RDF statements, computers would know what every­thing meant. The web would be intelligently readable by computers. AI researchers touted knowledge bases as the end of brittle systems using only statistics—­because, ­after all, statistics a­ ren’t sufficient for understanding. The Semantic Web and other knowledge base–­centered proj­ects in AI could fi­nally “know” about the world, and do more than just number-­crunch.

., 256, 267 Mayer-­Schönberger, Viktor, 143, 144, 257 McCarthy, John, 50, 107, 285n11 Microsoft Tay (chatbot), 229 Mill, John Stuart, 242, 243 machine learning: definition of, 133; Miller, George, 50 empirical constraint in, 146–149; minimax technique, 284n1 frequency assumption in, 150–154; Minsky, Marvin, 50, 52, 222 Mitchell, Melanie, 165 model saturation in, 155–156; as narrow AI, 141–142; as simulation, Mitchell, Tom, 133 138–140; supervised learning in, model saturation, 155–156 modus ponens, 108–109, 168–169 137 machine learning systems, 28–30 monologues, Turing test variation using, 194–195, 212–214 MacIntyre, Alasdair, 70–71 I ndex monotonic inference, 167 Mountcastle, Vernon, 264 Mumford, Lewis, 95, 98 “The Murders in the Rue Morgue” (short story, Poe), 89–94 Musk, Elon, 1, 75, 97 narrowness, 226–231 Nash, John, 50 National Resource Council (NRC), 53, 54 natu­ral language: AI understanding to, 228–229; computers’ understanding of, 48, 51–55; context of, 204; continued prob­lems with translation of, 56–57; in speech-­ driven virtual assistance applications, 227; Turing test of, 50, 194; understanding and meaning of, 205–214; Winograd schemas test of, 195–203 neocortical theories: Hawkins’s, 263; Kurzweil’s, 264–266 neural networks, 75 neuroscience, 246; collaboration in, 245–247; Data Brain proj­ects in, 251–254; ­Human Brain Proj­ect in, 247–251; neocortical theories in, 263–268; theory versus big data in, 255–256, 261–262 Newell, Allan, 51, 110 news stories, 152–154 Newton, Isaac, 187, 276 Nietz­sche, Friedrich Wilhelm, 63 no f­ ree lunch theorem, 29 noisy channel approach, 56 non-­monotonic inference, 167–168 normality assumption, 150–151 309 Norvig, Peter, 77, 155, 156 nuclear weapons, 45 Numenta (firm), 263 observation: generalizing from, 117–118; in induction, 115; limitations of, 121; turning into data, 291n12 operant conditioning (behaviorism), 69 orthography, 205 overfitting (statistical), 258–261 Page, Larry, 56 Pearl, Judea, 130–131, 174, 291n13 Peirce, Charles Sanders, 95–99; on abduction, 25–26, 160–168; on abductive inference, 99–102, 190; on guessing, 94, 183–184; on “Logical Machines,” 232–233, 273; theft of watch from, 157–160, 289–290n5; on types of inference, 171–172, 181; on weight of evidence, 24 Peirce, Juliette, 98 Perin, Rodrigo, 266 PIQUANT (AI system), 221–224 Poe, Edgar Allan, 89–94, 99, 102 Polanyi, Michael, 73–74 Popper, Karl, 70–71, 122 positivism, 63 pragmatics (context for natu­ral language), 204, 206, 214–215, 296n1 predictions, 69–73; big data used for, 143–144; induction in, 116, 124; limits to, 130 predictive neuroscience, 254 probabilistic inference, 102 programming languages for early computers, 284n2 310 I ndex scripts, 181–182 se­lection prob­lem, 182–184, 186–190 self-­d riving cars, 127, 278; saturation prob­lem in, 155–156 random sampling, 118 self-­reference, in mathe­matics, 13 reading comprehension, 195 semantic role labeling, 138–139 real-­t ime inference, 101 semantics, 206 reasoning, 176 Semantic Web, 179 religion, 63 semi-­supervised learning, 133–134 sequential classification, 136–137 resource description framework sequential learning, 136–137 (RDF), 179 R.U.R. (play, Capek), 83 sexual desire, 79 Russell, Bertrand, 110, 121–124, 173 Shannon, Claude, 19, 50, 56 Russell, Stuart, 42–43, 84; on h­ uman Shaw, Cliff, 110 ingenuity, 69, 274; on intelligence, Shelley, Mary, 50, 238, 280 76–78; on language and common Shelley, Percy, 238, 280 sense, 131; on limits to supercomSherlock Holmes (fictional character), 90, 121, 161, 291n13 puters, 39; on logic in AI, 107; on prob­lems in AI, 279; on superintel- Shirky, Clay, 241, 242 ligent computers, 80–83; on Turing Silver, Nate, 259–261 test, 193; on two-­player games, 125 Simon, Herbert, 50–52, 110 simulations, machine learning as, Rutherford, Ernest, 43 138–140 Singularity, 45–46, 50; Kurzweil on, Salmon, Wesley, 112 47–48; origins of concept, 286n3 sampling, 117–118 sarcasm, 151–152, 296n1 Skinner, B.


pages: 188 words: 9,226

Collaborative Futures by Mike Linksvayer, Michael Mandiberg, Mushon Zer-Aviv

4chan, AGPL, Benjamin Mako Hill, British Empire, citizen journalism, cloud computing, collaborative economy, corporate governance, crowdsourcing, Debian, Eben Moglen, en.wikipedia.org, fake news, Firefox, informal economy, jimmy wales, Kickstarter, late capitalism, lolcat, loose coupling, Marshall McLuhan, means of production, Naomi Klein, Network effects, optical character recognition, packet switching, planned obsolescence, postnationalism / post nation state, prediction markets, Richard Stallman, semantic web, Silicon Valley, slashdot, Slavoj Žižek, stealth mode startup, technoutopianism, The future is already here, the medium is the message, The Wisdom of Crowds, web application, WikiLeaks, Yochai Benkler

The announcement of Google Wave was probably the most ambitious vision for a decentralized collaborative protocol coming from Silicon Valley. It was launched with the same celebratory terminology propagated by the selfproclaimed social media gurus, only to be terminated a year later when the vision could not live up to the hype. Web 3.0 is also bullshit. The term was initially used to describe a web enhanced by Semantic Web technologies. However, these technologies have been developed painstakingly over essentially the entire history of the web and deployed increasingly in the la er part of the last decade. Many Open Source projects reject the arbitrary and counter-productive terminology of “dot releases” the difference between the 2.9 release and the 3.0 release should not necessarily be more substantial than the one between 2.8 and 2.9.

Publishing the entire “research compendium” under appropriate terms (e.g. usually public domain for data, a free so ware license for so ware, and a liberal Creative Commons license for articles and other content) and in open formats has recently been called “reproducible research”—in computational fields, the publication of such a compendium gives other researches all of the tools they need to build upon one’s work. Standards are also very important for enabling scientific collaboration, and not just coarse standards like RSS. The Semantic Web and in particular ontologies have sometimes been ridiculed by consumer web developers, but they are necessary for science. How can one treat the world's scientific literature as a database if it isn't possible to identify, for example, a specific chemical or gene, and agree on a name for the chemical or gene in question that different programs can use interoperably?


pages: 903 words: 235,753

The Stack: On Software and Sovereignty by Benjamin H. Bratton

1960s counterculture, 3D printing, 4chan, Ada Lovelace, Adam Curtis, additive manufacturing, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic trading, Amazon Mechanical Turk, Amazon Robotics, Amazon Web Services, Andy Rubin, Anthropocene, augmented reality, autonomous vehicles, basic income, Benevolent Dictator For Life (BDFL), Berlin Wall, bioinformatics, Biosphere 2, bitcoin, blockchain, Buckminster Fuller, Burning Man, call centre, capitalist realism, carbon credits, carbon footprint, carbon tax, carbon-based life, Cass Sunstein, Celebration, Florida, Charles Babbage, charter city, clean water, cloud computing, company town, congestion pricing, connected car, Conway's law, corporate governance, crowdsourcing, cryptocurrency, dark matter, David Graeber, deglobalization, dematerialisation, digital capitalism, digital divide, disintermediation, distributed generation, don't be evil, Douglas Engelbart, Douglas Engelbart, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, Eratosthenes, Ethereum, ethereum blockchain, Evgeny Morozov, facts on the ground, Flash crash, Frank Gehry, Frederick Winslow Taylor, fulfillment center, functional programming, future of work, Georg Cantor, gig economy, global supply chain, Google Earth, Google Glasses, Guggenheim Bilbao, High speed trading, high-speed rail, Hyperloop, Ian Bogost, illegal immigration, industrial robot, information retrieval, Intergovernmental Panel on Climate Change (IPCC), intermodal, Internet of things, invisible hand, Jacob Appelbaum, James Bridle, Jaron Lanier, Joan Didion, John Markoff, John Perry Barlow, Joi Ito, Jony Ive, Julian Assange, Khan Academy, Kim Stanley Robinson, Kiva Systems, Laura Poitras, liberal capitalism, lifelogging, linked data, lolcat, Mark Zuckerberg, market fundamentalism, Marshall McLuhan, Masdar, McMansion, means of production, megacity, megaproject, megastructure, Menlo Park, Minecraft, MITM: man-in-the-middle, Monroe Doctrine, Neal Stephenson, Network effects, new economy, Nick Bostrom, ocean acidification, off-the-grid, offshore financial centre, oil shale / tar sands, Oklahoma City bombing, OSI model, packet switching, PageRank, pattern recognition, peak oil, peer-to-peer, performance metric, personalized medicine, Peter Eisenman, Peter Thiel, phenotype, Philip Mirowski, Pierre-Simon Laplace, place-making, planetary scale, pneumatic tube, post-Fordism, precautionary principle, RAND corporation, recommendation engine, reserve currency, rewilding, RFID, Robert Bork, Sand Hill Road, scientific management, self-driving car, semantic web, sharing economy, Silicon Valley, Silicon Valley ideology, skeuomorphism, Slavoj Žižek, smart cities, smart grid, smart meter, Snow Crash, social graph, software studies, South China Sea, sovereign wealth fund, special economic zone, spectrum auction, Startup school, statistical arbitrage, Steve Jobs, Steven Levy, Stewart Brand, Stuxnet, Superbowl ad, supply-chain management, supply-chain management software, synthetic biology, TaskRabbit, technological determinism, TED Talk, the built environment, The Chicago School, the long tail, the scientific method, Torches of Freedom, transaction costs, Turing complete, Turing machine, Turing test, undersea cable, universal basic income, urban planning, Vernor Vinge, vertical integration, warehouse automation, warehouse robotics, Washington Consensus, web application, Westphalian system, WikiLeaks, working poor, Y Combinator, yottabyte

Just as for today's web pages, search providers are eager to provide more direct services built directly into query results themselves by predictively interpreting the intention of the query and providing its likely solution along with tools for the User to accomplish that intention as part of the search result. These are techniques sometimes associated with the semantic web, for which structured data are linked and associated to allow instrumental relations with other data, making the web as a whole more programmable by Users. Through various combinations of open or proprietary exigetics of data, and perhaps a sequence of application programming interfaces (APIs), a query entered as “book me a ticket to New York” can activate a series of secondary inquiries to calendars, banks, flight schedules, airline databases, bank accounts, and so on and, through this, initiate the cascading programming resulting in that booking.

This resulting platform might provide for the programming and counterprogramming of the resulting object landscapes and event graphs, putting them to direct use, as well as providing secondary metadata about their efficacy or accuracy. Just as most of the traffic on the Internet today is machine-to-machine, or at least machine generated, so too a semantic web of things21 would be correlated less by the cognitive dispositions or instrumental intentions of human Users, but those of “objects” and other instances within the larger meta-assemblage all querying and programming one another without human intervention or supervision. In the hype, it's easy to forget that the Internet of Things is also an Internet for Things (or for any addressable entity, however immaterial).

Fabbaloo, April 7, 2010, http://www.fabbaloo.com/blog/2010/4/7/the-3d-printer-virus-really.html. 20.  Cory Doctorow, “Metacrap: Putting the Torch to Seven Straw-men of the Meta-Utopia,” Well, August 26, 2011. 21.  Payam Barnaghi, Cory Henson, Kerry Taylor, and Wei Wang, “Semantics for the Internet of Things: Early Progress and Back to the Future,” International Journal on Semantic Web and Information System 8, no. 1 (2012): 1–21, http://knoesis.org/library/download/IJSWIS_SemIoT.pdf. 22.  Yann Moulier-Boutang, Cognitive Capitalism (London: Polity Press, 2012). 23.  Open Internet of Things Assembly, “Bill of Rights” http://postscapes.com/open-internet-of-things-assembly. (July 17, 2012). 24. 


pages: 680 words: 157,865

Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design by Diomidis Spinellis, Georgios Gousios

Albert Einstein, barriers to entry, business intelligence, business logic, business process, call centre, continuous integration, corporate governance, database schema, Debian, domain-specific language, don't repeat yourself, Donald Knuth, duck typing, en.wikipedia.org, fail fast, fault tolerance, financial engineering, Firefox, Free Software Foundation, functional programming, general-purpose programming language, higher-order functions, iterative process, linked data, locality of reference, loose coupling, meta-analysis, MVC pattern, Neal Stephenson, no silver bullet, peer-to-peer, premature optimization, recommendation engine, Richard Stallman, Ruby on Rails, semantic web, smart cities, social graph, social web, SPARQL, Steve Jobs, Stewart Brand, Strategic Defense Initiative, systems thinking, the Cathedral and the Bazaar, traveling salesman, Turing complete, type inference, web application, zero-coupon bond

He has designed and built network matrix switch control systems, online games, 3D simulation/visualization environments, Internet-distributed computing platforms, P2P, and Semantic Web-based systems. He has a B.S. in computer science from the College of William and Mary and currently lives in Fairfax, Virginia. He is the president of Bosatsu Consulting, Inc., a professional services company focused on web architecture, resource-oriented computing, the Semantic Web, advanced user interfaces, scalable systems, security consulting, and other technologies of the late 20th and early 21st centuries. Diomidis Spinellis is an Associate Professor in the Department of Management Science and Technology at the Athens University of Economics and Business in Greece.

It was a forked version of Apache 1.0, written in C and reflecting the state of the art at the time.[28] It has been a steady piece of Internet infrastructure since then, but it was showing its age and needed modernization, particularly to support the W3C TAG’s 303 recommendation and higher volumes of use. Most of the data was accessible through web pages or ad hoc CGI-bin scripts because at the time, the browser seemed like the only real client to serve. As we started to realize the applicability of persistent, unambiguous identifiers for use in the Semantic Web, life sciences, publication, and similar communities, we knew that it was time to rethink the architecture to be more useful for both people and software. The PURL system was designed to mediate the tension between good names and resolvable names. Anyone who has been publishing content on the Web over time knows that links break when content gets moved around.

These cache policies can be set on a per-folder, per-account, or backend basis and are enforced by a component running in its own thread in the server with lower priority, regularly inspecting the database for data that can be purged according to all policies applicable to it. Among the major missing puzzle pieces that were identified in the architecture at the 2007 meeting was how to approach searching and semantic linking. The KDE 4 platform was gaining powerful solutions for pervasive indexing, rich metadata handling, and semantic webs with the Strigi and Nepomuk projects, which could yield very interesting possibilities when integrated with Akonadi. It was unclear whether a component feeding data into Strigi for full indexing could be implemented as an agent, a separate process operating on the notifications from the core, or would need to be integrated into the server application itself for performance reasons.


Digital Accounting: The Effects of the Internet and Erp on Accounting by Ashutosh Deshmukh

accounting loophole / creative accounting, AltaVista, book value, business continuity plan, business intelligence, business logic, business process, call centre, computer age, conceptual framework, corporate governance, currency risk, data acquisition, disinformation, dumpster diving, fixed income, hypertext link, information security, interest rate swap, inventory management, iterative process, late fees, machine readable, money market fund, new economy, New Journalism, optical character recognition, packet switching, performance metric, profit maximization, semantic web, shareholder value, six sigma, statistical model, supply chain finance, supply-chain management, supply-chain management software, telemarketer, transaction costs, value at risk, vertical integration, warehouse automation, web application, Y2K

This style sheet is based on XHTML and will be needed to display the formatted memo exactly on the Web. • Web communication and services: Languages in this area handle communications in the client-server environment, define protocols for exchange of information and describe Web services. • Semantic Web and RDF: XML is also providing building blocks for Semantic Web. Semantic web refers to the extension of the current Web where information definition is standardized, enabling automated tools to process data. This standardization also leads to better linking of information and easier discovery, integration and reuse of data. Such a web will enable collaborative processing of data by humans and computers in a symbiotic fashion.

Examples of XML supplementary technologies Validation and linking technologies • • • • • • • XML DTD XML Schema XLink X Base XPath XPointer XFragment Transformation technologies • • • • • • • XSL XSLT Canonical XML XQuery XInclude DOM SAX • • • • • • • • o o o • • • • • • MathML SMIL SMIL Animation SVG Voice XML/CCXML XHTML XFrames XForms CC/PP SOAP/XMLP WSDL/WSCL RDF RDF Schema XML Signature XKMS P3P Encrypted Data Processor technologies XML applications • Non-text applications • Publishing on the Web o Web communication and services • Semantic Web and Resource Description Framework (RDF) Security applications increasingly popular and may become a dominant method. DTDs have a solid installed base, since DTDs are used in SGML, and are not likely to vanish in the short-term. The popularity of the Web can be partially traced to its hyperlink capabilities.


pages: 573 words: 163,302

Year's Best SF 15 by David G. Hartwell; Kathryn Cramer

air freight, Black Swan, disruptive innovation, experimental subject, Future Shock, Georg Cantor, gravity well, job automation, Kuiper Belt, phenotype, precautionary principle, quantum entanglement, semantic web

You only had to see the man to know that he had an agenda like no other writer in the world. “When Calvino finished his six lectures,” mused Massimo, “they carried him off to CERN in Geneva and they made him work on the ‘Semantic Web.’ The Semantic Web works beautifully, by the way. It’s not like your foul little Internet—so full of spam and crime.” He wiped the sausage knife on an oil-stained napkin. “I should qualify that remark. The Semantic Web works beautifully—in the Italian language. Because the Semantic Web was built by Italians. They had a little bit of help from a few French Oulipo writers.” “Can we leave this place now? And visit this Italy you boast so much about?


pages: 742 words: 137,937

The Future of the Professions: How Technology Will Transform the Work of Human Experts by Richard Susskind, Daniel Susskind

23andMe, 3D printing, Abraham Maslow, additive manufacturing, AI winter, Albert Einstein, Amazon Mechanical Turk, Amazon Robotics, Amazon Web Services, Andrew Keen, Atul Gawande, Automated Insights, autonomous vehicles, Big bang: deregulation of the City of London, big data - Walmart - Pop Tarts, Bill Joy: nanobots, Blue Ocean Strategy, business process, business process outsourcing, Cass Sunstein, Checklist Manifesto, Clapham omnibus, Clayton Christensen, clean water, cloud computing, commoditize, computer age, Computer Numeric Control, computer vision, Computing Machinery and Intelligence, conceptual framework, corporate governance, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, data science, death of newspapers, disintermediation, Douglas Hofstadter, driverless car, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, Filter Bubble, full employment, future of work, Garrett Hardin, Google Glasses, Google X / Alphabet X, Hacker Ethic, industrial robot, informal economy, information retrieval, interchangeable parts, Internet of things, Isaac Newton, James Hargreaves, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Khan Academy, knowledge economy, Large Hadron Collider, lifelogging, lump of labour, machine translation, Marshall McLuhan, Metcalfe’s law, Narrative Science, natural language processing, Network effects, Nick Bostrom, optical character recognition, Paul Samuelson, personalized medicine, planned obsolescence, pre–internet, Ray Kurzweil, Richard Feynman, Second Machine Age, self-driving car, semantic web, Shoshana Zuboff, Skype, social web, speech recognition, spinning jenny, strong AI, supply-chain management, Susan Wojcicki, tacit knowledge, TED Talk, telepresence, The Future of Employment, the market place, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Tragedy of the Commons, transaction costs, Turing test, Two Sigma, warehouse robotics, Watson beat the top human players on Jeopardy!, WikiLeaks, world market for maybe five computers, Yochai Benkler, young professional

Finally, there are systems that can detect and express emotions (affective computing). Volumes have already been written on each of these four subjects. We try to give an overview rather than make an academic assessment. We are not suggesting, incidentally, that these are the only important developments. We could also have added the ‘semantic web’, ‘search algorithms’, and ‘intelligent agents’.33 But to debate which technologies are primary distracts from the bigger point—that, exploiting various technologies, our machines will continue to become increasingly capable, and able to discharge more and more tasks that we used to think were the distinctive province of human beings.

, Wall Street Journal, 23 Feb. 2011 <http://www.wsj.com> (accessed 28 March 2015). Seidman, Dov, How (Hoboken, NJ: Wiley, 2007). Sennett, Richard, The Craftsman (London: Penguin Books, 2009). Sennett, Richard, Together (London: Allen Lane, 2012). Shadbolt, Nigel, Wendy Hall, and Tim Berners-Lee, ‘The Semantic Web Revisited’, IEEE Intelligent Systems, 21: 3 (2006), 96–101. Shanteau, James, ‘Cognitive Heuristics and Biases in Behavioral Auditing: Review, Comments, and Observations’, Accounting, Organizations, and Society, 14: 1 (1989), 165–77. Shapiro, Carl, and Hal Varian, Information Rules (Boston: Harvard Business School Press, 1999).

WikiHouse, ‘WikiHouse 4.0’ <http://www.wikihouse.cc/news-2/> (accessed 8 March 2015). Wikistrat, ‘Become an Analyst’ <http://www.wikistrat.com/become-an-analyst> (accessed 8 March 2015). Wilensky, Harold, ‘The Professionalization of Everyone?’, American Journal of Sociology, 70: 2 (1964), 137–58. Wilks, Yorick, ‘What is the Semantic Web and What Will it Do for eScience’, Research Report, No.12, Oxford Internet Institute, October 2006. Winner, Langdon, Autonomous Technology (Cambridge, Mass.: MIT Press, 1977). Winner, Langdon, ‘Technology Today: Utopia or Dystopia?’, in Technology and the Rest of Culture, ed. Arien Mack (Columbus, Ohio: Ohio State University Press, 2001).


pages: 66 words: 9,247

MongoDB and Python by Niall O’Higgins

cloud computing, Debian, fault tolerance, semantic web, web application

I’ve worked with most of the usual relational databases (MSSQL Server, MySQL, PostgreSQL) and with some very interesting nonrelational databases (Freebase.com’s Graphd/MQL, Berkeley DB, MongoDB). MongoDB is at this point the system I enjoy working with the most, and choose for most projects. It sits somewhere at a crossroads between the performance and pragmatism of a relational system and the flexibility and expressiveness of a semantic web database. It has been central to my success in building some quite complicated systems in a short period of time. I hope that after reading this book you will find MongoDB to be a pleasant database to work with, and one which doesn’t get in the way between you and the application you wish to build.


pages: 272 words: 83,378

Digital Barbarism: A Writer's Manifesto by Mark Helprin

Albert Einstein, anti-communist, Berlin Wall, carbon footprint, computer age, cotton gin, crowdsourcing, Easter island, hive mind, independent contractor, invention of writing, Jacquard loom, lateral thinking, plutocrats, race to the bottom, semantic web, Silicon Valley, Silicon Valley ideology, the scientific method, Yogi Berra, zero-sum game

Had each Turkish soldier had to decide individually whether or not to make that winter ascent, they all might have thought harder and better about it in the absence of so many others carrying them and their orders along on an utterly worthless wave of quick-set belief. In the electronic culture, however, the decision has already been made in regard to such things. To quote Jeremy Carroll, chief product architect of Top Quadrant, discussing an aspect of his work: “Semantic Web technology…” will make possible “consensus instructions from many different sources, or instructions that other people have already found helpful (rather than back-breaking searches and comparisons).”23 It is the labor, care, and learning in making such comparisons that bring the benefits of experience, a sharp eye, and good judgment.

See also Taxes from copyright, 111 royalties, 47, 51, 74, 78, 113, 158 A River Runs Through It (Maclean), 164 Robinson Crusoe (DeFoe), 119 Roth, Philip, 114 Royalties, 47, 51, 74, 78, 113, 158 Rushdie, Salman, 75 Russia, copyright law in, 128 S Satie, Erik, 80 Schlesinger, Arthur, 89 Schumann, Robert, 80 SEC. See Securities and Exchange Commission Second World War, 192, 196 Securities and Exchange Commission (SEC), 29 “Semantic Web technology,” 64 Seward, William H., 59–60 Sex, 17 Shakespeare, William, 179, 194 Sharpton, Al, 166 Signet Society (Harvard), 183 Silent Spring (Carson), 105 Silicon Valley, xiii, 205 Sinatra, Frank, 24 Skull and Bones (Yale), 182 Smith, Kate, 52 Social contract, 173 Socialism, 168 Social Security, 81 Social theorists, 185 Software, piracy, 38, 214 A Soldier of the Great War (Helprin), 113 Sonny Bono Copyright Term Extension Act (1998), 120, 125–126, 127, 139, 140 Sports, 94–95 Star Wars, 164 Statute of Queen Anne (1709), 124, 127 Stevens, Justice, 115 Story, Joseph, 124 Sweden, copyright law in, 128 Switzerland, copyright law in, 128 T Tartakovsky, Joseph, 87 Taubman, Arnold, 24 Taxes, 81, 86, 87, 171–172.


pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists by Gary Marcus, Jeremy Freeman

23andMe, Albert Einstein, backpropagation, bioinformatics, bitcoin, brain emulation, cloud computing, complexity theory, computer age, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data acquisition, data science, deep learning, Drosophila, epigenetics, Geoffrey Hinton, global pandemic, Google Glasses, ITER tokamak, iterative process, language acquisition, linked data, mouse model, optical character recognition, pattern recognition, personalized medicine, phenotype, race to the bottom, Richard Feynman, Ronald Reagan, semantic web, speech recognition, stem cell, Steven Pinker, supply-chain management, synthetic biology, tacit knowledge, traumatic brain injury, Turing machine, twin studies, web application

INCF has worked closely with partners, including the Allen Institute, University of Oslo, Duke University, University of Edinburgh, and others, to develop a standard coordinate space for mouse brain data, dubbed “Waxholm Space” and web services to facilitate translation between mouse brain atlases. In addition, in collaboration with the Neuroscience Information Framework (NIF, San Diego) it has produced community consensus ontologies and nomenclatures for neurons and brain structures, which have been placed in a public wiki (www.neurolex.org) employing the latest semantic web technologies. INCF supports working groups of experts from around the world to produce new standards, tools, services, and guidelines for the global community. With the advent of multiple large-scale brain initiatives around the world, INCF is well positioned to help coordinate standards and infrastructure between such projects at a global scale.

Ontologies formalize the definitions of these structures and their names (and synonyms) so that the relationships between entities are explicit. Alternatively, by annotating the data with the spatial coordinates of where it was measured, it would be associated with the volume that has been named reticular nucleus of the thalamus. Careful curation of data and annotating it using the next generation semantic web technologies and spatial coordinates, each piece of data will be part of a rich brain atlas integrated with a web of knowledge about the brain. The Neuroinformatics Platform, coordinated by groups from the École Polytechnique Fédérale de Lausanne (EPFL), Karolinska Institute, University of Oslo, Forschungszentrum Jülich, Universidad Politécnica de Madrid, and Radboud Universiteit Nijmegen, will provide the tools for organizing neuroscience data in atlases that bring together collections of data about the mouse and human brains from around the world.


pages: 597 words: 119,204

Website Optimization by Andrew B. King

AltaVista, AOL-Time Warner, bounce rate, don't be evil, Dr. Strangelove, en.wikipedia.org, Firefox, In Cold Blood by Truman Capote, information retrieval, iterative process, Kickstarter, machine readable, medical malpractice, Network effects, OSI model, performance metric, power law, satellite internet, search engine result page, second-price auction, second-price sealed-bid, semantic web, Silicon Valley, slashdot, social bookmarking, social graph, Steve Jobs, the long tail, three-martini lunch, traumatic brain injury, web application

=photovoltaic+panels Even better, remove all the variable query characters (?, $, and #): http://www.example.com/photovoltaic+panels By eliminating the suffix to URIs, you avoid broken links and messy mapping when changing technologies in the future. See Chapter 9 for details on URI rewriting. See also "Cool URIs for the Semantic Web," at http://www.w3.org/TR/cooluris/. Write compelling summaries In newspaper parlance, the description that goes with a headline is called a deck or a blurb. Great decks summarize the story in a couple of sentences, enticing the user to read the article. Include keywords describing the major theme of the article for search engines.

The following short example shows how the statement mentioned previously could be encoded in a web page: <div xmlns:dc="http://pURI.org/dc/elements/1.1/" about="http://www.oreilly.com/catalog/9780596515089"> <span property="dc:creator">Andy King</span> </div> Soon, a significant amount of traffic from search engines will depend on the extent to which the underlying site makes useful structured data available. Things such as microformats and RDFa have been around in various forms for years, but now that search engines are noticing them, SEO practitioners are starting to take note, too. * * * [34] http://www.techcrunch.com/2008/03/13/yahoo-embraces-the-semantic-web-expect-the-web-to-organize-itself-in-a-hurry/ [35] http://www.microformats.org [36] http://gmpg.org/xfn/11 [37] http://microformats.org/wiki/hcard [38] http://www.ietf.org/rfc/rfc2426.txt [39] http://www.w3.org/RDF/ [40] http://www.w3.org/TR/rdfa-syntax/ [41] http://www.w3.org/TR/curie Chapter 2.


Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data by Dipanjan Sarkar

bioinformatics, business intelligence, business logic, computer vision, continuous integration, data science, deep learning, Dr. Strangelove, en.wikipedia.org, functional programming, general-purpose programming language, Guido van Rossum, information retrieval, Internet of things, invention of the printing press, iterative process, language acquisition, machine readable, machine translation, natural language processing, out of africa, performance metric, premature optimization, recommendation engine, self-driving car, semantic web, sentiment analysis, speech recognition, statistical model, text mining, Turing test, web application

Semantic network around the concept fish In the network in Figure 1-17, we can see some of the concepts discussed earlier around fish and also specific types of fish like eel, salmon, shark, and so on, which can be hyponyms to the concept fish. These semantic networks are formally denoted and represented by semantic data models using graph structures, where concepts or entities are the nodes and the edges denote the relationships. The Semantic Web is as extension of the World Wide Web using semantic metadata annotations and embeddings using data-modeling techniques like Resource Description Framework (RDF) and Web Ontology Language (OWL). In linguistics, we have a rich lexical corpus and database called WordNet, which has an exhaustive list of different lexical entities grouped together based on semantic similarity (for example, synonyms) into synsets.

Keyphrase extraction, also known as terminology extraction, is defined as the process or technique of extracting key important and relevant terms or phrases from a body of unstructured text such that the core topics or themes of the text document(s) are captured in these key phrases. This technique falls under the broad umbrella of information retrieval and extraction. Keyphrase extraction finds its uses in many areas, including the following: Semantic web Query-based search engines and crawlers Recommendation systems Tagging systems Document similarity Translation Keyphrase extraction is often the starting point for carrying out more complex tasks in text analytics or NLP, and the output from this can itself act as features for more complex systems.


pages: 532 words: 139,706

Googled: The End of the World as We Know It by Ken Auletta

"World Economic Forum" Davos, 23andMe, AltaVista, An Inconvenient Truth, Andy Rubin, Anne Wojcicki, AOL-Time Warner, Apple's 1984 Super Bowl advert, Ben Horowitz, bioinformatics, Burning Man, carbon footprint, citizen journalism, Clayton Christensen, cloud computing, Colonization of Mars, commoditize, company town, corporate social responsibility, creative destruction, death of newspapers, digital rights, disintermediation, don't be evil, facts on the ground, Firefox, Frank Gehry, Google Earth, hypertext link, Innovator's Dilemma, Internet Archive, invention of the telephone, Jeff Bezos, jimmy wales, John Markoff, Kevin Kelly, knowledge worker, Larry Ellison, Long Term Capital Management, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Mary Meeker, Menlo Park, Network effects, new economy, Nicholas Carr, PageRank, Paul Buchheit, Peter Thiel, Ralph Waldo Emerson, Richard Feynman, Sand Hill Road, Saturday Night Live, semantic web, sharing economy, Sheryl Sandberg, Silicon Valley, Skype, slashdot, social graph, spectrum auction, stealth mode startup, Stephen Hawking, Steve Ballmer, Steve Jobs, strikebreaker, Susan Wojcicki, systems thinking, telemarketer, the Cathedral and the Bazaar, the long tail, the scientific method, The Wisdom of Crowds, Tipper Gore, Upton Sinclair, vertical integration, X Prize, yield management, zero-sum game

One could argue that the ultimate vertical search would be provided by Artificial Intelligence (AI), computers that could infer what users actually sought. This has always been an obsession of Google’s founders, and they have recruited engineers who specialize in AI. The term is sometimes used synonymously with another, “the semantic Web,” which has long been championed by Tim Berners-Lee. This vision appears to be a long way from becoming real. Craig Silverstein, Google employee number 1, said a thinking machine is probably “hundreds of years away” Marc Andreessen suggests that it is a pipe dream. “We are no closer to a computer that thinks like a person than we were fifty years ago,” he said.

Davenport, “Reverse Engineering Google’s Innovation Machine,” Harvard Business Review, April 2008. 324 Its social network site: author interviews with Google executives in Russia, Jason Bush, “Where Google Isn’t Goliath,” BusinessWeek, June 26, 2008. 324 “These companies air kiss”: author interview with Andrew Lack, October 4, 2007. 324 Facebook had 200 million users: author interview with Sheryl Sandberg, March 30, 2009. 324 “Anybody that gets”: author interview with Bill Campbell, October 8, 2007. 325 Lee began with : author interview with Kwan Lee, February 10, 2009. 325 “lacks a social gene”: author interview with John Borthwick, April 28, 2008. 326 “If I were Google”: author interview with Danny Sullivan, August 27, 2007. 326 The problem with horizontal search: author interview with Jason Calacanus, September 21, 2007. 327 “the semantic web”: Katie Franklin, “Google May Be Displaced, Said World Wide Web Creator Tim Berners-Lee”, Daily Telegraph, March 3, 2008. 327 “hundreds of years away”: author interview with Craig Silverstein, September 17, 2007. 327 “We are no closer”: author interview with Marc Andreessen, March 27, 2008. 327 In his provocative book: Nicholas Carr, The Big Switch: Rewiring the World, from Edison to Google, W.


pages: 303 words: 67,891

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms: Proceedings of the Agi Workshop 2006 by Ben Goertzel, Pei Wang

AI winter, artificial general intelligence, backpropagation, bioinformatics, brain emulation, classic study, combinatorial explosion, complexity theory, computer vision, Computing Machinery and Intelligence, conceptual framework, correlation coefficient, epigenetics, friendly AI, functional programming, G4S, higher-order functions, information retrieval, Isaac Newton, Jeff Hawkins, John Conway, Loebner Prize, Menlo Park, natural language processing, Nick Bostrom, Occam's razor, p-value, pattern recognition, performance metric, precautionary principle, Ray Kurzweil, Rodney Brooks, semantic web, statistical model, strong AI, theory of mind, traveling salesman, Turing machine, Turing test, Von Neumann architecture, Y2K

An editorial panel of internationally well-known scholars is appointed to provide a high quality selection. Series Editors: J. Breuker, R. Dieng-Kuntz, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras, R. Mizoguchi, M. Musen and N. Zhong Volume 157 Recently published in this series Vol. 156. R.M. Colomb, Ontology and the Semantic Web Vol. 155. O. Vasilecas et al. (Eds.), Databases and Information Systems IV – Selected Papers from the Seventh International Baltic Conference DB&IS’2006 Vol. 154. M. Duží et al. (Eds.), Information Modelling and Knowledge Bases XVIII Vol. 153. Y. Vogiazou, Design for Emergence – Collaborative Social Play with Online and Location-Based Media Vol. 152.

NARS can be connected to existing knowledge bases, such as Cyc (for commonsense knowledge), WordNet (for linguistic knowledge), Mizar (for mathematical knowledge), and so on. For each of them, a special interface module should be able to approximately translate knowledge from its original format into Narsese. x The Internet. It is possible for NARS to be equipped with additional modules, which use techniques like semantic web, information retrieval, and data mining, to directly acquire certain knowledge from the Internet, and put them into Narsese. x Natural language interface. After NARS has learned a natural language (as discussed previously), it should be able to accept knowledge from various sources in that language.


pages: 165 words: 50,798

Intertwingled: Information Changes Everything by Peter Morville

A Pattern Language, Airbnb, Albert Einstein, Arthur Eddington, augmented reality, Bernie Madoff, bike sharing, Black Swan, business process, Cass Sunstein, cognitive dissonance, collective bargaining, Computer Lib, disinformation, disruptive innovation, folksonomy, holacracy, index card, information retrieval, Internet of things, Isaac Newton, iterative process, Jane Jacobs, Jeff Hawkins, John Markoff, Kanban, Lean Startup, Lyft, messenger bag, minimum viable product, Mother of all demos, Nelson Mandela, Paul Graham, peer-to-peer, Project Xanadu, quantum entanglement, RFID, Richard Thaler, ride hailing / ride sharing, Schrödinger's Cat, self-driving car, semantic web, sharing economy, Silicon Valley, Silicon Valley startup, single source of truth, source of truth, Steve Jobs, Stewart Brand, systems thinking, Ted Nelson, the Cathedral and the Bazaar, The Death and Life of Great American Cities, the scientific method, The Wisdom of Crowds, theory of mind, uber lyft, urban planning, urban sprawl, Vannevar Bush, vertical integration, zero-sum game

If you look deeper, you’ll see triples – subject, predicate, object – defining semantic relations as precisely as possible. In ontological experiments, domain-specific models of entities, relationships, and attributes push the limits of information visualization and knowledge discovery. We’re on the verge of teaching systems to make links that uncover new questions. Figure 3-4. The Semantic Web is built on triples. Of course, links aren’t limited to digital networks. A book affords random access with its index and citations. A park links places with signs, paths, and bridges. And, off course, we may need a table of contents or a map or a metaphor, so we might know where we can go from where we are.


Beautiful Data: The Stories Behind Elegant Data Solutions by Toby Segaran, Jeff Hammerbacher

23andMe, airport security, Amazon Mechanical Turk, bioinformatics, Black Swan, business intelligence, card file, cloud computing, computer vision, correlation coefficient, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, DARPA: Urban Challenge, data acquisition, data science, database schema, double helix, en.wikipedia.org, epigenetics, fault tolerance, Firefox, Gregor Mendel, Hans Rosling, housing crisis, information retrieval, lake wobegon effect, Large Hadron Collider, longitudinal study, machine readable, machine translation, Mars Rover, natural language processing, openstreetmap, Paradox of Choice, power law, prediction markets, profit motive, semantic web, sentiment analysis, Simon Singh, social bookmarking, social graph, SPARQL, sparse data, speech recognition, statistical model, supply-chain management, systematic bias, TED Talk, text mining, the long tail, Vernor Vinge, web application

In practice, this is usually CONNECTING DATA Download at Boykma.Com 341 a manual process, but if we expect to build systems that can easily integrate hundreds or thousands of databases, we need to find ways to eliminate a lot of the manual work involved in such integrations. Various efforts to resolve these naming problems have been attempted. In the Semantic Web community an effort called “Linked Open Data” has emerged, wherein people are encouraging one another to refer to specific objects (like a movie, a person, or a restaurant) by a standard Universal Resource Indicator (URI), so everyone knows when two people are talking about the same thing. There have also been several efforts to standardize on a set of ontologies, which describe what fields should be used to describe things like a restaurant or a movie in all cases.

His paper on the Birch clustering algorithm received the SIGMOD 10-Year Test-of-Time award, and he has written the widely used text Database Management Systems (with Johannes Gehrke; McGraw-Hill). He is Chair of ACM SIGMOD, and a Fellow of the ACM and IEEE. Toby Segaran is the author of two O’Reilly titles, the very popular Programming Collective Intelligence and the recently released Programming the Semantic Web. He currently works at Metaweb, where he develops large-scale reconciliation algorithms in an attempt to create a free database of shared keys for all other public databases. Prior to working at Metaweb, he started a biotech software company, which was acquired in 2003 by Genstruct, a systems biology company.


pages: 219 words: 63,495

50 Future Ideas You Really Need to Know by Richard Watson

23andMe, 3D printing, access to a mobile phone, Albert Einstein, Alvin Toffler, artificial general intelligence, augmented reality, autonomous vehicles, BRICs, Buckminster Fuller, call centre, carbon credits, Charles Babbage, clean water, cloud computing, collaborative consumption, computer age, computer vision, crowdsourcing, dark matter, dematerialisation, Dennis Tito, digital Maoism, digital map, digital nomad, driverless car, Elon Musk, energy security, Eyjafjallajökull, failed state, Ford Model T, future of work, Future Shock, gamification, Geoffrey West, Santa Fe Institute, germ theory of disease, global pandemic, happiness index / gross national happiness, Higgs boson, high-speed rail, hive mind, hydrogen economy, Internet of things, Jaron Lanier, life extension, Mark Shuttleworth, Marshall McLuhan, megacity, natural language processing, Neil Armstrong, Network effects, new economy, ocean acidification, oil shale / tar sands, pattern recognition, peak oil, personalized medicine, phenotype, precision agriculture, private spaceflight, profit maximization, RAND corporation, Ray Kurzweil, RFID, Richard Florida, Search for Extraterrestrial Intelligence, self-driving car, semantic web, Skype, smart cities, smart meter, smart transportation, space junk, statistical model, stem cell, Stephen Hawking, Steve Jobs, Steven Pinker, Stewart Brand, strong AI, Stuxnet, supervolcano, synthetic biology, tech billionaire, telepresence, The Wisdom of Crowds, Thomas Malthus, Turing test, urban decay, Vernor Vinge, Virgin Galactic, Watson beat the top human players on Jeopardy!, web application, women in the workforce, working-age population, young professional

Web 2.0 A term often used to describe Web applications that help individuals to share information online, examples being sites such as Facebook and YouTube. Sometimes referred to as the participatory or conversational Web. Web 3.0 The next stage of Web development, although the term causes much disagreement. Sometimes refers to the ability of search engines to answer complex questions. It can also refer to the personalized Web, semantic Web or the geo-tagging of information. Web 4.0 Like Web 3.0 but immersive.


pages: 673 words: 164,804

Peer-to-Peer by Andy Oram

AltaVista, big-box store, c2.com, combinatorial explosion, commoditize, complexity theory, correlation coefficient, dark matter, Dennis Ritchie, fault tolerance, Free Software Foundation, Garrett Hardin, independent contractor, information retrieval, Kickstarter, Larry Wall, Marc Andreessen, moral hazard, Network effects, P = NP, P vs NP, p-value, packet switching, PalmPilot, peer-to-peer, peer-to-peer model, Ponzi scheme, power law, radical decentralization, rolodex, Ronald Coase, Search for Extraterrestrial Intelligence, semantic web, SETI@home, Silicon Valley, slashdot, statistical model, Tragedy of the Commons, UUNET, Vernor Vinge, web application, web of trust, Zimmermann PGP

This should be reflected both in the user interface and in the engine itself. Resources and relationships: A historical overview So where does this all leave us? How do we infuse our peer-to-peer applications with the metadata lessons learned from the Web? The core of the World Wide Web Consortium’s (W3C) metadata vision is a concept known as the Semantic Web . This is not a separate Web from the one we currently weave and wander, but a layer of metadata providing richer relationships between the ostensibly disparate resources we visit with our mouse clicks. While HTML’s hyperlinks are simple linear paths lacking any obvious meaning, such semantics do exist and need only a means of expression.

Groove, Why secure email is a failure sandboxing, Sandboxing and wrappers Saudi Arabia and legal status of ISPs, Precedents and parries scalability, Performance Freenet and, Scalability Gnutella and, Scalability web of trust and, Codifying reputation on a wide scale: The PGP web of trust scale-free link distributions, Link distribution in Freenet Schmidt, Bob, Placing nodes on the network, Host caches Schmidt, Eric, File sharing: Napster and successors Schneier, Bruce, Cryptography fundamentals, Groove versus email scoring algorithms, Scoring algorithms–Scoring algorithms factoring credibility into, Credibility Free Haven, Reputation systems incorporating credibility into, Scoring algorithms Reputation Server, Reputation metrics–Reputation metrics treating reputations as probabilities, Scoring algorithms vs. privacy of transaction data, Privacy and information leaks scoring postings/posters on Slashdot, Who will moderate the moderators: Slashdot scoring systems, Scoring systems–True decentralization adversarial approach to, Scoring algorithms aspects of, Aspects of a scoring system attacks against, Attacks and adversaries, Attacks and adversaries bootstrapping, Bootstrapping collecting ratings, Collecting ratings confidence values and, Personalizing reputation searches decentralizing, Decentralizing the scoring system–True decentralization default reputation score, Bootstrapping detecting suspicious behavior, Scoring algorithms importance of user interface in, Personalizing reputation searches meta-reputation problem, Multiple trusted parties privacy/information leaks, Privacy and information leaks–Privacy and information leaks qualitative vs. quantitative information, Personalizing reputation searches qualities of good, Scoring systems Reputation Server, Identity as an element of reputation, Scoring system separating performance and credibility, Scoring algorithms timing attacks against privacy of, Privacy and information leaks true decentralization, True decentralization using multiple trusted parties, Multiple trusted parties using transactions to calculate ratings, Attacks and adversaries weighting a score, Reputation systems Scott, Tracy, Distributed intelligence SDMI (Secure Digital Music Initiative), Yesterday’s technology at tomorrow’s prices, two days late SDSI/SPKI (Simple Distributed Security Infrastructure/Simple Public Key Infrastructure), Codifying reputation on a wide scale: The PGP web of trust, Shared space formation and trusted authentication search algorithms and centralization, Decentralization search engines centralized vs. distributed, Trust and search engines–Distributed search engines problems with, Searching trust issues and, Trust and search engines–Deniability secret sharing algorithms, Anonymity for anonymous storage Publius, Secret sharing Secure Digital Music Initiative (SDMI), Yesterday’s technology at tomorrow’s prices, two days late Secure Sockets Layer (see SSL) security, Security–Taxonomy of Groove keys centralizing crucial aspects of, Central control and local autonomy–Central control and local autonomy collaboration and, Groove versus email email and, Groove versus email granular, distributed at workgroup level, Groove versus email guarantees made by Groove, Security Red Rover weaknesses and, Putting low-tech “weaknesses” into perspective–Putting low-tech “weaknesses” into perspective of reputation systems, Attacks and adversaries selecting favored users, Accountability Semantic Web, Resources and relationships: A historical overview SERENDIP project, Radio SETI server deniability, Anonymity for anonymous storage server software, Publius, Publius, Publish operation, Server software server volunteers, Denial of service attacks, Mojo Nation and Free Haven, Freenet server-anonymity, Anonymity for anonymous storage attacks on, Attacks on anonymity Eternity Usenet and, An analysis of anonymity Free Haven and, An analysis of anonymity Freenet and, An analysis of anonymity Gnutella and, An analysis of anonymity Publius and, An analysis of anonymity servers dormant, Introducers web (see web servers) servnet, Free Haven (see Free Haven, servnet) SETI@home, Some context and a definition, Maximizing use of far-flung resources: Distributed computation, SETI@home–The peer-to-peer paradigm client program of, How SETI@home works computations, How SETI@home works data distribution server, How SETI@home works detecting signals, How SETI@home works difficulties and challenges, Trials and tribulations drift rates and, How SETI@home works floating-point operations and, The world’s most powerful computer how it began, SETI@home–Radio SETI how it works, How SETI@home works–How SETI@home works peer-to-peer paradigm and, The peer-to-peer paradigm processor-specific optimizations, Trials and tribulations security challenges, Trials and tribulations SERENDIP and, How SETI@home works server performance problems, Trials and tribulations user interest in, Human factors world’s most powerful computer, The world’s most powerful computer SHA-1 hash function (Publius), Hash functions Shamir, Adi, Nonparallelizable work functions, Micropayment digital cash schemes Shamir’s secret sharing algorithm (Publius), Secret sharing, True decentralization shared databases, Ways to fill shared databases–Napster: Harnessing the power of personal selfishness shared spaces cryptographic algorithms, Security characteristics of a shared space email, Groove versus email weaknesses of, Why secure email is a failure–Why secure email is a failure forming, Shared space formation and trusted authentication–Shared space formation and trusted authentication Groove, Security, The solution: A Groove shared space inviting people into, Inviting people into shared spaces joining or leaving, Anatomy of a mutually-trusting shared space mutually-suspicious, Mutually-suspicious shared spaces–Fetching lost messages mutually-trusting, Mutually-trusting shared spaces–The key to mutual trust projecting identities into, Anatomy of a mutually-trusting shared space security characteristics of, Security characteristics of a shared space–Security characteristics of a shared space using a relay server for peer communication, Message fanout shares Free Haven expiration dates of, Share expiration splitting files into, Publication trading, The design of Free Haven, Trading Publius, System architecture, Publius in a nutshell protecting against update attacks, Using the Update mechanism to censor retrieving, Retrieve operation splitting the key into, Publish operation–Publish operation shilling, Attacks and adversaries categorizing reputations to defend against, Scoring algorithms tying feedback to transactions to avoid, Collecting ratings Shirky, Clay, Contents of this book, Listening to Napster–New winners and losers, Contributors Shostack, Adam, Communications channel signals, detecting (SETI@home), How SETI@home works Signature Verification Keys (SVKs), Signature Verification Keys (SVKs) signature/verification key pairs, Anatomy of a mutually-trusting shared space, Taxonomy of Groove keys signatures, digital (see digital signatures) Simon and Rackoff, Future work Simple Object Access Protocol (SOAP) and web services, Web services and content syndication Simple Symmetric Transport Protocol (SSTP), Message fanout simulating behavior over time Freenet, Initial experiments–Initial experiments Gnutella, Initial experiments–Initial experiments fault tolerance in a Freenet network, Simulating fault tolerance in Gnutella, Fault tolerance and link distribution in Gnutella growth, in a Freenet network, Simulating growth–Simulating growth Sipser, Michael, Open source software Slashdot effect, Freenet protecting against, Active caching and mirroring resource allocation and, Conclusion Slashdot moderation system, Who will moderate the moderators: Slashdot vs.


pages: 226 words: 17,533

Programming Scala: tackle multicore complexity on the JVM by Venkat Subramaniam

augmented reality, business logic, continuous integration, domain-specific language, don't repeat yourself, functional programming, higher-order functions, loose coupling, semantic web, type inference, web application

You can build full applications entirely in Scala or intermix it to the extent you desire with Java and other languages on the JVM. So, your Scala code could be as small as a script or as large as a full-fledged enterprise application. Scala has been used to build applications in various domains including telecommunications, social networking, semantic web, and digital asset management. Apache Camel uses Scala for its DSL to create routing rules. Lift WebFramework is a powerful web development framework built using Scala. It takes full advantage of Scala features such as conciseness, expressiveness, pattern matching, and concurrency. 1.2 What’s Scala?


pages: 193 words: 19,478

Memory Machines: The Evolution of Hypertext by Belinda Barnet

augmented reality, Benoit Mandelbrot, Bill Duvall, British Empire, Buckminster Fuller, Charles Babbage, Claude Shannon: information theory, collateralized debt obligation, computer age, Computer Lib, conceptual framework, Douglas Engelbart, Douglas Engelbart, game design, hiring and firing, Howard Rheingold, HyperCard, hypertext link, Ian Bogost, information retrieval, Internet Archive, John Markoff, linked data, mandelbrot fractal, Marshall McLuhan, Menlo Park, nonsequential writing, Norbert Wiener, Project Xanadu, publish or perish, Robert Metcalfe, semantic web, seminal paper, Steve Jobs, Stewart Brand, technoutopianism, Ted Nelson, the scientific method, Vannevar Bush, wikimedia commons

There are some grounds for hope. However poorly conceived the general infrastructure, however corrupt and benighted the superstructure, the society of networks does support, somewhat obscurely, a plurality of ideas. Even on what ostensibly counts as the ascendant side, there is room for Berners-Lee to envision a Semantic Web that aims to cast some light below our diving xviii Memory Machines boards – and for great institutional innovators such as Wendy Hall of Southampton to extend the affordances of the Web through artful exploitations on the server side. Hypertext takes no single line. The concept itself arises from the idea of extension or complication ­– writing in a higher-dimensional space – so how could it be confined to one chain of transmission?


Designing Search: UX Strategies for Ecommerce Success by Greg Nudelman, Pabini Gabriel-Petit

access to a mobile phone, Albert Einstein, AltaVista, augmented reality, barriers to entry, Benchmark Capital, business intelligence, call centre, cognitive load, crowdsourcing, folksonomy, information retrieval, Internet of things, Neal Stephenson, Palm Treo, performance metric, QR code, recommendation engine, RFID, search costs, search engine result page, semantic web, Silicon Valley, social graph, social web, speech recognition, text mining, the long tail, the map is not the territory, The Wisdom of Crowds, web application, zero-sum game, Zipcar

Here’s one example: Google search results that extract location breadcrumbs from Web sites and display them on results pages to help users select relevant pages from among the many possibilities. Perhaps a breadcrumb-sharing service will evolve, making it easier to share meta-information—such as a location within a hierarchy and other attributes—across Web sites. I am sure someone at some startup is working on something like this now, perhaps under the semantic Web banner. Breadcrumbs may evolve to become an integral part of key navigational structures—instead of their being a last-resort mechanism, as is common today. This is what this chapter focuses on. Breadcrumbs have started out as a simple way to show a single location or a single path—which reflects constraints from the physical world—but a powerful aspect of the virtual world is that an object can live in many places, and you can find it in many different ways at the same time.


pages: 276 words: 78,094

Design for Hackers: Reverse Engineering Beauty by David Kadavy

Airbnb, complexity theory, en.wikipedia.org, Firefox, Hacker News, Isaac Newton, John Gruber, Paul Graham, Ruby on Rails, semantic web, Silicon Valley, Silicon Valley startup, Steve Jobs, TaskRabbit, web application, wikimedia commons, Y Combinator

> Art Nouveau: Inspired by the Arts and Crafts Movement’s return to organic forms, and freed from the limitations of typesetting by stone lithographic technique, Parisian poster artists such as Alphonse Mucha (originally from Morovia, now part of the Czech Republic) integrated illustration and typography. Today’s fast pace of business and the technological limitations of the web make typography of this nature impractical. jQuery plug-ins such as Lettering.js are attempting to bring similar typographic control to the semantic web. (Categorization: display) > Futura: Paul Renner’s Futura broke down letters into the most basic geometric forms that it could. Typefaces with such intense geometric influence render poorly at body copy sizes on today’s screens. Pixels are relatively incompatible with perfectly circular forms.


pages: 283 words: 78,705

Principles of Web API Design: Delivering Value with APIs and Microservices by James Higginbotham

Amazon Web Services, anti-pattern, business intelligence, business logic, business process, Clayton Christensen, cognitive dissonance, cognitive load, collaborative editing, continuous integration, create, read, update, delete, database schema, DevOps, fallacies of distributed computing, fault tolerance, index card, Internet of things, inventory management, Kubernetes, linked data, loose coupling, machine readable, Metcalfe’s law, microservices, recommendation engine, semantic web, side project, single page application, Snapchat, software as a service, SQL injection, web application, WebSocket

Listing 7.3 A JSON:API example demonstrating message-based representations * * * { "data": { "type": "books", "id": "12345", "attributes": { "isbn": "978-0321834577", "title": "Implementing Domain-Driven Design", "description": "With Implementing Domain-Driven Design, Vaughn has made an important contribution not only to the literature of the Domain- Driven Design community, but also to the literature of the broader enterprise application architecture field." }, "relationships": { "authors": { "data": [ {"id": "765", "type": "authors"} ] } }, "included": [ { "type": "authors", "id": "765", "fullName": "Vaughn Vernon", "links": { "self": { "href": "/authors/765" }, "authoredBooks": { "href": "/books?authorId=765" } } } } } * * * Semantic Hypermedia Messaging Semantic hypermedia messaging is the most comprehensive category as it adds semantic profile and linked data support, making APIs part of the Semantic Web. By applying semantics of resource properties through linked data, more meaning is assigned to each property without requiring an explicit name to be used. Linked data usually relies on a shared vocabulary from Schema.org or other resources. With the growth of data analytics and machine learning, linking data to shared vocabularies enable automated systems to easily derive value of the data provided from APIs.


pages: 743 words: 201,651

Free Speech: Ten Principles for a Connected World by Timothy Garton Ash

"World Economic Forum" Davos, A Declaration of the Independence of Cyberspace, Aaron Swartz, activist lawyer, Affordable Care Act / Obamacare, Andrew Keen, Apple II, Ayatollah Khomeini, battle of ideas, Berlin Wall, bitcoin, British Empire, Cass Sunstein, Chelsea Manning, citizen journalism, Citizen Lab, Clapham omnibus, colonial rule, critical race theory, crowdsourcing, data science, David Attenborough, digital divide, digital rights, don't be evil, Donald Davies, Douglas Engelbart, dual-use technology, Edward Snowden, Etonian, European colonialism, eurozone crisis, Evgeny Morozov, failed state, Fall of the Berlin Wall, Ferguson, Missouri, Filter Bubble, financial independence, Firefox, Galaxy Zoo, George Santayana, global village, Great Leap Forward, index card, Internet Archive, invention of movable type, invention of writing, Jaron Lanier, jimmy wales, John Markoff, John Perry Barlow, Julian Assange, Laura Poitras, machine readable, machine translation, Mark Zuckerberg, Marshall McLuhan, Mary Meeker, mass immigration, megacity, mutually assured destruction, national security letter, Nelson Mandela, Netflix Prize, Nicholas Carr, obamacare, Open Library, Parler "social media", Peace of Westphalia, Peter Thiel, power law, pre–internet, profit motive, public intellectual, RAND corporation, Ray Kurzweil, Ronald Reagan, semantic web, Sheryl Sandberg, Silicon Valley, Simon Singh, Snapchat, social graph, Stephen Fry, Stephen Hawking, Steve Jobs, Steve Wozniak, Streisand effect, technological determinism, TED Talk, The Death and Life of Great American Cities, The Wisdom of Crowds, Tipper Gore, trolley problem, Turing test, We are Anonymous. We are Legion, WikiLeaks, World Values Survey, Yochai Benkler, Yom Kippur War, yottabyte

Aaron Swartz, an American computing prodigy, co-developed Reddit, an online bulletin board which by 2015 clocked more than 150 million unique monthly visitors viewing more than six billion pages. He was involved in pioneering the widely used RSS web feed, worked with Tim Berners-Lee to improve data sharing through the Semantic Web and with cyberlaw guru Lawrence Lessig on the Creative Commons licences. All this by age 26.38 Swartz believed passionately that data, information and knowledge should be freely accessible to all. So he obtained the book-cataloguing data kept by the Library of Congress, for which it usually charged, and posted it on something called the Open Library.

., 195 search engine manipulation, 365 ‘search engine optimisation,’ 302 Second Life, 316 secrecy: C/S ratio, 324; guarding the guardians, 334–38; official, 324–27, 332–34, 337–38, 344–45; in wartime, 326; ‘well-placed sources,’ 341–45; whistleblowers and leakers, 339–41 section 295 of Indian/Pakistani penal code, 225, 254, 268, 275 secularism, 261, 265, 267, 273, 277–78, 281 security: executive oversight of, 335; versus freedom, 327–29; judiciary oversight of, 336–37; legislative oversight of, 335–36; national and personal, 321 sedition, 325 seditious libel, 331 Sedley, Stephen, 77, 131 Seinfeld, Jerry, 244 Selassie, Haile, 205 self-broadcasting/-publishing, 56–58 self-restraint, 213 Semantic Web, 164 Semprun, Jorgé, 304 Sen, Amartya, 78, 109, 193–94 Senegal, 243, 277 September 11, 2001 attacks, 64, 273, 322–24 Serbia, 133, 242 Serbo-Croat language, 123, 207 Serrano, Andres, 146 Serres, Michel, 25 sex, speech as, 89, 247–48 Shakarian, Hayastan, 349 Shakespeare, William, 156, 212 Shamikah, 313 Shamsie, Kamila, 90 ‘sharing,’ 166 Sharp, Gene, 148–49 Shaw, George Bernard, 17, 109 Shayegan, Daryush, 98 shield laws, 342 Shils, Edward, 99, 208–9 Shotoku (Prince), 109 Shrimsley, Robert, 142 Shteyngart, Gary, 13, 16 ‘Shunga’ art exhibition, 246–47 Sikhs, 131, 253, 262, 274 Siliconese, 50 Simone, Nina, 74, 78, 119, 212 Simpson, O.


pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzić

Albert Einstein, algorithmic bias, backpropagation, bioinformatics, business cycle, business intelligence, business process, butter production in bangladesh, combinatorial explosion, computer vision, conceptual framework, correlation coefficient, correlation does not imply causation, data acquisition, discrete time, El Camino Real, fault tolerance, finite state, Gini coefficient, information retrieval, Internet Archive, inventory management, iterative process, knowledge worker, linked data, loose coupling, Menlo Park, natural language processing, Netflix Prize, NP-complete, PageRank, pattern recognition, peer-to-peer, phenotype, random walk, RFID, semantic web, speech recognition, statistical model, Telecommunications Act of 1996, telemarketer, text mining, traveling salesman, web application

Thirty years ago the lack of relevant law was understandable: The technologies were new; their capacity was largely unknown; and the types of legal issues they might raise were novel. Today, it is inexplicable and threatens to undermine both privacy and security. Hence, we must develop technical, legal, and policy foundations for transparency and accountability of large-scale mining across distributed heterogeneous data sources. Policy awareness is a property of the Semantic Web still in development that should provide users with accessible and understandable views of the policies associated with resources. The following issues related to privacy concerns may assist in individual privacy protection during a data-mining process, and should be a part of the best data-mining practices: Whether there is a clear description of a program’s collection of personal information, including how the collected information will serve the program’s purpose?

It covers a wide scope of research areas including data representation, structuring and querying, as well as information retrieval and data mining. It encompasses different forms of databases, including data warehouses, data cubes, tabular or relational data, and many applications, among which are music warehouses, video mining, bioinformatics, semantic Web and data streams. Li, H. X., V. C. Yen, Fuzzy Sets and Fuzzy Decision-Making, CRC Press, Inc., Boca Raton, 1995. The book emphasizes the applications of fuzzy-set theory in the field of management science and decision science, introducing and formalizing the concept of fuzzy decision making.


pages: 294 words: 81,292

Our Final Invention: Artificial Intelligence and the End of the Human Era by James Barrat

AI winter, air gap, AltaVista, Amazon Web Services, artificial general intelligence, Asilomar, Automated Insights, Bayesian statistics, Bernie Madoff, Bill Joy: nanobots, Bletchley Park, brain emulation, California energy crisis, cellular automata, Chuck Templeton: OpenTable:, cloud computing, cognitive bias, commoditize, computer vision, Computing Machinery and Intelligence, cuban missile crisis, Daniel Kahneman / Amos Tversky, Danny Hillis, data acquisition, don't be evil, drone strike, dual-use technology, Extropian, finite state, Flash crash, friendly AI, friendly fire, Google Glasses, Google X / Alphabet X, Hacker News, Hans Moravec, Isaac Newton, Jaron Lanier, Jeff Hawkins, John Markoff, John von Neumann, Kevin Kelly, Law of Accelerating Returns, life extension, Loebner Prize, lone genius, machine translation, mutually assured destruction, natural language processing, Neil Armstrong, Nicholas Carr, Nick Bostrom, optical character recognition, PageRank, PalmPilot, paperclip maximiser, pattern recognition, Peter Thiel, precautionary principle, prisoner's dilemma, Ray Kurzweil, Recombinant DNA, Rodney Brooks, rolling blackouts, Search for Extraterrestrial Intelligence, self-driving car, semantic web, Silicon Valley, Singularitarianism, Skype, smart grid, speech recognition, statistical model, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steve Jurvetson, Steve Wozniak, strong AI, Stuxnet, subprime mortgage crisis, superintelligent machines, technological singularity, The Coming Technological Singularity, Thomas Bayes, traveling salesman, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, zero day

AIXItl—a computable approximation of AIXI—is another matter. This is also probably not true of mind uploading, if such a thing ever comes to pass. Computer science-based researchers want to engineer AGI: The mind versus brain debate is too large to address here. with $50 million in grants: Lenat, Doug, “Doug Lenat on Cyc, a truly semantic Web, and artificial intelligence (AI),” developerWorks, September 16, 2008, http://www.ibm.com/developerworks/podcast/dwi/cm-int091608txt.html (accessed September 28, 2011). Carnegie Mellon University’s NELL: Lohr, Steve, “Aiming to Learn as We Do, a Machine Teaches Itself,” New York Times, sec. science, October 4, 2010, http://www.nytimes.com/2010/10/05/science/05compute.html?


pages: 336 words: 90,749

How to Fix Copyright by William Patry

A Declaration of the Independence of Cyberspace, barriers to entry, big-box store, borderless world, bread and circuses, business cycle, business intelligence, citizen journalism, cloud computing, commoditize, content marketing, creative destruction, crowdsourcing, death of newspapers, digital divide, en.wikipedia.org, facts on the ground, Frederick Winslow Taylor, George Akerlof, Glass-Steagall Act, Gordon Gekko, haute cuisine, informal economy, invisible hand, John Perry Barlow, Joseph Schumpeter, Kickstarter, knowledge economy, lone genius, means of production, moral panic, new economy, road to serfdom, Ronald Coase, Ronald Reagan, search costs, semantic web, shareholder value, Silicon Valley, The Chicago School, The Wealth of Nations by Adam Smith, trade route, transaction costs, trickle-down economics, Twitter Arab Spring, Tyler Cowen, vertical integration, winner-take-all economy, zero-sum game

I have no idea. 7. Advances in technologies create problems that can only be solved by further advances in those same technologies. 8. The answer to beating a machine (say, at chess) is understanding how it works. “THE ANSWER TO THE MACHINE IS IN THE MACHINE” IS A REALLY BAD METAPHOR 235 9. The semantic web is the answer to all potential problems of access, control, and copying online. In other words, the proliferation of metadata standards will solve the “problem” of the existing behavior of computers, and in particular, search engines. 10. The challenge to copyright that the machine has always posed historically—shifting production cost and thus power—can be met by building a response into the same machine.


pages: 374 words: 94,508

Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage by Douglas B. Laney

3D printing, Affordable Care Act / Obamacare, banking crisis, behavioural economics, blockchain, book value, business climate, business intelligence, business logic, business process, call centre, carbon credits, chief data officer, Claude Shannon: information theory, commoditize, conceptual framework, crowdsourcing, dark matter, data acquisition, data science, deep learning, digital rights, digital twin, discounted cash flows, disintermediation, diversification, en.wikipedia.org, endowment effect, Erik Brynjolfsson, full employment, hype cycle, informal economy, information security, intangible asset, Internet of things, it's over 9,000, linked data, Lyft, Nash equilibrium, Neil Armstrong, Network effects, new economy, obamacare, performance metric, profit motive, recommendation engine, RFID, Salesforce, semantic web, single source of truth, smart meter, Snapchat, software as a service, source of truth, supply-chain management, tacit knowledge, technological determinism, text mining, uber lyft, Y2K, yield curve

This information can also have real commercial value—especially when mashed with other sources—to understand and act on local or global market conditions, population trends, and weather, for example. Public data even can be used to create new (ahem) high-value businesses such as Potbot, a virtual cannabis “budtender.” At its core is a recommendation engine that uses information on strains, cannabinoids, and medical applications aggregated via semantic web technology. Potbot also incorporates data from cannabis seed DNA scans along with recordings of brain activity in clinical tests. It monetizes this information, not just in the form of a consumer app, but also in helping growers improve their yields for the most popular or beneficial strains.1617 Public data is most monetizable when integrated with your own proprietary information.


Possiplex by Ted Nelson

Any sufficiently advanced technology is indistinguishable from magic, Bill Duvall, Brewster Kahle, Buckminster Fuller, Computer Lib, cuban missile crisis, Donald Knuth, Douglas Engelbart, Douglas Engelbart, Dr. Strangelove, Herman Kahn, HyperCard, Ivan Sutherland, Jaron Lanier, John Markoff, Kevin Kelly, Marc Andreessen, Marshall McLuhan, Murray Gell-Mann, nonsequential writing, pattern recognition, post-work, Project Xanadu, RAND corporation, reality distortion field, semantic web, Silicon Valley, Steve Jobs, Stewart Brand, Ted Nelson, Thomas Kuhn: the structure of scientific revolutions, Vannevar Bush, Zimmermann PGP

"Xanadu" is a registered trademark which I maintain at considerable cost, and I ask all parties to respect this by using the "®" or "(R)" symbol for the first use of the trademark "Xanadu" in each document. 7. Not "all the world's information", but all the world's documents. The concept of "information" is arguable, documents much less so. I believe Tim is finding his concept of pure information, the "Semantic Web", much more difficult to achieve than hypertext documents. 8. No, not a link; a transclusive pathway. The two mechanisms are entirely different. A link connects two things which are different. A transclusion connects two things which are the same. 9. Not authors, rightsholders. Sometimes the author is a rightsholder, sometimes not.


pages: 407 words: 103,501

The Digital Divide: Arguments for and Against Facebook, Google, Texting, and the Age of Social Netwo Rking by Mark Bauerlein

Alvin Toffler, Amazon Mechanical Turk, Andrew Keen, business cycle, centre right, citizen journalism, collaborative editing, computer age, computer vision, corporate governance, crowdsourcing, David Brooks, digital divide, disintermediation, folksonomy, Frederick Winslow Taylor, Future Shock, Hacker News, Herbert Marcuse, Howard Rheingold, invention of movable type, invention of the steam engine, invention of the telephone, Jaron Lanier, Jeff Bezos, jimmy wales, Kevin Kelly, knowledge worker, late fees, Lewis Mumford, Mark Zuckerberg, Marshall McLuhan, means of production, meta-analysis, moral panic, Network effects, new economy, Nicholas Carr, PageRank, PalmPilot, peer-to-peer, pets.com, radical decentralization, Results Only Work Environment, Saturday Night Live, scientific management, search engine result page, semantic web, Silicon Valley, slashdot, social graph, social web, software as a service, speech recognition, Steve Jobs, Stewart Brand, technology bubble, Ted Nelson, the long tail, the strength of weak ties, The Wisdom of Crowds, Thorstein Veblen, web application, Yochai Benkler

Ever since we first introduced the term “Web 2.0,” people have been asking, “What’s next?” Assuming that Web 2.0 was meant to be a kind of software version number (rather than a statement about the second coming of the Web after the dot-com bust), we’re constantly asked about “Web 3.0.” Is it the semantic web? The sentient web? Is it the social web? The mobile web? Is it some form of virtual reality? It is all of those, and more. The Web is no longer a collection of static pages of HTML that describe something in the world. Increasingly, the Web is the world—everything and everyone in the world casts an “information shadow,” an aura of data which, when captured and processed intelligently, offers extraordinary opportunity and mind-bending implications.


Data and the City by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle

A Declaration of the Independence of Cyberspace, algorithmic management, bike sharing, bitcoin, blockchain, Bretton Woods, Chelsea Manning, citizen journalism, Claude Shannon: information theory, clean water, cloud computing, complexity theory, conceptual framework, corporate governance, correlation does not imply causation, create, read, update, delete, crowdsourcing, cryptocurrency, data science, dematerialisation, digital divide, digital map, digital rights, distributed ledger, Evgeny Morozov, fault tolerance, fiat currency, Filter Bubble, floating exchange rates, folksonomy, functional programming, global value chain, Google Earth, Hacker News, hive mind, information security, Internet of things, Kickstarter, knowledge economy, Lewis Mumford, lifelogging, linked data, loose coupling, machine readable, new economy, New Urbanism, Nicholas Carr, nowcasting, open economy, openstreetmap, OSI model, packet switching, pattern recognition, performance metric, place-making, power law, quantum entanglement, RAND corporation, RFID, Richard Florida, ride hailing / ride sharing, semantic web, sentiment analysis, sharing economy, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, smart grid, smart meter, social graph, software studies, statistical model, tacit knowledge, TaskRabbit, technological determinism, technological solutionism, text mining, The Chicago School, The Death and Life of Great American Cities, the long tail, the market place, the medium is the message, the scientific method, Toyota Production System, urban planning, urban sprawl, web application

It is only by looking at the model and how it came to be through database specifications and requirements, the observation of data production on-site in real time and in communication with database designers and mangers, that attributes of an infrastructure’s assemblage can be observed in their state of play. What a cursory analysis shows is that the process of modelling is situated in the domain of object-oriented programming, the semantic web, GIScience, modelling software, taxonomies, the burgeoning database and GIS industry, modelling schemas, mathematics, consulting firms and offshore data re-engineering companies. Furthermore, data modelling requires a particular form of logical abstract thinking, in the case of the OSi and 1Spatial those that were involved in the modelling exercise were very senior, experienced and renowned spatial data experts, all formally trained in spatial database design and maintenance as well as spatial 180 T.


pages: 349 words: 102,827

The Infinite Machine: How an Army of Crypto-Hackers Is Building the Next Internet With Ethereum by Camila Russo

4chan, Airbnb, Alan Greenspan, algorithmic trading, altcoin, always be closing, Any sufficiently advanced technology is indistinguishable from magic, Asian financial crisis, Benchmark Capital, Big Tech, bitcoin, blockchain, Burning Man, Cambridge Analytica, Cody Wilson, crowdsourcing, cryptocurrency, distributed ledger, diversification, Dogecoin, Donald Trump, East Village, Ethereum, ethereum blockchain, Flash crash, Free Software Foundation, Google Glasses, Google Hangouts, hacker house, information security, initial coin offering, Internet of things, Mark Zuckerberg, Maui Hawaii, mobile money, new economy, non-fungible token, off-the-grid, peer-to-peer, Peter Thiel, pets.com, Ponzi scheme, prediction markets, QR code, reserve currency, RFC: Request For Comment, Richard Stallman, Robert Shiller, Sand Hill Road, Satoshi Nakamoto, semantic web, sharing economy, side project, Silicon Valley, Skype, slashdot, smart contracts, South of Market, San Francisco, the Cathedral and the Bazaar, the payments system, too big to fail, tulip mania, Turing complete, Two Sigma, Uber for X, Vitalik Buterin

Web 2 is the internet as we know it today, with user-generated content, streaming video and music, and location-based services. It thrives on mobile devices. Web 3 was first coined in a 2006 New York Times article referring to a third-generation internet. This new internet is made up of concepts including the “semantic web,” or a web of data that can be processed by machines, artificial intelligence, machine learning, and data mining. When algorithms decide what to recommend someone should purchase on Amazon, that’s a glimpse of Web 3. Besides all those features, Gavin’s version of Web 3 would allow people to interact without needing to trust each other.


pages: 387 words: 105,250

The Caryatids by Bruce Sterling

bread and circuses, carbon footprint, clean water, commons-based peer production, failed state, impulse control, machine translation, megaproject, negative equity, new economy, no-fly zone, nuclear winter, precautionary principle, semantic web, sexual politics, social software, space junk, starchitect, stem cell, supervolcano, urban renewal, Whole Earth Review

They had hit on a subject that knowledgeable experts had been discussing for a hundred years. The most heavily trafficked tag was the strange coinage “Supervolcano.” Supervolcanoes had been a topic of mild intellectual interest for many years. Recently, people had talked much less about supervolcanoes, and with more pejoratives in their semantics. Web-semantic traffic showed that people were actively shunning the subject of supervolcanoes. That scientific news seemed to be rubbing people the wrong way. “So,” said Guillermo at last, “according to our best sources here, there are some giant … and I mean really giant magma plumes rising up and chewing at the West Coast of North America.


pages: 373 words: 112,822

The Upstarts: How Uber, Airbnb, and the Killer Companies of the New Silicon Valley Are Changing the World by Brad Stone

Affordable Care Act / Obamacare, Airbnb, Amazon Web Services, Andy Kessler, autonomous vehicles, Ben Horowitz, Benchmark Capital, Boris Johnson, Burning Man, call centre, Chuck Templeton: OpenTable:, collaborative consumption, data science, Didi Chuxing, Dr. Strangelove, driverless car, East Village, fake it until you make it, fixed income, gentrification, Google X / Alphabet X, growth hacking, Hacker News, hockey-stick growth, housing crisis, inflight wifi, Jeff Bezos, John Zimmer (Lyft cofounder), Justin.tv, Kickstarter, Lyft, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Menlo Park, Mitch Kapor, Necker cube, obamacare, PalmPilot, Paul Graham, peer-to-peer, Peter Thiel, power law, race to the bottom, rent control, ride hailing / ride sharing, Ruby on Rails, San Francisco homelessness, Sand Hill Road, self-driving car, semantic web, sharing economy, side project, Silicon Valley, Silicon Valley startup, Skype, SoftBank, South of Market, San Francisco, Startup school, Steve Jobs, TaskRabbit, tech bro, TechCrunch disrupt, Tony Hsieh, transportation-network company, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, ubercab, Y Combinator, Y2K, Zipcar

Camp met Geoff Smith, who would become his StumbleUpon co-founder, through one of his childhood friends and together they started the site as a way for users to share and find interesting things on the internet without having to search for them on Google. Camp was obsessed with collaborative information systems and the semantic web. He didn’t go out much back then, splitting his time between his graduate thesis and the company and immersing himself in dense academic papers about esoteric topics in computer science. By the time Camp finished his degree in 2005, StumbleUpon was starting to show promise. Camp and Smith met an angel investor that year who convinced them to move to San Francisco and raise capital.


When Computers Can Think: The Artificial Intelligence Singularity by Anthony Berglas, William Black, Samantha Thalind, Max Scratchmann, Michelle Estes

3D printing, Abraham Maslow, AI winter, air gap, anthropic principle, artificial general intelligence, Asilomar, augmented reality, Automated Insights, autonomous vehicles, availability heuristic, backpropagation, blue-collar work, Boston Dynamics, brain emulation, call centre, cognitive bias, combinatorial explosion, computer vision, Computing Machinery and Intelligence, create, read, update, delete, cuban missile crisis, David Attenborough, DeepMind, disinformation, driverless car, Elon Musk, en.wikipedia.org, epigenetics, Ernest Rutherford, factory automation, feminist movement, finite state, Flynn Effect, friendly AI, general-purpose programming language, Google Glasses, Google X / Alphabet X, Gödel, Escher, Bach, Hans Moravec, industrial robot, Isaac Newton, job automation, John von Neumann, Law of Accelerating Returns, license plate recognition, Mahatma Gandhi, mandelbrot fractal, natural language processing, Nick Bostrom, Parkinson's law, patent troll, patient HM, pattern recognition, phenotype, ransomware, Ray Kurzweil, Recombinant DNA, self-driving car, semantic web, Silicon Valley, Singularitarianism, Skype, sorting algorithm, speech recognition, statistical model, stem cell, Stephen Hawking, Stuxnet, superintelligent machines, technological singularity, Thomas Malthus, Turing machine, Turing test, uranium enrichment, Von Neumann architecture, Watson beat the top human players on Jeopardy!, wikimedia commons, zero day

A better approach seems to be to present the structure of the data in a graphical user interface and then let the user specify the query directly in terms of the symbols that the computer does understand. As advances are made in commonsense reasoning this may change. Producing an effective natural language query processor is a major goal of the semantic web community. Eurisko and other early results One of the more commonly quoted early works is Eurisko, created by Douglas Lenat in 1976. It used various heuristics to generate short programs that could be interpreted as mathematical theorems. It also had heuristics for how to create new heuristics.


pages: 413 words: 119,587

Machines of Loving Grace: The Quest for Common Ground Between Humans and Robots by John Markoff

A Declaration of the Independence of Cyberspace, AI winter, airport security, Andy Rubin, Apollo 11, Apple II, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, backpropagation, basic income, Baxter: Rethink Robotics, Bill Atkinson, Bill Duvall, bioinformatics, Boston Dynamics, Brewster Kahle, Burning Man, call centre, cellular automata, Charles Babbage, Chris Urmson, Claude Shannon: information theory, Clayton Christensen, clean water, cloud computing, cognitive load, collective bargaining, computer age, Computer Lib, computer vision, crowdsourcing, Danny Hillis, DARPA: Urban Challenge, data acquisition, Dean Kamen, deep learning, DeepMind, deskilling, Do you want to sell sugared water for the rest of your life?, don't be evil, Douglas Engelbart, Douglas Engelbart, Douglas Hofstadter, Dr. Strangelove, driverless car, dual-use technology, Dynabook, Edward Snowden, Elon Musk, Erik Brynjolfsson, Evgeny Morozov, factory automation, Fairchild Semiconductor, Fillmore Auditorium, San Francisco, From Mathematics to the Technologies of Life and Death, future of work, Galaxy Zoo, General Magic , Geoffrey Hinton, Google Glasses, Google X / Alphabet X, Grace Hopper, Gunnar Myrdal, Gödel, Escher, Bach, Hacker Ethic, Hans Moravec, haute couture, Herbert Marcuse, hive mind, hype cycle, hypertext link, indoor plumbing, industrial robot, information retrieval, Internet Archive, Internet of things, invention of the wheel, Ivan Sutherland, Jacques de Vaucanson, Jaron Lanier, Jeff Bezos, Jeff Hawkins, job automation, John Conway, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John Perry Barlow, John von Neumann, Kaizen: continuous improvement, Kevin Kelly, Kiva Systems, knowledge worker, Kodak vs Instagram, labor-force participation, loose coupling, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, medical residency, Menlo Park, military-industrial complex, Mitch Kapor, Mother of all demos, natural language processing, Neil Armstrong, new economy, Norbert Wiener, PageRank, PalmPilot, pattern recognition, Philippa Foot, pre–internet, RAND corporation, Ray Kurzweil, reality distortion field, Recombinant DNA, Richard Stallman, Robert Gordon, Robert Solow, Rodney Brooks, Sand Hill Road, Second Machine Age, self-driving car, semantic web, Seymour Hersh, shareholder value, side project, Silicon Valley, Silicon Valley startup, Singularitarianism, skunkworks, Skype, social software, speech recognition, stealth mode startup, Stephen Hawking, Steve Ballmer, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, Strategic Defense Initiative, strong AI, superintelligent machines, tech worker, technological singularity, Ted Nelson, TED Talk, telemarketer, telepresence, telepresence robot, Tenerife airport disaster, The Coming Technological Singularity, the medium is the message, Thorstein Veblen, Tony Fadell, trolley problem, Turing test, Vannevar Bush, Vernor Vinge, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, We are as Gods, Whole Earth Catalog, William Shockley: the traitorous eight, zero-sum game

They argued for keeping human users in direct control rather than handing off decisions to a software valet. The Siri team did not shy away from the controversy, and it wasn’t long before they pulled back the curtain on their project, just a bit. By late spring 2009, Gruber was speaking obliquely about the new technology. During the summer of that year he appeared at a Semantic Web conference and described, point by point, how the futuristic technologies in the Knowledge Navigator were becoming a reality: there were now touch screens that enabled so-called gestural interfaces, there was a global network for information sharing and collaboration, developers were coding programs that interacted with humans, and engineers had started to finesse natural and continuous speech recognition.


pages: 1,038 words: 137,468

JavaScript Cookbook by Shelley Powers

business logic, Firefox, Google Chrome, hypertext link, leftpad, semantic web, SQL injection, web application, WebSocket

operator, 120 test method (RegExp), 24 U testing code with JsUnit, 392–396 text elements (forms), 162 undefined array elements, 70 text input (forms), accessing, 159–161 undefined data type, 11 text results to Ajax requests, processing, 422 Unicode sequences, 16 text value (aria-relevant attribute), 324 unit testing, 393 textareas universal selector (*), 232 events for, 162 unload events, 115 lines in, processing, 16–17 warnings when leaving pages, 147 observing character input for, 129–132 unordered lists, applying striping theme to, textInput events, 130 230–231 TextRectangle objects, 272 uppercase (see case) this context, 163 URIError errors, 185 this keyword, 360, 383–385 URLs, adding persistent information to, 458– keeping object members private, 361–362 461 throw statements, 184 user error, about, 177 throwing exceptions, 184 user input, form (see forms) Thunderbird extensions, building, 486 user input, validating (see validating) time (see date and time; tiers) userAgent property (Navigator), 146 timed page updates, 427–430 UTC date and time, printing, 42–43 timeouts, 49–50 UTCString method (Date), 42 timerEvent function, 428 timers, 41 V function closures with, 52–53 validating incremental counters in code, 57–58 array contents, 86–87 recurring, 50–51 checking for function errors, 180–181 triggering timeouts, 49–50 with forms title elements, 211 based on format, 166–167 today’s date, printing, 41–42 canceling invalid data submission, 167– toISOString method (Date), 44 168 toLowerCase method (String), 5 dynamic selection lists, 173–176 tools, extending with JavaScript, 496–499 preventing multiple submissions, 169– top property (bounding rectangle), 272, 273 171 toString method, 1, 59 function arguments (input), 95 touch swiping events, 117 highlighting invalid form fields, 302–307 toUpperCase method (String), 5 with jQuery Validation plug-in, 403 tr elements social security numbers, 26–28 adding to tables, 257–260 value attribute (objects), 370 Index | 527 valueOf method, 11, 12 writable attribute (objects), 370 variable values, checking, 181–182 vendor property (Navigator), 146 X video (see rich media) video elements, 326, 353–357 X3D, 326 visibility property (CSS), 172, 276 XML documents VoiceOver screen reader, 297 extracting pertinent information from, 437– 442 W processing, 436–437 XMLHttpRequest objects \w in regular expressions, 23 accessing, 414–415 \W in regular expressions, 23 adding callback functions to, 420–421 warn function (JsUnit), 394 checking for error conditions, 421 Watch panel (Firebug), 190 making requests to other domains, 422– Web Inspector (Safari), 203 424 web page elements (see elements) XScriptContext objects, 499 web page space (see page space) web pages (see document elements; pages) web-safe colors, 148 Web Sockets API, 413, 429 Web Workers, 500–509 WebGL (Web Graphics Library), 326, 350– 351 WebKit (Google) debugging with, 208–209 WebGL support in, 350–351 .wgt files, 493 while loop, iterating through arrays with, 71 whitespace, 269 (see also page space) matching in regular expressions, 23 nonbreaking space character, 19 trimming from form data, 162 trimming from strings, 17–19 using regular expressions, 35–36 widgets, creating, 489–494 width (see size) width attribute (canvas element), 327 width property (bounding rectangle), 272 width property (Screen), 149 window area, measuring, 270–271 window elements, 143 creating new stripped-down, 144–145 open method, 145 window space (see page space) windows, communicating across, 430–434 Windows-Eyes, 297 words, 32 (see also strings) swapping order of, 32–34 528 | Index About the Author Shelley Powers has been working with and writing about web technologies—from the first release of JavaScript to the latest graphics and design tools—for more than 15 years. Her recent O’Reilly books have covered the semantic web, Ajax, JavaScript, and web graphics. She’s an avid amateur photographer and web development aficionado. Colophon The animal on the cover of JavaScript Cookbook is a little (or lesser) egret ( Egretta garzetta). A small white heron, it is the old world counterpart to the very similar new world snowy egret.


We Are the Nerds: The Birth and Tumultuous Life of Reddit, the Internet's Culture Laboratory by Christine Lagorio-Chafkin

"Friedman doctrine" OR "shareholder theory", 4chan, Aaron Swartz, Airbnb, Amazon Web Services, Bernie Sanders, big-box store, bitcoin, blockchain, Brewster Kahle, Burning Man, compensation consultant, crowdsourcing, cryptocurrency, data science, David Heinemeier Hansson, digital rights, disinformation, Donald Trump, East Village, eternal september, fake news, game design, Golden Gate Park, growth hacking, Hacker News, hiring and firing, independent contractor, Internet Archive, Jacob Appelbaum, Jeff Bezos, jimmy wales, Joi Ito, Justin.tv, Kickstarter, Large Hadron Collider, Lean Startup, lolcat, Lyft, Marc Andreessen, Mark Zuckerberg, medical residency, minimum viable product, natural language processing, Palm Treo, Paul Buchheit, Paul Graham, paypal mafia, Peter Thiel, plutocrats, QR code, r/findbostonbombers, recommendation engine, RFID, rolodex, Ruby on Rails, Sam Altman, Sand Hill Road, Saturday Night Live, self-driving car, semantic web, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, slashdot, Snapchat, Social Justice Warrior, social web, South of Market, San Francisco, Startup school, Stephen Hawking, Steve Bannon, Steve Jobs, Steve Wozniak, Streisand effect, technoutopianism, uber lyft, Wayback Machine, web application, WeWork, WikiLeaks, Y Combinator

Huffman arrived at Image Matters to find five guys working on government security and emergency response technology. He later admitted he didn’t fully understand the scope of the work at the time, but one project helped layer locations of disaster responders over a map. Another was sort of a web-based assistant like Siri. What Huffman largely worked on was translating data for the “semantic web,” a layer of coding that helps computers understand and catalog a website’s contents. Image Matters didn’t behave how Huffman expected startups should. “It wasn’t very glamorous. All its money was from government contracts, so their projects were kind of boring,” he said. “No users, no scaling problems—none of that stuff.”


pages: 999 words: 194,942

Clojure Programming by Chas Emerick, Brian Carper, Christophe Grand

Amazon Web Services, Benoit Mandelbrot, cloud computing, cognitive load, continuous integration, database schema, domain-specific language, don't repeat yourself, drop ship, duck typing, en.wikipedia.org, failed state, finite state, Firefox, functional programming, game design, general-purpose programming language, Guido van Rossum, higher-order functions, Larry Wall, mandelbrot fractal, no silver bullet, Paul Graham, platform as a service, premature optimization, random walk, Ruby on Rails, Schrödinger's Cat, semantic web, software as a service, sorting algorithm, SQL injection, Turing complete, type inference, web application

* * * [177] Note that metadata on keys of &env can’t be relied upon, in particular in the presence of local aliases. [178] See Testing Contextual Macros for our stab at an alternative macroexpansion function that does support this without the var-dereferencing line noise. [179] Or, returned by a previous expansion. [180] Triples are a term for subject-predicate-object expressions, as found in semantic web technologies like RDF. Specific representations and semantics of triples vary from implementation to implementation, but a simplified example of a vector triple might be ["Boston" :capital-of "Massachusetts"]. [181] refer is described in “refer”, and is also reused by use, described later in that chapter


pages: 761 words: 231,902

The Singularity Is Near: When Humans Transcend Biology by Ray Kurzweil

additive manufacturing, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anthropic principle, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, backpropagation, Benoit Mandelbrot, Bill Joy: nanobots, bioinformatics, brain emulation, Brewster Kahle, Brownian motion, business cycle, business intelligence, c2.com, call centre, carbon-based life, cellular automata, Charles Babbage, Claude Shannon: information theory, complexity theory, conceptual framework, Conway's Game of Life, coronavirus, cosmological constant, cosmological principle, cuban missile crisis, data acquisition, Dava Sobel, David Brooks, Dean Kamen, digital divide, disintermediation, double helix, Douglas Hofstadter, en.wikipedia.org, epigenetics, factory automation, friendly AI, functional programming, George Gilder, Gödel, Escher, Bach, Hans Moravec, hype cycle, informal economy, information retrieval, information security, invention of the telephone, invention of the telescope, invention of writing, iterative process, Jaron Lanier, Jeff Bezos, job automation, job satisfaction, John von Neumann, Kevin Kelly, Law of Accelerating Returns, life extension, lifelogging, linked data, Loebner Prize, Louis Pasteur, mandelbrot fractal, Marshall McLuhan, Mikhail Gorbachev, Mitch Kapor, mouse model, Murray Gell-Mann, mutually assured destruction, natural language processing, Network effects, new economy, Nick Bostrom, Norbert Wiener, oil shale / tar sands, optical character recognition, PalmPilot, pattern recognition, phenotype, power law, precautionary principle, premature optimization, punch-card reader, quantum cryptography, quantum entanglement, radical life extension, randomized controlled trial, Ray Kurzweil, remote working, reversible computing, Richard Feynman, Robert Metcalfe, Rodney Brooks, scientific worldview, Search for Extraterrestrial Intelligence, selection bias, semantic web, seminal paper, Silicon Valley, Singularitarianism, speech recognition, statistical model, stem cell, Stephen Hawking, Stewart Brand, strong AI, Stuart Kauffman, superintelligent machines, technological singularity, Ted Kaczynski, telepresence, The Coming Technological Singularity, Thomas Bayes, transaction costs, Turing machine, Turing test, two and twenty, Vernor Vinge, Y2K, Yogi Berra

See note 57 in chapter 2 for an analysis of the information content in the genome, which I estimate to be 30 to 100 million bytes, therefore less than 109 bits. See the section "Human Memory Capacity" in chapter 3 (p. 126) for my analysis of the information in a human brain, estimated at 1018 bits. 11. Marie Gustafsson and Christian Balkenius, "Using Semantic Web Techniques for Validation of Cognitive Models against Neuroscientific Data," AILS04 Workshop, SAIS/SSLS Workshop (Swedish Artificial Intelligence Society; Swedish Society for Learning Systems), April 15–16, 2004, Lund, Sweden, www.lucs.lu.se/People/Christian.Balkenius/PDF/Gustafsson.Balkenius.2004.pdf. 12.


Data Mining: Concepts and Techniques: Concepts and Techniques by Jiawei Han, Micheline Kamber, Jian Pei

backpropagation, bioinformatics, business intelligence, business process, Claude Shannon: information theory, cloud computing, computer vision, correlation coefficient, cyber-physical system, database schema, discrete time, disinformation, distributed generation, finite state, industrial research laboratory, information retrieval, information security, iterative process, knowledge worker, linked data, machine readable, natural language processing, Netflix Prize, Occam's razor, pattern recognition, performance metric, phenotype, power law, random walk, recommendation engine, RFID, search costs, semantic web, seminal paper, sentiment analysis, sparse data, speech recognition, statistical model, stochastic process, supply-chain management, text mining, thinkpad, Thomas Bayes, web application

seeStructural Clustering Algorithm for Networks core vertex 531 illustrated 532 scatter plots 54 2-D data set visualization with 59 3-D data set visualization with 60 correlations between attributes 54–56 illustrated 55 matrix 56, 59 schemas integration 94 snowflake 140–141 star 139–140 science applications 611–613 search engines 28 search space pruning 263, 301 second guess heuristic 369 selection dimensions 225 self-training 432 semantic annotations applications 317, 313, 320–321 with context modeling 316 from DBLP data set 316–317 effectiveness 317 example 314–315 of frequent patterns 313–317 mutual information 315–316 task definition 315 Semantic Web 597 semi-offline materialization 226 semi-supervised classification 432–433, 437 alternative approaches 433 cotraining 432–433 self-training 432 semi-supervised learning 25 outlier detection by 572 semi-supervised outlier detection 551 sensitivity analysis 408 sensitivity measure 367 sentiment classification 434 sequence data analysis 319 sequences 586 alignment 590 biological 586, 590–591 classification of 589–590 similarity searches 587 symbolic 586, 588–590 time-series 586, 587–588 sequential covering algorithm 359 general-to-specific search 360 greedy search 361 illustrated 359 rule induction with 359–361 sequential pattern mining 589 constraint-based 589 in symbolic sequences 588–589 shapelets method 590 shared dimensions 204 pruning 205 shared-sorts 193 shared-partitions 193 shell cubes 160 shell fragments 192, 235 approach 211–212 computation algorithm 212, 213 computation example 214–215 precomputing 210 shrinking diameter 592 sigmoid function 402 signature-based detection 614 significance levels 373 significance measure 312 significance tests 372–373, 386 silhouette coefficient 489–490 similarity asymmetric binary 71 cosine 77–78 measuring 65–78, 79 nominal attributes 70 similarity measures 447–448, 525–528 constraints on 533 geodesic distance 525–526 SimRank 526–528 similarity searches 587 in information networks 594 in multimedia data mining 596 simple random sample with replacement (SRSWR) 108 simple random sample without replacement (SRSWOR) 108 SimRank 526–528, 539 computation 527–528 random walk 526–528 structural context 528 simultaneous aggregation 195 single-dimensional association rules 17, 287 single-linkage algorithm 460, 461 singular value decomposition (SVD) 587 skewed data balanced 271 negatively 47 positively 47 wavelet transforms on 102 slice operation 148 small-world phenomenon 592 smoothing 112 by bin boundaries 89 by bin means 89 by bin medians 89 for data discretization 90 snowflake schema 140 example 141 illustrated 141 star schema versus 140 social networks 524–525, 526–528 densification power law 592 evolution of 594 mining 623 small-world phenomenon 592see alsonetworks social science/social studies data mining 613 soft clustering 501 soft constraints 534, 539 example 534 handling 536–537 space-filling curve 58 sparse data 102 sparse data cubes 190 sparsest cuts 539 sparsity coefficient 579 spatial data 14 spatial data mining 595 spatiotemporal data analysis 319 spatiotemporal data mining 595, 623–624 specialized SQL servers 165 specificity measure 367 spectral clustering 520–522, 539 effectiveness 522 framework 521 steps 520–522 speech recognition 430 speed, classification 369 spiral method 152 split-point 333, 340, 342 splitting attributes 333 splitting criterion 333, 342 splitting rules.


pages: 864 words: 272,918

Palo Alto: A History of California, Capitalism, and the World by Malcolm Harris

2021 United States Capitol attack, Aaron Swartz, affirmative action, air traffic controllers' union, Airbnb, Alan Greenspan, Alvin Toffler, Amazon Mechanical Turk, Amazon Web Services, Apple II, Apple's 1984 Super Bowl advert, back-to-the-land, bank run, Bear Stearns, Big Tech, Bill Gates: Altair 8800, Black Lives Matter, Bob Noyce, book scanning, British Empire, business climate, California gold rush, Cambridge Analytica, capital controls, Charles Lindbergh, classic study, cloud computing, collective bargaining, colonial exploitation, colonial rule, Colonization of Mars, commoditize, company town, computer age, conceptual framework, coronavirus, corporate personhood, COVID-19, cuban missile crisis, deindustrialization, Deng Xiaoping, desegregation, deskilling, digital map, double helix, Douglas Engelbart, Edward Snowden, Elon Musk, Erlich Bachman, estate planning, European colonialism, Fairchild Semiconductor, financial engineering, financial innovation, fixed income, Frederick Winslow Taylor, fulfillment center, future of work, Garrett Hardin, gentrification, George Floyd, ghettoisation, global value chain, Golden Gate Park, Google bus, Google Glasses, greed is good, hiring and firing, housing crisis, hydraulic fracturing, if you build it, they will come, illegal immigration, immigration reform, invisible hand, It's morning again in America, iterative process, Jeff Bezos, Joan Didion, John Markoff, joint-stock company, Jony Ive, Kevin Kelly, Kickstarter, knowledge worker, land reform, Larry Ellison, Lean Startup, legacy carrier, life extension, longitudinal study, low-wage service sector, Lyft, manufacturing employment, Marc Andreessen, Marc Benioff, Mark Zuckerberg, Marshall McLuhan, Max Levchin, means of production, Menlo Park, Metcalfe’s law, microdosing, Mikhail Gorbachev, military-industrial complex, Monroe Doctrine, Mont Pelerin Society, moral panic, mortgage tax deduction, Mother of all demos, move fast and break things, mutually assured destruction, new economy, Oculus Rift, off grid, oil shale / tar sands, PageRank, PalmPilot, passive income, Paul Graham, paypal mafia, Peter Thiel, pets.com, phenotype, pill mill, platform as a service, Ponzi scheme, popular electronics, power law, profit motive, race to the bottom, radical life extension, RAND corporation, Recombinant DNA, refrigerator car, Richard Florida, ride hailing / ride sharing, rising living standards, risk tolerance, Robert Bork, Robert Mercer, Robert Metcalfe, Ronald Reagan, Salesforce, San Francisco homelessness, Sand Hill Road, scientific management, semantic web, sexual politics, Sheryl Sandberg, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, social web, SoftBank, software as a service, sovereign wealth fund, special economic zone, Stanford marshmallow experiment, Stanford prison experiment, stem cell, Steve Bannon, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, stock buybacks, strikebreaker, Suez canal 1869, super pumped, TaskRabbit, tech worker, Teledyne, telemarketer, the long tail, the new new thing, thinkpad, Thorstein Veblen, Tim Cook: Apple, Tony Fadell, too big to fail, Toyota Production System, Tragedy of the Commons, transcontinental railway, traumatic brain injury, Travis Kalanick, TSMC, Uber and Lyft, Uber for X, uber lyft, ubercab, union organizing, Upton Sinclair, upwardly mobile, urban decay, urban renewal, value engineering, Vannevar Bush, vertical integration, Vision Fund, W. E. B. Du Bois, War on Poverty, warehouse robotics, Wargames Reagan, Washington Consensus, white picket fence, William Shockley: the traitorous eight, women in the workforce, Y Combinator, Y2K, Yogi Berra, éminence grise

Bucket of Crabs Everything here was appallingly what it seemed.i Her fellow undergrads were all careerist dickheads, thumb-sucking vegans, smug libertarians, batshit Republicans, pompous student-visa techies, precious study-abroad fuzzies, Division I Neanderthals, faculty lapdogs, marching band weenies recouping their squandered adolescences, and the unforgivably rich. Everyone seemed so well parented; everyone’s semantic web architecture or microlending nonprofit or carbon nanotube dildo was going to change the world. —Tony Tulathimutte, Private Citizens2 When the dot-com bubble popped, it left a layer of winners: the middlemen at the big financial institutions as well as the bottom-feeders and big firms who cleaned up after, but there were also the founders who happened to sell at the right time.


Engineering Security by Peter Gutmann

active measures, address space layout randomization, air gap, algorithmic trading, Amazon Web Services, Asperger Syndrome, bank run, barriers to entry, bitcoin, Brian Krebs, business process, call centre, card file, cloud computing, cognitive bias, cognitive dissonance, cognitive load, combinatorial explosion, Credit Default Swap, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, Debian, domain-specific language, Donald Davies, Donald Knuth, double helix, Dr. Strangelove, Dunning–Kruger effect, en.wikipedia.org, endowment effect, false flag, fault tolerance, Firefox, fundamental attribution error, George Akerlof, glass ceiling, GnuPG, Google Chrome, Hacker News, information security, iterative process, Jacob Appelbaum, Jane Jacobs, Jeff Bezos, John Conway, John Gilmore, John Markoff, John von Neumann, Ken Thompson, Kickstarter, lake wobegon effect, Laplace demon, linear programming, litecoin, load shedding, MITM: man-in-the-middle, Multics, Network effects, nocebo, operational security, Paradox of Choice, Parkinson's law, pattern recognition, peer-to-peer, Pierre-Simon Laplace, place-making, post-materialism, QR code, quantum cryptography, race to the bottom, random walk, recommendation engine, RFID, risk tolerance, Robert Metcalfe, rolling blackouts, Ruby on Rails, Sapir-Whorf hypothesis, Satoshi Nakamoto, security theater, semantic web, seminal paper, Skype, slashdot, smart meter, social intelligence, speech recognition, SQL injection, statistical model, Steve Jobs, Steven Pinker, Stuxnet, sunk-cost fallacy, supply-chain attack, telemarketer, text mining, the built environment, The Death and Life of Great American Cities, The Market for Lemons, the payments system, Therac-25, too big to fail, Tragedy of the Commons, Turing complete, Turing machine, Turing test, Wayback Machine, web application, web of trust, x509 certificate, Y2K, zero day, Zimmermann PGP

“Context-Aware Access Control—Making Access Control Decisions Based on Context Information”, Sven Lachmund, Thomas Walter, Laurent Bussard, Laurent Gomez and Eddy Olk, Proceedings of the International Workshop on Ubiquitous Access Control (IWUAC’06), July 2006, p.1. “A Semantic Context-Aware Access Control Framework for Secure Collaborations in Pervasive Computing Environments”, Alessandra Toninelli, Rebecca Montanari, Lalana Kagal and Ora Lassila, Proceedings of the 5th International Semantic Web Conference (ISWC’06), Springer-Verlag LNCS No.4273, November 2006, p.473. “Information Security Architecture-Context Aware Access Control Model for Educational Applications”, N. DuraiPandian, V. Shanmughaneethi and C. Chellappan, International Journal of Computer Science and Network Security, Vol.6, No.12 (December 2006), p.197.