fallacies of distributed computing

pages: 283 words: 78,705

Principles of Web API Design: Delivering Value with APIs and Microservices by James Higginbotham

Amazon Web Services, anti-pattern, business intelligence, business logic, business process, Clayton Christensen, cognitive dissonance, cognitive load, collaborative editing, continuous integration, create, read, update, delete, database schema, DevOps, fallacies of distributed computing, fault tolerance, index card, Internet of things, inventory management, Kubernetes, linked data, loose coupling, machine readable, Metcalfe’s law, microservices, recommendation engine, semantic web, side project, single page application, Snapchat, software as a service, SQL injection, web application, WebSocket

Notice the explicit mention of the method (the procedure) and the ordered parameter list that results in a tight coupling between client and server: POST https://rpc.example.com/calculator-service HTTP/1.1 Content-Type: application/json Content-Length: ...Accept: application/json {"jsonrpc": "2.0", "method": "subtract", "params": [42, 23], "id": 1} Most RPC-based systems take advantage of a helper library and code generation tooling to generate the client and server stubs that are responsible for network communications. Those familiar with the fallacies of distributed computing recognize that failures can occur whenever code is executed remotely. While one of RPC’s goals is to make remote invocation behave as if it is calling a local procedure, network outages and other failure handling support is often incorporated into the client and server stubs and raised as an error.

…

Distributed Systems Challenges The journey toward microservices requires a deep understanding of distributed systems. Those not as familiar with the concepts of distributed tracing, observability, eventual consistency, fault tolerance, and failover will encounter a more difficult time with microservices. The eight fallacies of distributed computing, written in 1994 and still applicable today, must be understood by every developer. Additionally, many find that architectural oversight is required to initially decompose and subsequently integrate services into solutions. Teams unable to have architectural support may suffer from lack of architectural consideration in the design of their microservices, resulting in poor boundaries and overlapping team responsibilities that produces increased cross-team coordination.

pages: 540 words: 103,101

Building Microservices by Sam Newman

airport security, Amazon Web Services, anti-pattern, business logic, business process, call centre, continuous integration, Conway's law, create, read, update, delete, defense in depth, don't repeat yourself, Edward Snowden, fail fast, fallacies of distributed computing, fault tolerance, index card, information retrieval, Infrastructure as a Service, inventory management, job automation, Kubernetes, load shedding, loose coupling, microservices, MITM: man-in-the-middle, platform as a service, premature optimization, pull request, recommendation engine, Salesforce, SimCity, social graph, software as a service, source of truth, sunk-cost fallacy, systems thinking, the built environment, the long tail, two-pizza team, web application, WebSocket

Just taking a local API and trying to make it a service boundary without any more thought is likely to get you in trouble. In some of the worst examples, developers may be using remote calls without knowing it if the abstraction is overly opaque. You need to think about the network itself. Famously, the first of the fallacies of distributed computing is “The network is reliable”. Networks aren’t reliable. They can and will fail, even if your client and the server you are speaking to are fine. They can fail fast, they can fail slow, and they can even malform your packets. You should assume that your networks are plagued with malevolent entities ready to unleash their ire on a whim.

…

What happens when we have to handle failure of multiple separate services or manage hundreds of services? What are some of the coping patterns when you have more microservices than people? Let’s find out. Failure Is Everywhere We understand that things can go wrong. Hard disks can fail. Our software can crash. And as anyone who has read the fallacies of distributed computing can tell you, we know that the network is unreliable. We can do our best to try to limit the causes of failure, but at a certain scale, failure becomes inevitable. Hard drives, for example, are more reliable now than ever before, but they’ll break eventually. The more hard drives you have, the higher the likelihood of failure for an individual unit; failure becomes a statistical certainty at scale.

pages: 933 words: 205,691

Hadoop: The Definitive Guide by Tom White

Amazon Web Services, bioinformatics, business intelligence, business logic, combinatorial explosion, data science, database schema, Debian, domain-specific language, en.wikipedia.org, exponential backoff, fallacies of distributed computing, fault tolerance, full text search, functional programming, Grace Hopper, information retrieval, Internet Archive, Kickstarter, Large Hadron Collider, linked data, loose coupling, openstreetmap, recommendation engine, RFID, SETI@home, social graph, sparse data, web application

Launch the ConfigUpdater in one terminal window: % java ConfigUpdater localhost Set /config to 79 Set /config to 14 Set /config to 78 Then launch the ConfigWatcher in another window immediately afterward: % java ConfigWatcher localhost Read /config as 79 Read /config as 14 Read /config as 78 The Resilient ZooKeeper Application The first of the Fallacies of Distributed Computing[136] states that “The network is reliable.” As they stand, the programs so far have been assuming a reliable network, so when they run on a real network, they can fail in several ways. Let’s examine possible failure modes and what we can do to correct them so that our programs are resilient in the face of failure.

…

Thanks to its ZooKeeper underpinnings, Hedwig is a highly available service and guarantees message delivery even if subscribers are offline for extended periods of time. BookKeeper is a ZooKeeper subproject, and you can find more information on how to use it, and Hedwig, at http://zookeeper.apache.org/bookkeeper/. * * * [136] See http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing. [137] For more detail, see the excellent article “Dealing with InterruptedException” by Brian Goetz. [138] Another way of writing the code would be to have a single catch block, just for KeeperException, and a test to see whether its code has the value KeeperException.Code.SESSIONEXPIRED.

Seeking SRE: Conversations About Running Production Systems at Scale by David N. Blank-Edelman

Affordable Care Act / Obamacare, algorithmic trading, AlphaGo, Amazon Web Services, backpropagation, Black Lives Matter, Bletchley Park, bounce rate, business continuity plan, business logic, business process, cloud computing, cognitive bias, cognitive dissonance, cognitive load, commoditize, continuous integration, Conway's law, crowdsourcing, dark matter, data science, database schema, Debian, deep learning, DeepMind, defense in depth, DevOps, digital rights, domain-specific language, emotional labour, en.wikipedia.org, exponential backoff, fail fast, fallacies of distributed computing, fault tolerance, fear of failure, friendly fire, game design, Grace Hopper, imposter syndrome, information retrieval, Infrastructure as a Service, Internet of things, invisible hand, iterative process, Kaizen: continuous improvement, Kanban, Kubernetes, loose coupling, Lyft, machine readable, Marc Andreessen, Maslow's hierarchy, microaggression, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, pull request, RAND corporation, remote working, Richard Feynman, risk tolerance, Ruby on Rails, Salesforce, scientific management, search engine result page, self-driving car, sentiment analysis, Silicon Valley, single page application, Snapchat, software as a service, software is eating the world, source of truth, systems thinking, the long tail, the scientific method, Toyota Production System, traumatic brain injury, value engineering, vertical integration, web application, WebSocket, zero day

As the use of horizontally scaled services across commodity hardware became more economical and commonplace than investing in “big iron,” distributed system principles became much more important to the proper execution of systems. Perhaps foremost among those principles is the conclusion drawn from the fallacies of distributed computing that anything can fail at any time, even things that used to be considered, in simpler environments, rock solid. Working at scale has another consequence, too: problems affect only “1 in a million” are painfully frequent when dealing with hundreds of millions or billions of events. In such an environment, minimizing the losses from expected failures becomes a critical design guideline.

pages: 1,201 words: 233,519

Coders at Work by Peter Seibel

Ada Lovelace, Bill Atkinson, bioinformatics, Bletchley Park, Charles Babbage, cloud computing, Compatible Time-Sharing System, Conway's Game of Life, Dennis Ritchie, domain-specific language, don't repeat yourself, Donald Knuth, fallacies of distributed computing, fault tolerance, Fermat's Last Theorem, Firefox, Free Software Foundation, functional programming, George Gilder, glass ceiling, Guido van Rossum, history of Unix, HyperCard, industrial research laboratory, information retrieval, Ken Thompson, L Peter Deutsch, Larry Wall, loose coupling, Marc Andreessen, Menlo Park, Metcalfe's law, Multics, no silver bullet, Perl 6, premature optimization, publish or perish, random walk, revision control, Richard Stallman, rolodex, Ruby on Rails, Saturday Night Live, side project, slashdot, speech recognition, systems thinking, the scientific method, Therac-25, Turing complete, Turing machine, Turing test, type inference, Valgrind, web application

After participating in a failed attempt to commercialize the Project Genie system, Deutsch moved to Xerox PARC, where he worked on the Interlisp system and on the Smalltalk virtual machine, helping to invent the technique of just-in-time compilation. He served as Chief Scientist at the PARC spin-off, ParcPlace, and was a Fellow at Sun Microsystems, where he put to paper the now famous “Seven Fallacies of Distributed Computing.” He is also the author of Ghostscript, the Postscript viewer. In 1992, he was part of the group that received the Association for Computing Machinery Software System Award, for their work on Interlisp, and in 1994 he was elected a Fellow of the ACM. In 2002 Deutsch quit work on Ghostscript in order to study musical composition.