collaborative editing

13 results back to index


pages: 924 words: 241,081

The Art of Community by Jono Bacon

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

barriers to entry, Benevolent Dictator For Life (BDFL), collaborative editing, crowdsourcing, Debian, DevOps, en.wikipedia.org, Firefox, game design, Guido van Rossum, Johann Wolfgang von Goethe, Jono Bacon, Kickstarter, Larry Wall, Mark Shuttleworth, Mark Zuckerberg, openstreetmap, Richard Stallman, side project, Silicon Valley, Skype, slashdot, social graph, software as a service, telemarketer, union organizing, VA Linux, web application

Symbols 3D environments, Virtual worlds, Virtual worlds 5-A-Day campaign, The Buzz Cycle, Buildup, Buildup, Announce A A Thousand Suns (album), Mike Shinoda, Linkin Park accommodations for events, Accommodation, Accommodation accomplishment, sense of, Determining Contributions, Determining Contributions accountability, Structuring the plan, Accountability, Accountability Acquia, Dries Buytaert, Drupal and Acquia, Dries Buytaert, Drupal and Acquia action vs. theory, Theory Versus Action: Action Wins, Theory Versus Action: Action Wins Ad Hoc Peer Review, Reviewing new developers: In depth, Reviewing new developers: In depth addiction, Addiction, Addiction AdSense, Google, Online advertising, Online advertising, Handling Absence, Handling Absence advertising, Online advertising, Online advertising, The Professional Press advocates vs. salespeople, Becoming the Advocate, Becoming the Advocate age, and member behavior, Profiling the polemical alliances, Building Alliances, Building Alliances amateur press, The Amateur Press, The Amateur Press Anderson, Chris, Foreword announcement phase of buzz cycle, Announce, Announce, Applying the buzz cycle, Applying the buzz cycle anonymity, when gathering feedback, Anonymity, Anonymity appreciation, Growing Kudos, Growing Kudos Aq (Stuart Langridge), Planning Your Community ArduPilot Mega design, Foreword, Foreword assumptions, during brainstorming sessions, Technique 1: Question assumptions, Technique 1: Question assumptions At Home with Jono Bacon (live stream show), Videos audience, Roles, Roles author of this book, Social Media, Social Media, Keeping in Touch, Keeping in Touch awareness, creating, Campaigns and awareness, Campaigns and awareness B Bacon, Jono, Social Media, Social Media, Keeping in Touch, Keeping in Touch badges, at events, Assets, Assets bandwidth, at events, Ilan Rabinovitch, Southern California Linux Expo, Ilan Rabinovitch, Southern California Linux Expo banners, at events, Assets, Assets BarCamp, Organizing an Unconference, Organizing an Unconference, Organizing an Unconference Basinger, Mike, Governance, Accountability belief, Unwrapping Opportunity, Unwrapping Opportunity, Unwrapping Opportunity, Unwrapping Opportunity, Unwrapping Opportunity evidenced by LugRadio community, Unwrapping Opportunity, Unwrapping Opportunity importance of, Unwrapping Opportunity instilled by Obama election, Unwrapping Opportunity, Unwrapping Opportunity belonging, Building Belonging into the Social Economy, Building Belonging into the Social Economy, The Basis of Communication, Units of Belonging, Units of Belonging, Building Family Values, Building Family Values, Building Family Values, Building Family Values and threats on community, The Basis of Communication building into social economy, Building Belonging into the Social Economy, Building Belonging into the Social Economy impact of workflow on, Building Family Values, Building Family Values teams as units of, Units of Belonging, Units of Belonging vs. family, Building Family Values, Building Family Values Betts, Andrew, Blog wars bikeshedding, Avoiding bikeshedding, Avoiding bikeshedding bit.ly service, Tuning up your messages Blair, Tony, Uniting Together, Uniting Together, Inspired Words, Inspired Words Bloch, Peter, The Basis of Communication blogs, Aims, Getting Online, Getting Online, Blogs, Blogs, Blog wars, Blog wars, Privacy, Community reputation and private information, Privacy community manager as contributor to, Community reputation contact information on, Aims disagreements on, Blog wars, Blog wars history of, Blogs, Blogs setting up, Getting Online, Getting Online blueprints, Structuring Your Projects, Structuring Your Projects, Documenting work items, Scheduling, Scheduling bots, Communications bottlenecks, Ensuring Effective Processes, Knowing When It Is Time, Problems with Responsibility causes of, Ensuring Effective Processes, Problems with Responsibility facing councils, Knowing When It Is Time Bowling Alone (Putnam), Building Belonging into the Social Economy brainstorming, Brainstorming Ideas, Technique 3: Let’s make it suck broadcast appreciation approach, Growing Kudos, Growing Kudos broadcasting with social media, Being Social, Broadcasting, Broadcasting, Broadcasting, Tuning up your messages contents of broadcasts, Broadcasting, Broadcasting overview, Being Social using Twitter, Broadcasting, Tuning up your messages Buffy the Vampire Slayer forum, Read-mostly communities, Read-mostly communities bugs, Bug Tracking, Bug triage, Bug reporting, Bug reporting, Bug reporting, Measuring Mechanics, Measuring Mechanics, Measuring Mechanics, Measuring Mechanics, Measuring Mechanics, Measuring Mechanics fixing, Measuring Mechanics, Measuring Mechanics reporting, Bug reporting, Bug reporting, Measuring Mechanics, Measuring Mechanics tracking, Bug Tracking, Bug triage, Bug reporting, Measuring Mechanics, Measuring Mechanics Bugzilla bug tracker, Bug reporting buildup phase of buzz cycle, Buildup, Buildup, Applying the buzz cycle, Applying the buzz cycle bureaucracy, Processes: Simple Is Sustainable, Avoiding bureaucracy, Avoiding bureaucracy, Delegated Governance, Delegated Governance burndown charts, Visualizing Data with Burndown Charts, Visualizing Data with Burndown Charts, Using burndown charts, Using burndown charts, Using burndown charts, Generating additional information, Using burndown charts, Using burndown charts, Observing burndown patterns, Observing burndown patterns, Building burndown charts into your workflow, Building burndown charts into your workflow benefits of, Using burndown charts, Using burndown charts building into workflow, Building burndown charts into your workflow, Building burndown charts into your workflow generating, Using burndown charts, Generating additional information overview, Visualizing Data with Burndown Charts, Visualizing Data with Burndown Charts patterns in, Observing burndown patterns, Observing burndown patterns reading, Using burndown charts, Using burndown charts burnout, Dealing with Burnout, Dealing with Burnout, Detecting and Treating Burnout, Detecting and Treating Burnout, Required rest and relaxation, Required rest and relaxation, Work/Life Balance, Addiction and work/life balance, Work/Life Balance, Addiction cycle of, Dealing with Burnout, Dealing with Burnout detecting, Detecting and Treating Burnout, Detecting and Treating Burnout treating, Required rest and relaxation, Required rest and relaxation Burns, George, Engage the People Bussler, Mark, Videos, Mark Bussler, Classic Game Room, Mark Bussler, Classic Game Room Buytaert, Dries, Governance Does Not Suck, Dries Buytaert, Drupal and Acquia, Dries Buytaert, Drupal and Acquia buzz, creating, The Art of Community, The Art of Community, The Art of Community, Mindshare, The Mindshare Opportunity, The Mission, The Mission, Uniting Together, Uniting Together, Becoming the Advocate, Becoming the Advocate, Getting It Right by Not Getting It Wrong, Getting It Right by Not Getting It Wrong, Honesty, Honesty, Honesty, Honesty, The Buzz Cycle, The Buzz Cycle, Planning, Planning, Planning, Buildup, Buildup, Announce, Announce, Review, Review, Buzz Targets, Buzz Targets, Announcing Your Community, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Attracting Contributors, Choosing Events, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Building Alliances, Building Alliances, The Professional Press, The Professional Press, The Amateur Press, The Amateur Press, Podcasts, Podcasts, Videos, Videos (see also blogs) (see also events) (see also websites, community) advocates vs. salespeople, Becoming the Advocate, Becoming the Advocate and mindshare, Mindshare, The Mindshare Opportunity and mission statement, The Mission, The Mission buzz cycle, The Buzz Cycle, The Buzz Cycle, Planning, Planning, Planning, Buildup, Buildup, Announce, Announce, Review, Review, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle, Applying the buzz cycle and announcing community, Applying the buzz cycle, Applying the buzz cycle and attracting contributors, Applying the buzz cycle, Applying the buzz cycle announcement phase, Announce, Announce buildup phase, Buildup, Buildup, Applying the buzz cycle, Applying the buzz cycle overview, The Buzz Cycle, The Buzz Cycle planning phase, Planning, Planning, Planning, Applying the buzz cycle, Applying the buzz cycle review phase, Review, Review, Applying the buzz cycle, Applying the buzz cycle comparison to politics, Uniting Together, Uniting Together honesty needed, Honesty, Honesty negative energy, avoiding, Honesty, Honesty spamming, avoiding, Getting It Right by Not Getting It Wrong, Getting It Right by Not Getting It Wrong targets of campaign, Buzz Targets, Buzz Targets, Announcing Your Community, Applying the buzz cycle, Attracting Contributors, Choosing Events, Building Alliances, Building Alliances, The Professional Press, The Professional Press, The Amateur Press, The Amateur Press amateur press, The Amateur Press, The Amateur Press announcing community, Announcing Your Community, Applying the buzz cycle attracting contributors, Attracting Contributors, Choosing Events building alliances, Building Alliances, Building Alliances overview, Buzz Targets, Buzz Targets professional press, The Professional Press, The Professional Press using podcasts, Podcasts, Podcasts using videos, Videos, Videos ByteMark Hosting, Sponsorship, Understanding Your Workflow, Setting expectations C CafePress, Selling campaigns, Campaigns and awareness, Running a Campaign, The buildup can do culture, Avoiding bikeshedding capital, social, The Art of Community (see social capital) Carmony, Kevin, Blog wars Castro, Jorge, Planning catering for events, Catering, Catering, Additional notes, Event-specific notes, Event-specific notes Cave Story (game), Richard Esguerra, Humble Indie Bundle CGR (Classic Game Room), Mark Bussler, Classic Game Room, Mark Bussler, Classic Game Room channels, on YouTube, Videos, Videos charts, The Art of Community (see burndown charts) chat, The Art of Community (see IRC) circles, in Google+, Getting started with Google+ clarity of communication, Striving for Clarity, Striving for Clarity, Be clear, Be clear in conflict resolution, Be clear, Be clear in general, Striving for Clarity, Striving for Clarity Classic Game Room (CGR), Mark Bussler, Classic Game Room, Mark Bussler, Classic Game Room cliques, Communications, Communications Code of Conduct, Ubuntu, Diversity, Diversity Cole, Henry, Simplicity is key collaboration, The Art of Community, Collaboration-Driven Ethos, Collaboration-Driven Ethos, Designing Your Community, Designing Your Community, Designing Your Community, Designing Your Community, Brainstorming Ideas, Technique 3: Let’s make it suck, Understand the extent and range of collaboration among our teams, Understand the extent and range of collaboration among our teams, The Mechanics of Collaboration, The Mechanics of Collaboration, Collaborative Editing, Reporting, Reporting, Being Social, Collaboration, Collaboration, Campaigns and awareness, Events, Events, Running a Campaign, The buildup, Measuring Mechanics, Measuring Mechanics (see also communication) between teams, Understand the extent and range of collaboration among our teams, Understand the extent and range of collaboration among our teams brainstorming, Brainstorming Ideas, Technique 3: Let’s make it suck documenting areas of, Designing Your Community, Designing Your Community, Designing Your Community, Designing Your Community ethos driven by, Collaboration-Driven Ethos, Collaboration-Driven Ethos mechanics of, The Mechanics of Collaboration, The Mechanics of Collaboration, Measuring Mechanics, Measuring Mechanics on editing, Collaborative Editing, Reporting, Reporting using social media, Being Social, Collaboration, Collaboration, Campaigns and awareness, Events, Events, Running a Campaign, The buildup for campaigns and awareness, Campaigns and awareness, Running a Campaign, The buildup for coordinating events, Events, Events overview of, Being Social party-planning example, Collaboration, Collaboration Colvig, Mary, Planning Your Community, Mary Colvig, Mozilla, Mary Colvig, Mozilla commenting facilities, on community website, Building Conversation, Building Conversation commericially-sponsored communities, The Art of Community (see sponsors) communication, The Art of Community, The Art of Community, The Art of Community, The Art of Community, The Basis of Communication, The Basis of Communication, Community: The Bird’s-Eye View, Ensure that teams can communicate clearly and effectively, Ensure that teams can communicate clearly and effectively, Striving for Clarity, Striving for Clarity, Striving for Clarity, Striving for Clarity, Choices, Choices, Choices, Choices, Communication fetishism, Communication fetishism, IRC, IRC, Leading by Example, Leading by Example, Daily Communication, Daily Communication, Daily Communication, Netiquette, Netiquette, Avoiding bikeshedding, Setting tone, Longer Writing, Inspiring your community, The mechanics of writing, The mechanics of writing, The mechanics of writing, The mechanics of writing, Don’t write like an institution, Untwisting the tail, Inspiring your community, Inspiring your community, Communications, Communications, Communications, Communications, Inspired Words, Inspired Words, Aims, Communication, Communication, Communicating Between Councils, Communicating Between Councils, Barriers to Input, Barriers to Input, Online Discussion Meetings, Running the meeting (see also collaboration) and cliques, Communications, Communications between Community Council members, Communication, Communication between councils, Communicating Between Councils, Communicating Between Councils between teams, Community: The Bird’s-Eye View, Ensure that teams can communicate clearly and effectively, Ensure that teams can communicate clearly and effectively channels for, The Art of Community, The Art of Community, Striving for Clarity, Striving for Clarity, Choices, Choices, Choices, Choices, Communication fetishism, Communication fetishism, IRC, IRC, Daily Communication (see also forums) (see also mailing lists) avoiding overabundance of, Communication fetishism, Communication fetishism choosing, Choices, Choices, Choices, Choices clarity in, Striving for Clarity, Striving for Clarity, Daily Communication IRC (chat), IRC, IRC feedback from others about, Leading by Example, Leading by Example frequency of, Daily Communication, Daily Communication in online discussion meetings, Online Discussion Meetings, Running the meeting netiquette, Netiquette, Netiquette stories as mechanism behind, The Basis of Communication, The Basis of Communication transparency in, Striving for Clarity, Striving for Clarity, Communications, Communications via community website, Aims when complaints arise, Barriers to Input, Barriers to Input writing, The Art of Community, Avoiding bikeshedding, Setting tone, Longer Writing, Inspiring your community, The mechanics of writing, The mechanics of writing, The mechanics of writing, The mechanics of writing, Don’t write like an institution, Untwisting the tail, Inspiring your community, Inspiring your community, Inspired Words, Inspired Words (see also documentation) cultural differences in, The mechanics of writing, The mechanics of writing inspirational, Inspiring your community, Inspiring your community, Inspired Words, Inspired Words mechanics of, The mechanics of writing, The mechanics of writing simple, clear language, Don’t write like an institution, Untwisting the tail tone, Avoiding bikeshedding, Setting tone community, The Art of Community, The Essence of Community, The Basis of Communication, The Basis of Communication, The Basis of Communication, The Basis of Communication, Unwrapping Opportunity, Unwrapping Opportunity, Inspired Words, Inspired Words, Announcing Your Community, Applying the buzz cycle, The Case for Governance, Why Community Building Has Become a Big Business, Why Community Building Has Become a Big Business (see also measuring community) and open source development, Why Community Building Has Become a Big Business, Why Community Building Has Become a Big Business announcing, Announcing Your Community, Applying the buzz cycle defined, The Essence of Community size of, The Case for Governance stories within, The Basis of Communication, The Basis of Communication threats on, The Basis of Communication, The Basis of Communication, Unwrapping Opportunity, Unwrapping Opportunity effect on sense of belonging, The Basis of Communication, The Basis of Communication LugRadio response to rail strike, Unwrapping Opportunity, Unwrapping Opportunity use of term, Inspired Words, Inspired Words Community Council, The Art of Community, Communications, Setting Up a Community Council, Setting Up a Community Council, Responsibilities, Responsibilities, Structure, Structure, Commercial sponsorship, Commercial sponsorship, Membership, Communication, Communication, Codifying Your Council, Codifying Your Council, Nominating and Electing Council Members, Forming a new council, Communicating Between Councils, Communicating Between Councils (see also Ubuntu Community Council) codifying, Codifying Your Council, Codifying Your Council communication between, Communication, Communication communication channels run by, Communications communication with other councils, Communicating Between Councils, Communicating Between Councils members of, Membership, Nominating and Electing Council Members, Forming a new council of commercially-sponsored communities, Commercial sponsorship, Commercial sponsorship overview of, Setting Up a Community Council, Setting Up a Community Council responsibilities of, Responsibilities, Responsibilities structure of, Structure, Structure community cycles, Community Cycles, Leading by example: Ubuntu Community Leadership Summit, The Second Edition, The Community Leadership Summit, Joining Us community managers, The Art of Community, Trust Is Everything, Trust Is Everything, Perception of you, Perception of you, Credibility and the Need to Track Progress, Credibility and the Need to Track Progress, Events, Why Community Building Has Become a Big Business, Why Community Building Has Become a Big Business, Scope of the Role, Scope of the Role, Risk, Risk, Breaking Tradition, Breaking Tradition, Control and Reporting, Control and Reporting, The ability to enact change, The ability to enact change, The Responsibilities of Community Engagement, The Responsibilities of Community Engagement, Salary, Salary, Communicating Expectations to the Candidate, Communicating Expectations to the Candidate, Induction, Community reputation, Internal reputation, Community reputation, Strategy, Strategy, Management and Communications, Weekly engagements, Community feedback, Community feedback (see also governance) and strategic plan, Strategy, Strategy as messenger of community feedback, Community feedback, Community feedback building credibility as, Credibility and the Need to Track Progress, Credibility and the Need to Track Progress communicating expectations to candidate for, Communicating Expectations to the Candidate, Communicating Expectations to the Candidate community perception of, Perception of you, Perception of you company's communication with, Management and Communications, Weekly engagements degree of control over change, The ability to enact change, The ability to enact change difference from traditional managers, Breaking Tradition, Breaking Tradition hype about, Why Community Building Has Become a Big Business, Why Community Building Has Become a Big Business identifying at events, Events induction of, Induction, Community reputation interaction with community, The Responsibilities of Community Engagement, The Responsibilities of Community Engagement personality/characteristics of, Trust Is Everything, Trust Is Everything reporting by, Control and Reporting, Control and Reporting reputation of, Internal reputation, Community reputation risk in hiring, Risk, Risk salary of, Salary, Salary scope of role, Scope of the Role, Scope of the Role complaints, listening to, Barriers to Input, Barriers to Input conferences, The Art of Community, The Art of Community, Organizing Physical Events (see also events) (see also unconferences) confidence, Units of Belonging, Units of Belonging, Developing Knowledge, Developing Knowledge effect of scope on, Units of Belonging, Units of Belonging effect of skills acquisition on, Developing Knowledge, Developing Knowledge conflict, The Art of Community, The Art of Community, Diversity, Blog wars, Blog wars, The Case for Governance, The Nature of the Beast, The Nature of the Beast, The Structure of Strife, The Structure of Strife, The Structure of Strife, The Structure of Strife, The Calm Before the Storm, The Calm Before the Storm, Contentious Personalities, Profiling the polemical, Poisonous people, Barriers to Input, Barriers to Input, Problems with Responsibility, Problems with Responsibility, Lack of Justice, Lack of Justice, Lack of Justice, Lack of Justice (see also conflict resolution) (see also escalation of issues) addressed in Ubuntu Code of Conduct, Diversity determining seriousness of, The Case for Governance elements of, The Structure of Strife, The Structure of Strife expecting, The Nature of the Beast, The Nature of the Beast in communities vs. in formal organizations, The Structure of Strife insight gained from, The Structure of Strife on blogs, Blog wars, Blog wars physical violence, Lack of Justice, Lack of Justice sources of, The Calm Before the Storm, The Calm Before the Storm, Contentious Personalities, Profiling the polemical, Poisonous people, Barriers to Input, Barriers to Input, Problems with Responsibility, Problems with Responsibility, Lack of Justice, Lack of Justice barriers to input, Barriers to Input, Barriers to Input lack of justice, Lack of Justice, Lack of Justice overview of, The Calm Before the Storm, The Calm Before the Storm personality traits, Contentious Personalities, Profiling the polemical, Poisonous people problems with responsibility, Problems with Responsibility, Problems with Responsibility conflict resolution, The Art of Community, The Art of Community, To Bring Peace, To Bring Peace, Responsibilities, Lack of Justice, Lack of Justice, The Conflict Resolution Process, The Conflict Resolution Process, The Role of a Facilitator, Be clear, Be objective, Be clear, Be open, Be open, Part 1: Calm and reassure, The fantastical user group debacle, Part 2: Get the facts, The fantastical user group debacle, Part 3: Discuss, The fantastical user group debacle, Part 4: Document, The fantastical user group debacle, Part 5: Reflect and maintain, The fantastical user group debacle (see also conflict) (see also escalation of issues) and governance, To Bring Peace, To Bring Peace, Responsibilities effect of not handling correctly, Lack of Justice, Lack of Justice facilitator's role in, The Role of a Facilitator, Be clear keeping records of, Be open, Be open overview of, The Conflict Resolution Process, The Conflict Resolution Process process of, Part 1: Calm and reassure, The fantastical user group debacle, Part 2: Get the facts, The fantastical user group debacle, Part 3: Discuss, The fantastical user group debacle, Part 4: Document, The fantastical user group debacle, Part 5: Reflect and maintain, The fantastical user group debacle calm and reassure, Part 1: Calm and reassure, The fantastical user group debacle discuss, Part 3: Discuss, The fantastical user group debacle document, Part 4: Document, The fantastical user group debacle get the facts, Part 2: Get the facts, The fantastical user group debacle reflect and maintain, Part 5: Reflect and maintain, The fantastical user group debacle qualities needed for, Be objective, Be clear consideration for others, in Ubuntu Code of Conduct, Diversity contact information, Aims contextual appreciation, Growing Kudos contributers, The Art of Community (see also conflict) contributors, Community: The Bird’s-Eye View, Community: The Bird’s-Eye View, The Gates of Your Community, The Gates of Your Community, The Gates of Your Community, The Gates of Your Community, Assessing Contributors, Reviewing new developers: In depth, The On-Ramp: Creating Collaborative Processes, Growing Kudos, The On-Ramp: Creating Collaborative Processes, The On-Ramp: Creating Collaborative Processes, The On-Ramp: Creating Collaborative Processes, Identifying the On-Ramp, Identifying the On-Ramp, Developing Knowledge, Developing Knowledge, Determining Contributions, Determining Contributions, Growing Kudos, Growing Kudos, Staying Current, Staying Current, Staying Current, Attracting Contributors, Attracting Contributors, Attracting Contributors, Attracting Contributors, Attracting Contributors, Applying the buzz cycle, Applying the buzz cycle, Profiling the polemical, Profiling the polemical, Handling Absence, Handling Absence, Handling Bereavement, Handling Bereavement accessing, Assessing Contributors, Reviewing new developers: In depth attracting, Community: The Bird’s-Eye View, Community: The Bird’s-Eye View, The Gates of Your Community, The Gates of Your Community, Attracting Contributors, Attracting Contributors, Attracting Contributors, Attracting Contributors, Attracting Contributors, Applying the buzz cycle, Applying the buzz cycle applying buzz cycle to, Applying the buzz cycle, Applying the buzz cycle as ongoing effort, Attracting Contributors by appealing to capacity and diversity, Community: The Bird’s-Eye View, Community: The Bird’s-Eye View by showcasing work, Attracting Contributors, Attracting Contributors the right people, The Gates of Your Community, The Gates of Your Community through existing members, Attracting Contributors, Attracting Contributors death of, Handling Bereavement, Handling Bereavement getting participation from (the on-ramp), The On-Ramp: Creating Collaborative Processes, Growing Kudos, The On-Ramp: Creating Collaborative Processes, The On-Ramp: Creating Collaborative Processes, The On-Ramp: Creating Collaborative Processes, Identifying the On-Ramp, Identifying the On-Ramp, Developing Knowledge, Developing Knowledge, Determining Contributions, Determining Contributions, Growing Kudos, Growing Kudos defined, The On-Ramp: Creating Collaborative Processes determining contributions, Determining Contributions, Determining Contributions identifying, Identifying the On-Ramp, Identifying the On-Ramp showing appreciation, Growing Kudos, Growing Kudos skills acquisition, Developing Knowledge, Developing Knowledge steps in, The On-Ramp: Creating Collaborative Processes, The On-Ramp: Creating Collaborative Processes leaving community, Handling Absence, Handling Absence maturing of, Profiling the polemical, Profiling the polemical retaining, The Gates of Your Community, The Gates of Your Community to community website, Staying Current, Staying Current, Staying Current conventions, The Art of Community, Organizing Physical Events (see also events) conversation, on community website, The Art of Community, Building Conversation, Building Conversation, Building Conversation (see also communication) coordinated universal time (UTC), Date/time, Date/time costs, The Art of Community, The Art of Community, The Art of Community, Cost, Cost, Additional notes, Event-specific notes, Understanding Your Needs, Understanding Your Needs (see also finances) (see also money from sponsors) (see also sponsors) cutting, Understanding Your Needs, Understanding Your Needs for physical events, Cost, Cost, Additional notes, Event-specific notes councils, The Art of Community (see Community Council) Cox, Alan, Blogs Creative Commons, Untwisting the tail, Untwisting the tail, Videos, Mike Linksvayer, Creative Commons, Mike Linksvayer, Creative Commons credibility, Credibility and the Need to Track Progress, Credibility and the Need to Track Progress Cricket (graphing tool), Plugging your stats into graphs criticism, The Art of Community, Gathering feedback (see also feedback) culture, effect on members, Profiling the polemical cycles, Community Cycles, Leading by example: Ubuntu Czajkowski, Laura, Building a Set of Generals D data, The Art of Community, The Art of Community (see hooks and data) (see tracking) DC Ubuntu Local Community, Observational Tests deadlines, Planning, Step 3: Set Deadlines, Step 3: Set Deadlines for buzz-creating activities, Planning for organizing events, Step 3: Set Deadlines, Step 3: Set Deadlines Dealing with Diversity (Graen), Diversity debates, Debates, Debates deep-level diversity, Diversity, Diversity developers, Developer, Developer disagreements, The Art of Community (see conflict) Distributed Version Control System (DVCS), Source Control diversity within communities, Diversity, Diversity DIY Drones community, Foreword, Foreword documentation, The Art of Community, Designing Your Community, Baking in Openness, Documenting Your Strategy, Documenting Your Strategy, Document Them, Make Them Easy to Find, Make Them Easy to Find, Collaborative Editing, Collaborative Editing, Aims, Showing off your survey reports, Showing off your survey reports, Showing off your survey reports, Documenting work items, Documenting work items, Codifying Your Council, Codifying Your Council, Codifying Your Council, Be open, Part 4: Document, Part 4: Document, Accommodation (see also writing) collaborative editing of, Collaborative Editing, Collaborative Editing of conflict resolution, Be open, Part 4: Document, Part 4: Document of council codification, Codifying Your Council, Codifying Your Council, Codifying Your Council of hotel accommodations for events, Accommodation of processes, Document Them, Make Them Easy to Find, Make Them Easy to Find of strategic plan, Designing Your Community, Baking in Openness, Documenting Your Strategy, Documenting Your Strategy of survey results, Showing off your survey reports, Showing off your survey reports, Showing off your survey reports of work items, Documenting work items, Documenting work items on community website, Aims donations, Donations, Donations Dotzler, Asa, Mary Colvig, Mozilla Dr.

issue escalation in, Escalation, Escalation measurement of, Hooks ’n’ Data, Hooks ’n’ Data membership, Membership, Membership, Ubuntu Member, Ubuntu Member, Ubuntu Member, Ubuntu Member, Ubuntu Member, Ubuntu Member, Ubuntu Member, Ubuntu Member, Developer, Developer, Council or Board Member, Council or Board Member applying for, Ubuntu Member, Ubuntu Member council or board member, Council or Board Member, Council or Board Member developer, Developer, Developer member role, Ubuntu Member, Ubuntu Member overview, Membership, Membership privileges of, Ubuntu Member, Ubuntu Member Regional Membership Boards, Ubuntu Member, Ubuntu Member overview of, Write-centered communities process reassessment in, Process Reassessment, Process Reassessment release cycles, Leading by example: Ubuntu, Leading by example: Ubuntu skills acquisition in, Developing Knowledge, Developing Knowledge structure of, Mark Shuttleworth, Mark Shuttleworth, Community Council, Community Council, Technical Board, Technical Board, Team councils, Team councils Community Council, Community Council, Community Council founder and primary sponsor, Mark Shuttleworth, Mark Shuttleworth team councils, Team councils, Team councils Technical Board, Technical Board, Technical Board survey to determine bottlenecks, Ensuring Effective Processes, Ensuring Effective Processes team collaboration in, Understand the extent and range of collaboration among our teams Ubuntu 11.04 release, Feedback, Feedback updates for, Providing Community Updates, Providing Community Updates use of YouTube by, Videos, Videos Xubuntu community, Baking in Openness, Baking in Openness young African as contributor to, Striving for Clarity, Inspiring your community Ubuntu Community Council, Governance, Governance, Commercial sponsorship, Mark Shuttleworth, Mark Shuttleworth, Community Council, Community Council, Community Council, Community Council, Community Council, Technical Board, Technical Board, Developer how items are forwarded to, Community Council, Community Council mandate of, Community Council Mike Basinger's election to, Governance, Governance MOTU Developers, Developer requirements for, contrasted with Technical Board, Technical Board, Technical Board responsibilities of, Community Council, Community Council role of Mark Shuttleworth on, Mark Shuttleworth, Mark Shuttleworth seats on, Commercial sponsorship Ubuntu Core Developers, Developer Ubuntu Developer Summit (UDS), Reporting, Reporting, Organizing a Summit, Organizing a Summit, The Organizational Team, The Organizational Team, The Venue, The Venue, Meeting room requirements, Meeting room requirements, Location, Location, Facilities, Facilities, Assets, Assets, Infrastructure, Infrastructure, Infrastructure, Infrastructure, Room Layout, Room Layout, Room Layout, The Timetable, Sessions, Scheduling, Scheduling, Scheduling, Scheduling assets, Assets, Assets infrastructure, Infrastructure, Infrastructure organizational team of, The Organizational Team, The Organizational Team remote participation in, Infrastructure, Infrastructure, Room Layout reporting at, Reporting, Reporting scheduling, Scheduling, Scheduling sponsorship of, Scheduling, Scheduling timetable of, The Timetable, Sessions venue of, The Venue, The Venue, Meeting room requirements, Meeting room requirements, Location, Location, Facilities, Facilities, Room Layout, Room Layout facilities, Facilities, Facilities location, Location, Location meeting room, Meeting room requirements, Meeting room requirements overview of, The Venue, The Venue room layout, Room Layout, Room Layout Ubuntu Forums Council (FC), Codifying Your Council, Codifying Your Council, Escalation Ubuntu Global Jam, Plugging your stats into graphs Ubuntu Local Community (LoCo), Observational Tests, Building a Set of Generals, Building a Set of Generals, Responsibilities, Team councils, Team councils Ubuntu Open Week, Date/time, Organizing Online Tutorials, Running a session Ubuntu Package Selection, Technical Board Ubuntu Packaging Policy, Technical Board Ubuntu Release Feature Goals, Technical Board Ubuntu Upstream Report, Getting to know the problem UDS, The Art of Community (see Ubuntu Developer Summit) unconferences, Organizing an Unconference, Event-specific notes updates, community, Providing Community Updates, Providing Community Updates, Staying Current, Staying Current usability testing, Observational Tests, Observational Tests, Observational Tests Ustream, Videos, Videos UTC (coordinated universal time), Date/time, Date/time V value, bringing to company/community, Hooks ’n’ Data, Privacy values, Building Your Communication Channels, Aspire to Inspire, Aspire to Inspire, Responsibilities effect of good communication on, Building Your Communication Channels governance based on, Aspire to Inspire, Aspire to Inspire, Responsibilities van Rossum, Guido, Dictatorial Charismatic Leadership Verduzco, Cristina, The Gates of Your Community, The Gates of Your Community version control, Source Control, Source Control video recording, at events, Infrastructure, Infrastructure videos, Videos, Videos, Videos, Videos about community, Videos, Videos creating buzz with, Videos, Videos violence, Lack of Justice, Lack of Justice viral marketing, Attracting Contributors, Attracting Contributors virtual worlds, Virtual worlds, Virtual worlds VoIP (Voice over IP), Voice over IP (VoIP), Voice over IP (VoIP) volunteers, The Art of Community (see contributers) W Walker, Dave, Social Media Walli, Stephen, Building Belonging into the Social Economy, Long versus short presentations WebKit, Enlightened Dictatorship, Enlightened Dictatorship website for this book, Building Our Own Community, Building Our Own Community website, community, Setting Up Your Base, Setting Up Your Base, Aims, Aims, Staying Current, Review, Staying Current, Staying Current, Staying Current, Building Conversation, Building Conversation, Building Conversation, Building Conversation, Building Conversation, Getting Online, Getting Online, Syndication, Syndication, Syndication, Syndication, Syndication, Syndication aims of, Aims, Aims commenting facilities on, Building Conversation, Building Conversation contributors to, Staying Current, Staying Current, Staying Current conversation on, Building Conversation, Building Conversation, Building Conversation importance of, Setting Up Your Base, Setting Up Your Base keeping current, Staying Current, Review RSS feeds from, Syndication, Syndication search engine optimization of, Syndication, Syndication setting up with blogging engine, Getting Online, Getting Online syndication of content from, Syndication, Syndication Whedon, Joss, Read-mostly communities, Read-mostly communities Wii (game system), Technique 2: Think outside the box Wikipedia, Tim O’Reilly, O’Reilly Media wikis, Documenting Your Strategy, Documenting Your Strategy, Reporting publishing strategic plan on, Documenting Your Strategy, Documenting Your Strategy using for reporting, Reporting WineCamp, Organizing an Unconference, Organizing an Unconference Wink, Videos wit, Setting tone Wolfire Games, Richard Esguerra, Humble Indie Bundle, Richard Esguerra, Humble Indie Bundle word of mouth, building buzz with, Attracting Contributors WordPress, Getting Online Work item level, of projects, Tracking Projects work items, communication between teams about, Ensure that teams can communicate clearly and effectively Work unit level, of projects, Tracking Projects work, tracking, The Art of Community (see tracking) work/life balance, Work/Life Balance, Addiction workflow, The Art of Community, The Art of Community, Understanding Your Workflow, Understanding Your Workflow, Roles, Roles, Building a Simple Workflow, Building a Simple Workflow, An Example: Ubuntu Bug Workflow, Lessons learned, Bug Tracking, Bug triage, Source Control, Source Control, Collaborative Editing, Collaborative Editing, Building and Maintaining Transparency, Building and Maintaining Transparency, Regular Workflow Assessment, Regular Workflow Assessment, Gathering Structured Feedback, Gathering Structured Feedback, Building burndown charts into your workflow, Building burndown charts into your workflow, Building Family Values, Building Family Values (see also reporting) (see also tools) assessing, Regular Workflow Assessment, Regular Workflow Assessment bug tracking, Bug Tracking, Bug triage building, Building a Simple Workflow, Building a Simple Workflow building burndown charts into, Building burndown charts into your workflow, Building burndown charts into your workflow collaborative editing, Collaborative Editing, Collaborative Editing gathering feedback about, Gathering Structured Feedback, Gathering Structured Feedback identifying audience, Roles, Roles impact on sense of belonging, Building Family Values, Building Family Values overview, Understanding Your Workflow, Understanding Your Workflow source control, Source Control, Source Control transparency in, Building and Maintaining Transparency, Building and Maintaining Transparency Ubuntu bug example, An Example: Ubuntu Bug Workflow, Lessons learned write-centered communities, Write-centered communities, Write-centered communities writing, The Art of Community, Building a Mission Statement, Building a Mission Statement, Avoiding bikeshedding, Setting tone, The mechanics of writing, The mechanics of writing, The mechanics of writing, The mechanics of writing, Don’t write like an institution, Untwisting the tail, Inspiring your community, Inspiring your community, Inspired Words, Inspired Words (see also documentation) cultural differences in, The mechanics of writing, The mechanics of writing inspirational, Inspiring your community, Inspiring your community, Inspired Words, Inspired Words mechanics of, The mechanics of writing, The mechanics of writing mission statements, Building a Mission Statement, Building a Mission Statement simple, clear language, Don’t write like an institution, Untwisting the tail tone, Avoiding bikeshedding, Setting tone X X.commerce developer community, Carolyn Mellor, X.commerce, PayPal, and eBay, Carolyn Mellor, X.commerce, PayPal, and eBay X.org Foundation, Transparency xFree86 project, Transparency, Transparency Xiaojiang Huang, Foreword Xubuntu community, Baking in Openness, Baking in Openness Y YouTube, Videos, Videos, Mark Bussler, Classic Game Room Z Zazzle, Selling Zinsser, William, Don’t write like an institution About the Author Jono Bacon is an award-winning leading community manager, author and consultant.

Technical Considerations Most of this chapter has focused on best practices for assessing optimal workflow for a project, staying detached enough from specific technologies so that the information can apply to any kind of community and workflow. The next few pages delve into some technical specifics of particular interest to software projects, such as bug tracking, source control, and collaborative editing. As we cover these topics I am going to deliberately avoid recommending specific solutions or pieces of software. This is because software changes a lot more quickly than books do, and the last thing I want to do is recommend a piece of software that has since become unsuitable for your requirements. Instead, the following pages will advise you of some key considerations that you should keep in mind when evaluating which tools you want to use to solve a particular problem.


Martin Kleppmann-Designing Data-Intensive Applications. The Big Ideas Behind Reliable, Scalable and Maintainable Systems-O’Reilly (2017) by Unknown

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, general-purpose programming language, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, loose coupling, Marc Andreessen, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, web application, WebSocket, wikimedia commons

As the rich his‐ tory of broken calendar sync implementations demonstrates, multi-leader replication is a tricky thing to get right. There are tools that aim to make this kind of multi-leader configuration easier. For example, CouchDB is designed for this mode of operation [29]. Collaborative editing Real-time collaborative editing applications allow several people to edit a document simultaneously. For example, Etherpad [30] and Google Docs [31] allow multiple people to concurrently edit a text document or spreadsheet (the algorithm is briefly discussed in “Automatic Conflict Resolution” on page 174). 170 | Chapter 5: Replication We don’t usually think of collaborative editing as a database replication problem, but it has a lot in common with the previously mentioned offline editing use case. When one user edits a document, the changes are instantly applied to their local replica (the state of the document in their web browser or client application) and asynchronously replicated to the server and any other users who are editing the same document.

The opposite of bounded. 558 | Glossary Index A aborts (transactions), 222, 224 in two-phase commit, 356 performance of optimistic concurrency con‐ trol, 266 retrying aborted transactions, 231 abstraction, 21, 27, 222, 266, 321 access path (in network model), 37, 60 accidental complexity, removing, 21 accountability, 535 ACID properties (transactions), 90, 223 atomicity, 223, 228 consistency, 224, 529 durability, 226 isolation, 225, 228 acknowledgements (messaging), 445 active/active replication (see multi-leader repli‐ cation) active/passive replication (see leader-based rep‐ lication) ActiveMQ (messaging), 137, 444 distributed transaction support, 361 ActiveRecord (object-relational mapper), 30, 232 actor model, 138 (see also message-passing) comparison to Pregel model, 425 comparison to stream processing, 468 Advanced Message Queuing Protocol (see AMQP) aerospace systems, 6, 10, 305, 372 aggregation data cubes and materialized views, 101 in batch processes, 406 in stream processes, 466 aggregation pipeline query language, 48 Agile, 22 minimizing irreversibility, 414, 497 moving faster with confidence, 532 Unix philosophy, 394 agreement, 365 (see also consensus) Airflow (workflow scheduler), 402 Ajax, 131 Akka (actor framework), 139 algorithms algorithm correctness, 308 B-trees, 79-83 for distributed systems, 306 hash indexes, 72-75 mergesort, 76, 402, 405 red-black trees, 78 SSTables and LSM-trees, 76-79 all-to-all replication topologies, 175 AllegroGraph (database), 50 ALTER TABLE statement (SQL), 40, 111 Amazon Dynamo (database), 177 Amazon Web Services (AWS), 8 Kinesis Streams (messaging), 448 network reliability, 279 postmortems, 9 RedShift (database), 93 S3 (object storage), 398 checking data integrity, 530 amplification of bias, 534 of failures, 364, 495 Index | 559 of tail latency, 16, 207 write amplification, 84 AMQP (Advanced Message Queuing Protocol), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 message ordering, 446 analytics, 90 comparison to transaction processing, 91 data warehousing (see data warehousing) parallel query execution in MPP databases, 415 predictive (see predictive analytics) relation to batch processing, 411 schemas for, 93-95 snapshot isolation for queries, 238 stream analytics, 466 using MapReduce, analysis of user activity events (example), 404 anti-caching (in-memory databases), 89 anti-entropy, 178 Apache ActiveMQ (see ActiveMQ) Apache Avro (see Avro) Apache Beam (see Beam) Apache BookKeeper (see BookKeeper) Apache Cassandra (see Cassandra) Apache CouchDB (see CouchDB) Apache Curator (see Curator) Apache Drill (see Drill) Apache Flink (see Flink) Apache Giraph (see Giraph) Apache Hadoop (see Hadoop) Apache HAWQ (see HAWQ) Apache HBase (see HBase) Apache Helix (see Helix) Apache Hive (see Hive) Apache Impala (see Impala) Apache Jena (see Jena) Apache Kafka (see Kafka) Apache Lucene (see Lucene) Apache MADlib (see MADlib) Apache Mahout (see Mahout) Apache Oozie (see Oozie) Apache Parquet (see Parquet) Apache Qpid (see Qpid) Apache Samza (see Samza) Apache Solr (see Solr) Apache Spark (see Spark) 560 | Index Apache Storm (see Storm) Apache Tajo (see Tajo) Apache Tez (see Tez) Apache Thrift (see Thrift) Apache ZooKeeper (see ZooKeeper) Apama (stream analytics), 466 append-only B-trees, 82, 242 append-only files (see logs) Application Programming Interfaces (APIs), 5, 27 for batch processing, 403 for change streams, 456 for distributed transactions, 361 for graph processing, 425 for services, 131-136 (see also services) evolvability, 136 RESTful, 133 SOAP, 133 application state (see state) approximate search (see similarity search) archival storage, data from databases, 131 arcs (see edges) arithmetic mean, 14 ASCII text, 119, 395 ASN.1 (schema language), 127 asynchronous networks, 278, 553 comparison to synchronous networks, 284 formal model, 307 asynchronous replication, 154, 553 conflict detection, 172 data loss on failover, 157 reads from asynchronous follower, 162 Asynchronous Transfer Mode (ATM), 285 atomic broadcast (see total order broadcast) atomic clocks (caesium clocks), 294, 295 (see also clocks) atomicity (concurrency), 553 atomic increment-and-get, 351 compare-and-set, 245, 327 (see also compare-and-set operations) replicated operations, 246 write operations, 243 atomicity (transactions), 223, 228, 553 atomic commit, 353 avoiding, 523, 528 blocking and nonblocking, 359 in stream processing, 360, 477 maintaining derived data, 453 for multi-object transactions, 229 for single-object writes, 230 auditability, 528-533 designing for, 531 self-auditing systems, 530 through immutability, 460 tools for auditable data systems, 532 availability, 8 (see also fault tolerance) in CAP theorem, 337 in service level agreements (SLAs), 15 Avro (data format), 122-127 code generation, 127 dynamically generated schemas, 126 object container files, 125, 131, 414 reader determining writer’s schema, 125 schema evolution, 123 use in Hadoop, 414 awk (Unix tool), 391 AWS (see Amazon Web Services) Azure (see Microsoft) B B-trees (indexes), 79-83 append-only/copy-on-write variants, 82, 242 branching factor, 81 comparison to LSM-trees, 83-85 crash recovery, 82 growing by splitting a page, 81 optimizations, 82 similarity to dynamic partitioning, 212 backpressure, 441, 553 in TCP, 282 backups database snapshot for replication, 156 integrity of, 530 snapshot isolation for, 238 use for ETL processes, 405 backward compatibility, 112 BASE, contrast to ACID, 223 bash shell (Unix), 70, 395, 503 batch processing, 28, 389-431, 553 combining with stream processing lambda architecture, 497 unifying technologies, 498 comparison to MPP databases, 414-418 comparison to stream processing, 464 comparison to Unix, 413-414 dataflow engines, 421-423 fault tolerance, 406, 414, 422, 442 for data integration, 494-498 graphs and iterative processing, 424-426 high-level APIs and languages, 403, 426-429 log-based messaging and, 451 maintaining derived state, 495 MapReduce and distributed filesystems, 397-413 (see also MapReduce) measuring performance, 13, 390 outputs, 411-413 key-value stores, 412 search indexes, 411 using Unix tools (example), 391-394 Bayou (database), 522 Beam (dataflow library), 498 bias, 534 big ball of mud, 20 Bigtable data model, 41, 99 binary data encodings, 115-128 Avro, 122-127 MessagePack, 116-117 Thrift and Protocol Buffers, 117-121 binary encoding based on schemas, 127 by network drivers, 128 binary strings, lack of support in JSON and XML, 114 BinaryProtocol encoding (Thrift), 118 Bitcask (storage engine), 72 crash recovery, 74 Bitcoin (cryptocurrency), 532 Byzantine fault tolerance, 305 concurrency bugs in exchanges, 233 bitmap indexes, 97 blockchains, 532 Byzantine fault tolerance, 305 blocking atomic commit, 359 Bloom (programming language), 504 Bloom filter (algorithm), 79, 466 BookKeeper (replicated log), 372 Bottled Water (change data capture), 455 bounded datasets, 430, 439, 553 (see also batch processing) bounded delays, 553 in networks, 285 process pauses, 298 broadcast hash joins, 409 Index | 561 brokerless messaging, 442 Brubeck (metrics aggregator), 442 BTM (transaction coordinator), 356 bulk synchronous parallel (BSP) model, 425 bursty network traffic patterns, 285 business data processing, 28, 90, 390 byte sequence, encoding data in, 112 Byzantine faults, 304-306, 307, 553 Byzantine fault-tolerant systems, 305, 532 Byzantine Generals Problem, 304 consensus algorithms and, 366 C caches, 89, 553 and materialized views, 101 as derived data, 386, 499-504 database as cache of transaction log, 460 in CPUs, 99, 338, 428 invalidation and maintenance, 452, 467 linearizability, 324 CAP theorem, 336-338, 554 Cascading (batch processing), 419, 427 hash joins, 409 workflows, 403 cascading failures, 9, 214, 281 Cascalog (batch processing), 60 Cassandra (database) column-family data model, 41, 99 compaction strategy, 79 compound primary key, 204 gossip protocol, 216 hash partitioning, 203-205 last-write-wins conflict resolution, 186, 292 leaderless replication, 177 linearizability, lack of, 335 log-structured storage, 78 multi-datacenter support, 184 partitioning scheme, 213 secondary indexes, 207 sloppy quorums, 184 cat (Unix tool), 391 causal context, 191 (see also causal dependencies) causal dependencies, 186-191 capturing, 191, 342, 494, 514 by total ordering, 493 causal ordering, 339 in transactions, 262 sending message to friends (example), 494 562 | Index causality, 554 causal ordering, 339-343 linearizability and, 342 total order consistent with, 344, 345 consistency with, 344-347 consistent snapshots, 340 happens-before relationship, 186 in serializable transactions, 262-265 mismatch with clocks, 292 ordering events to capture, 493 violations of, 165, 176, 292, 340 with synchronized clocks, 294 CEP (see complex event processing) certificate transparency, 532 chain replication, 155 linearizable reads, 351 change data capture, 160, 454 API support for change streams, 456 comparison to event sourcing, 457 implementing, 454 initial snapshot, 455 log compaction, 456 changelogs, 460 change data capture, 454 for operator state, 479 generating with triggers, 455 in stream joins, 474 log compaction, 456 maintaining derived state, 452 Chaos Monkey, 7, 280 checkpointing in batch processors, 422, 426 in high-performance computing, 275 in stream processors, 477, 523 chronicle data model, 458 circuit-switched networks, 284 circular buffers, 450 circular replication topologies, 175 clickstream data, analysis of, 404 clients calling services, 131 pushing state changes to, 512 request routing, 214 stateful and offline-capable, 170, 511 clocks, 287-299 atomic (caesium) clocks, 294, 295 confidence interval, 293-295 for global snapshots, 294 logical (see logical clocks) skew, 291-294, 334 slewing, 289 synchronization and accuracy, 289-291 synchronization using GPS, 287, 290, 294, 295 time-of-day versus monotonic clocks, 288 timestamping events, 471 cloud computing, 146, 275 need for service discovery, 372 network glitches, 279 shared resources, 284 single-machine reliability, 8 Cloudera Impala (see Impala) clustered indexes, 86 CODASYL model, 36 (see also network model) code generation with Avro, 127 with Thrift and Protocol Buffers, 118 with WSDL, 133 collaborative editing multi-leader replication and, 170 column families (Bigtable), 41, 99 column-oriented storage, 95-101 column compression, 97 distinction between column families and, 99 in batch processors, 428 Parquet, 96, 131, 414 sort order in, 99-100 vectorized processing, 99, 428 writing to, 101 comma-separated values (see CSV) command query responsibility segregation (CQRS), 462 commands (event sourcing), 459 commits (transactions), 222 atomic commit, 354-355 (see also atomicity; transactions) read committed isolation, 234 three-phase commit (3PC), 359 two-phase commit (2PC), 355-359 commutative operations, 246 compaction of changelogs, 456 (see also log compaction) for stream operator state, 479 of log-structured storage, 73 issues with, 84 size-tiered and leveled approaches, 79 CompactProtocol encoding (Thrift), 119 compare-and-set operations, 245, 327 implementing locks, 370 implementing uniqueness constraints, 331 implementing with total order broadcast, 350 relation to consensus, 335, 350, 352, 374 relation to transactions, 230 compatibility, 112, 128 calling services, 136 properties of encoding formats, 139 using databases, 129-131 using message-passing, 138 compensating transactions, 355, 461, 526 complex event processing (CEP), 465 complexity distilling in theoretical models, 310 hiding using abstraction, 27 of software systems, managing, 20 composing data systems (see unbundling data‐ bases) compute-intensive applications, 3, 275 concatenated indexes, 87 in Cassandra, 204 Concord (stream processor), 466 concurrency actor programming model, 138, 468 (see also message-passing) bugs from weak transaction isolation, 233 conflict resolution, 171, 174 detecting concurrent writes, 184-191 dual writes, problems with, 453 happens-before relationship, 186 in replicated systems, 161-191, 324-338 lost updates, 243 multi-version concurrency control (MVCC), 239 optimistic concurrency control, 261 ordering of operations, 326, 341 reducing, through event logs, 351, 462, 507 time and relativity, 187 transaction isolation, 225 write skew (transaction isolation), 246-251 conflict-free replicated datatypes (CRDTs), 174 conflicts conflict detection, 172 causal dependencies, 186, 342 in consensus algorithms, 368 in leaderless replication, 184 Index | 563 in log-based systems, 351, 521 in nonlinearizable systems, 343 in serializable snapshot isolation (SSI), 264 in two-phase commit, 357, 364 conflict resolution automatic conflict resolution, 174 by aborting transactions, 261 by apologizing, 527 convergence, 172-174 in leaderless systems, 190 last write wins (LWW), 186, 292 using atomic operations, 246 using custom logic, 173 determining what is a conflict, 174, 522 in multi-leader replication, 171-175 avoiding conflicts, 172 lost updates, 242-246 materializing, 251 relation to operation ordering, 339 write skew (transaction isolation), 246-251 congestion (networks) avoidance, 282 limiting accuracy of clocks, 293 queueing delays, 282 consensus, 321, 364-375, 554 algorithms, 366-368 preventing split brain, 367 safety and liveness properties, 365 using linearizable operations, 351 cost of, 369 distributed transactions, 352-375 in practice, 360-364 two-phase commit, 354-359 XA transactions, 361-364 impossibility of, 353 membership and coordination services, 370-373 relation to compare-and-set, 335, 350, 352, 374 relation to replication, 155, 349 relation to uniqueness constraints, 521 consistency, 224, 524 across different databases, 157, 452, 462, 492 causal, 339-348, 493 consistent prefix reads, 165-167 consistent snapshots, 156, 237-242, 294, 455, 500 (see also snapshots) 564 | Index crash recovery, 82 enforcing constraints (see constraints) eventual, 162, 322 (see also eventual consistency) in ACID transactions, 224, 529 in CAP theorem, 337 linearizability, 324-338 meanings of, 224 monotonic reads, 164-165 of secondary indexes, 231, 241, 354, 491, 500 ordering guarantees, 339-352 read-after-write, 162-164 sequential, 351 strong (see linearizability) timeliness and integrity, 524 using quorums, 181, 334 consistent hashing, 204 consistent prefix reads, 165 constraints (databases), 225, 248 asynchronously checked, 526 coordination avoidance, 527 ensuring idempotence, 519 in log-based systems, 521-524 across multiple partitions, 522 in two-phase commit, 355, 357 relation to consensus, 374, 521 relation to event ordering, 347 requiring linearizability, 330 Consul (service discovery), 372 consumers (message streams), 137, 440 backpressure, 441 consumer offsets in logs, 449 failures, 445, 449 fan-out, 11, 445, 448 load balancing, 444, 448 not keeping up with producers, 441, 450, 502 context switches, 14, 297 convergence (conflict resolution), 172-174, 322 coordination avoidance, 527 cross-datacenter, 168, 493 cross-partition ordering, 256, 294, 348, 523 services, 330, 370-373 coordinator (in 2PC), 356 failure, 358 in XA transactions, 361-364 recovery, 363 copy-on-write (B-trees), 82, 242 CORBA (Common Object Request Broker Architecture), 134 correctness, 6 auditability, 528-533 Byzantine fault tolerance, 305, 532 dealing with partial failures, 274 in log-based systems, 521-524 of algorithm within system model, 308 of compensating transactions, 355 of consensus, 368 of derived data, 497, 531 of immutable data, 461 of personal data, 535, 540 of time, 176, 289-295 of transactions, 225, 515, 529 timeliness and integrity, 524-528 corruption of data detecting, 519, 530-533 due to pathological memory access, 529 due to radiation, 305 due to split brain, 158, 302 due to weak transaction isolation, 233 formalization in consensus, 366 integrity as absence of, 524 network packets, 306 on disks, 227 preventing using write-ahead logs, 82 recovering from, 414, 460 Couchbase (database) durability, 89 hash partitioning, 203-204, 211 rebalancing, 213 request routing, 216 CouchDB (database) B-tree storage, 242 change feed, 456 document data model, 31 join support, 34 MapReduce support, 46, 400 replication, 170, 173 covering indexes, 86 CPUs cache coherence and memory barriers, 338 caching and pipelining, 99, 428 increasing parallelism, 43 CRDTs (see conflict-free replicated datatypes) CREATE INDEX statement (SQL), 85, 500 credit rating agencies, 535 Crunch (batch processing), 419, 427 hash joins, 409 sharded joins, 408 workflows, 403 cryptography defense against attackers, 306 end-to-end encryption and authentication, 519, 543 proving integrity of data, 532 CSS (Cascading Style Sheets), 44 CSV (comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494-498 lambda architecture, 497 maintaining derived state, 495 reprocessing data, 496 unifying, 498 by unbundling databases, 499-515 comparison to federated databases, 501 combining tools by deriving data, 490-494 derived data versus distributed transac‐ tions, 492 limits of total ordering, 493 ordering events to capture causality, 493 reasoning about dataflows, 491 need for, 385 data lakes, 415 data locality (see locality) data models, 27-64 graph-like models, 49-63 Datalog language, 60-63 property graphs, 50 RDF and triple-stores, 55-59 query languages, 42-48 relational model versus document model, 28-42 data protection regulations, 542 data systems, 3 about, 4 Index | 565 concerns when designing, 5 future of, 489-544 correctness, constraints, and integrity, 515-533 data integration, 490-498 unbundling databases, 499-515 heterogeneous, keeping in sync, 452 maintainability, 18-22 possible faults in, 221 reliability, 6-10 hardware faults, 7 human errors, 9 importance of, 10 software errors, 8 scalability, 10-18 unreliable clocks, 287-299 data warehousing, 91-95, 554 comparison to data lakes, 415 ETL (extract-transform-load), 92, 416, 452 keeping data systems in sync, 452 schema design, 93 slowly changing dimension (SCD), 476 data-intensive applications, 3 database triggers (see triggers) database-internal distributed transactions, 360, 364, 477 databases archival storage, 131 comparison of message brokers to, 443 dataflow through, 129 end-to-end argument for, 519-520 checking integrity, 531 inside-out, 504 (see also unbundling databases) output from batch workflows, 412 relation to event streams, 451-464 (see also changelogs) API support for change streams, 456, 506 change data capture, 454-457 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 unbundling, 499-515 composing data storage technologies, 499-504 designing applications around dataflow, 504-509 566 | Index observing derived state, 509-515 datacenters geographically distributed, 145, 164, 278, 493 multi-tenancy and shared resources, 284 network architecture, 276 network faults, 279 replication across multiple, 169 leaderless replication, 184 multi-leader replication, 168, 335 dataflow, 128-139, 504-509 correctness of dataflow systems, 525 differential, 504 message-passing, 136-139 reasoning about, 491 through databases, 129 through services, 131-136 dataflow engines, 421-423 comparison to stream processing, 464 directed acyclic graphs (DAG), 424 partitioning, approach to, 429 support for declarative queries, 427 Datalog (query language), 60-63 datatypes binary strings in XML and JSON, 114 conflict-free, 174 in Avro encodings, 122 in Thrift and Protocol Buffers, 121 numbers in XML and JSON, 114 Datomic (database) B-tree storage, 242 data model, 50, 57 Datalog query language, 60 excision (deleting data), 463 languages for transactions, 255 serial execution of transactions, 253 deadlocks detection, in two-phase commit (2PC), 364 in two-phase locking (2PL), 258 Debezium (change data capture), 455 declarative languages, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34, 554 costs, 39 in derived data systems, 386 materialized views, 101 updating derived data, 228, 231, 490 versus normalization, 462 derived data, 386, 439, 554 from change data capture, 454 in event sourcing, 458-458 maintaining derived state through logs, 452-457, 459-463 observing, by subscribing to streams, 512 outputs of batch and stream processing, 495 through application code, 505 versus distributed transactions, 492 deterministic operations, 255, 274, 554 accidental nondeterminism, 423 and fault tolerance, 423, 426 and idempotence, 478, 492 computing derived data, 495, 526, 531 in state machine replication, 349, 452, 458 joins, 476 DevOps, 394 differential dataflow, 504 dimension tables, 94 dimensional modeling (see star schemas) directed acyclic graphs (DAGs), 424 dirty reads (transaction isolation), 234 dirty writes (transaction isolation), 235 discrimination, 534 disks (see hard disks) distributed actor frameworks, 138 distributed filesystems, 398-399 decoupling from query engines, 417 indiscriminately dumping data into, 415 use by MapReduce, 402 distributed systems, 273-312, 554 Byzantine faults, 304-306 cloud versus supercomputing, 275 detecting network faults, 280 faults and partial failures, 274-277 formalization of consensus, 365 impossibility results, 338, 353 issues with failover, 157 limitations of distributed transactions, 363 multi-datacenter, 169, 335 network problems, 277-286 quorums, relying on, 301 reasons for using, 145, 151 synchronized clocks, relying on, 291-295 system models, 306-310 use of clocks and time, 287 distributed transactions (see transactions) Django (web framework), 232 DNS (Domain Name System), 216, 372 Docker (container manager), 506 document data model, 30-42 comparison to relational model, 38-42 document references, 38, 403 document-oriented databases, 31 many-to-many relationships and joins, 36 multi-object transactions, need for, 231 versus relational model convergence of models, 41 data locality, 41 document-partitioned indexes, 206, 217, 411 domain-driven design (DDD), 457 DRBD (Distributed Replicated Block Device), 153 drift (clocks), 289 Drill (query engine), 93 Druid (database), 461 Dryad (dataflow engine), 421 dual writes, problems with, 452, 507 duplicates, suppression of, 517 (see also idempotence) using a unique ID, 518, 522 durability (transactions), 226, 554 duration (time), 287 measurement with monotonic clocks, 288 dynamic partitioning, 212 dynamically typed languages analogy to schema-on-read, 40 code generation and, 127 Dynamo-style databases (see leaderless replica‐ tion) E edges (in graphs), 49, 403 property graph model, 50 edit distance (full-text search), 88 effectively-once semantics, 476, 516 Index | 567 (see also exactly-once semantics) preservation of integrity, 525 elastic systems, 17 Elasticsearch (search server) document-partitioned indexes, 207 partition rebalancing, 211 percolator (stream search), 467 usage example, 4 use of Lucene, 79 ElephantDB (database), 413 Elm (programming language), 504, 512 encodings (data formats), 111-128 Avro, 122-127 binary variants of JSON and XML, 115 compatibility, 112 calling services, 136 using databases, 129-131 using message-passing, 138 defined, 113 JSON, XML, and CSV, 114 language-specific formats, 113 merits of schemas, 127 representations of data, 112 Thrift and Protocol Buffers, 117-121 end-to-end argument, 277, 519-520 checking integrity, 531 publish/subscribe streams, 512 enrichment (stream), 473 Enterprise JavaBeans (EJB), 134 entities (see vertices) epoch (consensus algorithms), 368 epoch (Unix timestamps), 288 equi-joins, 403 erasure coding (error correction), 398 Erlang OTP (actor framework), 139 error handling for network faults, 280 in transactions, 231 error-correcting codes, 277, 398 Esper (CEP engine), 466 etcd (coordination service), 370-373 linearizable operations, 333 locks and leader election, 330 quorum reads, 351 service discovery, 372 use of Raft algorithm, 349, 353 Ethereum (blockchain), 532 Ethernet (networks), 276, 278, 285 packet checksums, 306, 519 568 | Index Etherpad (collaborative editor), 170 ethics, 533-543 code of ethics and professional practice, 533 legislation and self-regulation, 542 predictive analytics, 533-536 amplifying bias, 534 feedback loops, 536 privacy and tracking, 536-543 consent and freedom of choice, 538 data as assets and power, 540 meaning of privacy, 539 surveillance, 537 respect, dignity, and agency, 543, 544 unintended consequences, 533, 536 ETL (extract-transform-load), 92, 405, 452, 554 use of Hadoop for, 416 event sourcing, 457-459 commands and events, 459 comparison to change data capture, 457 comparison to lambda architecture, 497 deriving current state from event log, 458 immutability and auditability, 459, 531 large, reliable data systems, 519, 526 Event Store (database), 458 event streams (see streams) events, 440 deciding on total order of, 493 deriving views from event log, 461 difference to commands, 459 event time versus processing time, 469, 477, 498 immutable, advantages of, 460, 531 ordering to capture causality, 493 reads as, 513 stragglers, 470, 498 timestamp of, in stream processing, 471 EventSource (browser API), 512 eventual consistency, 152, 162, 308, 322 (see also conflicts) and perpetual inconsistency, 525 evolvability, 21, 111 calling services, 136 graph-structured data, 52 of databases, 40, 129-131, 461, 497 of message-passing, 138 reprocessing data, 496, 498 schema evolution in Avro, 123 schema evolution in Thrift and Protocol Buffers, 120 schema-on-read, 39, 111, 128 exactly-once semantics, 360, 476, 516 parity with batch processors, 498 preservation of integrity, 525 exclusive mode (locks), 258 eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F Facebook Presto (query engine), 93 React, Flux, and Redux (user interface libra‐ ries), 512 social graphs, 49 Wormhole (change data capture), 455 fact tables, 93 failover, 157, 554 (see also leader-based replication) in leaderless replication, absence of, 178 leader election, 301, 348, 352 potential problems, 157 failures amplification by distributed transactions, 364, 495 failure detection, 280 automatic rebalancing causing cascading failures, 214 perfect failure detectors, 359 timeouts and unbounded delays, 282, 284 using ZooKeeper, 371 faults versus, 7 partial failures in distributed systems, 275-277, 310 fan-out (messaging systems), 11, 445 fault tolerance, 6-10, 555 abstractions for, 321 formalization in consensus, 365-369 use of replication, 367 human fault tolerance, 414 in batch processing, 406, 414, 422, 425 in log-based systems, 520, 524-526 in stream processing, 476-479 atomic commit, 477 idempotence, 478 maintaining derived state, 495 microbatching and checkpointing, 477 rebuilding state after a failure, 478 of distributed transactions, 362-364 transaction atomicity, 223, 354-361 faults, 6 Byzantine faults, 304-306 failures versus, 7 handled by transactions, 221 handling in supercomputers and cloud computing, 275 hardware, 7 in batch processing versus distributed data‐ bases, 417 in distributed systems, 274-277 introducing deliberately, 7, 280 network faults, 279-281 asymmetric faults, 300 detecting, 280 tolerance of, in multi-leader replication, 169 software errors, 8 tolerating (see fault tolerance) federated databases, 501 fence (CPU instruction), 338 fencing (preventing split brain), 158, 302-304 generating fencing tokens, 349, 370 properties of fencing tokens, 308 stream processors writing to databases, 478, 517 Fibre Channel (networks), 398 field tags (Thrift and Protocol Buffers), 119-121 file descriptors (Unix), 395 financial data, 460 Firebase (database), 456 Flink (processing framework), 421-423 dataflow APIs, 427 fault tolerance, 422, 477, 479 Gelly API (graph processing), 425 integration of batch and stream processing, 495, 498 machine learning, 428 query optimizer, 427 stream processing, 466 flow control, 282, 441, 555 FLP result (on consensus), 353 FlumeJava (dataflow library), 403, 427 followers, 152, 555 (see also leader-based replication) foreign keys, 38, 403 forward compatibility, 112 forward decay (algorithm), 16 Index | 569 Fossil (version control system), 463 shunning (deleting data), 463 FoundationDB (database) serializable transactions, 261, 265, 364 fractal trees, 83 full table scans, 403 full-text search, 555 and fuzzy indexes, 88 building search indexes, 411 Lucene storage engine, 79 functional reactive programming (FRP), 504 functional requirements, 22 futures (asynchronous operations), 135 fuzzy search (see similarity search) G garbage collection immutability and, 463 process pauses for, 14, 296-299, 301 (see also process pauses) genome analysis, 63, 429 geographically distributed datacenters, 145, 164, 278, 493 geospatial indexes, 87 Giraph (graph processing), 425 Git (version control system), 174, 342, 463 GitHub, postmortems, 157, 158, 309 global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), 398 GNU Coreutils (Linux), 394 GoldenGate (change data capture), 161, 170, 455 (see also Oracle) Google Bigtable (database) data model (see Bigtable data model) partitioning scheme, 199, 202 storage layout, 78 Chubby (lock service), 370 Cloud Dataflow (stream processor), 466, 477, 498 (see also Beam) Cloud Pub/Sub (messaging), 444, 448 Docs (collaborative editor), 170 Dremel (query engine), 93, 96 FlumeJava (dataflow library), 403, 427 GFS (distributed file system), 398 gRPC (RPC framework), 135 MapReduce (batch processing), 390 570 | Index (see also MapReduce) building search indexes, 411 task preemption, 418 Pregel (graph processing), 425 Spanner (see Spanner) TrueTime (clock API), 294 gossip protocol, 216 government use of data, 541 GPS (Global Positioning System) use for clock synchronization, 287, 290, 294, 295 GraphChi (graph processing), 426 graphs, 555 as data models, 49-63 example of graph-structured data, 49 property graphs, 50 RDF and triple-stores, 55-59 versus the network model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H Hadoop (data infrastructure) comparison to distributed databases, 390 comparison to MPP databases, 414-418 comparison to Unix, 413-414, 499 diverse processing models in ecosystem, 417 HDFS distributed filesystem (see HDFS) higher-level tools, 403 join algorithms, 403-410 (see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, 340 capturing, 187 concurrency and, 186 hard disks access patterns, 84 detecting corruption, 519, 530 faults in, 7, 227 sequential write throughput, 75, 450 hardware faults, 7 hash indexes, 72-75 broadcast hash joins, 409 partitioned hash joins, 409 hash partitioning, 203-205, 217 consistent hashing, 204 problems with hash mod N, 210 range queries, 204 suitable hash functions, 203 with fixed number of partitions, 210 HAWQ (database), 428 HBase (database) bug due to lack of fencing, 302 bulk loading, 413 column-family data model, 41, 99 dynamic partitioning, 212 key-range partitioning, 202 log-structured storage, 78 request routing, 216 size-tiered compaction, 79 use of HDFS, 417 use of ZooKeeper, 370 HDFS (Hadoop Distributed File System), 398-399 (see also distributed filesystems) checking data integrity, 530 decoupling from query engines, 417 indiscriminately dumping data into, 415 metadata about datasets, 410 NameNode, 398 use by Flink, 479 use by HBase, 212 use by MapReduce, 402 HdrHistogram (numerical library), 16 head (Unix tool), 392 head vertex (property graphs), 51 head-of-line blocking, 15 heap files (databases), 86 Helix (cluster manager), 216 heterogeneous distributed transactions, 360, 364 heuristic decisions (in 2PC), 363 Hibernate (object-relational mapper), 30 hierarchical model, 36 high availability (see fault tolerance) high-frequency trading, 290, 299 high-performance computing (HPC), 275 hinted handoff, 183 histograms, 16 Hive (query engine), 419, 427 for data warehouses, 93 HCatalog and metastore, 410 map-side joins, 409 query optimizer, 427 skewed joins, 408 workflows, 403 Hollerith machines, 390 hopping windows (stream processing), 472 (see also windows) horizontal scaling (see scaling out) HornetQ (messaging), 137, 444 distributed transaction support, 361 hot spots, 201 due to celebrities, 205 for time-series data, 203 in batch processing, 407 relieving, 205 hot standbys (see leader-based replication) HTTP, use in APIs (see services) human errors, 9, 279, 414 HyperDex (database), 88 HyperLogLog (algorithm), 466 I I/O operations, waiting for, 297 IBM DB2 (database) distributed transaction support, 361 recursive query support, 54 serializable isolation, 242, 257 XML and JSON support, 30, 42 electromechanical card-sorting machines, 390 IMS (database), 36 imperative query APIs, 46 InfoSphere Streams (CEP engine), 466 MQ (messaging), 444 distributed transaction support, 361 System R (database), 222 WebSphere (messaging), 137 idempotence, 134, 478, 555 by giving operations unique IDs, 518, 522 idempotent operations, 517 immutability advantages of, 460, 531 Index | 571 deriving state from event log, 459-464 for crash recovery, 75 in B-trees, 82, 242 in event sourcing, 457 inputs to Unix commands, 397 limitations of, 463 Impala (query engine) for data warehouses, 93 hash joins, 409 native code generation, 428 use of HDFS, 417 impedance mismatch, 29 imperative languages, 42 setting element styles (example), 45 in doubt (transaction status), 358 holding locks, 362 orphaned transactions, 363 in-memory databases, 88 durability, 227 serial transaction execution, 253 incidents cascading failures, 9 crashes due to leap seconds, 290 data corruption and financial losses due to concurrency bugs, 233 data corruption on hard disks, 227 data loss due to last-write-wins, 173, 292 data on disks unreadable, 309 deleted items reappearing, 174 disclosure of sensitive data due to primary key reuse, 157 errors in transaction serializability, 529 gigabit network interface with 1 Kb/s throughput, 311 network faults, 279 network interface dropping only inbound packets, 279 network partitions and whole-datacenter failures, 275 poor handling of network faults, 280 sending message to ex-partner, 494 sharks biting undersea cables, 279 split brain due to 1-minute packet delay, 158, 279 vibrations in server rack, 14 violation of uniqueness constraint, 529 indexes, 71, 555 and snapshot isolation, 241 as derived data, 386, 499-504 572 | Index B-trees, 79-83 building in batch processes, 411 clustered, 86 comparison of B-trees and LSM-trees, 83-85 concatenated, 87 covering (with included columns), 86 creating, 500 full-text search, 88 geospatial, 87 hash, 72-75 index-range locking, 260 multi-column, 87 partitioning and secondary indexes, 206-209, 217 secondary, 85 (see also secondary indexes) problems with dual writes, 452, 491 SSTables and LSM-trees, 76-79 updating when data changes, 452, 467 Industrial Revolution, 541 InfiniBand (networks), 285 InfiniteGraph (database), 50 InnoDB (storage engine) clustered index on primary key, 86 not preventing lost updates, 245 preventing write skew, 248, 257 serializable isolation, 257 snapshot isolation support, 239 inside-out databases, 504 (see also unbundling databases) integrating different data systems (see data integration) integrity, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 in consensus formalization, 365 integrity checks, 530 (see also auditing) end-to-end, 519, 531 use of snapshot isolation, 238 maintaining despite software bugs, 529 Interface Definition Language (IDL), 117, 122 intermediate state, materialization of, 420-423 internet services, systems for implementing, 275 invariants, 225 (see also constraints) inversion of control, 396 IP (Internet Protocol) unreliability of, 277 ISDN (Integrated Services Digital Network), 284 isolation (in transactions), 225, 228, 555 correctness and, 515 for single-object writes, 230 serializability, 251-266 actual serial execution, 252-256 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 violating, 228 weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-237 snapshot isolation, 237-242 iterative processing, 424-426 J Java Database Connectivity (JDBC) distributed transaction support, 361 network drivers, 128 Java Enterprise Edition (EE), 134, 356, 361 Java Message Service (JMS), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 distributed transaction support, 361 message ordering, 446 Java Transaction API (JTA), 355, 361 Java Virtual Machine (JVM) bytecode generation, 428 garbage collection pauses, 296 process reuse in batch processors, 422 JavaScript in MapReduce querying, 46 setting element styles (example), 45 use in advanced queries, 48 Jena (RDF framework), 57 Jepsen (fault tolerance testing), 515 jitter (network delay), 284 joins, 555 by index lookup, 403 expressing as relational operators, 427 in relational and document databases, 34 MapReduce map-side joins, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 MapReduce reduce-side joins, 403-408 handling skew, 407 sort-merge joins, 405 parallel execution of, 415 secondary indexes and, 85 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 support in document databases, 42 JOTM (transaction coordinator), 356 JSON Avro schema representation, 122 binary variants, 115 for application data, issues with, 114 in relational databases, 30, 42 representing a résumé (example), 31 Juttle (query language), 504 K k-nearest neighbors, 429 Kafka (messaging), 137, 448 Kafka Connect (database integration), 457, 461 Kafka Streams (stream processor), 466, 467 fault tolerance, 479 leader-based replication, 153 log compaction, 456, 467 message offsets, 447, 478 request routing, 216 transaction support, 477 usage example, 4 Ketama (partitioning library), 213 key-value stores, 70 as batch process output, 412 hash indexes, 72-75 in-memory, 89 partitioning, 201-205 by hash of key, 203, 217 by key range, 202, 217 dynamic partitioning, 212 skew and hot spots, 205 Kryo (Java), 113 Kubernetes (cluster manager), 418, 506 L lambda architecture, 497 Lamport timestamps, 345 Index | 573 Large Hadron Collider (LHC), 64 last write wins (LWW), 173, 334 discarding concurrent writes, 186 problems with, 292 prone to lost updates, 246 late binding, 396 latency instability under two-phase locking, 259 network latency and resource utilization, 286 response time versus, 14 tail latency, 15, 207 leader-based replication, 152-161 (see also replication) failover, 157, 301 handling node outages, 156 implementation of replication logs change data capture, 454-457 (see also changelogs) statement-based, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 linearizability of operations, 333 locking and leader election, 330 log sequence number, 156, 449 read-scaling architecture, 161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 leaderless replication, 177-191 (see also replication) detecting concurrent writes, 184-191 capturing happens-before relationship, 187 happens-before relationship and concur‐ rency, 186 last write wins, 186 merging concurrently written values, 190 version vectors, 191 multi-datacenter, 184 quorums, 179-182 consistency limitations, 181-183, 334 sloppy quorums and hinted handoff, 183 read repair and anti-entropy, 178 leap seconds, 8, 290 in time-of-day clocks, 288 leases, 295 implementation with ZooKeeper, 370 574 | Index need for fencing, 302 ledgers, 460 distributed ledger technologies, 532 legacy systems, maintenance of, 18 less (Unix tool), 397 LevelDB (storage engine), 78 leveled compaction, 79 Levenshtein automata, 88 limping (partial failure), 311 linearizability, 324-338, 555 cost of, 335-338 CAP theorem, 336 memory on multi-core CPUs, 338 definition, 325-329 implementing with total order broadcast, 350 in ZooKeeper, 370 of derived data systems, 492, 524 avoiding coordination, 527 of different replication methods, 332-335 using quorums, 334 relying on, 330-332 constraints and uniqueness, 330 cross-channel timing dependencies, 331 locking and leader election, 330 stronger than causal consistency, 342 using to implement total order broadcast, 351 versus serializability, 329 LinkedIn Azkaban (workflow scheduler), 402 Databus (change data capture), 161, 455 Espresso (database), 31, 126, 130, 153, 216 Helix (cluster manager) (see Helix) profile (example), 30 reference to company entity (example), 34 Rest.li (RPC framework), 135 Voldemort (database) (see Voldemort) Linux, leap second bug, 8, 290 liveness properties, 308 LMDB (storage engine), 82, 242 load approaches to coping with, 17 describing, 11 load testing, 16 load balancing (messaging), 444 local indexes (see document-partitioned indexes) locality (data access), 32, 41, 555 in batch processing, 400, 405, 421 in stateful clients, 170, 511 in stream processing, 474, 478, 508, 522 location transparency, 134 in the actor model, 138 locks, 556 deadlock, 258 distributed locking, 301-304, 330 fencing tokens, 303 implementation with ZooKeeper, 370 relation to consensus, 374 for transaction isolation in snapshot isolation, 239 in two-phase locking (2PL), 257-261 making operations atomic, 243 performance, 258 preventing dirty writes, 236 preventing phantoms with index-range locks, 260, 265 read locks (shared mode), 236, 258 shared mode and exclusive mode, 258 in two-phase commit (2PC) deadlock detection, 364 in-doubt transactions holding locks, 362 materializing conflicts with, 251 preventing lost updates by explicit locking, 244 log sequence number, 156, 449 logic programming languages, 504 logical clocks, 293, 343, 494 for read-after-write consistency, 164 logical logs, 160 logs (data structure), 71, 556 advantages of immutability, 460 compaction, 73, 79, 456, 460 for stream operator state, 479 creating using total order broadcast, 349 implementing uniqueness constraints, 522 log-based messaging, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 disk space usage, 450 replaying old messages, 451, 496, 498 slow consumers, 450 using logs for message storage, 447 log-structured storage, 71-79 log-structured merge tree (see LSMtrees) replication, 152, 158-161 change data capture, 454-457 (see also changelogs) coordination with snapshot, 156 logical (row-based) replication, 160 statement-based replication, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 scalability limits, 493 loose coupling, 396, 419, 502 lost updates (see updates) LSM-trees (indexes), 78-79 comparison to B-trees, 83-85 Lucene (storage engine), 79 building indexes in batch processes, 411 similarity search, 88 Luigi (workflow scheduler), 402 LWW (see last write wins) M machine learning ethical considerations, 534 (see also ethics) iterative processing, 424 models derived from training data, 505 statistical and numerical algorithms, 428 MADlib (machine learning toolkit), 428 magic scaling sauce, 18 Mahout (machine learning toolkit), 428 maintainability, 18-22, 489 defined, 23 design principles for software systems, 19 evolvability (see evolvability) operability, 19 simplicity and managing complexity, 20 many-to-many relationships in document model versus relational model, 39 modeling as graphs, 49 many-to-one and many-to-many relationships, 33-36 many-to-one relationships, 34 MapReduce (batch processing), 390, 399-400 accessing external services within job, 404, 412 comparison to distributed databases designing for frequent faults, 417 diversity of processing models, 416 diversity of storage, 415 Index | 575 comparison to stream processing, 464 comparison to Unix, 413-414 disadvantages and limitations of, 419 fault tolerance, 406, 414, 422 higher-level tools, 403, 426 implementation in Hadoop, 400-403 the shuffle, 402 implementation in MongoDB, 46-48 machine learning, 428 map-side processing, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 mapper and reducer functions, 399 materialization of intermediate state, 419-423 output of batch workflows, 411-413 building search indexes, 411 key-value stores, 412 reduce-side processing, 403-408 analysis of user activity events (exam‐ ple), 404 grouping records by same key, 406 handling skew, 407 sort-merge joins, 405 workflows, 402 marshalling (see encoding) massively parallel processing (MPP), 216 comparison to composing storage technolo‐ gies, 502 comparison to Hadoop, 414-418, 428 master-master replication (see multi-leader replication) master-slave replication (see leader-based repli‐ cation) materialization, 556 aggregate values, 101 conflicts, 251 intermediate state (batch processing), 420-423 materialized views, 101 as derived data, 386, 499-504 maintaining, using stream processing, 467, 475 Maven (Java build tool), 428 Maxwell (change data capture), 455 mean, 14 media monitoring, 467 median, 14 576 | Index meeting room booking (example), 249, 259, 521 membership services, 372 Memcached (caching server), 4, 89 memory in-memory databases, 88 durability, 227 serial transaction execution, 253 in-memory representation of data, 112 random bit-flips in, 529 use by indexes, 72, 77 memory barrier (CPU instruction), 338 MemSQL (database) in-memory storage, 89 read committed isolation, 236 memtable (in LSM-trees), 78 Mercurial (version control system), 463 merge joins, MapReduce map-side, 410 mergeable persistent data structures, 174 merging sorted files, 76, 402, 405 Merkle trees, 532 Mesos (cluster manager), 418, 506 message brokers (see messaging systems) message-passing, 136-139 advantages over direct RPC, 137 distributed actor frameworks, 138 evolvability, 138 MessagePack (encoding format), 116 messages exactly-once semantics, 360, 476 loss of, 442 using total order broadcast, 348 messaging systems, 440-451 (see also streams) backpressure, buffering, or dropping mes‐ sages, 441 brokerless messaging, 442 event logs, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 replaying old messages, 451, 496, 498 slow consumers, 450 message brokers, 443-446 acknowledgements and redelivery, 445 comparison to event logs, 448, 451 multiple consumers of same topic, 444 reliability, 442 uniqueness in log-based messaging, 522 Meteor (web framework), 456 microbatching, 477, 495 microservices, 132 (see also services) causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 Microsoft Azure Service Bus (messaging), 444 Azure Storage, 155, 398 Azure Stream Analytics, 466 DCOM (Distributed Component Object Model), 134 MSDTC (transaction coordinator), 356 Orleans (see Orleans) SQL Server (see SQL Server) migrating (rewriting) data, 40, 130, 461, 497 modulus operator (%), 210 MongoDB (database) aggregation pipeline, 48 atomic operations, 243 BSON, 41 document data model, 31 hash partitioning (sharding), 203-204 key-range partitioning, 202 lack of join support, 34, 42 leader-based replication, 153 MapReduce support, 46, 400 oplog parsing, 455, 456 partition splitting, 212 request routing, 216 secondary indexes, 207 Mongoriver (change data capture), 455 monitoring, 10, 19 monotonic clocks, 288 monotonic reads, 164 MPP (see massively parallel processing) MSMQ (messaging), 361 multi-column indexes, 87 multi-leader replication, 168-177 (see also replication) handling write conflicts, 171 conflict avoidance, 172 converging toward a consistent state, 172 custom conflict resolution logic, 173 determining what is a conflict, 174 linearizability, lack of, 333 replication topologies, 175-177 use cases, 168 clients with offline operation, 170 collaborative editing, 170 multi-datacenter replication, 168, 335 multi-object transactions, 228 need for, 231 Multi-Paxos (total order broadcast), 367 multi-table index cluster tables (Oracle), 41 multi-tenancy, 284 multi-version concurrency control (MVCC), 239, 266 detecting stale MVCC reads, 263 indexes and snapshot isolation, 241 mutual exclusion, 261 (see also locks) MySQL (database) binlog coordinates, 156 binlog parsing for change data capture, 455 circular replication topology, 175 consistent snapshots, 156 distributed transaction support, 361 InnoDB storage engine (see InnoDB) JSON support, 30, 42 leader-based replication, 153 performance of XA transactions, 360 row-based replication, 160 schema changes in, 40 snapshot isolation support, 242 (see also InnoDB) statement-based replication, 159 Tungsten Replicator (multi-leader replica‐ tion), 170 conflict detection, 177 N nanomsg (messaging library), 442 Narayana (transaction coordinator), 356 NATS (messaging), 137 near-real-time (nearline) processing, 390 (see also stream processing) Neo4j (database) Cypher query language, 52 graph data model, 50 Nephele (dataflow engine), 421 netcat (Unix tool), 397 Netflix Chaos Monkey, 7, 280 Network Attached Storage (NAS), 146, 398 network model, 36 Index | 577 graph databases versus, 60 imperative query APIs, 46 Network Time Protocol (see NTP) networks congestion and queueing, 282 datacenter network topologies, 276 faults (see faults) linearizability and network delays, 338 network partitions, 279, 337 timeouts and unbounded delays, 281 next-key locking, 260 nodes (in graphs) (see vertices) nodes (processes), 556 handling outages in leader-based replica‐ tion, 156 system models for failure, 307 noisy neighbors, 284 nonblocking atomic commit, 359 nondeterministic operations accidental nondeterminism, 423 partial failures in distributed systems, 275 nonfunctional requirements, 22 nonrepeatable reads, 238 (see also read skew) normalization (data representation), 33, 556 executing joins, 39, 42, 403 foreign key references, 231 in systems of record, 386 versus denormalization, 462 NoSQL, 29, 499 transactions and, 223 Notation3 (N3), 56 npm (package manager), 428 NTP (Network Time Protocol), 287 accuracy, 289, 293 adjustments to monotonic clocks, 289 multiple server addresses, 306 numbers, in XML and JSON encodings, 114 O object-relational mapping (ORM) frameworks, 30 error handling and aborted transactions, 232 unsafe read-modify-write cycle code, 244 object-relational mismatch, 29 observer pattern, 506 offline systems, 390 (see also batch processing) 578 | Index stateful, offline-capable clients, 170, 511 offline-first applications, 511 offsets consumer offsets in partitioned logs, 449 messages in partitioned logs, 447 OLAP (online analytic processing), 91, 556 data cubes, 102 OLTP (online transaction processing), 90, 556 analytics queries versus, 411 workload characteristics, 253 one-to-many relationships, 30 JSON representation, 32 online systems, 389 (see also services) Oozie (workflow scheduler), 402 OpenAPI (service definition format), 133 OpenStack Nova (cloud infrastructure) use of ZooKeeper, 370 Swift (object storage), 398 operability, 19 operating systems versus databases, 499 operation identifiers, 518, 522 operational transformation, 174 operators, 421 flow of data between, 424 in stream processing, 464 optimistic concurrency control, 261 Oracle (database) distributed transaction support, 361 GoldenGate (change data capture), 161, 170, 455 lack of serializability, 226 leader-based replication, 153 multi-table index cluster tables, 41 not preventing write skew, 248 partitioned indexes, 209 PL/SQL language, 255 preventing lost updates, 245 read committed isolation, 236 Real Application Clusters (RAC), 330 recursive query support, 54 snapshot isolation support, 239, 242 TimesTen (in-memory database), 89 WAL-based replication, 160 XML support, 30 ordering, 339-352 by sequence numbers, 343-348 causal ordering, 339-343 partial order, 341 limits of total ordering, 493 total order broadcast, 348-352 Orleans (actor framework), 139 outliers (response time), 14 Oz (programming language), 504 P package managers, 428, 505 packet switching, 285 packets corruption of, 306 sending via UDP, 442 PageRank (algorithm), 49, 424 paging (see virtual memory) ParAccel (database), 93 parallel databases (see massively parallel pro‐ cessing) parallel execution of graph analysis algorithms, 426 queries in MPP databases, 216 Parquet (data format), 96, 131 (see also column-oriented storage) use in Hadoop, 414 partial failures, 275, 310 limping, 311 partial order, 341 partitioning, 199-218, 556 and replication, 200 in batch processing, 429 multi-partition operations, 514 enforcing constraints, 522 secondary index maintenance, 495 of key-value data, 201-205 by key range, 202 skew and hot spots, 205 rebalancing partitions, 209-214 automatic or manual rebalancing, 213 problems with hash mod N, 210 using dynamic partitioning, 212 using fixed number of partitions, 210 using N partitions per node, 212 replication and, 147 request routing, 214-216 secondary indexes, 206-209 document-based partitioning, 206 term-based partitioning, 208 serial execution of transactions and, 255 Paxos (consensus algorithm), 366 ballot number, 368 Multi-Paxos (total order broadcast), 367 percentiles, 14, 556 calculating efficiently, 16 importance of high percentiles, 16 use in service level agreements (SLAs), 15 Percona XtraBackup (MySQL tool), 156 performance describing, 13 of distributed transactions, 360 of in-memory databases, 89 of linearizability, 338 of multi-leader replication, 169 perpetual inconsistency, 525 pessimistic concurrency control, 261 phantoms (transaction isolation), 250 materializing conflicts, 251 preventing, in serializability, 259 physical clocks (see clocks) pickle (Python), 113 Pig (dataflow language), 419, 427 replicated joins, 409 skewed joins, 407 workflows, 403 Pinball (workflow scheduler), 402 pipelined execution, 423 in Unix, 394 point in time, 287 polyglot persistence, 29 polystores, 501 PostgreSQL (database) BDR (multi-leader replication), 170 causal ordering of writes, 177 Bottled Water (change data capture), 455 Bucardo (trigger-based replication), 161, 173 distributed transaction support, 361 foreign data wrappers, 501 full text search support, 490 leader-based replication, 153 log sequence number, 156 MVCC implementation, 239, 241 PL/pgSQL language, 255 PostGIS geospatial indexes, 87 preventing lost updates, 245 preventing write skew, 248, 261 read committed isolation, 236 recursive query support, 54 representing graphs, 51 Index | 579 serializable snapshot isolation (SSI), 261 snapshot isolation support, 239, 242 WAL-based replication, 160 XML and JSON support, 30, 42 pre-splitting, 212 Precision Time Protocol (PTP), 290 predicate locks, 259 predictive analytics, 533-536 amplifying bias, 534 ethics of (see ethics) feedback loops, 536 preemption of datacenter resources, 418 of threads, 298 Pregel processing model, 425 primary keys, 85, 556 compound primary key (Cassandra), 204 primary-secondary replication (see leaderbased replication) privacy, 536-543 consent and freedom of choice, 538 data as assets and power, 540 deleting data, 463 ethical considerations (see ethics) legislation and self-regulation, 542 meaning of, 539 surveillance, 537 tracking behavioral data, 536 probabilistic algorithms, 16, 466 process pauses, 295-299 processing time (of events), 469 producers (message streams), 440 programming languages dataflow languages, 504 for stored procedures, 255 functional reactive programming (FRP), 504 logic programming, 504 Prolog (language), 61 (see also Datalog) promises (asynchronous operations), 135 property graphs, 50 Cypher query language, 52 Protocol Buffers (data format), 117-121 field tags and schema evolution, 120 provenance of data, 531 publish/subscribe model, 441 publishers (message streams), 440 punch card tabulating machines, 390 580 | Index pure functions, 48 putting computation near data, 400 Q Qpid (messaging), 444 quality of service (QoS), 285 Quantcast File System (distributed filesystem), 398 query languages, 42-48 aggregation pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182, 556 for leaderless replication, 179 in consensus algorithms, 368 limitations of consistency, 181-183, 334 making decisions in distributed systems, 301 monitoring staleness, 182 multi-datacenter replication, 184 relying on durability, 309 sloppy quorums and hinted handoff, 183 R R-trees (indexes), 87 RabbitMQ (messaging), 137, 444 leader-based replication, 153 race conditions, 225 (see also concurrency) avoiding with linearizability, 331 caused by dual writes, 452 dirty writes, 235 in counter increments, 235 lost updates, 242-246 preventing with event logs, 462, 507 preventing with serializable isolation, 252 write skew, 246-251 Raft (consensus algorithm), 366 sensitivity to network problems, 369 term number, 368 use in etcd, 353 RAID (Redundant Array of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234 no dirty writes, 235 read path (derived data), 509 read repair (leaderless replication), 178 for linearizability, 335 read replicas (see leader-based replication) read skew (transaction isolation), 238, 266 as violation of causality, 340 read-after-write consistency, 163, 524 cross-device, 164 read-modify-write cycle, 243 read-scaling architecture, 161 reads as events, 513 real-time collaborative editing, 170 near-real-time processing, 390 (see also stream processing) publish/subscribe dataflow, 513 response time guarantees, 298 time-of-day clocks, 288 rebalancing partitions, 209-214, 556 (see also partitioning) automatic or manual rebalancing, 213 dynamic partitioning, 212 fixed number of partitions, 210 fixed number of partitions per node, 212 problems with hash mod N, 210 recency guarantee, 324 recommendation engines batch process outputs, 412 batch workflows, 403, 420 iterative processing, 424 statistical and numerical algorithms, 428 records, 399 events in stream processing, 440 recursive common table expressions (SQL), 54 redelivery (messaging), 445 Redis (database) atomic operations, 243 durability, 89 Lua scripting, 255 single-threaded execution, 253 usage example, 4 redundancy hardware components, 7 of derived data, 386 (see also derived data) Reed–Solomon codes (error correction), 398 refactoring, 22 (see also evolvability) regions (partitioning), 199 register (data structure), 325 relational data model, 28-42 comparison to document model, 38-42 graph queries in SQL, 53 in-memory databases with, 89 many-to-one and many-to-many relation‐ ships, 33 multi-object transactions, need for, 231 NoSQL as alternative to, 29 object-relational mismatch, 29 relational algebra and SQL, 42 versus document model convergence of models, 41 data locality, 41 relational databases eventual consistency, 162 history, 28 leader-based replication, 153 logical logs, 160 philosophy compared to Unix, 499, 501 schema changes, 40, 111, 130 statement-based replication, 158 use of B-tree indexes, 80 relationships (see edges) reliability, 6-10, 489 building a reliable system from unreliable components, 276 defined, 6, 22 hardware faults, 7 human errors, 9 importance of, 10 of messaging systems, 442 Index | 581 software errors, 8 Remote Method Invocation (Java RMI), 134 remote procedure calls (RPCs), 134-136 (see also services) based on futures, 135 data encoding and evolution, 136 issues with, 134 using Avro, 126, 135 using Thrift, 135 versus message brokers, 137 repeatable reads (transaction isolation), 242 replicas, 152 replication, 151-193, 556 and durability, 227 chain replication, 155 conflict resolution and, 246 consistency properties, 161-167 consistent prefix reads, 165 monotonic reads, 164 reading your own writes, 162 in distributed filesystems, 398 leaderless, 177-191 detecting concurrent writes, 184-191 limitations of quorum consistency, 181-183, 334 sloppy quorums and hinted handoff, 183 monitoring staleness, 182 multi-leader, 168-177 across multiple datacenters, 168, 335 handling write conflicts, 171-175 replication topologies, 175-177 partitioning and, 147, 200 reasons for using, 145, 151 single-leader, 152-161 failover, 157 implementation of replication logs, 158-161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 state machine replication, 349, 452 using erasure coding, 398 with heterogeneous data systems, 453 replication logs (see logs) reprocessing data, 496, 498 (see also evolvability) from log-based messaging, 451 request routing, 214-216 582 | Index approaches to, 214 parallel query execution, 216 resilient systems, 6 (see also fault tolerance) response time as performance metric for services, 13, 389 guarantees on, 298 latency versus, 14 mean and percentiles, 14 user experience, 15 responsibility and accountability, 535 REST (Representational State Transfer), 133 (see also services) RethinkDB (database) document data model, 31 dynamic partitioning, 212 join support, 34, 42 key-range partitioning, 202 leader-based replication, 153 subscribing to changes, 456 Riak (database) Bitcask storage engine, 72 CRDTs, 174, 191 dotted version vectors, 191 gossip protocol, 216 hash partitioning, 203-204, 211 last-write-wins conflict resolution, 186 leaderless replication, 177 LevelDB storage engine, 78 linearizability, lack of, 335 multi-datacenter support, 184 preventing lost updates across replicas, 246 rebalancing, 213 search feature, 209 secondary indexes, 207 siblings (concurrently written values), 190 sloppy quorums, 184 ring buffers, 450 Ripple (cryptocurrency), 532 rockets, 10, 36, 305 RocksDB (storage engine), 78 leveled compaction, 79 rollbacks (transactions), 222 rolling upgrades, 8, 112 routing (see request routing) row-oriented storage, 96 row-based replication, 160 rowhammer (memory corruption), 529 RPCs (see remote procedure calls) Rubygems (package manager), 428 rules (Datalog), 61 S safety and liveness properties, 308 in consensus algorithms, 366 in transactions, 222 sagas (see compensating transactions) Samza (stream processor), 466, 467 fault tolerance, 479 streaming SQL support, 466 sandboxes, 9 SAP HANA (database), 93 scalability, 10-18, 489 approaches for coping with load, 17 defined, 22 describing load, 11 describing performance, 13 partitioning and, 199 replication and, 161 scaling up versus scaling out, 146 scaling out, 17, 146 (see also shared-nothing architecture) scaling up, 17, 146 scatter/gather approach, querying partitioned databases, 207 SCD (slowly changing dimension), 476 schema-on-read, 39 comparison to evolvable schema, 128 in distributed filesystems, 415 schema-on-write, 39 schemaless databases (see schema-on-read) schemas, 557 Avro, 122-127 reader determining writer’s schema, 125 schema evolution, 123 dynamically generated, 126 evolution of, 496 affecting application code, 111 compatibility checking, 126 in databases, 129-131 in message-passing, 138 in service calls, 136 flexibility in document model, 39 for analytics, 93-95 for JSON and XML, 115 merits of, 127 schema migration on railways, 496 Thrift and Protocol Buffers, 117-121 schema evolution, 120 traditional approach to design, fallacy in, 462 searches building search indexes in batch processes, 411 k-nearest neighbors, 429 on streams, 467 partitioned secondary indexes, 206 secondaries (see leader-based replication) secondary indexes, 85, 557 partitioning, 206-209, 217 document-partitioned, 206 index maintenance, 495 term-partitioned, 208 problems with dual writes, 452, 491 updating, transaction isolation and, 231 secondary sorts, 405 sed (Unix tool), 392 self-describing files, 127 self-joins, 480 self-validating systems, 530 semantic web, 57 semi-synchronous replication, 154 sequence number ordering, 343-348 generators, 294, 344 insufficiency for enforcing constraints, 347 Lamport timestamps, 345 use of timestamps, 291, 295, 345 sequential consistency, 351 serializability, 225, 233, 251-266, 557 linearizability versus, 329 pessimistic versus optimistic concurrency control, 261 serial execution, 252-256 partitioning, 255 using stored procedures, 253, 349 serializable snapshot isolation (SSI), 261-266 detecting stale MVCC reads, 263 detecting writes that affect prior reads, 264 distributed execution, 265, 364 performance of SSI, 265 preventing write skew, 262-265 two-phase locking (2PL), 257-261 index-range locks, 260 performance, 258 Serializable (Java), 113 Index | 583 serialization, 113 (see also encoding) service discovery, 135, 214, 372 using DNS, 216, 372 service level agreements (SLAs), 15 service-oriented architecture (SOA), 132 (see also services) services, 131-136 microservices, 132 causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 remote procedure calls (RPCs), 134-136 issues with, 134 similarity to databases, 132 web services, 132, 135 session windows (stream processing), 472 (see also windows) sessionization, 407 sharding (see partitioning) shared mode (locks), 258 shared-disk architecture, 146, 398 shared-memory architecture, 146 shared-nothing architecture, 17, 146-147, 557 (see also replication) distributed filesystems, 398 (see also distributed filesystems) partitioning, 199 use of network, 277 sharks biting undersea cables, 279 counting (example), 46-48 finding (example), 42 website about (example), 44 shredding (in relational model), 38 siblings (concurrent values), 190, 246 (see also conflicts) similarity search edit distance, 88 genome data, 63 k-nearest neighbors, 429 single-leader replication (see leader-based rep‐ lication) single-threaded execution, 243, 252 in batch processing, 406, 421, 426 in stream processing, 448, 463, 522 size-tiered compaction, 79 skew, 557 584 | Index clock skew, 291-294, 334 in transaction isolation read skew, 238, 266 write skew, 246-251, 262-265 (see also write skew) meanings of, 238 unbalanced workload, 201 compensating for, 205 due to celebrities, 205 for time-series data, 203 in batch processing, 407 slaves (see leader-based replication) sliding windows (stream processing), 472 (see also windows) sloppy quorums, 183 (see also quorums) lack of linearizability, 334 slowly changing dimension (data warehouses), 476 smearing (leap seconds adjustments), 290 snapshots (databases) causal consistency, 340 computing derived data, 500 in change data capture, 455 serializable snapshot isolation (SSI), 261-266, 329 setting up a new replica, 156 snapshot isolation and repeatable read, 237-242 implementing with MVCC, 239 indexes and MVCC, 241 visibility rules, 240 synchronized clocks for global snapshots, 294 snowflake schemas, 95 SOAP, 133 (see also services) evolvability, 136 software bugs, 8 maintaining integrity, 529 solid state drives (SSDs) access patterns, 84 detecting corruption, 519, 530 faults in, 227 sequential write throughput, 75 Solr (search server) building indexes in batch processes, 411 document-partitioned indexes, 207 request routing, 216 usage example, 4 use of Lucene, 79 sort (Unix tool), 392, 394, 395 sort-merge joins (MapReduce), 405 Sorted String Tables (see SSTables) sorting sort order in column storage, 99 source of truth (see systems of record) Spanner (database) data locality, 41 snapshot isolation using clocks, 295 TrueTime API, 294 Spark (processing framework), 421-423 bytecode generation, 428 dataflow APIs, 427 fault tolerance, 422 for data warehouses, 93 GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304 spreadsheets, dataflow programming capabili‐ ties, 504 SQL (Structured Query Language), 21, 28, 43 advantages and limitations of, 416 distributed query execution, 48 graph queries in, 53 isolation levels standard, issues with, 242 query execution on Hadoop, 416 résumé (example), 30 SQL injection vulnerability, 305 SQL on Hadoop, 93 statement-based replication, 158 stored procedures, 255 SQL Server (database) data warehousing support, 93 distributed transaction support, 361 leader-based replication, 153 preventing lost updates, 245 preventing write skew, 248, 257 read committed isolation, 236 recursive query support, 54 serializable isolation, 257 snapshot isolation support, 239 T-SQL language, 255 XML support, 30 SQLstream (stream analytics), 466 SSDs (see solid state drives) SSTables (storage format), 76-79 advantages over hash indexes, 76 concatenated index, 204 constructing and maintaining, 78 making LSM-Tree from, 78 staleness (old data), 162 cross-channel timing dependencies, 331 in leaderless databases, 178 in multi-version concurrency control, 263 monitoring for, 182 of client state, 512 versus linearizability, 324 versus timeliness, 524 standbys (see leader-based replication) star replication topologies, 175 star schemas, 93-95 similarity to event sourcing, 458 Star Wars analogy (event time versus process‐ ing time), 469 state derived from log of immutable events, 459 deriving current state from the event log, 458 interplay between state changes and appli‐ cation code, 507 maintaining derived state, 495 maintenance by stream processor in streamstream joins, 473 observing derived state, 509-515 rebuilding after stream processor failure, 478 separation of application code and, 505 state machine replication, 349, 452 statement-based replication, 158 statically typed languages analogy to schema-on-write, 40 code generation and, 127 statistical and numerical algorithms, 428 StatsD (metrics aggregator), 442 stdin, stdout, 395, 396 Stellar (cryptocurrency), 532 Index | 585 stock market feeds, 442 STONITH (Shoot The Other Node In The Head), 158 stop-the-world (see garbage collection) storage composing data storage technologies, 499-504 diversity of, in MapReduce, 415 Storage Area Network (SAN), 146, 398 storage engines, 69-104 column-oriented, 95-101 column compression, 97-99 defined, 96 distinction between column families and, 99 Parquet, 96, 131 sort order in, 99-100 writing to, 101 comparing requirements for transaction processing and analytics, 90-96 in-memory storage, 88 durability, 227 row-oriented, 70-90 B-trees, 79-83 comparing B-trees and LSM-trees, 83-85 defined, 96 log-structured, 72-79 stored procedures, 161, 253-255, 557 and total order broadcast, 349 pros and cons of, 255 similarity to stream processors, 505 Storm (stream processor), 466 distributed RPC, 468, 514 Trident state handling, 478 straggler events, 470, 498 stream processing, 464-481, 557 accessing external services within job, 474, 477, 478, 517 combining with batch processing lambda architecture, 497 unifying technologies, 498 comparison to batch processing, 464 complex event processing (CEP), 465 fault tolerance, 476-479 atomic commit, 477 idempotence, 478 microbatching and checkpointing, 477 rebuilding state after a failure, 478 for data integration, 494-498 586 | Index maintaining derived state, 495 maintenance of materialized views, 467 messaging systems (see messaging systems) reasoning about time, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 types of windows, 472 relation to databases (see streams) relation to services, 508 search on streams, 467 single-threaded execution, 448, 463 stream analytics, 466 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 streams, 440-451 end-to-end, pushing events to clients, 512 messaging systems (see messaging systems) processing (see stream processing) relation to databases, 451-464 (see also changelogs) API support for change streams, 456 change data capture, 454-457 derivative of state by time, 460 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 topics, 440 strict serializability, 329 strong consistency (see linearizability) strong one-copy serializability, 329 subjects, predicates, and objects (in triplestores), 55 subscribers (message streams), 440 (see also consumers) supercomputers, 275 surveillance, 537 (see also privacy) Swagger (service definition format), 133 swapping to disk (see virtual memory) synchronous networks, 285, 557 comparison to asynchronous networks, 284 formal model, 307 synchronous replication, 154, 557 chain replication, 155 conflict detection, 172 system models, 300, 306-310 assumptions in, 528 correctness of algorithms, 308 mapping to the real world, 309 safety and liveness, 308 systems of record, 386, 557 change data capture, 454, 491 treating event log as, 460 systems thinking, 536 T t-digest (algorithm), 16 table-table joins, 474 Tableau (data visualization software), 416 tail (Unix tool), 447 tail vertex (property graphs), 51 Tajo (query engine), 93 Tandem NonStop SQL (database), 200 TCP (Transmission Control Protocol), 277 comparison to circuit switching, 285 comparison to UDP, 283 connection failures, 280 flow control, 282, 441 packet checksums, 306, 519, 529 reliability and duplicate suppression, 517 retransmission timeouts, 284 use for transaction sessions, 229 telemetry (see monitoring) Teradata (database), 93, 200 term-partitioned indexes, 208, 217 termination (consensus), 365 Terrapin (database), 413 Tez (dataflow engine), 421-423 fault tolerance, 422 support by higher-level tools, 427 thrashing (out of memory), 297 threads (concurrency) actor model, 138, 468 (see also message-passing) atomic operations, 223 background threads, 73, 85 execution pauses, 286, 296-298 memory barriers, 338 preemption, 298 single (see single-threaded execution) three-phase commit, 359 Thrift (data format), 117-121 BinaryProtocol, 118 CompactProtocol, 119 field tags and schema evolution, 120 throughput, 13, 390 TIBCO, 137 Enterprise Message Service, 444 StreamBase (stream analytics), 466 time concurrency and, 187 cross-channel timing dependencies, 331 in distributed systems, 287-299 (see also clocks) clock synchronization and accuracy, 289 relying on synchronized clocks, 291-295 process pauses, 295-299 reasoning about, in stream processors, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 timestamp of events, 471 types of windows, 472 system models for distributed systems, 307 time-dependence in stream joins, 475 time-of-day clocks, 288 timeliness, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 timeouts, 279, 557 dynamic configuration of, 284 for failover, 158 length of, 281 timestamps, 343 assigning to events in stream processing, 471 for read-after-write consistency, 163 for transaction ordering, 295 insufficiency for enforcing constraints, 347 key range partitioning by, 203 Lamport, 345 logical, 494 ordering events, 291, 345 Titan (database), 50 tombstones, 74, 191, 456 topics (messaging), 137, 440 total order, 341, 557 limits of, 493 sequence numbers or timestamps, 344 total order broadcast, 348-352, 493, 522 consensus algorithms and, 366-368 Index | 587 implementation in ZooKeeper and etcd, 370 implementing with linearizable storage, 351 using, 349 using to implement linearizable storage, 350 tracking behavioral data, 536 (see also privacy) transaction coordinator (see coordinator) transaction manager (see coordinator) transaction processing, 28, 90-95 comparison to analytics, 91 comparison to data warehousing, 93 transactions, 221-267, 558 ACID properties of, 223 atomicity, 223 consistency, 224 durability, 226 isolation, 225 compensating (see compensating transac‐ tions) concept of, 222 distributed transactions, 352-364 avoiding, 492, 502, 521-528 failure amplification, 364, 495 in doubt/uncertain status, 358, 362 two-phase commit, 354-359 use of, 360-361 XA transactions, 361-364 OLTP versus analytics queries, 411 purpose of, 222 serializability, 251-266 actual serial execution, 252-256 pessimistic versus optimistic concur‐ rency control, 261 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 single-object and multi-object, 228-232 handling errors and aborts, 231 need for multi-object transactions, 231 single-object writes, 230 snapshot isolation (see snapshots) weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-238 transitive closure (graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter constructing home timelines (example), 11, 462, 474, 511 DistributedLog (event log), 448 Finagle (RPC framework), 135 Snowflake (sequence number generator), 294 Summingbird (processing library), 497 two-phase commit (2PC), 353, 355-359, 558 confusion with two-phase locking, 356 coordinator failure, 358 coordinator recovery, 363 how it works, 357 issues in practice, 363 performance cost, 360 transactions holding locks, 362 two-phase locking (2PL), 257-261, 329, 558 confusion with two-phase commit, 356 index-range locks, 260 performance of, 258 type checking, dynamic versus static, 40 U UDP (User Datagram Protocol) comparison to TCP, 283 multicast, 442 unbounded datasets, 439, 558 (see also streams) unbounded delays, 558 in networks, 282 process pauses, 296 unbundling databases, 499-515 composing data storage technologies, 499-504 federation versus unbundling, 501 need for high-level language, 503 designing applications around dataflow, 504-509 observing derived state, 509-515 materialized views and caching, 510 multi-partition data processing, 514 pushing state changes to clients, 512 uncertain (transaction status) (see in doubt) uniform consensus, 365 (see also consensus) uniform interfaces, 395 union type (in Avro), 125 uniq (Unix tool), 392 uniqueness constraints asynchronously checked, 526 requiring consensus, 521 requiring linearizability, 330 uniqueness in log-based messaging, 522 Unix philosophy, 394-397 command-line batch processing, 391-394 Unix pipes versus dataflow engines, 423 comparison to Hadoop, 413-414 comparison to relational databases, 499, 501 comparison to stream processing, 464 composability and uniform interfaces, 395 loose coupling, 396 pipes, 394 relation to Hadoop, 499 UPDATE statement (SQL), 40 updates preventing lost updates, 242-246 atomic write operations, 243 automatically detecting lost updates, 245 compare-and-set operations, 245 conflict resolution and replication, 246 using explicit locking, 244 preventing write skew, 246-251 V validity (consensus), 365 vBuckets (partitioning), 199 vector clocks, 191 (see also version vectors) vectorized processing, 99, 428 verification, 528-533 avoiding blind trust, 530 culture of, 530 designing for auditability, 531 end-to-end integrity checks, 531 tools for auditable data systems, 532 version control systems, reliance on immutable data, 463 version vectors, 177, 191 capturing causal dependencies, 343 versus vector clocks, 191 Vertica (database), 93 handling writes, 101 replicas using different sort orders, 100 vertical scaling (see scaling up) vertices (in graphs), 49 property graph model, 50 Viewstamped Replication (consensus algo‐ rithm), 366 view number, 368 virtual machines, 146 (see also cloud computing) context switches, 297 network performance, 282 noisy neighbors, 284 reliability in cloud services, 8 virtualized clocks in, 290 virtual memory process pauses due to page faults, 14, 297 versus memory management by databases, 89 VisiCalc (spreadsheets), 504 vnodes (partitioning), 199 Voice over IP (VoIP), 283 Voldemort (database) building read-only stores in batch processes, 413 hash partitioning, 203-204, 211 leaderless replication, 177 multi-datacenter support, 184 rebalancing, 213 reliance on read repair, 179 sloppy quorums, 184 VoltDB (database) cross-partition serializability, 256 deterministic stored procedures, 255 in-memory storage, 89 output streams, 456 secondary indexes, 207 serial execution of transactions, 253 statement-based replication, 159, 479 transactions in stream processing, 477 W WAL (write-ahead log), 82 web services (see services) Web Services Description Language (WSDL), 133 webhooks, 443 webMethods (messaging), 137 WebSocket (protocol), 512 Index | 589 windows (stream processing), 466, 468-472 infinite windows for changelogs, 467, 474 knowing when all events have arrived, 470 stream joins within a window, 473 types of windows, 472 winners (conflict resolution), 173 WITH RECURSIVE syntax (SQL), 54 workflows (MapReduce), 402 outputs, 411-414 key-value stores, 412 search indexes, 411 with map-side joins, 410 working set, 393 write amplification, 84 write path (derived data), 509 write skew (transaction isolation), 246-251 characterizing, 246-251, 262 examples of, 247, 249 materializing conflicts, 251 occurrence in practice, 529 phantoms, 250 preventing in snapshot isolation, 262-265 in two-phase locking, 259-261 options for, 248 write-ahead log (WAL), 82, 159 writes (database) atomic write operations, 243 detecting writes affecting prior reads, 264 preventing dirty writes with read commit‐ ted, 235 WS-* framework, 133 (see also services) WS-AtomicTransaction (2PC), 355 590 | Index X XA transactions, 355, 361-364 heuristic decisions, 363 limitations of, 363 xargs (Unix tool), 392, 396 XML binary variants, 115 encoding RDF data, 57 for application data, issues with, 114 in relational databases, 30, 41 XSL/XPath, 45 Y Yahoo!

The opposite of bounded. 558 | Glossary Index A aborts (transactions), 222, 224 in two-phase commit, 356 performance of optimistic concurrency con‐ trol, 266 retrying aborted transactions, 231 abstraction, 21, 27, 222, 266, 321 access path (in network model), 37, 60 accidental complexity, removing, 21 accountability, 535 ACID properties (transactions), 90, 223 atomicity, 223, 228 consistency, 224, 529 durability, 226 isolation, 225, 228 acknowledgements (messaging), 445 active/active replication (see multi-leader repli‐ cation) active/passive replication (see leader-based rep‐ lication) ActiveMQ (messaging), 137, 444 distributed transaction support, 361 ActiveRecord (object-relational mapper), 30, 232 actor model, 138 (see also message-passing) comparison to Pregel model, 425 comparison to stream processing, 468 Advanced Message Queuing Protocol (see AMQP) aerospace systems, 6, 10, 305, 372 aggregation data cubes and materialized views, 101 in batch processes, 406 in stream processes, 466 aggregation pipeline query language, 48 Agile, 22 minimizing irreversibility, 414, 497 moving faster with confidence, 532 Unix philosophy, 394 agreement, 365 (see also consensus) Airflow (workflow scheduler), 402 Ajax, 131 Akka (actor framework), 139 algorithms algorithm correctness, 308 B-trees, 79-83 for distributed systems, 306 hash indexes, 72-75 mergesort, 76, 402, 405 red-black trees, 78 SSTables and LSM-trees, 76-79 all-to-all replication topologies, 175 AllegroGraph (database), 50 ALTER TABLE statement (SQL), 40, 111 Amazon Dynamo (database), 177 Amazon Web Services (AWS), 8 Kinesis Streams (messaging), 448 network reliability, 279 postmortems, 9 RedShift (database), 93 S3 (object storage), 398 checking data integrity, 530 amplification of bias, 534 of failures, 364, 495 Index | 559 of tail latency, 16, 207 write amplification, 84 AMQP (Advanced Message Queuing Protocol), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 message ordering, 446 analytics, 90 comparison to transaction processing, 91 data warehousing (see data warehousing) parallel query execution in MPP databases, 415 predictive (see predictive analytics) relation to batch processing, 411 schemas for, 93-95 snapshot isolation for queries, 238 stream analytics, 466 using MapReduce, analysis of user activity events (example), 404 anti-caching (in-memory databases), 89 anti-entropy, 178 Apache ActiveMQ (see ActiveMQ) Apache Avro (see Avro) Apache Beam (see Beam) Apache BookKeeper (see BookKeeper) Apache Cassandra (see Cassandra) Apache CouchDB (see CouchDB) Apache Curator (see Curator) Apache Drill (see Drill) Apache Flink (see Flink) Apache Giraph (see Giraph) Apache Hadoop (see Hadoop) Apache HAWQ (see HAWQ) Apache HBase (see HBase) Apache Helix (see Helix) Apache Hive (see Hive) Apache Impala (see Impala) Apache Jena (see Jena) Apache Kafka (see Kafka) Apache Lucene (see Lucene) Apache MADlib (see MADlib) Apache Mahout (see Mahout) Apache Oozie (see Oozie) Apache Parquet (see Parquet) Apache Qpid (see Qpid) Apache Samza (see Samza) Apache Solr (see Solr) Apache Spark (see Spark) 560 | Index Apache Storm (see Storm) Apache Tajo (see Tajo) Apache Tez (see Tez) Apache Thrift (see Thrift) Apache ZooKeeper (see ZooKeeper) Apama (stream analytics), 466 append-only B-trees, 82, 242 append-only files (see logs) Application Programming Interfaces (APIs), 5, 27 for batch processing, 403 for change streams, 456 for distributed transactions, 361 for graph processing, 425 for services, 131-136 (see also services) evolvability, 136 RESTful, 133 SOAP, 133 application state (see state) approximate search (see similarity search) archival storage, data from databases, 131 arcs (see edges) arithmetic mean, 14 ASCII text, 119, 395 ASN.1 (schema language), 127 asynchronous networks, 278, 553 comparison to synchronous networks, 284 formal model, 307 asynchronous replication, 154, 553 conflict detection, 172 data loss on failover, 157 reads from asynchronous follower, 162 Asynchronous Transfer Mode (ATM), 285 atomic broadcast (see total order broadcast) atomic clocks (caesium clocks), 294, 295 (see also clocks) atomicity (concurrency), 553 atomic increment-and-get, 351 compare-and-set, 245, 327 (see also compare-and-set operations) replicated operations, 246 write operations, 243 atomicity (transactions), 223, 228, 553 atomic commit, 353 avoiding, 523, 528 blocking and nonblocking, 359 in stream processing, 360, 477 maintaining derived data, 453 for multi-object transactions, 229 for single-object writes, 230 auditability, 528-533 designing for, 531 self-auditing systems, 530 through immutability, 460 tools for auditable data systems, 532 availability, 8 (see also fault tolerance) in CAP theorem, 337 in service level agreements (SLAs), 15 Avro (data format), 122-127 code generation, 127 dynamically generated schemas, 126 object container files, 125, 131, 414 reader determining writer’s schema, 125 schema evolution, 123 use in Hadoop, 414 awk (Unix tool), 391 AWS (see Amazon Web Services) Azure (see Microsoft) B B-trees (indexes), 79-83 append-only/copy-on-write variants, 82, 242 branching factor, 81 comparison to LSM-trees, 83-85 crash recovery, 82 growing by splitting a page, 81 optimizations, 82 similarity to dynamic partitioning, 212 backpressure, 441, 553 in TCP, 282 backups database snapshot for replication, 156 integrity of, 530 snapshot isolation for, 238 use for ETL processes, 405 backward compatibility, 112 BASE, contrast to ACID, 223 bash shell (Unix), 70, 395, 503 batch processing, 28, 389-431, 553 combining with stream processing lambda architecture, 497 unifying technologies, 498 comparison to MPP databases, 414-418 comparison to stream processing, 464 comparison to Unix, 413-414 dataflow engines, 421-423 fault tolerance, 406, 414, 422, 442 for data integration, 494-498 graphs and iterative processing, 424-426 high-level APIs and languages, 403, 426-429 log-based messaging and, 451 maintaining derived state, 495 MapReduce and distributed filesystems, 397-413 (see also MapReduce) measuring performance, 13, 390 outputs, 411-413 key-value stores, 412 search indexes, 411 using Unix tools (example), 391-394 Bayou (database), 522 Beam (dataflow library), 498 bias, 534 big ball of mud, 20 Bigtable data model, 41, 99 binary data encodings, 115-128 Avro, 122-127 MessagePack, 116-117 Thrift and Protocol Buffers, 117-121 binary encoding based on schemas, 127 by network drivers, 128 binary strings, lack of support in JSON and XML, 114 BinaryProtocol encoding (Thrift), 118 Bitcask (storage engine), 72 crash recovery, 74 Bitcoin (cryptocurrency), 532 Byzantine fault tolerance, 305 concurrency bugs in exchanges, 233 bitmap indexes, 97 blockchains, 532 Byzantine fault tolerance, 305 blocking atomic commit, 359 Bloom (programming language), 504 Bloom filter (algorithm), 79, 466 BookKeeper (replicated log), 372 Bottled Water (change data capture), 455 bounded datasets, 430, 439, 553 (see also batch processing) bounded delays, 553 in networks, 285 process pauses, 298 broadcast hash joins, 409 Index | 561 brokerless messaging, 442 Brubeck (metrics aggregator), 442 BTM (transaction coordinator), 356 bulk synchronous parallel (BSP) model, 425 bursty network traffic patterns, 285 business data processing, 28, 90, 390 byte sequence, encoding data in, 112 Byzantine faults, 304-306, 307, 553 Byzantine fault-tolerant systems, 305, 532 Byzantine Generals Problem, 304 consensus algorithms and, 366 C caches, 89, 553 and materialized views, 101 as derived data, 386, 499-504 database as cache of transaction log, 460 in CPUs, 99, 338, 428 invalidation and maintenance, 452, 467 linearizability, 324 CAP theorem, 336-338, 554 Cascading (batch processing), 419, 427 hash joins, 409 workflows, 403 cascading failures, 9, 214, 281 Cascalog (batch processing), 60 Cassandra (database) column-family data model, 41, 99 compaction strategy, 79 compound primary key, 204 gossip protocol, 216 hash partitioning, 203-205 last-write-wins conflict resolution, 186, 292 leaderless replication, 177 linearizability, lack of, 335 log-structured storage, 78 multi-datacenter support, 184 partitioning scheme, 213 secondary indexes, 207 sloppy quorums, 184 cat (Unix tool), 391 causal context, 191 (see also causal dependencies) causal dependencies, 186-191 capturing, 191, 342, 494, 514 by total ordering, 493 causal ordering, 339 in transactions, 262 sending message to friends (example), 494 562 | Index causality, 554 causal ordering, 339-343 linearizability and, 342 total order consistent with, 344, 345 consistency with, 344-347 consistent snapshots, 340 happens-before relationship, 186 in serializable transactions, 262-265 mismatch with clocks, 292 ordering events to capture, 493 violations of, 165, 176, 292, 340 with synchronized clocks, 294 CEP (see complex event processing) certificate transparency, 532 chain replication, 155 linearizable reads, 351 change data capture, 160, 454 API support for change streams, 456 comparison to event sourcing, 457 implementing, 454 initial snapshot, 455 log compaction, 456 changelogs, 460 change data capture, 454 for operator state, 479 generating with triggers, 455 in stream joins, 474 log compaction, 456 maintaining derived state, 452 Chaos Monkey, 7, 280 checkpointing in batch processors, 422, 426 in high-performance computing, 275 in stream processors, 477, 523 chronicle data model, 458 circuit-switched networks, 284 circular buffers, 450 circular replication topologies, 175 clickstream data, analysis of, 404 clients calling services, 131 pushing state changes to, 512 request routing, 214 stateful and offline-capable, 170, 511 clocks, 287-299 atomic (caesium) clocks, 294, 295 confidence interval, 293-295 for global snapshots, 294 logical (see logical clocks) skew, 291-294, 334 slewing, 289 synchronization and accuracy, 289-291 synchronization using GPS, 287, 290, 294, 295 time-of-day versus monotonic clocks, 288 timestamping events, 471 cloud computing, 146, 275 need for service discovery, 372 network glitches, 279 shared resources, 284 single-machine reliability, 8 Cloudera Impala (see Impala) clustered indexes, 86 CODASYL model, 36 (see also network model) code generation with Avro, 127 with Thrift and Protocol Buffers, 118 with WSDL, 133 collaborative editing multi-leader replication and, 170 column families (Bigtable), 41, 99 column-oriented storage, 95-101 column compression, 97 distinction between column families and, 99 in batch processors, 428 Parquet, 96, 131, 414 sort order in, 99-100 vectorized processing, 99, 428 writing to, 101 comma-separated values (see CSV) command query responsibility segregation (CQRS), 462 commands (event sourcing), 459 commits (transactions), 222 atomic commit, 354-355 (see also atomicity; transactions) read committed isolation, 234 three-phase commit (3PC), 359 two-phase commit (2PC), 355-359 commutative operations, 246 compaction of changelogs, 456 (see also log compaction) for stream operator state, 479 of log-structured storage, 73 issues with, 84 size-tiered and leveled approaches, 79 CompactProtocol encoding (Thrift), 119 compare-and-set operations, 245, 327 implementing locks, 370 implementing uniqueness constraints, 331 implementing with total order broadcast, 350 relation to consensus, 335, 350, 352, 374 relation to transactions, 230 compatibility, 112, 128 calling services, 136 properties of encoding formats, 139 using databases, 129-131 using message-passing, 138 compensating transactions, 355, 461, 526 complex event processing (CEP), 465 complexity distilling in theoretical models, 310 hiding using abstraction, 27 of software systems, managing, 20 composing data systems (see unbundling data‐ bases) compute-intensive applications, 3, 275 concatenated indexes, 87 in Cassandra, 204 Concord (stream processor), 466 concurrency actor programming model, 138, 468 (see also message-passing) bugs from weak transaction isolation, 233 conflict resolution, 171, 174 detecting concurrent writes, 184-191 dual writes, problems with, 453 happens-before relationship, 186 in replicated systems, 161-191, 324-338 lost updates, 243 multi-version concurrency control (MVCC), 239 optimistic concurrency control, 261 ordering of operations, 326, 341 reducing, through event logs, 351, 462, 507 time and relativity, 187 transaction isolation, 225 write skew (transaction isolation), 246-251 conflict-free replicated datatypes (CRDTs), 174 conflicts conflict detection, 172 causal dependencies, 186, 342 in consensus algorithms, 368 in leaderless replication, 184 Index | 563 in log-based systems, 351, 521 in nonlinearizable systems, 343 in serializable snapshot isolation (SSI), 264 in two-phase commit, 357, 364 conflict resolution automatic conflict resolution, 174 by aborting transactions, 261 by apologizing, 527 convergence, 172-174 in leaderless systems, 190 last write wins (LWW), 186, 292 using atomic operations, 246 using custom logic, 173 determining what is a conflict, 174, 522 in multi-leader replication, 171-175 avoiding conflicts, 172 lost updates, 242-246 materializing, 251 relation to operation ordering, 339 write skew (transaction isolation), 246-251 congestion (networks) avoidance, 282 limiting accuracy of clocks, 293 queueing delays, 282 consensus, 321, 364-375, 554 algorithms, 366-368 preventing split brain, 367 safety and liveness properties, 365 using linearizable operations, 351 cost of, 369 distributed transactions, 352-375 in practice, 360-364 two-phase commit, 354-359 XA transactions, 361-364 impossibility of, 353 membership and coordination services, 370-373 relation to compare-and-set, 335, 350, 352, 374 relation to replication, 155, 349 relation to uniqueness constraints, 521 consistency, 224, 524 across different databases, 157, 452, 462, 492 causal, 339-348, 493 consistent prefix reads, 165-167 consistent snapshots, 156, 237-242, 294, 455, 500 (see also snapshots) 564 | Index crash recovery, 82 enforcing constraints (see constraints) eventual, 162, 322 (see also eventual consistency) in ACID transactions, 224, 529 in CAP theorem, 337 linearizability, 324-338 meanings of, 224 monotonic reads, 164-165 of secondary indexes, 231, 241, 354, 491, 500 ordering guarantees, 339-352 read-after-write, 162-164 sequential, 351 strong (see linearizability) timeliness and integrity, 524 using quorums, 181, 334 consistent hashing, 204 consistent prefix reads, 165 constraints (databases), 225, 248 asynchronously checked, 526 coordination avoidance, 527 ensuring idempotence, 519 in log-based systems, 521-524 across multiple partitions, 522 in two-phase commit, 355, 357 relation to consensus, 374, 521 relation to event ordering, 347 requiring linearizability, 330 Consul (service discovery), 372 consumers (message streams), 137, 440 backpressure, 441 consumer offsets in logs, 449 failures, 445, 449 fan-out, 11, 445, 448 load balancing, 444, 448 not keeping up with producers, 441, 450, 502 context switches, 14, 297 convergence (conflict resolution), 172-174, 322 coordination avoidance, 527 cross-datacenter, 168, 493 cross-partition ordering, 256, 294, 348, 523 services, 330, 370-373 coordinator (in 2PC), 356 failure, 358 in XA transactions, 361-364 recovery, 363 copy-on-write (B-trees), 82, 242 CORBA (Common Object Request Broker Architecture), 134 correctness, 6 auditability, 528-533 Byzantine fault tolerance, 305, 532 dealing with partial failures, 274 in log-based systems, 521-524 of algorithm within system model, 308 of compensating transactions, 355 of consensus, 368 of derived data, 497, 531 of immutable data, 461 of personal data, 535, 540 of time, 176, 289-295 of transactions, 225, 515, 529 timeliness and integrity, 524-528 corruption of data detecting, 519, 530-533 due to pathological memory access, 529 due to radiation, 305 due to split brain, 158, 302 due to weak transaction isolation, 233 formalization in consensus, 366 integrity as absence of, 524 network packets, 306 on disks, 227 preventing using write-ahead logs, 82 recovering from, 414, 460 Couchbase (database) durability, 89 hash partitioning, 203-204, 211 rebalancing, 213 request routing, 216 CouchDB (database) B-tree storage, 242 change feed, 456 document data model, 31 join support, 34 MapReduce support, 46, 400 replication, 170, 173 covering indexes, 86 CPUs cache coherence and memory barriers, 338 caching and pipelining, 99, 428 increasing parallelism, 43 CRDTs (see conflict-free replicated datatypes) CREATE INDEX statement (SQL), 85, 500 credit rating agencies, 535 Crunch (batch processing), 419, 427 hash joins, 409 sharded joins, 408 workflows, 403 cryptography defense against attackers, 306 end-to-end encryption and authentication, 519, 543 proving integrity of data, 532 CSS (Cascading Style Sheets), 44 CSV (comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494-498 lambda architecture, 497 maintaining derived state, 495 reprocessing data, 496 unifying, 498 by unbundling databases, 499-515 comparison to federated databases, 501 combining tools by deriving data, 490-494 derived data versus distributed transac‐ tions, 492 limits of total ordering, 493 ordering events to capture causality, 493 reasoning about dataflows, 491 need for, 385 data lakes, 415 data locality (see locality) data models, 27-64 graph-like models, 49-63 Datalog language, 60-63 property graphs, 50 RDF and triple-stores, 55-59 query languages, 42-48 relational model versus document model, 28-42 data protection regulations, 542 data systems, 3 about, 4 Index | 565 concerns when designing, 5 future of, 489-544 correctness, constraints, and integrity, 515-533 data integration, 490-498 unbundling databases, 499-515 heterogeneous, keeping in sync, 452 maintainability, 18-22 possible faults in, 221 reliability, 6-10 hardware faults, 7 human errors, 9 importance of, 10 software errors, 8 scalability, 10-18 unreliable clocks, 287-299 data warehousing, 91-95, 554 comparison to data lakes, 415 ETL (extract-transform-load), 92, 416, 452 keeping data systems in sync, 452 schema design, 93 slowly changing dimension (SCD), 476 data-intensive applications, 3 database triggers (see triggers) database-internal distributed transactions, 360, 364, 477 databases archival storage, 131 comparison of message brokers to, 443 dataflow through, 129 end-to-end argument for, 519-520 checking integrity, 531 inside-out, 504 (see also unbundling databases) output from batch workflows, 412 relation to event streams, 451-464 (see also changelogs) API support for change streams, 456, 506 change data capture, 454-457 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 unbundling, 499-515 composing data storage technologies, 499-504 designing applications around dataflow, 504-509 566 | Index observing derived state, 509-515 datacenters geographically distributed, 145, 164, 278, 493 multi-tenancy and shared resources, 284 network architecture, 276 network faults, 279 replication across multiple, 169 leaderless replication, 184 multi-leader replication, 168, 335 dataflow, 128-139, 504-509 correctness of dataflow systems, 525 differential, 504 message-passing, 136-139 reasoning about, 491 through databases, 129 through services, 131-136 dataflow engines, 421-423 comparison to stream processing, 464 directed acyclic graphs (DAG), 424 partitioning, approach to, 429 support for declarative queries, 427 Datalog (query language), 60-63 datatypes binary strings in XML and JSON, 114 conflict-free, 174 in Avro encodings, 122 in Thrift and Protocol Buffers, 121 numbers in XML and JSON, 114 Datomic (database) B-tree storage, 242 data model, 50, 57 Datalog query language, 60 excision (deleting data), 463 languages for transactions, 255 serial execution of transactions, 253 deadlocks detection, in two-phase commit (2PC), 364 in two-phase locking (2PL), 258 Debezium (change data capture), 455 declarative languages, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34, 554 costs, 39 in derived data systems, 386 materialized views, 101 updating derived data, 228, 231, 490 versus normalization, 462 derived data, 386, 439, 554 from change data capture, 454 in event sourcing, 458-458 maintaining derived state through logs, 452-457, 459-463 observing, by subscribing to streams, 512 outputs of batch and stream processing, 495 through application code, 505 versus distributed transactions, 492 deterministic operations, 255, 274, 554 accidental nondeterminism, 423 and fault tolerance, 423, 426 and idempotence, 478, 492 computing derived data, 495, 526, 531 in state machine replication, 349, 452, 458 joins, 476 DevOps, 394 differential dataflow, 504 dimension tables, 94 dimensional modeling (see star schemas) directed acyclic graphs (DAGs), 424 dirty reads (transaction isolation), 234 dirty writes (transaction isolation), 235 discrimination, 534 disks (see hard disks) distributed actor frameworks, 138 distributed filesystems, 398-399 decoupling from query engines, 417 indiscriminately dumping data into, 415 use by MapReduce, 402 distributed systems, 273-312, 554 Byzantine faults, 304-306 cloud versus supercomputing, 275 detecting network faults, 280 faults and partial failures, 274-277 formalization of consensus, 365 impossibility results, 338, 353 issues with failover, 157 limitations of distributed transactions, 363 multi-datacenter, 169, 335 network problems, 277-286 quorums, relying on, 301 reasons for using, 145, 151 synchronized clocks, relying on, 291-295 system models, 306-310 use of clocks and time, 287 distributed transactions (see transactions) Django (web framework), 232 DNS (Domain Name System), 216, 372 Docker (container manager), 506 document data model, 30-42 comparison to relational model, 38-42 document references, 38, 403 document-oriented databases, 31 many-to-many relationships and joins, 36 multi-object transactions, need for, 231 versus relational model convergence of models, 41 data locality, 41 document-partitioned indexes, 206, 217, 411 domain-driven design (DDD), 457 DRBD (Distributed Replicated Block Device), 153 drift (clocks), 289 Drill (query engine), 93 Druid (database), 461 Dryad (dataflow engine), 421 dual writes, problems with, 452, 507 duplicates, suppression of, 517 (see also idempotence) using a unique ID, 518, 522 durability (transactions), 226, 554 duration (time), 287 measurement with monotonic clocks, 288 dynamic partitioning, 212 dynamically typed languages analogy to schema-on-read, 40 code generation and, 127 Dynamo-style databases (see leaderless replica‐ tion) E edges (in graphs), 49, 403 property graph model, 50 edit distance (full-text search), 88 effectively-once semantics, 476, 516 Index | 567 (see also exactly-once semantics) preservation of integrity, 525 elastic systems, 17 Elasticsearch (search server) document-partitioned indexes, 207 partition rebalancing, 211 percolator (stream search), 467 usage example, 4 use of Lucene, 79 ElephantDB (database), 413 Elm (programming language), 504, 512 encodings (data formats), 111-128 Avro, 122-127 binary variants of JSON and XML, 115 compatibility, 112 calling services, 136 using databases, 129-131 using message-passing, 138 defined, 113 JSON, XML, and CSV, 114 language-specific formats, 113 merits of schemas, 127 representations of data, 112 Thrift and Protocol Buffers, 117-121 end-to-end argument, 277, 519-520 checking integrity, 531 publish/subscribe streams, 512 enrichment (stream), 473 Enterprise JavaBeans (EJB), 134 entities (see vertices) epoch (consensus algorithms), 368 epoch (Unix timestamps), 288 equi-joins, 403 erasure coding (error correction), 398 Erlang OTP (actor framework), 139 error handling for network faults, 280 in transactions, 231 error-correcting codes, 277, 398 Esper (CEP engine), 466 etcd (coordination service), 370-373 linearizable operations, 333 locks and leader election, 330 quorum reads, 351 service discovery, 372 use of Raft algorithm, 349, 353 Ethereum (blockchain), 532 Ethernet (networks), 276, 278, 285 packet checksums, 306, 519 568 | Index Etherpad (collaborative editor), 170 ethics, 533-543 code of ethics and professional practice, 533 legislation and self-regulation, 542 predictive analytics, 533-536 amplifying bias, 534 feedback loops, 536 privacy and tracking, 536-543 consent and freedom of choice, 538 data as assets and power, 540 meaning of privacy, 539 surveillance, 537 respect, dignity, and agency, 543, 544 unintended consequences, 533, 536 ETL (extract-transform-load), 92, 405, 452, 554 use of Hadoop for, 416 event sourcing, 457-459 commands and events, 459 comparison to change data capture, 457 comparison to lambda architecture, 497 deriving current state from event log, 458 immutability and auditability, 459, 531 large, reliable data systems, 519, 526 Event Store (database), 458 event streams (see streams) events, 440 deciding on total order of, 493 deriving views from event log, 461 difference to commands, 459 event time versus processing time, 469, 477, 498 immutable, advantages of, 460, 531 ordering to capture causality, 493 reads as, 513 stragglers, 470, 498 timestamp of, in stream processing, 471 EventSource (browser API), 512 eventual consistency, 152, 162, 308, 322 (see also conflicts) and perpetual inconsistency, 525 evolvability, 21, 111 calling services, 136 graph-structured data, 52 of databases, 40, 129-131, 461, 497 of message-passing, 138 reprocessing data, 496, 498 schema evolution in Avro, 123 schema evolution in Thrift and Protocol Buffers, 120 schema-on-read, 39, 111, 128 exactly-once semantics, 360, 476, 516 parity with batch processors, 498 preservation of integrity, 525 exclusive mode (locks), 258 eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F Facebook Presto (query engine), 93 React, Flux, and Redux (user interface libra‐ ries), 512 social graphs, 49 Wormhole (change data capture), 455 fact tables, 93 failover, 157, 554 (see also leader-based replication) in leaderless replication, absence of, 178 leader election, 301, 348, 352 potential problems, 157 failures amplification by distributed transactions, 364, 495 failure detection, 280 automatic rebalancing causing cascading failures, 214 perfect failure detectors, 359 timeouts and unbounded delays, 282, 284 using ZooKeeper, 371 faults versus, 7 partial failures in distributed systems, 275-277, 310 fan-out (messaging systems), 11, 445 fault tolerance, 6-10, 555 abstractions for, 321 formalization in consensus, 365-369 use of replication, 367 human fault tolerance, 414 in batch processing, 406, 414, 422, 425 in log-based systems, 520, 524-526 in stream processing, 476-479 atomic commit, 477 idempotence, 478 maintaining derived state, 495 microbatching and checkpointing, 477 rebuilding state after a failure, 478 of distributed transactions, 362-364 transaction atomicity, 223, 354-361 faults, 6 Byzantine faults, 304-306 failures versus, 7 handled by transactions, 221 handling in supercomputers and cloud computing, 275 hardware, 7 in batch processing versus distributed data‐ bases, 417 in distributed systems, 274-277 introducing deliberately, 7, 280 network faults, 279-281 asymmetric faults, 300 detecting, 280 tolerance of, in multi-leader replication, 169 software errors, 8 tolerating (see fault tolerance) federated databases, 501 fence (CPU instruction), 338 fencing (preventing split brain), 158, 302-304 generating fencing tokens, 349, 370 properties of fencing tokens, 308 stream processors writing to databases, 478, 517 Fibre Channel (networks), 398 field tags (Thrift and Protocol Buffers), 119-121 file descriptors (Unix), 395 financial data, 460 Firebase (database), 456 Flink (processing framework), 421-423 dataflow APIs, 427 fault tolerance, 422, 477, 479 Gelly API (graph processing), 425 integration of batch and stream processing, 495, 498 machine learning, 428 query optimizer, 427 stream processing, 466 flow control, 282, 441, 555 FLP result (on consensus), 353 FlumeJava (dataflow library), 403, 427 followers, 152, 555 (see also leader-based replication) foreign keys, 38, 403 forward compatibility, 112 forward decay (algorithm), 16 Index | 569 Fossil (version control system), 463 shunning (deleting data), 463 FoundationDB (database) serializable transactions, 261, 265, 364 fractal trees, 83 full table scans, 403 full-text search, 555 and fuzzy indexes, 88 building search indexes, 411 Lucene storage engine, 79 functional reactive programming (FRP), 504 functional requirements, 22 futures (asynchronous operations), 135 fuzzy search (see similarity search) G garbage collection immutability and, 463 process pauses for, 14, 296-299, 301 (see also process pauses) genome analysis, 63, 429 geographically distributed datacenters, 145, 164, 278, 493 geospatial indexes, 87 Giraph (graph processing), 425 Git (version control system), 174, 342, 463 GitHub, postmortems, 157, 158, 309 global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), 398 GNU Coreutils (Linux), 394 GoldenGate (change data capture), 161, 170, 455 (see also Oracle) Google Bigtable (database) data model (see Bigtable data model) partitioning scheme, 199, 202 storage layout, 78 Chubby (lock service), 370 Cloud Dataflow (stream processor), 466, 477, 498 (see also Beam) Cloud Pub/Sub (messaging), 444, 448 Docs (collaborative editor), 170 Dremel (query engine), 93, 96 FlumeJava (dataflow library), 403, 427 GFS (distributed file system), 398 gRPC (RPC framework), 135 MapReduce (batch processing), 390 570 | Index (see also MapReduce) building search indexes, 411 task preemption, 418 Pregel (graph processing), 425 Spanner (see Spanner) TrueTime (clock API), 294 gossip protocol, 216 government use of data, 541 GPS (Global Positioning System) use for clock synchronization, 287, 290, 294, 295 GraphChi (graph processing), 426 graphs, 555 as data models, 49-63 example of graph-structured data, 49 property graphs, 50 RDF and triple-stores, 55-59 versus the network model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H Hadoop (data infrastructure) comparison to distributed databases, 390 comparison to MPP databases, 414-418 comparison to Unix, 413-414, 499 diverse processing models in ecosystem, 417 HDFS distributed filesystem (see HDFS) higher-level tools, 403 join algorithms, 403-410 (see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, 340 capturing, 187 concurrency and, 186 hard disks access patterns, 84 detecting corruption, 519, 530 faults in, 7, 227 sequential write throughput, 75, 450 hardware faults, 7 hash indexes, 72-75 broadcast hash joins, 409 partitioned hash joins, 409 hash partitioning, 203-205, 217 consistent hashing, 204 problems with hash mod N, 210 range queries, 204 suitable hash functions, 203 with fixed number of partitions, 210 HAWQ (database), 428 HBase (database) bug due to lack of fencing, 302 bulk loading, 413 column-family data model, 41, 99 dynamic partitioning, 212 key-range partitioning, 202 log-structured storage, 78 request routing, 216 size-tiered compaction, 79 use of HDFS, 417 use of ZooKeeper, 370 HDFS (Hadoop Distributed File System), 398-399 (see also distributed filesystems) checking data integrity, 530 decoupling from query engines, 417 indiscriminately dumping data into, 415 metadata about datasets, 410 NameNode, 398 use by Flink, 479 use by HBase, 212 use by MapReduce, 402 HdrHistogram (numerical library), 16 head (Unix tool), 392 head vertex (property graphs), 51 head-of-line blocking, 15 heap files (databases), 86 Helix (cluster manager), 216 heterogeneous distributed transactions, 360, 364 heuristic decisions (in 2PC), 363 Hibernate (object-relational mapper), 30 hierarchical model, 36 high availability (see fault tolerance) high-frequency trading, 290, 299 high-performance computing (HPC), 275 hinted handoff, 183 histograms, 16 Hive (query engine), 419, 427 for data warehouses, 93 HCatalog and metastore, 410 map-side joins, 409 query optimizer, 427 skewed joins, 408 workflows, 403 Hollerith machines, 390 hopping windows (stream processing), 472 (see also windows) horizontal scaling (see scaling out) HornetQ (messaging), 137, 444 distributed transaction support, 361 hot spots, 201 due to celebrities, 205 for time-series data, 203 in batch processing, 407 relieving, 205 hot standbys (see leader-based replication) HTTP, use in APIs (see services) human errors, 9, 279, 414 HyperDex (database), 88 HyperLogLog (algorithm), 466 I I/O operations, waiting for, 297 IBM DB2 (database) distributed transaction support, 361 recursive query support, 54 serializable isolation, 242, 257 XML and JSON support, 30, 42 electromechanical card-sorting machines, 390 IMS (database), 36 imperative query APIs, 46 InfoSphere Streams (CEP engine), 466 MQ (messaging), 444 distributed transaction support, 361 System R (database), 222 WebSphere (messaging), 137 idempotence, 134, 478, 555 by giving operations unique IDs, 518, 522 idempotent operations, 517 immutability advantages of, 460, 531 Index | 571 deriving state from event log, 459-464 for crash recovery, 75 in B-trees, 82, 242 in event sourcing, 457 inputs to Unix commands, 397 limitations of, 463 Impala (query engine) for data warehouses, 93 hash joins, 409 native code generation, 428 use of HDFS, 417 impedance mismatch, 29 imperative languages, 42 setting element styles (example), 45 in doubt (transaction status), 358 holding locks, 362 orphaned transactions, 363 in-memory databases, 88 durability, 227 serial transaction execution, 253 incidents cascading failures, 9 crashes due to leap seconds, 290 data corruption and financial losses due to concurrency bugs, 233 data corruption on hard disks, 227 data loss due to last-write-wins, 173, 292 data on disks unreadable, 309 deleted items reappearing, 174 disclosure of sensitive data due to primary key reuse, 157 errors in transaction serializability, 529 gigabit network interface with 1 Kb/s throughput, 311 network faults, 279 network interface dropping only inbound packets, 279 network partitions and whole-datacenter failures, 275 poor handling of network faults, 280 sending message to ex-partner, 494 sharks biting undersea cables, 279 split brain due to 1-minute packet delay, 158, 279 vibrations in server rack, 14 violation of uniqueness constraint, 529 indexes, 71, 555 and snapshot isolation, 241 as derived data, 386, 499-504 572 | Index B-trees, 79-83 building in batch processes, 411 clustered, 86 comparison of B-trees and LSM-trees, 83-85 concatenated, 87 covering (with included columns), 86 creating, 500 full-text search, 88 geospatial, 87 hash, 72-75 index-range locking, 260 multi-column, 87 partitioning and secondary indexes, 206-209, 217 secondary, 85 (see also secondary indexes) problems with dual writes, 452, 491 SSTables and LSM-trees, 76-79 updating when data changes, 452, 467 Industrial Revolution, 541 InfiniBand (networks), 285 InfiniteGraph (database), 50 InnoDB (storage engine) clustered index on primary key, 86 not preventing lost updates, 245 preventing write skew, 248, 257 serializable isolation, 257 snapshot isolation support, 239 inside-out databases, 504 (see also unbundling databases) integrating different data systems (see data integration) integrity, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 in consensus formalization, 365 integrity checks, 530 (see also auditing) end-to-end, 519, 531 use of snapshot isolation, 238 maintaining despite software bugs, 529 Interface Definition Language (IDL), 117, 122 intermediate state, materialization of, 420-423 internet services, systems for implementing, 275 invariants, 225 (see also constraints) inversion of control, 396 IP (Internet Protocol) unreliability of, 277 ISDN (Integrated Services Digital Network), 284 isolation (in transactions), 225, 228, 555 correctness and, 515 for single-object writes, 230 serializability, 251-266 actual serial execution, 252-256 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 violating, 228 weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-237 snapshot isolation, 237-242 iterative processing, 424-426 J Java Database Connectivity (JDBC) distributed transaction support, 361 network drivers, 128 Java Enterprise Edition (EE), 134, 356, 361 Java Message Service (JMS), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 distributed transaction support, 361 message ordering, 446 Java Transaction API (JTA), 355, 361 Java Virtual Machine (JVM) bytecode generation, 428 garbage collection pauses, 296 process reuse in batch processors, 422 JavaScript in MapReduce querying, 46 setting element styles (example), 45 use in advanced queries, 48 Jena (RDF framework), 57 Jepsen (fault tolerance testing), 515 jitter (network delay), 284 joins, 555 by index lookup, 403 expressing as relational operators, 427 in relational and document databases, 34 MapReduce map-side joins, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 MapReduce reduce-side joins, 403-408 handling skew, 407 sort-merge joins, 405 parallel execution of, 415 secondary indexes and, 85 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 support in document databases, 42 JOTM (transaction coordinator), 356 JSON Avro schema representation, 122 binary variants, 115 for application data, issues with, 114 in relational databases, 30, 42 representing a résumé (example), 31 Juttle (query language), 504 K k-nearest neighbors, 429 Kafka (messaging), 137, 448 Kafka Connect (database integration), 457, 461 Kafka Streams (stream processor), 466, 467 fault tolerance, 479 leader-based replication, 153 log compaction, 456, 467 message offsets, 447, 478 request routing, 216 transaction support, 477 usage example, 4 Ketama (partitioning library), 213 key-value stores, 70 as batch process output, 412 hash indexes, 72-75 in-memory, 89 partitioning, 201-205 by hash of key, 203, 217 by key range, 202, 217 dynamic partitioning, 212 skew and hot spots, 205 Kryo (Java), 113 Kubernetes (cluster manager), 418, 506 L lambda architecture, 497 Lamport timestamps, 345 Index | 573 Large Hadron Collider (LHC), 64 last write wins (LWW), 173, 334 discarding concurrent writes, 186 problems with, 292 prone to lost updates, 246 late binding, 396 latency instability under two-phase locking, 259 network latency and resource utilization, 286 response time versus, 14 tail latency, 15, 207 leader-based replication, 152-161 (see also replication) failover, 157, 301 handling node outages, 156 implementation of replication logs change data capture, 454-457 (see also changelogs) statement-based, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 linearizability of operations, 333 locking and leader election, 330 log sequence number, 156, 449 read-scaling architecture, 161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 leaderless replication, 177-191 (see also replication) detecting concurrent writes, 184-191 capturing happens-before relationship, 187 happens-before relationship and concur‐ rency, 186 last write wins, 186 merging concurrently written values, 190 version vectors, 191 multi-datacenter, 184 quorums, 179-182 consistency limitations, 181-183, 334 sloppy quorums and hinted handoff, 183 read repair and anti-entropy, 178 leap seconds, 8, 290 in time-of-day clocks, 288 leases, 295 implementation with ZooKeeper, 370 574 | Index need for fencing, 302 ledgers, 460 distributed ledger technologies, 532 legacy systems, maintenance of, 18 less (Unix tool), 397 LevelDB (storage engine), 78 leveled compaction, 79 Levenshtein automata, 88 limping (partial failure), 311 linearizability, 324-338, 555 cost of, 335-338 CAP theorem, 336 memory on multi-core CPUs, 338 definition, 325-329 implementing with total order broadcast, 350 in ZooKeeper, 370 of derived data systems, 492, 524 avoiding coordination, 527 of different replication methods, 332-335 using quorums, 334 relying on, 330-332 constraints and uniqueness, 330 cross-channel timing dependencies, 331 locking and leader election, 330 stronger than causal consistency, 342 using to implement total order broadcast, 351 versus serializability, 329 LinkedIn Azkaban (workflow scheduler), 402 Databus (change data capture), 161, 455 Espresso (database), 31, 126, 130, 153, 216 Helix (cluster manager) (see Helix) profile (example), 30 reference to company entity (example), 34 Rest.li (RPC framework), 135 Voldemort (database) (see Voldemort) Linux, leap second bug, 8, 290 liveness properties, 308 LMDB (storage engine), 82, 242 load approaches to coping with, 17 describing, 11 load testing, 16 load balancing (messaging), 444 local indexes (see document-partitioned indexes) locality (data access), 32, 41, 555 in batch processing, 400, 405, 421 in stateful clients, 170, 511 in stream processing, 474, 478, 508, 522 location transparency, 134 in the actor model, 138 locks, 556 deadlock, 258 distributed locking, 301-304, 330 fencing tokens, 303 implementation with ZooKeeper, 370 relation to consensus, 374 for transaction isolation in snapshot isolation, 239 in two-phase locking (2PL), 257-261 making operations atomic, 243 performance, 258 preventing dirty writes, 236 preventing phantoms with index-range locks, 260, 265 read locks (shared mode), 236, 258 shared mode and exclusive mode, 258 in two-phase commit (2PC) deadlock detection, 364 in-doubt transactions holding locks, 362 materializing conflicts with, 251 preventing lost updates by explicit locking, 244 log sequence number, 156, 449 logic programming languages, 504 logical clocks, 293, 343, 494 for read-after-write consistency, 164 logical logs, 160 logs (data structure), 71, 556 advantages of immutability, 460 compaction, 73, 79, 456, 460 for stream operator state, 479 creating using total order broadcast, 349 implementing uniqueness constraints, 522 log-based messaging, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 disk space usage, 450 replaying old messages, 451, 496, 498 slow consumers, 450 using logs for message storage, 447 log-structured storage, 71-79 log-structured merge tree (see LSMtrees) replication, 152, 158-161 change data capture, 454-457 (see also changelogs) coordination with snapshot, 156 logical (row-based) replication, 160 statement-based replication, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 scalability limits, 493 loose coupling, 396, 419, 502 lost updates (see updates) LSM-trees (indexes), 78-79 comparison to B-trees, 83-85 Lucene (storage engine), 79 building indexes in batch processes, 411 similarity search, 88 Luigi (workflow scheduler), 402 LWW (see last write wins) M machine learning ethical considerations, 534 (see also ethics) iterative processing, 424 models derived from training data, 505 statistical and numerical algorithms, 428 MADlib (machine learning toolkit), 428 magic scaling sauce, 18 Mahout (machine learning toolkit), 428 maintainability, 18-22, 489 defined, 23 design principles for software systems, 19 evolvability (see evolvability) operability, 19 simplicity and managing complexity, 20 many-to-many relationships in document model versus relational model, 39 modeling as graphs, 49 many-to-one and many-to-many relationships, 33-36 many-to-one relationships, 34 MapReduce (batch processing), 390, 399-400 accessing external services within job, 404, 412 comparison to distributed databases designing for frequent faults, 417 diversity of processing models, 416 diversity of storage, 415 Index | 575 comparison to stream processing, 464 comparison to Unix, 413-414 disadvantages and limitations of, 419 fault tolerance, 406, 414, 422 higher-level tools, 403, 426 implementation in Hadoop, 400-403 the shuffle, 402 implementation in MongoDB, 46-48 machine learning, 428 map-side processing, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 mapper and reducer functions, 399 materialization of intermediate state, 419-423 output of batch workflows, 411-413 building search indexes, 411 key-value stores, 412 reduce-side processing, 403-408 analysis of user activity events (exam‐ ple), 404 grouping records by same key, 406 handling skew, 407 sort-merge joins, 405 workflows, 402 marshalling (see encoding) massively parallel processing (MPP), 216 comparison to composing storage technolo‐ gies, 502 comparison to Hadoop, 414-418, 428 master-master replication (see multi-leader replication) master-slave replication (see leader-based repli‐ cation) materialization, 556 aggregate values, 101 conflicts, 251 intermediate state (batch processing), 420-423 materialized views, 101 as derived data, 386, 499-504 maintaining, using stream processing, 467, 475 Maven (Java build tool), 428 Maxwell (change data capture), 455 mean, 14 media monitoring, 467 median, 14 576 | Index meeting room booking (example), 249, 259, 521 membership services, 372 Memcached (caching server), 4, 89 memory in-memory databases, 88 durability, 227 serial transaction execution, 253 in-memory representation of data, 112 random bit-flips in, 529 use by indexes, 72, 77 memory barrier (CPU instruction), 338 MemSQL (database) in-memory storage, 89 read committed isolation, 236 memtable (in LSM-trees), 78 Mercurial (version control system), 463 merge joins, MapReduce map-side, 410 mergeable persistent data structures, 174 merging sorted files, 76, 402, 405 Merkle trees, 532 Mesos (cluster manager), 418, 506 message brokers (see messaging systems) message-passing, 136-139 advantages over direct RPC, 137 distributed actor frameworks, 138 evolvability, 138 MessagePack (encoding format), 116 messages exactly-once semantics, 360, 476 loss of, 442 using total order broadcast, 348 messaging systems, 440-451 (see also streams) backpressure, buffering, or dropping mes‐ sages, 441 brokerless messaging, 442 event logs, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 replaying old messages, 451, 496, 498 slow consumers, 450 message brokers, 443-446 acknowledgements and redelivery, 445 comparison to event logs, 448, 451 multiple consumers of same topic, 444 reliability, 442 uniqueness in log-based messaging, 522 Meteor (web framework), 456 microbatching, 477, 495 microservices, 132 (see also services) causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 Microsoft Azure Service Bus (messaging), 444 Azure Storage, 155, 398 Azure Stream Analytics, 466 DCOM (Distributed Component Object Model), 134 MSDTC (transaction coordinator), 356 Orleans (see Orleans) SQL Server (see SQL Server) migrating (rewriting) data, 40, 130, 461, 497 modulus operator (%), 210 MongoDB (database) aggregation pipeline, 48 atomic operations, 243 BSON, 41 document data model, 31 hash partitioning (sharding), 203-204 key-range partitioning, 202 lack of join support, 34, 42 leader-based replication, 153 MapReduce support, 46, 400 oplog parsing, 455, 456 partition splitting, 212 request routing, 216 secondary indexes, 207 Mongoriver (change data capture), 455 monitoring, 10, 19 monotonic clocks, 288 monotonic reads, 164 MPP (see massively parallel processing) MSMQ (messaging), 361 multi-column indexes, 87 multi-leader replication, 168-177 (see also replication) handling write conflicts, 171 conflict avoidance, 172 converging toward a consistent state, 172 custom conflict resolution logic, 173 determining what is a conflict, 174 linearizability, lack of, 333 replication topologies, 175-177 use cases, 168 clients with offline operation, 170 collaborative editing, 170 multi-datacenter replication, 168, 335 multi-object transactions, 228 need for, 231 Multi-Paxos (total order broadcast), 367 multi-table index cluster tables (Oracle), 41 multi-tenancy, 284 multi-version concurrency control (MVCC), 239, 266 detecting stale MVCC reads, 263 indexes and snapshot isolation, 241 mutual exclusion, 261 (see also locks) MySQL (database) binlog coordinates, 156 binlog parsing for change data capture, 455 circular replication topology, 175 consistent snapshots, 156 distributed transaction support, 361 InnoDB storage engine (see InnoDB) JSON support, 30, 42 leader-based replication, 153 performance of XA transactions, 360 row-based replication, 160 schema changes in, 40 snapshot isolation support, 242 (see also InnoDB) statement-based replication, 159 Tungsten Replicator (multi-leader replica‐ tion), 170 conflict detection, 177 N nanomsg (messaging library), 442 Narayana (transaction coordinator), 356 NATS (messaging), 137 near-real-time (nearline) processing, 390 (see also stream processing) Neo4j (database) Cypher query language, 52 graph data model, 50 Nephele (dataflow engine), 421 netcat (Unix tool), 397 Netflix Chaos Monkey, 7, 280 Network Attached Storage (NAS), 146, 398 network model, 36 Index | 577 graph databases versus, 60 imperative query APIs, 46 Network Time Protocol (see NTP) networks congestion and queueing, 282 datacenter network topologies, 276 faults (see faults) linearizability and network delays, 338 network partitions, 279, 337 timeouts and unbounded delays, 281 next-key locking, 260 nodes (in graphs) (see vertices) nodes (processes), 556 handling outages in leader-based replica‐ tion, 156 system models for failure, 307 noisy neighbors, 284 nonblocking atomic commit, 359 nondeterministic operations accidental nondeterminism, 423 partial failures in distributed systems, 275 nonfunctional requirements, 22 nonrepeatable reads, 238 (see also read skew) normalization (data representation), 33, 556 executing joins, 39, 42, 403 foreign key references, 231 in systems of record, 386 versus denormalization, 462 NoSQL, 29, 499 transactions and, 223 Notation3 (N3), 56 npm (package manager), 428 NTP (Network Time Protocol), 287 accuracy, 289, 293 adjustments to monotonic clocks, 289 multiple server addresses, 306 numbers, in XML and JSON encodings, 114 O object-relational mapping (ORM) frameworks, 30 error handling and aborted transactions, 232 unsafe read-modify-write cycle code, 244 object-relational mismatch, 29 observer pattern, 506 offline systems, 390 (see also batch processing) 578 | Index stateful, offline-capable clients, 170, 511 offline-first applications, 511 offsets consumer offsets in partitioned logs, 449 messages in partitioned logs, 447 OLAP (online analytic processing), 91, 556 data cubes, 102 OLTP (online transaction processing), 90, 556 analytics queries versus, 411 workload characteristics, 253 one-to-many relationships, 30 JSON representation, 32 online systems, 389 (see also services) Oozie (workflow scheduler), 402 OpenAPI (service definition format), 133 OpenStack Nova (cloud infrastructure) use of ZooKeeper, 370 Swift (object storage), 398 operability, 19 operating systems versus databases, 499 operation identifiers, 518, 522 operational transformation, 174 operators, 421 flow of data between, 424 in stream processing, 464 optimistic concurrency control, 261 Oracle (database) distributed transaction support, 361 GoldenGate (change data capture), 161, 170, 455 lack of serializability, 226 leader-based replication, 153 multi-table index cluster tables, 41 not preventing write skew, 248 partitioned indexes, 209 PL/SQL language, 255 preventing lost updates, 245 read committed isolation, 236 Real Application Clusters (RAC), 330 recursive query support, 54 snapshot isolation support, 239, 242 TimesTen (in-memory database), 89 WAL-based replication, 160 XML support, 30 ordering, 339-352 by sequence numbers, 343-348 causal ordering, 339-343 partial order, 341 limits of total ordering, 493 total order broadcast, 348-352 Orleans (actor framework), 139 outliers (response time), 14 Oz (programming language), 504 P package managers, 428, 505 packet switching, 285 packets corruption of, 306 sending via UDP, 442 PageRank (algorithm), 49, 424 paging (see virtual memory) ParAccel (database), 93 parallel databases (see massively parallel pro‐ cessing) parallel execution of graph analysis algorithms, 426 queries in MPP databases, 216 Parquet (data format), 96, 131 (see also column-oriented storage) use in Hadoop, 414 partial failures, 275, 310 limping, 311 partial order, 341 partitioning, 199-218, 556 and replication, 200 in batch processing, 429 multi-partition operations, 514 enforcing constraints, 522 secondary index maintenance, 495 of key-value data, 201-205 by key range, 202 skew and hot spots, 205 rebalancing partitions, 209-214 automatic or manual rebalancing, 213 problems with hash mod N, 210 using dynamic partitioning, 212 using fixed number of partitions, 210 using N partitions per node, 212 replication and, 147 request routing, 214-216 secondary indexes, 206-209 document-based partitioning, 206 term-based partitioning, 208 serial execution of transactions and, 255 Paxos (consensus algorithm), 366 ballot number, 368 Multi-Paxos (total order broadcast), 367 percentiles, 14, 556 calculating efficiently, 16 importance of high percentiles, 16 use in service level agreements (SLAs), 15 Percona XtraBackup (MySQL tool), 156 performance describing, 13 of distributed transactions, 360 of in-memory databases, 89 of linearizability, 338 of multi-leader replication, 169 perpetual inconsistency, 525 pessimistic concurrency control, 261 phantoms (transaction isolation), 250 materializing conflicts, 251 preventing, in serializability, 259 physical clocks (see clocks) pickle (Python), 113 Pig (dataflow language), 419, 427 replicated joins, 409 skewed joins, 407 workflows, 403 Pinball (workflow scheduler), 402 pipelined execution, 423 in Unix, 394 point in time, 287 polyglot persistence, 29 polystores, 501 PostgreSQL (database) BDR (multi-leader replication), 170 causal ordering of writes, 177 Bottled Water (change data capture), 455 Bucardo (trigger-based replication), 161, 173 distributed transaction support, 361 foreign data wrappers, 501 full text search support, 490 leader-based replication, 153 log sequence number, 156 MVCC implementation, 239, 241 PL/pgSQL language, 255 PostGIS geospatial indexes, 87 preventing lost updates, 245 preventing write skew, 248, 261 read committed isolation, 236 recursive query support, 54 representing graphs, 51 Index | 579 serializable snapshot isolation (SSI), 261 snapshot isolation support, 239, 242 WAL-based replication, 160 XML and JSON support, 30, 42 pre-splitting, 212 Precision Time Protocol (PTP), 290 predicate locks, 259 predictive analytics, 533-536 amplifying bias, 534 ethics of (see ethics) feedback loops, 536 preemption of datacenter resources, 418 of threads, 298 Pregel processing model, 425 primary keys, 85, 556 compound primary key (Cassandra), 204 primary-secondary replication (see leaderbased replication) privacy, 536-543 consent and freedom of choice, 538 data as assets and power, 540 deleting data, 463 ethical considerations (see ethics) legislation and self-regulation, 542 meaning of, 539 surveillance, 537 tracking behavioral data, 536 probabilistic algorithms, 16, 466 process pauses, 295-299 processing time (of events), 469 producers (message streams), 440 programming languages dataflow languages, 504 for stored procedures, 255 functional reactive programming (FRP), 504 logic programming, 504 Prolog (language), 61 (see also Datalog) promises (asynchronous operations), 135 property graphs, 50 Cypher query language, 52 Protocol Buffers (data format), 117-121 field tags and schema evolution, 120 provenance of data, 531 publish/subscribe model, 441 publishers (message streams), 440 punch card tabulating machines, 390 580 | Index pure functions, 48 putting computation near data, 400 Q Qpid (messaging), 444 quality of service (QoS), 285 Quantcast File System (distributed filesystem), 398 query languages, 42-48 aggregation pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182, 556 for leaderless replication, 179 in consensus algorithms, 368 limitations of consistency, 181-183, 334 making decisions in distributed systems, 301 monitoring staleness, 182 multi-datacenter replication, 184 relying on durability, 309 sloppy quorums and hinted handoff, 183 R R-trees (indexes), 87 RabbitMQ (messaging), 137, 444 leader-based replication, 153 race conditions, 225 (see also concurrency) avoiding with linearizability, 331 caused by dual writes, 452 dirty writes, 235 in counter increments, 235 lost updates, 242-246 preventing with event logs, 462, 507 preventing with serializable isolation, 252 write skew, 246-251 Raft (consensus algorithm), 366 sensitivity to network problems, 369 term number, 368 use in etcd, 353 RAID (Redundant Array of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234 no dirty writes, 235 read path (derived data), 509 read repair (leaderless replication), 178 for linearizability, 335 read replicas (see leader-based replication) read skew (transaction isolation), 238, 266 as violation of causality, 340 read-after-write consistency, 163, 524 cross-device, 164 read-modify-write cycle, 243 read-scaling architecture, 161 reads as events, 513 real-time collaborative editing, 170 near-real-time processing, 390 (see also stream processing) publish/subscribe dataflow, 513 response time guarantees, 298 time-of-day clocks, 288 rebalancing partitions, 209-214, 556 (see also partitioning) automatic or manual rebalancing, 213 dynamic partitioning, 212 fixed number of partitions, 210 fixed number of partitions per node, 212 problems with hash mod N, 210 recency guarantee, 324 recommendation engines batch process outputs, 412 batch workflows, 403, 420 iterative processing, 424 statistical and numerical algorithms, 428 records, 399 events in stream processing, 440 recursive common table expressions (SQL), 54 redelivery (messaging), 445 Redis (database) atomic operations, 243 durability, 89 Lua scripting, 255 single-threaded execution, 253 usage example, 4 redundancy hardware components, 7 of derived data, 386 (see also derived data) Reed–Solomon codes (error correction), 398 refactoring, 22 (see also evolvability) regions (partitioning), 199 register (data structure), 325 relational data model, 28-42 comparison to document model, 38-42 graph queries in SQL, 53 in-memory databases with, 89 many-to-one and many-to-many relation‐ ships, 33 multi-object transactions, need for, 231 NoSQL as alternative to, 29 object-relational mismatch, 29 relational algebra and SQL, 42 versus document model convergence of models, 41 data locality, 41 relational databases eventual consistency, 162 history, 28 leader-based replication, 153 logical logs, 160 philosophy compared to Unix, 499, 501 schema changes, 40, 111, 130 statement-based replication, 158 use of B-tree indexes, 80 relationships (see edges) reliability, 6-10, 489 building a reliable system from unreliable components, 276 defined, 6, 22 hardware faults, 7 human errors, 9 importance of, 10 of messaging systems, 442 Index | 581 software errors, 8 Remote Method Invocation (Java RMI), 134 remote procedure calls (RPCs), 134-136 (see also services) based on futures, 135 data encoding and evolution, 136 issues with, 134 using Avro, 126, 135 using Thrift, 135 versus message brokers, 137 repeatable reads (transaction isolation), 242 replicas, 152 replication, 151-193, 556 and durability, 227 chain replication, 155 conflict resolution and, 246 consistency properties, 161-167 consistent prefix reads, 165 monotonic reads, 164 reading your own writes, 162 in distributed filesystems, 398 leaderless, 177-191 detecting concurrent writes, 184-191 limitations of quorum consistency, 181-183, 334 sloppy quorums and hinted handoff, 183 monitoring staleness, 182 multi-leader, 168-177 across multiple datacenters, 168, 335 handling write conflicts, 171-175 replication topologies, 175-177 partitioning and, 147, 200 reasons for using, 145, 151 single-leader, 152-161 failover, 157 implementation of replication logs, 158-161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 state machine replication, 349, 452 using erasure coding, 398 with heterogeneous data systems, 453 replication logs (see logs) reprocessing data, 496, 498 (see also evolvability) from log-based messaging, 451 request routing, 214-216 582 | Index approaches to, 214 parallel query execution, 216 resilient systems, 6 (see also fault tolerance) response time as performance metric for services, 13, 389 guarantees on, 298 latency versus, 14 mean and percentiles, 14 user experience, 15 responsibility and accountability, 535 REST (Representational State Transfer), 133 (see also services) RethinkDB (database) document data model, 31 dynamic partitioning, 212 join support, 34, 42 key-range partitioning, 202 leader-based replication, 153 subscribing to changes, 456 Riak (database) Bitcask storage engine, 72 CRDTs, 174, 191 dotted version vectors, 191 gossip protocol, 216 hash partitioning, 203-204, 211 last-write-wins conflict resolution, 186 leaderless replication, 177 LevelDB storage engine, 78 linearizability, lack of, 335 multi-datacenter support, 184 preventing lost updates across replicas, 246 rebalancing, 213 search feature, 209 secondary indexes, 207 siblings (concurrently written values), 190 sloppy quorums, 184 ring buffers, 450 Ripple (cryptocurrency), 532 rockets, 10, 36, 305 RocksDB (storage engine), 78 leveled compaction, 79 rollbacks (transactions), 222 rolling upgrades, 8, 112 routing (see request routing) row-oriented storage, 96 row-based replication, 160 rowhammer (memory corruption), 529 RPCs (see remote procedure calls) Rubygems (package manager), 428 rules (Datalog), 61 S safety and liveness properties, 308 in consensus algorithms, 366 in transactions, 222 sagas (see compensating transactions) Samza (stream processor), 466, 467 fault tolerance, 479 streaming SQL support, 466 sandboxes, 9 SAP HANA (database), 93 scalability, 10-18, 489 approaches for coping with load, 17 defined, 22 describing load, 11 describing performance, 13 partitioning and, 199 replication and, 161 scaling up versus scaling out, 146 scaling out, 17, 146 (see also shared-nothing architecture) scaling up, 17, 146 scatter/gather approach, querying partitioned databases, 207 SCD (slowly changing dimension), 476 schema-on-read, 39 comparison to evolvable schema, 128 in distributed filesystems, 415 schema-on-write, 39 schemaless databases (see schema-on-read) schemas, 557 Avro, 122-127 reader determining writer’s schema, 125 schema evolution, 123 dynamically generated, 126 evolution of, 496 affecting application code, 111 compatibility checking, 126 in databases, 129-131 in message-passing, 138 in service calls, 136 flexibility in document model, 39 for analytics, 93-95 for JSON and XML, 115 merits of, 127 schema migration on railways, 496 Thrift and Protocol Buffers, 117-121 schema evolution, 120 traditional approach to design, fallacy in, 462 searches building search indexes in batch processes, 411 k-nearest neighbors, 429 on streams, 467 partitioned secondary indexes, 206 secondaries (see leader-based replication) secondary indexes, 85, 557 partitioning, 206-209, 217 document-partitioned, 206 index maintenance, 495 term-partitioned, 208 problems with dual writes, 452, 491 updating, transaction isolation and, 231 secondary sorts, 405 sed (Unix tool), 392 self-describing files, 127 self-joins, 480 self-validating systems, 530 semantic web, 57 semi-synchronous replication, 154 sequence number ordering, 343-348 generators, 294, 344 insufficiency for enforcing constraints, 347 Lamport timestamps, 345 use of timestamps, 291, 295, 345 sequential consistency, 351 serializability, 225, 233, 251-266, 557 linearizability versus, 329 pessimistic versus optimistic concurrency control, 261 serial execution, 252-256 partitioning, 255 using stored procedures, 253, 349 serializable snapshot isolation (SSI), 261-266 detecting stale MVCC reads, 263 detecting writes that affect prior reads, 264 distributed execution, 265, 364 performance of SSI, 265 preventing write skew, 262-265 two-phase locking (2PL), 257-261 index-range locks, 260 performance, 258 Serializable (Java), 113 Index | 583 serialization, 113 (see also encoding) service discovery, 135, 214, 372 using DNS, 216, 372 service level agreements (SLAs), 15 service-oriented architecture (SOA), 132 (see also services) services, 131-136 microservices, 132 causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 remote procedure calls (RPCs), 134-136 issues with, 134 similarity to databases, 132 web services, 132, 135 session windows (stream processing), 472 (see also windows) sessionization, 407 sharding (see partitioning) shared mode (locks), 258 shared-disk architecture, 146, 398 shared-memory architecture, 146 shared-nothing architecture, 17, 146-147, 557 (see also replication) distributed filesystems, 398 (see also distributed filesystems) partitioning, 199 use of network, 277 sharks biting undersea cables, 279 counting (example), 46-48 finding (example), 42 website about (example), 44 shredding (in relational model), 38 siblings (concurrent values), 190, 246 (see also conflicts) similarity search edit distance, 88 genome data, 63 k-nearest neighbors, 429 single-leader replication (see leader-based rep‐ lication) single-threaded execution, 243, 252 in batch processing, 406, 421, 426 in stream processing, 448, 463, 522 size-tiered compaction, 79 skew, 557 584 | Index clock skew, 291-294, 334 in transaction isolation read skew, 238, 266 write skew, 246-251, 262-265 (see also write skew) meanings of, 238 unbalanced workload, 201 compensating for, 205 due to celebrities, 205 for time-series data, 203 in batch processing, 407 slaves (see leader-based replication) sliding windows (stream processing), 472 (see also windows) sloppy quorums, 183 (see also quorums) lack of linearizability, 334 slowly changing dimension (data warehouses), 476 smearing (leap seconds adjustments), 290 snapshots (databases) causal consistency, 340 computing derived data, 500 in change data capture, 455 serializable snapshot isolation (SSI), 261-266, 329 setting up a new replica, 156 snapshot isolation and repeatable read, 237-242 implementing with MVCC, 239 indexes and MVCC, 241 visibility rules, 240 synchronized clocks for global snapshots, 294 snowflake schemas, 95 SOAP, 133 (see also services) evolvability, 136 software bugs, 8 maintaining integrity, 529 solid state drives (SSDs) access patterns, 84 detecting corruption, 519, 530 faults in, 227 sequential write throughput, 75 Solr (search server) building indexes in batch processes, 411 document-partitioned indexes, 207 request routing, 216 usage example, 4 use of Lucene, 79 sort (Unix tool), 392, 394, 395 sort-merge joins (MapReduce), 405 Sorted String Tables (see SSTables) sorting sort order in column storage, 99 source of truth (see systems of record) Spanner (database) data locality, 41 snapshot isolation using clocks, 295 TrueTime API, 294 Spark (processing framework), 421-423 bytecode generation, 428 dataflow APIs, 427 fault tolerance, 422 for data warehouses, 93 GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304 spreadsheets, dataflow programming capabili‐ ties, 504 SQL (Structured Query Language), 21, 28, 43 advantages and limitations of, 416 distributed query execution, 48 graph queries in, 53 isolation levels standard, issues with, 242 query execution on Hadoop, 416 résumé (example), 30 SQL injection vulnerability, 305 SQL on Hadoop, 93 statement-based replication, 158 stored procedures, 255 SQL Server (database) data warehousing support, 93 distributed transaction support, 361 leader-based replication, 153 preventing lost updates, 245 preventing write skew, 248, 257 read committed isolation, 236 recursive query support, 54 serializable isolation, 257 snapshot isolation support, 239 T-SQL language, 255 XML support, 30 SQLstream (stream analytics), 466 SSDs (see solid state drives) SSTables (storage format), 76-79 advantages over hash indexes, 76 concatenated index, 204 constructing and maintaining, 78 making LSM-Tree from, 78 staleness (old data), 162 cross-channel timing dependencies, 331 in leaderless databases, 178 in multi-version concurrency control, 263 monitoring for, 182 of client state, 512 versus linearizability, 324 versus timeliness, 524 standbys (see leader-based replication) star replication topologies, 175 star schemas, 93-95 similarity to event sourcing, 458 Star Wars analogy (event time versus process‐ ing time), 469 state derived from log of immutable events, 459 deriving current state from the event log, 458 interplay between state changes and appli‐ cation code, 507 maintaining derived state, 495 maintenance by stream processor in streamstream joins, 473 observing derived state, 509-515 rebuilding after stream processor failure, 478 separation of application code and, 505 state machine replication, 349, 452 statement-based replication, 158 statically typed languages analogy to schema-on-write, 40 code generation and, 127 statistical and numerical algorithms, 428 StatsD (metrics aggregator), 442 stdin, stdout, 395, 396 Stellar (cryptocurrency), 532 Index | 585 stock market feeds, 442 STONITH (Shoot The Other Node In The Head), 158 stop-the-world (see garbage collection) storage composing data storage technologies, 499-504 diversity of, in MapReduce, 415 Storage Area Network (SAN), 146, 398 storage engines, 69-104 column-oriented, 95-101 column compression, 97-99 defined, 96 distinction between column families and, 99 Parquet, 96, 131 sort order in, 99-100 writing to, 101 comparing requirements for transaction processing and analytics, 90-96 in-memory storage, 88 durability, 227 row-oriented, 70-90 B-trees, 79-83 comparing B-trees and LSM-trees, 83-85 defined, 96 log-structured, 72-79 stored procedures, 161, 253-255, 557 and total order broadcast, 349 pros and cons of, 255 similarity to stream processors, 505 Storm (stream processor), 466 distributed RPC, 468, 514 Trident state handling, 478 straggler events, 470, 498 stream processing, 464-481, 557 accessing external services within job, 474, 477, 478, 517 combining with batch processing lambda architecture, 497 unifying technologies, 498 comparison to batch processing, 464 complex event processing (CEP), 465 fault tolerance, 476-479 atomic commit, 477 idempotence, 478 microbatching and checkpointing, 477 rebuilding state after a failure, 478 for data integration, 494-498 586 | Index maintaining derived state, 495 maintenance of materialized views, 467 messaging systems (see messaging systems) reasoning about time, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 types of windows, 472 relation to databases (see streams) relation to services, 508 search on streams, 467 single-threaded execution, 448, 463 stream analytics, 466 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 streams, 440-451 end-to-end, pushing events to clients, 512 messaging systems (see messaging systems) processing (see stream processing) relation to databases, 451-464 (see also changelogs) API support for change streams, 456 change data capture, 454-457 derivative of state by time, 460 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 topics, 440 strict serializability, 329 strong consistency (see linearizability) strong one-copy serializability, 329 subjects, predicates, and objects (in triplestores), 55 subscribers (message streams), 440 (see also consumers) supercomputers, 275 surveillance, 537 (see also privacy) Swagger (service definition format), 133 swapping to disk (see virtual memory) synchronous networks, 285, 557 comparison to asynchronous networks, 284 formal model, 307 synchronous replication, 154, 557 chain replication, 155 conflict detection, 172 system models, 300, 306-310 assumptions in, 528 correctness of algorithms, 308 mapping to the real world, 309 safety and liveness, 308 systems of record, 386, 557 change data capture, 454, 491 treating event log as, 460 systems thinking, 536 T t-digest (algorithm), 16 table-table joins, 474 Tableau (data visualization software), 416 tail (Unix tool), 447 tail vertex (property graphs), 51 Tajo (query engine), 93 Tandem NonStop SQL (database), 200 TCP (Transmission Control Protocol), 277 comparison to circuit switching, 285 comparison to UDP, 283 connection failures, 280 flow control, 282, 441 packet checksums, 306, 519, 529 reliability and duplicate suppression, 517 retransmission timeouts, 284 use for transaction sessions, 229 telemetry (see monitoring) Teradata (database), 93, 200 term-partitioned indexes, 208, 217 termination (consensus), 365 Terrapin (database), 413 Tez (dataflow engine), 421-423 fault tolerance, 422 support by higher-level tools, 427 thrashing (out of memory), 297 threads (concurrency) actor model, 138, 468 (see also message-passing) atomic operations, 223 background threads, 73, 85 execution pauses, 286, 296-298 memory barriers, 338 preemption, 298 single (see single-threaded execution) three-phase commit, 359 Thrift (data format), 117-121 BinaryProtocol, 118 CompactProtocol, 119 field tags and schema evolution, 120 throughput, 13, 390 TIBCO, 137 Enterprise Message Service, 444 StreamBase (stream analytics), 466 time concurrency and, 187 cross-channel timing dependencies, 331 in distributed systems, 287-299 (see also clocks) clock synchronization and accuracy, 289 relying on synchronized clocks, 291-295 process pauses, 295-299 reasoning about, in stream processors, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 timestamp of events, 471 types of windows, 472 system models for distributed systems, 307 time-dependence in stream joins, 475 time-of-day clocks, 288 timeliness, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 timeouts, 279, 557 dynamic configuration of, 284 for failover, 158 length of, 281 timestamps, 343 assigning to events in stream processing, 471 for read-after-write consistency, 163 for transaction ordering, 295 insufficiency for enforcing constraints, 347 key range partitioning by, 203 Lamport, 345 logical, 494 ordering events, 291, 345 Titan (database), 50 tombstones, 74, 191, 456 topics (messaging), 137, 440 total order, 341, 557 limits of, 493 sequence numbers or timestamps, 344 total order broadcast, 348-352, 493, 522 consensus algorithms and, 366-368 Index | 587 implementation in ZooKeeper and etcd, 370 implementing with linearizable storage, 351 using, 349 using to implement linearizable storage, 350 tracking behavioral data, 536 (see also privacy) transaction coordinator (see coordinator) transaction manager (see coordinator) transaction processing, 28, 90-95 comparison to analytics, 91 comparison to data warehousing, 93 transactions, 221-267, 558 ACID properties of, 223 atomicity, 223 consistency, 224 durability, 226 isolation, 225 compensating (see compensating transac‐ tions) concept of, 222 distributed transactions, 352-364 avoiding, 492, 502, 521-528 failure amplification, 364, 495 in doubt/uncertain status, 358, 362 two-phase commit, 354-359 use of, 360-361 XA transactions, 361-364 OLTP versus analytics queries, 411 purpose of, 222 serializability, 251-266 actual serial execution, 252-256 pessimistic versus optimistic concur‐ rency control, 261 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 single-object and multi-object, 228-232 handling errors and aborts, 231 need for multi-object transactions, 231 single-object writes, 230 snapshot isolation (see snapshots) weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-238 transitive closure (graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter constructing home timelines (example), 11, 462, 474, 511 DistributedLog (event log), 448 Finagle (RPC framework), 135 Snowflake (sequence number generator), 294 Summingbird (processing library), 497 two-phase commit (2PC), 353, 355-359, 558 confusion with two-phase locking, 356 coordinator failure, 358 coordinator recovery, 363 how it works, 357 issues in practice, 363 performance cost, 360 transactions holding locks, 362 two-phase locking (2PL), 257-261, 329, 558 confusion with two-phase commit, 356 index-range locks, 260 performance of, 258 type checking, dynamic versus static, 40 U UDP (User Datagram Protocol) comparison to TCP, 283 multicast, 442 unbounded datasets, 439, 558 (see also streams) unbounded delays, 558 in networks, 282 process pauses, 296 unbundling databases, 499-515 composing data storage technologies, 499-504 federation versus unbundling, 501 need for high-level language, 503 designing applications around dataflow, 504-509 observing derived state, 509-515 materialized views and caching, 510 multi-partition data processing, 514 pushing state changes to clients, 512 uncertain (transaction status) (see in doubt) uniform consensus, 365 (see also consensus) uniform interfaces, 395 union type (in Avro), 125 uniq (Unix tool), 392 uniqueness constraints asynchronously checked, 526 requiring consensus, 521 requiring linearizability, 330 uniqueness in log-based messaging, 522 Unix philosophy, 394-397 command-line batch processing, 391-394 Unix pipes versus dataflow engines, 423 comparison to Hadoop, 413-414 comparison to relational databases, 499, 501 comparison to stream processing, 464 composability and uniform interfaces, 395 loose coupling, 396 pipes, 394 relation to Hadoop, 499 UPDATE statement (SQL), 40 updates preventing lost updates, 242-246 atomic write operations, 243 automatically detecting lost updates, 245 compare-and-set operations, 245 conflict resolution and replication, 246 using explicit locking, 244 preventing write skew, 246-251 V validity (consensus), 365 vBuckets (partitioning), 199 vector clocks, 191 (see also version vectors) vectorized processing, 99, 428 verification, 528-533 avoiding blind trust, 530 culture of, 530 designing for auditability, 531 end-to-end integrity checks, 531 tools for auditable data systems, 532 version control systems, reliance on immutable data, 463 version vectors, 177, 191 capturing causal dependencies, 343 versus vector clocks, 191 Vertica (database), 93 handling writes, 101 replicas using different sort orders, 100 vertical scaling (see scaling up) vertices (in graphs), 49 property graph model, 50 Viewstamped Replication (consensus algo‐ rithm), 366 view number, 368 virtual machines, 146 (see also cloud computing) context switches, 297 network performance, 282 noisy neighbors, 284 reliability in cloud services, 8 virtualized clocks in, 290 virtual memory process pauses due to page faults, 14, 297 versus memory management by databases, 89 VisiCalc (spreadsheets), 504 vnodes (partitioning), 199 Voice over IP (VoIP), 283 Voldemort (database) building read-only stores in batch processes, 413 hash partitioning, 203-204, 211 leaderless replication, 177 multi-datacenter support, 184 rebalancing, 213 reliance on read repair, 179 sloppy quorums, 184 VoltDB (database) cross-partition serializability, 256 deterministic stored procedures, 255 in-memory storage, 89 output streams, 456 secondary indexes, 207 serial execution of transactions, 253 statement-based replication, 159, 479 transactions in stream processing, 477 W WAL (write-ahead log), 82 web services (see services) Web Services Description Language (WSDL), 133 webhooks, 443 webMethods (messaging), 137 WebSocket (protocol), 512 Index | 589 windows (stream processing), 466, 468-472 infinite windows for changelogs, 467, 474 knowing when all events have arrived, 470 stream joins within a window, 473 types of windows, 472 winners (conflict resolution), 173 WITH RECURSIVE syntax (SQL), 54 workflows (MapReduce), 402 outputs, 411-414 key-value stores, 412 search indexes, 411 with map-side joins, 410 working set, 393 write amplification, 84 write path (derived data), 509 write skew (transaction isolation), 246-251 characterizing, 246-251, 262 examples of, 247, 249 materializing conflicts, 251 occurrence in practice, 529 phantoms, 250 preventing in snapshot isolation, 262-265 in two-phase locking, 259-261 options for, 248 write-ahead log (WAL), 82, 159 writes (database) atomic write operations, 243 detecting writes affecting prior reads, 264 preventing dirty writes with read commit‐ ted, 235 WS-* framework, 133 (see also services) WS-AtomicTransaction (2PC), 355 590 | Index X XA transactions, 355, 361-364 heuristic decisions, 363 limitations of, 363 xargs (Unix tool), 392, 396 XML binary variants, 115 encoding RDF data, 57 for application data, issues with, 114 in relational databases, 30, 41 XSL/XPath, 45 Y Yahoo!


pages: 394 words: 118,929

Dreaming in Code: Two Dozen Programmers, Three Years, 4,732 Bugs, and One Quest for Transcendent Software by Scott Rosenberg

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

A Pattern Language, Benevolent Dictator For Life (BDFL), Berlin Wall, c2.com, call centre, collaborative editing, conceptual framework, continuous integration, Donald Knuth, Douglas Engelbart, Douglas Engelbart, Douglas Hofstadter, Dynabook, en.wikipedia.org, Firefox, Ford paid five dollars a day, Francis Fukuyama: the end of history, George Santayana, Grace Hopper, Guido van Rossum, Gödel, Escher, Bach, Howard Rheingold, index card, Internet Archive, inventory management, Jaron Lanier, John Markoff, John von Neumann, knowledge worker, Larry Wall, life extension, Loma Prieta earthquake, Menlo Park, Merlin Mann, new economy, Nicholas Carr, Norbert Wiener, pattern recognition, Paul Graham, Potemkin village, RAND corporation, Ray Kurzweil, Richard Stallman, Ronald Reagan, Ruby on Rails, semantic web, side project, Silicon Valley, Singularitarianism, slashdot, software studies, source of truth, South of Market, San Francisco, speech recognition, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Stewart Brand, Ted Nelson, Therac-25, thinkpad, Turing test, VA Linux, Vannevar Bush, Vernor Vinge, web application, Whole Earth Catalog, Y2K

Cutting the size of the company in half, he slimmed down On’s goals and handed the new plan to a small team. Two years later, On shipped its first product: an add-on to the Macintosh operating system that made it easier for users to organize and find files. The company went on to release two more software products: a tool for setting up meetings over computer networks, and a program for collaboratively editing a shared memo. The products were critical hits but failed to catch fire, and they were limited to the Macintosh, which had become a narrow, isolated market. Meanwhile, On Technology had become one more chapter for the software disaster tomes. Kapor was right: Chandler was atypical. But what project wasn’t? “There’s no such thing as a typical software project,” Andy Hertzfeld liked to say.

As part of her job at Microsoft, Dusseault had started participating in the work of standards committees and attending meetings of the Internet Engineering Task Force (IETF), the parent organization of the standards committees. At the IETF, Dusseault got involved in work on a new standard called WebDAV (for Web-based Distributed Authoring and Versioning), an extension to the basic protocol of the Web that, as its official Web site explains, “allows users to collaboratively edit and manage files on remote Web servers.” By 2002 she was chairing the WebDAV working group and writing a book on the subject. To many programmers the realm of standards is a swamp of endless draft revisions, arm wrestling via mailing list, and maddeningly slow progress that makes garden-variety software development look like the Indy 500. (Big software companies are notorious for implementing new standards long before the drafts that codify them have won approval.)

CHAPTER 8 STICKIES ON A WHITEBOARD “Dogfooding . . . is something we do”: This description of dogfooding is from a February 21, 2003, blog posting by Scott Guthrie, a general manager in the Microsoft Developer Division, at http://weblogs.asp.net/scottgu/archive/2003/02/21/ 2743.aspx. “We use what we build”: Robert Taylor, quoted in Thierry Bardini, Bootstrapping (Stanford University Press, 2000), p. 154. “allows users to collaboratively edit”: Description from the main WebDAV site at http://www.webdav.org/. “hazmat > you know what would be really helpful”: From the Chandler IRC chat channel, July 14, 2004, at http://wiki.osafoundation.org/script/getIrcTranscript. cgi?channel=chandler&date=20040714. “People seem to be extraordinarily patient”: Blog posting by Ted Leung from July 29, 2004, at http://www.sauria.com/blog/2004/Jul/29.


pages: 132 words: 31,976

Getting Real by Jason Fried, David Heinemeier Hansson, Matthew Linderman, 37 Signals

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

call centre, collaborative editing, David Heinemeier Hansson, iterative process, John Gruber, knowledge worker, Merlin Mann, Metcalfe's law, performance metric, premature optimization, Ruby on Rails, slashdot, Steve Jobs, web application

Table of contents | Essay list for this chapter | Next essay Name Hook Give your app a name that's easy to remember A big mistake a lot of people make is thinking their app's name needs to be ultradescriptive. Don't worry about picking a name that vividly describes your tool's purpose; That usually just leads to a generic, forgettable name. Basecamp is a better name than something like Project Management Center or ProjectExpress. Writeboard is better than CollaborEdit. Also, don't focus group or committee-ize the naming process too much. Pick a name that's short, catchy, and memorable and then run with it. And don't sweat it if you can't get the exact domain name you want. You can always be creative and get close with a couple of extra letters (e.g. backpackit.com or campfirenow.com). Easy Does It Doesn't the tech industry realize that thinking up catchy, self-explanatory names would ultimately benefit it in the same way?


pages: 394 words: 110,352

The Art of Community: Building the New Age of Participation by Jono Bacon

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

barriers to entry, Benevolent Dictator For Life (BDFL), collaborative editing, crowdsourcing, Debian, en.wikipedia.org, Firefox, game design, Guido van Rossum, Johann Wolfgang von Goethe, Jono Bacon, Mark Shuttleworth, openstreetmap, Richard Stallman, Skype, social software, software as a service, telemarketer, web application

The sooner you get your tools up and running and get people focused on doing useful work, the better. Technical Considerations Most of this chapter has focused on best practices for assessing optimal workflow for a project, staying detached enough from specific technologies so that it can apply to any kind of community and workflow. The next few pages delve into some technical specifics of particular interest to software projects, such as bug tracking, source control, and collaborative editing. As we cover these topics I am going to deliberately avoid recommending specific solutions or pieces of software. This is because software changes a lot more quickly than books do, and the last thing I want to do is recommend a piece of software that has since become unsuitable for your requirements. Instead, the following pages will advise on some key considerations that SUPPORTING WORKFLOW WITH TOOLS Download at Boykma.Com 133 you should keep in mind when evaluating which tools you want to use to solve a particular problem.

One piece of advice I do offer, however, is that you should strongly consider some of the integrated tools that exist around these systems. As an example, the Launchpad system integrates tightly with the Bazaar DVCS system and makes tasks such as reviewing merges and viewing changes a point-and-click affair. These kinds of features can be hugely useful with such an important component in the software development toolchain. Collaborative Editing Back in the early days of open source, we all lied to the world about documentation. “Want to get involved in open source but can’t write code? Sure you can help! Write documentation!” Back then, if you wanted to write documentation you needed to install a complex menagerie of text processing tools, write your documentation in a markup language, ensure the document properly validated, and then convert the documentation into something the user could read.


pages: 271 words: 52,814

Blockchain: Blueprint for a New Economy by Melanie Swan

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

23andMe, Airbnb, altcoin, Amazon Web Services, asset allocation, banking crisis, basic income, bioinformatics, bitcoin, blockchain, capital controls, cellular automata, central bank independence, clean water, cloud computing, collaborative editing, Conway's Game of Life, crowdsourcing, cryptocurrency, disintermediation, Edward Snowden, en.wikipedia.org, ethereum blockchain, fault tolerance, fiat currency, financial innovation, Firefox, friendly AI, Hernando de Soto, intangible asset, Internet Archive, Internet of things, Khan Academy, Kickstarter, lifelogging, litecoin, Lyft, M-Pesa, microbiome, Network effects, new economy, peer-to-peer, peer-to-peer lending, peer-to-peer model, personalized medicine, post scarcity, prediction markets, QR code, ride hailing / ride sharing, Satoshi Nakamoto, Search for Extraterrestrial Intelligence, SETI@home, sharing economy, Skype, smart cities, smart contracts, smart grid, software as a service, technological singularity, Turing complete, unbanked and underbanked, underbanked, web application, WikiLeaks

So far, the Bitcoin convention of making open source code available by publishing software for cryptocurrency blockchains and protocols on GitHub has extended to some forms of “academic” publishing in the area, too, as white papers are posted as “Readme files” on GitHub. For example, there is blockchain venture capitalist David Johnston’s Dapp paper (“The General Theory of Decentralized Applications”) and Factom’s concept for batching the notarization of digital artifacts paper (the “Notary Chains” white paper). An interesting challenge for academic publishing on the blockchain is not just having an open-access, collaboratively edited, ongoing-discussion-forum journal per existing examples, or open-access, self-published blockchain white papers on GitHub, but to more fundamentally implement the blockchain concepts in blockchain journals. The consideration of what a decentralized direct peer-to-peer model for academic publishing could look like prompts the articulation of the functions that academic publishing provides and how, if these are still required, they might be provided in decentralized models.


pages: 270 words: 79,992

The End of Big: How the Internet Makes David the New Goliath by Nicco Mele

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

3D printing, 4chan, A Declaration of the Independence of Cyberspace, Airbnb, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, Apple's 1984 Super Bowl advert, barriers to entry, Berlin Wall, big-box store, bitcoin, business climate, call centre, Cass Sunstein, centralized clearinghouse, Chelsea Manning, citizen journalism, cloud computing, collaborative consumption, collaborative editing, commoditize, creative destruction, crony capitalism, cross-subsidies, crowdsourcing, David Brooks, death of newspapers, Donald Trump, Douglas Engelbart, Douglas Engelbart, en.wikipedia.org, Exxon Valdez, Fall of the Berlin Wall, Filter Bubble, Firefox, Galaxy Zoo, global supply chain, Google Chrome, Gordon Gekko, Hacker Ethic, Jaron Lanier, Jeff Bezos, jimmy wales, John Markoff, Julian Assange, Kevin Kelly, Khan Academy, Kickstarter, Lean Startup, Mark Zuckerberg, minimum viable product, Mohammed Bouazizi, Mother of all demos, Narrative Science, new economy, Occupy movement, old-boy network, peer-to-peer, period drama, Peter Thiel, pirate software, publication bias, Robert Metcalfe, Ronald Reagan, Ronald Reagan: Tear down this wall, sharing economy, Silicon Valley, Skype, social web, Steve Jobs, Steve Wozniak, Stewart Brand, Stuxnet, Ted Nelson, Telecommunications Act of 1996, telemarketer, The Wisdom of Crowds, transaction costs, uranium enrichment, Whole Earth Catalog, WikiLeaks, Zipcar

My expectation—that proved to be true—was that after a year, it would grow enough that we could sustain it through advertising.38 Since launching his blog in November 2000, Marshall has managed to build his 1,000 true fans into a serious business. Although still a small newsroom, TPM continues to do a range of hard-hitting reporting with a team of twenty full-time staff and a handful of contributors.39 Story of a River Online communities also are developing some of the watchdog capabilities of traditional Big News organizations. Wikipedia, a collaboratively edited encyclopedia and the world’s fifth most visited Web site, retains some of the hard-won functions of the fourth estate. While Wikipedians are fond of noting that, “Wikipedia does not have firm rules,” Wikipedia culture applies two fundamental values across all Wikipedia entries: neutrality and verifiability.40 Wikipedia professes to “strive for articles that document and explain the major points of view in a balanced and impartial manner,” avoiding advocacy and debatable assertions.


pages: 834 words: 180,700

The Architecture of Open Source Applications by Amy Brown, Greg Wilson

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

8-hour work day, anti-pattern, bioinformatics, c2.com, cloud computing, collaborative editing, combinatorial explosion, computer vision, continuous integration, create, read, update, delete, David Heinemeier Hansson, Debian, domain-specific language, Donald Knuth, en.wikipedia.org, fault tolerance, finite state, Firefox, friendly fire, Guido van Rossum, linked data, load shedding, locality of reference, loose coupling, Mars Rover, MVC pattern, peer-to-peer, Perl 6, premature optimization, recommendation engine, revision control, Ruby on Rails, side project, Skype, slashdot, social web, speech recognition, the scientific method, The Wisdom of Crowds, web application, WebSocket

If B and C have clocks in which some counters are greater than others in both clocks, say (39, 2, 5) and (39, 1, 6), then the servers recognize that they received different, potentially unreconcilable updates over time, and identify a conflict. Conflict Resolution Conflict resolution varies across the different systems. The Dynamo paper leaves conflict resolution to the application using the storage system. Two versions of a shopping cart can be merged into one without significant loss of data, but two versions of a collaboratively edited document might require human reviewer to resolve conflict. Voldemort follows this model, returning multiple copies of a key to the requesting client application upon conflict. Cassandra, which stores a timestamp on each key, uses the most recently timestamped version of a key when two versions are in conflict. This removes the need for a round-trip to the client and simplifies the API.

Violet has a more general design where the graph itself manages listeners for the following interface: public interface GraphModificationListener { void nodeAdded(Graph g, Node n); void nodeRemoved(Graph g, Node n); void nodeMoved(Graph g, Node n, double dx, double dy); void childAttached(Graph g, int index, Node p, Node c); void childDetached(Graph g, int index, Node p, Node c); void edgeAdded(Graph g, Edge e); void edgeRemoved(Graph g, Edge e); void propertyChangedOnNodeOrEdge(Graph g, PropertyChangeEvent event); } The framework installs a listener into each graph that is a bridge to the undo manager. For supporting undo, adding generic listener support to the model is overdesigned—the graph operations could directly interact with the undo manager. However, I also wanted to support an experimental collaborative editing feature. If you want to support undo/redo in your application, think carefully about the atomic operations in your model (and not your user interface). In the model, fire events when a structural change happens, and allow the Swing undo manager to collect and group these events. 22.9. Plugin Architecture For a programmer familiar with 2D graphics, it is not difficult to add a new diagram type to Violet.

Remix: Making Art and Commerce Thrive in the Hybrid Economy by Lawrence Lessig

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, Andrew Keen, Benjamin Mako Hill, Berlin Wall, Bernie Sanders, Brewster Kahle, Cass Sunstein, collaborative editing, commoditize, disintermediation, don't be evil, Erik Brynjolfsson, Internet Archive, invisible hand, Jeff Bezos, jimmy wales, Kevin Kelly, Larry Wall, late fees, Mark Shuttleworth, Netflix Prize, Network effects, new economy, optical character recognition, PageRank, peer-to-peer, recommendation engine, revision control, Richard Stallman, Ronald Coase, Saturday Night Live, SETI@home, sharing economy, Silicon Valley, Skype, slashdot, Steve Jobs, The Nature of the Firm, thinkpad, transaction costs, VA Linux, yellow journalism

Today there are more than a quarter million people collaborating in just this way on Slashdot.30 Their work—work they don’t get paid for, don’t even get frequent-flier miles for—produced a site worth millions of dollars. In 1999, Slashdot was sold to Andover .net, and ads were added to the layout. So readers edit, and readers and Andover.net then profit. Collaboration here winnows a potentially endless array of comments to a relative few that people reading the site would or should want to see. The site adds a kind of collaborative editing to a news page that is RW to the tradition of RO news. This collaboration produces a high-quality RW site. Editing is value. This value is produced for free. Last.fm No doubt the industry that has squealed the most (think stuck pig) about the Internet is the traditional recording industry. As I’ve described, the Net competed (whether legally or not) with its model for profiting from music.


pages: 407 words: 103,501

The Digital Divide: Arguments for and Against Facebook, Google, Texting, and the Age of Social Netwo Rking by Mark Bauerlein

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Mechanical Turk, Andrew Keen, centre right, citizen journalism, collaborative editing, computer age, computer vision, corporate governance, crowdsourcing, David Brooks, disintermediation, Frederick Winslow Taylor, Howard Rheingold, invention of movable type, invention of the steam engine, invention of the telephone, Jaron Lanier, Jeff Bezos, jimmy wales, Kevin Kelly, knowledge worker, late fees, Mark Zuckerberg, Marshall McLuhan, means of production, meta analysis, meta-analysis, moral panic, Network effects, new economy, Nicholas Carr, PageRank, peer-to-peer, pets.com, Results Only Work Environment, Saturday Night Live, search engine result page, semantic web, Silicon Valley, slashdot, social graph, social web, software as a service, speech recognition, Steve Jobs, Stewart Brand, technology bubble, Ted Nelson, The Wisdom of Crowds, Thorstein Veblen, web application

This, Wales argues, is only possible because responsibility for the content of Wikipedia is so widely distributed. Even hundreds of professional editors would struggle to keep six million articles clean day in and day out, but Wikipedia manages it fairly easily by relying on its thousands of volunteer contributors. The delicate compromise wording of the entry about abortion is an example of how collaborative editing can succeed. One passage reads: “Most often those in favor of legal prohibition of abortion describe themselves as pro-life while those against legal restrictions on abortion describe themselves as pro-choice.” Imagine the fighting that went into producing these simple words. But the article, as it stands, is not disputed. Discussants have found a middle ground. “It’s fabulous,” says Wales, citing another example, “that our article about Taiwan was written by Mainlanders and Taiwanese who don’t agree.”

Common Knowledge?: An Ethnography of Wikipedia by Dariusz Jemielniak

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Andrew Keen, barriers to entry, Benevolent Dictator For Life (BDFL), citation needed, collaborative consumption, collaborative editing, conceptual framework, continuous integration, crowdsourcing, Debian, deskilling, digital Maoism, en.wikipedia.org, Filter Bubble, Google Glasses, Guido van Rossum, Hacker Ethic, hive mind, Internet Archive, invisible hand, Jaron Lanier, jimmy wales, job satisfaction, Julian Assange, knowledge economy, knowledge worker, Menlo Park, moral hazard, online collectivism, pirate software, RFC: Request For Comment, Richard Stallman, selection bias, Silicon Valley, Skype, slashdot, social software, Stewart Brand, The Nature of the Firm, The Wisdom of Crowds, transaction costs, WikiLeaks, wikimedia commons, zero-sum game

The diversity of activities and areas where Wikipedians can interact is very high. This diversity dilutes the gradations of organizational authority. Some Wikipedians focus only on editing, but in general, credibility is built in interactions and participation in the common discourse (Forte & Bruckman, 2005), and to occupy a community position one has not only to edit but also interact. After all, collaborative editing of Wikipedia articles is a process of negotiating and constructing shared perspectives (Kane, Majchrzak, Johnson, & Chenisern, 2009), including the admissibility of sources, as well as balancing what constitutes neutral information and what does not. Even seemingly individual tasks have collaborative aspects. Wikipedians like to socialize. Even though Wikipedia is not a social network and most interactions on Wikipedia are related to the encyclopedia, a plethora of cultural production is meant only for other Wikipedians.


pages: 936 words: 252,313

Good Calories, Bad Calories: Challenging the Conventional Wisdom on Diet, Weight Control, and Disease by Gary Taubes

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Albert Einstein, California gold rush, cognitive dissonance, collaborative editing, Drosophila, Everything should be made as simple as possible, experimental subject, Gary Taubes, invention of agriculture, John Snow's cholera map, meta analysis, meta-analysis, phenotype, placebo effect, Ralph Nader, randomized controlled trial, Richard Feynman, Richard Feynman, Robert Gordon, selection bias, the scientific method, Thomas Kuhn: the structure of scientific revolutions, unbiased observer, Upton Sinclair

That this was the opinion of a nutritionist and a physician, neither of whom had worked clinically with obese patients, was lost in the publication of the critique under the auspices of the AMA itself. Mayer dedicated one of his syndicated newspaper columns to condemning Atkins, based on the AMA critique, which he cited repeatedly as though it were the considered opinion of the entire American Medical Association and not just his former collaborator, edited by his former student. “The American Medical Association,” wrote Mayer, has “taken the unusual step of warning the U.S. public against the latest do-it-yourself diet as propounded in ‘Dr. Atkins’s Diet Revolution.’” The AMA report, Mayer wrote, “explains why the ‘diet revolution’ cannot work.” After Van Itallie drafted the AMA attack on Atkins, he spent the next decade as the principal arbitrator of the risks and benefits of weight-reduction diets.


pages: 552 words: 168,518

MacroWikinomics: Rebooting Business and the World by Don Tapscott, Anthony D. Williams

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

accounting loophole / creative accounting, airport security, Andrew Keen, augmented reality, Ayatollah Khomeini, barriers to entry, bioinformatics, Bretton Woods, business climate, business process, car-free, carbon footprint, citizen journalism, Clayton Christensen, clean water, Climategate, Climatic Research Unit, cloud computing, collaborative editing, collapse of Lehman Brothers, collateralized debt obligation, colonial rule, commoditize, corporate governance, corporate social responsibility, creative destruction, crowdsourcing, death of newspapers, demographic transition, distributed generation, don't be evil, en.wikipedia.org, energy security, energy transition, Exxon Valdez, failed state, fault tolerance, financial innovation, Galaxy Zoo, game design, global village, Google Earth, Hans Rosling, hive mind, Home mortgage interest deduction, interchangeable parts, Internet of things, invention of movable type, Isaac Newton, James Watt: steam engine, Jaron Lanier, jimmy wales, Joseph Schumpeter, Julian Assange, Kevin Kelly, knowledge economy, knowledge worker, Marc Andreessen, Marshall McLuhan, mass immigration, medical bankruptcy, megacity, mortgage tax deduction, Netflix Prize, new economy, Nicholas Carr, oil shock, old-boy network, online collectivism, open borders, open economy, pattern recognition, peer-to-peer lending, personalized medicine, Ray Kurzweil, RFID, ride hailing / ride sharing, Ronald Reagan, Rubik’s Cube, scientific mainstream, shareholder value, Silicon Valley, Skype, smart grid, smart meter, social graph, social web, software patent, Steve Jobs, text mining, the scientific method, The Wisdom of Crowds, transaction costs, transfer pricing, University of East Anglia, urban sprawl, value at risk, WikiLeaks, X Prize, young professional, Zipcar

Fortunately, Web 2.0 technologies make the process of engaging citizens in problem solving easier and less costly than ever before. As with Habitat Jam, international institutions can post background information on the Web and use online video conferencing to bring in expert testimony. Web-enabled forums can allow discussion and debate among hundreds, thousands, and even millions of geographically dispersed participants. Wikis provide a platform for collaborative editing of policy documents, while social networking technologies can connect citizens and organizations with common goals and interests. The promise is that digital engagement will support global problem-solving approaches that integrate policy development and implementation into a seamless and flexible practice of continuous engagement, improvement, and innovation that can reach across national borders.