Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The platform decade — Google, Facebook, and consolidation

The web that the previous chapters have been describing — a federated, peer-to-peer, document-oriented infrastructure built on uncoordinated URLs and stateless protocols — was the web of approximately 1991 to 2005. The web that the world uses in 2026 is structurally different, not because the protocols changed but because a small number of companies grew, through the 2000s and 2010s, into operators of the systems on which the rest of the web depends. Google indexed the web and then became the gateway to it. Facebook built a social graph and then became the principal venue of online social life. Amazon built a commerce platform and then built the cloud infrastructure that most other web companies run on. A handful of other platforms — Twitter, YouTube (acquired by Google), Instagram and WhatsApp (acquired by Facebook), TikTok, the various app stores — absorbed adjacent functions. By 2020, most of the consumer internet’s traffic, time, and revenue passed through one of these platforms at some point in its journey. The web’s federated peer architecture had been overlaid, without being replaced, by a hierarchical platform structure that effectively determined what the federated peers could do. This chapter is about how that hierarchical structure emerged and what its emergence has done to the substrate underneath it.

Google

The platform decade arguably begins with Google. Larry Page and Sergey Brin, then Stanford graduate students, published the paper “The Anatomy of a Large-Scale Hypertextual Web Search Engine” in 1998, describing the PageRank algorithm that used the web’s link structure to rank search results. The company was founded in September 1998 in a Menlo Park garage; the IPO came in August 2004 at a valuation of $23 billion. By 2010 Google was the dominant search engine globally, with market share in many countries above 90%. By 2020 the company (renamed Alphabet, with Google as its largest subsidiary) had revenue of more than $180 billion per year, almost all of it from advertising tied to search and to the various other Google properties.

Google’s relationship to the rest of the web has been, structurally, the relationship of a gatekeeper. A web site that does not appear in Google’s index is, for most users, invisible. A web site that appears low in Google’s results is, for most users, also invisible. The criteria by which Google ranks results are not public; the broad strokes are known, but the specific decisions are commercial secrets, and the criteria evolve continuously through algorithm updates that have material effects on which sites are visible. The web exists, in a sense, twice: once as the federated network of independent sites that the original protocols imagined, and once as the much smaller filtered version of itself that Google surfaces to users. The two are not the same, and the gap between them has grown over the decades as Google has elaborated its filtering criteria.

The economic structure of Google’s filtering produces patterns that affect what the web becomes. The advertising-supported business model means that Google’s filtering is, ultimately, optimized for what produces advertising revenue. Search engine optimization — the practice of writing web pages to rank well in Google’s results — has been a substantial industry since the early 2000s, with hundreds of millions of dollars in annual spending, and the result has been a homogenization of what successful web content looks like. The pages that rank well are the pages that follow current SEO best practices, which tend to mean specific lengths, specific keyword densities, specific structures, specific kinds of metadata. The diversity of voices and styles the early web had been notable for has narrowed, partly because the rewarded patterns are specific and the unrewarded patterns disappear from view.

Google’s other properties have extended the gatekeeping into adjacent domains. Gmail (launched April 2004) hosts a substantial fraction of consumer mailboxes globally. Google Maps (2005) dominates online mapping. YouTube (acquired in October 2006 for $1.65 billion) dominates online video. Google Drive, Docs, Sheets, and the rest of the Workspace suite are used by hundreds of millions of users and enterprises. Android (acquired in 2005, released as a mobile operating system in 2008) runs on the majority of the world’s smartphones. The Chrome browser (released in September 2008) has roughly two-thirds of the global browser market share. Each of these is, individually, a substantial business. Together they constitute a stack in which Google is the platform for many of the core consumer internet activities. The platform position is reinforced by the data flows between them: a user who uses Google for search, Gmail for mail, Chrome for browsing, Android for their phone, and YouTube for video has all of their behavior visible to one company, which uses the aggregated data to improve its advertising targeting, which produces the revenue that funds the platform.

Facebook

Facebook was founded in February 2004 by Mark Zuckerberg, then a Harvard sophomore. The site spread from Harvard to other elite universities and then, in September 2006, to the general public. By 2008 it had passed MySpace as the leading social network in the United States; by 2012 it had a billion users; by 2020 it had nearly three billion across its family of services (including Instagram and WhatsApp, acquired in 2012 and 2014 respectively). The company rebranded as Meta in 2021, partly to signal the company’s intentions beyond Facebook itself.

Facebook’s relationship to the broader web has been, structurally, the relationship of a partial replacement. The site provided, increasingly, the things that people had previously gotten from various places on the open web: personal updates (which had been blogs), photos (which had been Flickr or various photo-sharing sites), longer-form content (which had been people’s own websites and other publishers), event invitations, group communications, and, eventually, news consumption. Many users’ web activity, by the mid-2010s, was substantially conducted inside Facebook. The hours-per-day average that users spent on Facebook grew through the 2010s to substantial fractions of the total time spent online; for many users, Facebook was the consumer internet.

The structural consequences of this replacement were severe for the open web. Personal blogs, which had been a thriving category in the mid-2000s, contracted sharply as users moved their personal updates to Facebook. Independent photo sites contracted as users moved their photos to Facebook and Instagram. Event-listing sites, group-coordination sites, various small-community sites — many of them lost users to the corresponding Facebook features, and many of them shut down or shrank to specialty niches. The Facebook universe was, by the mid-2010s, an effective substitute for substantial parts of the open web for substantial fractions of the user base.

Facebook also extended its reach into the web it was not replacing. The Open Graph Protocol, introduced by Facebook in 2010, was a markup convention that web pages could embed to tell Facebook how to display links to them when those links were shared. Open Graph metadata included titles, descriptions, images, and various other properties; pages that included it appeared nicely formatted in Facebook’s feed, while pages that did not appeared as bare links. The protocol was adopted widely by web publishers because the alternative — appearing badly in Facebook’s feed — was a real penalty. The effect was that web publishers, including those with no other relationship to Facebook, ended up structuring their pages in ways that served Facebook’s display needs. The platform reached back into the substrate.

The other extension was the social plugin: the Like button, the Share button, and the various other widgets that Facebook offered for embedding on third-party sites. The plugins were adopted widely; web pages across many publishers embedded them. The plugins also reported back to Facebook every visit to every page that embedded them, providing Facebook with a substantial portion of the web’s traffic data even for users who were not logged in. The tracking implications were significant and have been the subject of subsequent privacy regulations (the GDPR in particular). The plugins illustrated a broader pattern: even web pages that were not on Facebook were, through Facebook’s plugins and other infrastructure, partly within Facebook’s data-gathering apparatus.

Amazon and the cloud

Amazon’s platform play was different in kind. Amazon Web Services, launched in 2006 with S3 (Simple Storage Service) and EC2 (Elastic Compute Cloud), grew through the late 2000s and 2010s into the dominant cloud infrastructure provider. AWS now runs a substantial fraction of the world’s web applications. Estimates vary, but the typical figure for AWS’s share of the cloud-infrastructure market in 2026 is around 30-35%, with Microsoft Azure at around 20-25% and Google Cloud at 10-12%, and the remainder split among smaller providers. The three largest providers together account for roughly two-thirds of public cloud spending.

The cloud platform position is different from the consumer platform positions of Google and Facebook because the customers are other businesses rather than end users. But the effects on the web’s structure are comparable. A web application built on AWS depends on AWS for its infrastructure; if AWS has an outage, the application has an outage; if AWS changes its pricing, the application’s economics change; if AWS makes decisions about which workloads are acceptable, the application has to comply. The cloud providers’ decisions, in aggregate, shape what kinds of web applications can be economically built and operated. The substrate underneath the web is now, for most large web applications, a service from one of a small number of platform providers.

The cloud consolidation has had specific effects on the federation properties of the web. A web application running on AWS is, in a meaningful sense, running inside AWS. The egress costs of moving data out of AWS to other providers are non-trivial; the integration with AWS-specific services creates lock-in; the operational practices that work on AWS may not transfer. Building a web application that is genuinely portable across cloud providers requires deliberate effort and ongoing discipline. Most applications, having made the initial choice of a cloud provider, stay with that provider regardless of subsequent considerations. The federation that the internet’s transport layer provides is undermined, at the operational layer, by the consolidation of cloud infrastructure into a small oligopoly.

Mobile and the app store

The second great consolidation of the platform decade was the mobile app store. Apple’s App Store launched in July 2008, alongside the iPhone 3G; Google Play (originally Android Market) launched in October 2008. The app stores became, within a few years, the primary way that consumers acquired software for their mobile devices, and the mobile devices became, by the mid-2010s, the primary way that consumers used the internet for many purposes.

The app store model was, structurally, a sharp departure from the web’s permissionless distribution. To distribute software through the App Store, a developer had to register with Apple, pay an annual fee, submit each application for review, comply with the App Store guidelines, and accept Apple’s terms of service. Apple controlled what software could be distributed; Apple took a 30% cut of all paid sales and in-app purchases (reduced to 15% for smaller developers in some categories); Apple could remove an application from the store at any time for any reason. The Google Play store had comparable structure with slightly different specifics.

The app store consolidation produced a specific kind of loss for the web. Many of the things that would have been built as web applications in the 2000s — communication tools, productivity tools, games, content-delivery services — were instead built as mobile apps. The mobile apps were, structurally, not web pages. They were software distributed through the app stores, with the app stores’ rules applying. The properties the web had — open distribution, view-source, easy linking, archival accessibility, search-engine visibility — did not apply to mobile apps. A user who installed a popular mobile app was, structurally, using a piece of software they did not control, distributed through a platform they did not control, on a device whose update policies they did not control.

The mobile shift was significant for the platform decade’s consolidation because it gave the platform operators a second layer of control over what consumers did online. The web was, after the consolidation, just one venue for consumer internet activity; the apps, controlled by the same platform operators or others within the same general structure, were another. The total platform footprint — web sites and apps and the various intermediating services — was much larger than the web alone, and the consolidation was correspondingly deeper.

Algorithmic feeds and the engagement model

A specific feature of the platform decade was the algorithmic feed. Facebook introduced its algorithmic News Feed in 2009, replacing the previous chronological display of friends’ updates with a ranked display determined by Facebook’s relevance algorithms. Twitter introduced an algorithmic timeline in 2016, similarly replacing the chronological display. Instagram followed. TikTok, founded in 2016 and dominant by the early 2020s, was algorithmic from the start. By the early 2020s, almost every major social-media platform had moved from chronological to algorithmic display, with the algorithms tuned to maximize various measures of user engagement.

The engagement model — the practice of optimizing platform behavior to maximize the time users spend on the platform — became, through the 2010s, the dominant business model of the consumer internet. The argument for engagement was that engaged users see more advertising and produce more revenue. The argument was correct as far as it went, and engagement-driven product decisions produced enormous revenue for the platforms. The argument’s costs were less visible at the time and have become more visible since.

Engagement-optimized algorithms tend to surface content that produces strong emotional reactions. Outrage is engaging; polarization is engaging; conflict is engaging. The platforms became, through the late 2010s, increasingly populated by content optimized for these reactions, because the algorithms surfaced and rewarded such content, and creators of content responded to the incentives. The political consequences have been substantial, with academic and journalistic literature on the effects of algorithmic feeds on political discourse, civic engagement, and individual psychological well-being. The cumulative effect of a generation of consumer internet experience under engagement optimization has been one of the most consequential social experiments in recent history, and the results are still being absorbed.

The relevance to the broader argument of this book is structural. The federated peer-to-peer model of the early internet had no algorithmic feeds. Usenet was chronological. Email is chronological. The early web’s discussion venues were chronological. The shift to algorithmic ranking is a shift to platform-controlled curation, and the curation reflects the platform’s interests rather than the user’s stated preferences. The user of an algorithmically ranked feed is, in a meaningful sense, not in control of what they see; the platform is. This is a substantial change in the relationship between a user and the content they consume, and it is a change that the early internet’s architecture had not anticipated.

The death of the open web discourse

A periodic discourse through the 2010s, in technology press and among certain web technologists, asserted that the open web was dying. The discourse had several specific markers. Google’s discontinuation of Google Reader in July 2013 was an early flashpoint: Reader had been a popular RSS reader, and its closure was widely read as Google’s signal that RSS-based federated content distribution was not a priority. The subsequent decline of RSS as a mainstream technology, with most major publishers prioritizing social-media distribution over RSS, was treated as evidence of the broader decline.

The platform-control issues — Facebook’s algorithmic feed, the app stores’ gatekeeping, the consolidation of cloud infrastructure, the dominance of search by a single company — accumulated into a sense, by the late 2010s, that the open web’s structural properties were being eroded faster than they could be defended. Various commentators (Anil Dash, Hossein Derakhshan in his 2015 essay “The Web We Have to Save,” and many others) wrote essays arguing that the web of independent publishers, of personal sites, of federated communities, was disappearing. The essays were not, in every detail, correct — the open web did not die in any literal sense — but they captured something real about the trajectory.

The trajectory’s response, beginning in the late 2010s and accelerating in the 2020s, has been the recovery work that Part V treats. The IndieWeb movement (treated in chapter twenty-two), the local-first software movement (treated in chapter twenty-one), the personal knowledge tools wave (treated in chapter twenty-four), the static renaissance (treated in chapter twenty-three) — each of these is, in part, a response to the platform consolidation, an attempt to recover at the application layer what has been lost at the platform layer. None of them is large enough, individually, to reverse the consolidation. Collectively they are a meaningful counter-current, and the trajectory they describe is one of the few hopeful elements in the platform-decade story.

Antitrust, regulation, and the limits of what is being done about this

The platforms have, through the late 2010s and 2020s, attracted substantial regulatory attention. The European Union’s General Data Protection Regulation (GDPR), in force since 2018, imposed substantial obligations on data handling by large platforms. The Digital Services Act and Digital Markets Act, both passed in 2022, imposed further obligations. The United States has pursued antitrust cases against Google (US v Google, 2020, with a major ruling in 2024 finding Google to be an illegal monopolist in search), against Facebook (FTC v Meta, ongoing), against Apple (Epic Games v Apple, with mixed outcomes, and the EU’s various app-store mandates), and against Amazon (FTC v Amazon, filed 2023, ongoing). Various other jurisdictions have similar cases ongoing.

The regulatory response has had real but limited effects. The platforms have made changes — GDPR consent flows, app-store changes in the EU, various transparency reports, various advertising-data restrictions — but the underlying consolidation has not, so far, been reversed. The economic position of the largest platforms in 2026 remains roughly what it was in 2020, with some shifts at the margins. Whether the regulatory pressure will eventually produce structural change or merely produce procedural changes that leave the underlying structure intact is, in 2026, an open question. The recovery efforts in Part V should be read as efforts that proceed in parallel with the regulatory response, not in lieu of it.

What the platform decade has done to the substrate

The most important question for this book’s argument is what the platform decade has done to the substrate underneath it. The internet’s protocols are unchanged. HTTP, TCP/IP, DNS, SMTP — these still work as Postel and his colleagues designed them. The protocols are federated; the protocols are peer-to-peer at the architectural level; the protocols do not require any of the platforms. The substrate is intact.

What has changed is what runs on the substrate. The applications most users use, the services most businesses depend on, the infrastructure most of the web runs on — these are increasingly the property of a small number of companies. The substrate’s federation is still there at the protocol level and increasingly invisible at the application level. A user who knows only the consumer internet has no obvious reason to know that the substrate is federated; everything they experience appears to be the platforms.

The recovery work in Part V is, in part, work to make the substrate’s federation visible again. Local-first software wants to give users data that does not live on platforms. ActivityPub wants to give users social network experiences that span servers. The static renaissance wants to give users documents that do not depend on platform infrastructure. Each of these is, at root, a way of bringing the federated substrate back into the user’s experience.

There is also, less optimistically, a question about whether the substrate itself will remain intact. The internet’s infrastructure depends on the cooperative behavior of network operators, on the continued operation of standards bodies that are not captured, on the continued availability of independent name servers and registries, and on a working ecology of independent operators below the platform layer. The trends through the platform decade have, in various small ways, put pressure on this ecology. The cloud consolidation puts pressure on independent server hosting. The app store consolidation puts pressure on independent software distribution. The advertising consolidation puts pressure on independent publishing. The search consolidation puts pressure on independent finding. The cumulative pressure, while not yet existential, is real, and the work of maintaining the federated substrate has become harder than it was twenty years ago.

The conclusion to Part IV

The four chapters of Part IV have traced the web’s victory. The technical decisions of the early 1990s, the commercial inflection of 1995, the standards-body politics that followed, the accidental dominance of JavaScript, and the platform consolidation of the 2010s and 2020s — these are the substance of how the web went from being one of several proposed networked-computing platforms to being the substrate of contemporary computing. The web’s victory has been real and has been beneficial in many ways: it has made information available to far more people than the pre-web alternatives could have reached, it has produced enormous economic value, it has enabled forms of expression and community that would not otherwise have existed, and it has been, on the whole, one of the more successful experiments in public infrastructure in the history of the species.

The victory has also come with costs, which are the subject of Part V. The document was lost to the application; the personal authorship was lost to the platform; the federated peer model was lost to the consolidation; the structural properties that the pre-web alternatives had offered were lost to the simplicity of the web’s protocol. Some of these losses are being recovered. Some are not. The five chapters of Part V take the recoveries — Bret Victor and the dynamic document tradition, local-first software, ActivityPub and the IndieWeb, the static renaissance, the personal knowledge tools — and try to be honest about what is being recovered, what is partial, and what is not happening. The book ends with an assessment of what is permanent in the losses and what is not.

The first chapter of Part V takes the recovery that has had the smallest user base and the largest intellectual influence. The dynamic document tradition — running from Bret Victor’s explorables through Observable, Jupyter, and the recent renaissance in computational notebooks — has been less a mass-market phenomenon than a set of demonstrations of what a document could be if it were also a computation. The demonstrations have not yet reached a wide audience. The intellectual ambition behind them is substantial, and the influence on the recovery of the document on the web has been disproportionate to their direct user base.