Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Local-first and the data ownership recovery

The platform decade did, among other things, transfer a generation’s data from the user’s machines to the platforms’ servers. The transfer was not announced; it happened as a side effect of the cloud-application model that became dominant through the 2010s. Email moved to Gmail; documents moved to Google Drive; notes moved to Evernote and then to its successors; photos moved to iCloud and Google Photos and Facebook; messages moved to WhatsApp and Messenger and Slack; the various spreadsheets and project plans and personal records that used to live on hard drives moved to one or another web application’s back-end database. The transfer was, in many ways, beneficial: synchronization across devices, collaboration with others, off-site backup, access from anywhere with a browser. The transfer also produced a specific kind of dependency: the user’s data, in functional terms, belongs to the platform, and the user’s ability to use that data is contingent on the platform’s continued existence, continued operation, and continued willingness to permit the use. This dependency is what local-first software is trying to undo. This chapter is about the local-first movement: what it claims, what technologies it depends on, what has been built, and what is plausible as a path forward.

The 2019 manifesto

The local-first software movement crystallized around an essay published in April 2019 by Martin Kleppmann, Adam Wiggins, Peter van Hardenberg, and Mark McGranaghan, under the auspices of Ink & Switch, a small research group the latter three had been running in Brooklyn and San Francisco. The essay was titled “Local-first software: You own your data, in spite of the cloud.” It crystallized, in a relatively short document, a position that had been articulating itself in pieces across several research traditions for the previous decade.

The essay proposed seven ideals for local-first software. The first was speed: applications should respond immediately to user actions, without the network round-trip that cloud applications require. The second was the absence of single-device lock-in: the user’s data should work on whatever device the user has, not be trapped on the one they happened to create it on. The third was that the network should be optional: applications should work offline, with the network providing convenience features rather than being a required dependency. The fourth was seamless collaboration: when network connectivity is available, users should be able to collaborate with others without that collaboration being mediated by a central server in a way that takes their work hostage. The fifth was “the long now”: users should still have access to their data in twenty years, regardless of what has happened to the company that built the tool. The sixth was security and privacy by default: the user’s data should be private from the operator of any cloud service, not just from third-party attackers. The seventh was ultimate ownership and control: the user, not the operator, should hold final authority over their data.

The seven ideals were not novel as individual propositions. Each had been articulated, in various forms, by previous critics of the cloud-application model. What the essay did was assemble them into a coherent program and attach the program to a name. “Local-first software” became, after the essay, a term of art that designers and engineers could refer to when discussing the architectural alternatives to platform-dependent cloud applications. The movement that the essay named has, in the six years since, produced research, tooling, and a small but growing number of production applications.

CRDTs

The technical substrate that makes local-first applications possible is a body of research on what are called Conflict-free Replicated Data Types, or CRDTs. The research traces back to work in distributed systems and operational transformation through the 1990s and 2000s. Operational transformation, an algorithm developed for collaborative text editing (Ellis and Gibbs’s 1989 paper is an early reference), was the technology behind Google Wave (2009) and is the substrate for Google Docs’s collaborative editing. Operational transformation works but is notoriously hard to get right; the algorithms are subtle, and the implementations require centralized coordination to maintain correctness.

CRDTs were proposed as an alternative. The foundational paper, “Conflict-free Replicated Data Types” by Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski, was published in 2011. The core idea was that data structures could be designed so that their operations were commutative — operations could be applied in any order, by any replica, with the final state being the same regardless of order. A CRDT-based system can have multiple replicas modifying the data independently, exchanging operations whenever they happen to be in contact, and converging to a consistent state without requiring any central coordinator. The mathematical properties that make this possible are constraining — not every data structure admits a CRDT implementation, and the implementations that do exist often have overhead in terms of metadata size — but enough common structures (counters, sets, lists, maps, text documents) have working CRDT implementations that significant applications can be built on them.

The local-first proposition depends on CRDTs because the local-first model assumes that any replica of the data can be modified independently and that the replicas must be able to converge when they meet. This is exactly the problem CRDTs are designed to solve. Without CRDTs (or equivalent algorithms), local-first software has to choose between merge conflicts that the user must resolve manually (which is a poor user experience) and centralized coordination that defeats the local-first proposition. With CRDTs, the merge happens automatically, in the background, with deterministic outcomes that don’t require a central authority.

The library ecosystem

The CRDT research has produced a small but capable library ecosystem. The two most prominent libraries in 2026 are Automerge and Yjs.

Automerge, principally developed by Martin Kleppmann (one of the authors of the 2019 essay) with substantial contributions from others, is a JSON-like CRDT library that lets applications model their data as a structured document and synchronize it across peers. The library is available in TypeScript and Rust, with bindings to several other languages. The data model is rich enough to support text editing, list manipulation, map/object manipulation, and various structured data operations, with the underlying CRDT machinery handling the synchronization. Automerge is in active development, with regular performance improvements and new features, and is used by a number of production applications.

Yjs, developed by Kevin Jahns and a team, is a similar CRDT library with a different design emphasis. Yjs has been particularly focused on collaborative text editing, with editor-integration packages for ProseMirror, CodeMirror, Quill, and other rich-text editor frameworks. Yjs has a substantial production user base, including in commercial collaborative-editing applications, and is generally considered the more performant of the two libraries for the text-editing use case. Both libraries are open source under permissive licenses, and the choice between them is mostly a question of which library’s API fits the application’s data model better.

The libraries handle the merge problem; they do not handle the transport problem. Local-first applications also need some way to exchange operations between replicas. The transport can be peer-to-peer (direct connections between user devices), can use a relay server (which forwards operations but does not need to understand them), or can use a more conventional client-server architecture with the server being a relay rather than the canonical state-holder. The local-first model permits any of these, with the constraint that no component in the system should be a single point of trust that has access to the data unencrypted.

Ink & Switch

The research lab that produced the 2019 manifesto and several of the prominent open-source libraries deserves brief description. Ink & Switch was founded in 2015 by Peter van Hardenberg, Adam Wiggins, and James Lindenbaum, all of whom had been at Heroku before. The lab is privately funded — primarily through the founders’ personal resources, with various grants and contracts at the margins — and operates as a small (typically 10-15 person) research team. The lab’s work is mostly published as research reports, prototypes, and open-source code. The team has produced, over the past decade, a substantial body of work on local-first software, on the user-interface design of collaborative tools, on the integration of CRDTs into application architectures, and on related topics.

Ink & Switch’s prototypes have included Pushpin (a collaborative pinboard application that demonstrated local-first patterns), Capstone (a more elaborate canvas-based notebook), and a series of related explorations. None of these has been a commercial product; they are research artifacts, intended to demonstrate possibilities and to surface design problems for further work. The lab’s published reports — “Local-first software,” “Project Cambria” (on schema evolution for CRDT-based applications), “Project Aphelion” (on collaboration with strangers), and others — are the public-facing artifacts that have most influenced the broader local-first community.

The lab’s existence and approach are themselves part of the recovery story. The pre-web era had research labs that took long-running architectural questions seriously and produced demonstrations rather than products: Xerox PARC, Bell Labs, SRI, MIT’s Media Lab. The intervening decades have seen many of these labs decline or be repurposed toward shorter-term commercial work. Ink & Switch is part of a small recent revival of the research-lab model — alongside groups like Hack Club’s HQ, Future of Coding’s various initiatives, and a few others — in which technical research is done in public, on time horizons longer than a product cycle, with the output being published rather than commercialized. The pattern is fragile and depends on patient funding, but it is producing genuinely substantial work.

Production applications

The local-first movement has begun to produce production applications that real users use. The list is shorter than the discourse around the movement might suggest, but it is real.

Obsidian, the personal knowledge tool treated in chapter twenty-four, is local-first by design. Notes are stored as Markdown files on the user’s machine. Synchronization across devices is provided by either the user’s own cloud storage (Dropbox, iCloud, Syncthing) or Obsidian’s own paid sync service, with the local files being the canonical store either way. The user’s data is, in a strong sense, the user’s: it lives on their machines as files they own, in a format they can read with any text editor, and continues to exist if Obsidian the company goes away.

Linear, the project-tracking tool launched in 2019, has built its application around an aggressively local-first architecture in which the client maintains a full local copy of the relevant data and the server is a synchronization mechanism. The result is an application that feels responsive in a way that conventional cloud applications do not, with most operations completing instantly because they execute on local data. Linear is not strictly local-first in the manifesto’s full sense — the server holds the canonical state for most purposes, the user does not have full ownership of the data in the way that Obsidian provides — but the architecture is closer to local-first than to conventional cloud and demonstrates that the pattern can produce commercially successful applications.

Apple Notes has been moving in a local-first direction over recent years, with iCloud serving as a synchronization layer for notes that are stored locally on each device. The architecture is not exactly the local-first manifesto’s, but the user-facing properties — fast local responses, working offline, persistence even if Apple’s servers are unavailable — are close enough to the manifesto’s ideals that the application qualifies for inclusion in the local-first conversation.

Several smaller applications — Tarvos (collaborative note-taking), Muse (visual thinking tool), various prototypes from the Ink & Switch alumni — have explored the design space. The overall production population is still small in absolute terms; most consumer applications remain server-canonical, with the local-first patterns being adopted in pockets.

The hard problems

The local-first proposition is not free. The pattern requires solving several problems that conventional cloud applications can avoid.

Schema evolution. A CRDT-based application stores data on user devices, and the data persists across application updates. When the application’s schema changes — when new features require new data structures, or when existing structures need to be reorganized — the old data has to be migrated. Conventional cloud applications can do this with a server-side migration script that runs once. Local-first applications need each replica to be able to migrate its own data, with the migration being compatible across all the replicas the user (and any collaborators) have. The problem is hard and has been the subject of substantial Ink & Switch research (the Cambria project) without yet producing a clean general solution.

Authentication and access control. Local-first systems often want to share data among a controlled group — colleagues, friends, family — without that group having to involve a central authentication server. The cryptographic protocols for doing this are not exotic but they are also not trivial; the application has to handle key management, key rotation, group membership, and the various edge cases of trust establishment in a way that is invisible to non-technical users. Most production local-first applications have, so far, accepted some compromise here, with the application’s vendor providing some authentication infrastructure even though the data itself lives on user devices.

Storage cost. CRDT-based data structures often have substantial metadata overhead. A document that would be a few kilobytes in a conventional format may be a few hundred kilobytes in a CRDT format, because the CRDT needs to maintain the operation history to support merging. The overhead grows with the document’s lifetime. Various compaction and pruning strategies have been developed, but none is free, and applications working with large documents have to budget for the metadata cost.

Search. Conventional cloud applications can do full-text search across all users’ data with centralized indexes. Local-first applications need each replica to maintain its own index, which is more expensive in storage and computation. The indexes also need to be updated as operations arrive from peers, which is more complex than indexing a stable corpus. Several CRDT-friendly indexing approaches have been developed, but the user experience of search in local-first applications is still, in many cases, less polished than in cloud applications.

The user-facing presentation of CRDTs. Some CRDT operations have semantics that are surprising to users. Two users concurrently editing the same paragraph may, after the merge, see a result that neither of them produced — a merge of their two edits that is logically consistent but is not what either intended. Operational transformation has the same issue but typically with more sophisticated heuristics for which version to prefer. The user-experience design of local-first applications has to handle these merge surprises gracefully, and the design is hard.

What is at stake

The argument for local-first software is, at the deepest level, an argument about the relationship between users and software. The cloud-application model that has dominated the last two decades has a specific implicit assumption: that the user is the application’s customer, that the application’s vendor is the responsible party, and that the vendor’s continued existence and continued willingness to provide service are conditions of the user’s relationship with their data. This assumption is, in many cases, fine. Many cloud applications are well-run, will continue to exist for decades, and will provide their users with reliable access to their data for as long as they need it. The assumption is also, in many cases, false. Companies fail, are acquired, are discontinued, change their pricing, change their policies, lose their data, or simply decide that a particular feature or product line is no longer worth maintaining. When this happens, users of the affected applications find that their relationship to their own data has changed in ways they did not anticipate.

The local-first proposition is that this should not be a normal occurrence. The user’s data should belong to the user in a robust sense — should be stored on the user’s machines, should be in a format the user can read, should continue to be accessible if the application’s vendor goes away. The cloud convenience features (synchronization, collaboration, off-site backup) should be supplements to the local storage, not replacements for it. The user’s relationship to their data should be the same as the user’s relationship to their books or their photographs in a physical archive: theirs, persistently, in a form they can access without permission.

This proposition has parallels to the proposition that motivated the email chapter (chapter thirteen), and to the proposition that motivated the federated services chapters more broadly. Email is the surviving federated peer system; local-first is the proposal to recover, at the application layer, what email recovered at the protocol layer. If local-first succeeds at any scale, the architecture of consumer software changes: users have local data, the cloud provides services on top of the data rather than holding it, and the relationship between user and software returns to something closer to the desktop-application model than to the cloud-application model.

What is being recovered and what is not

The local-first recovery, where it succeeds, recovers several things:

The user’s data ownership. If the data lives on the user’s machine in a readable format, the user owns it in the strong sense. This is the central proposition.

Offline operation. Applications that maintain local data can work without the network, with the network being used only for synchronization. This is a usability win independent of the ownership argument.

Speed. Local-first applications can be substantially faster than cloud applications, because most operations don’t require a network round trip. Linear’s success is largely attributable to this property.

Long-term durability. If the data is in a portable format on the user’s machine, the user can continue to use it indefinitely, with or without the application’s vendor. Obsidian’s Markdown files are the clearest current example.

What the recovery has not yet achieved:

Mass-market adoption. The local-first applications described above have user bases in the millions but not the billions. The mainstream consumer internet remains overwhelmingly cloud-canonical.

A general-purpose local-first platform. There is no single application platform on which local-first applications run, the way there is for cloud applications. Each local-first application has to solve, mostly from scratch, the problems of synchronization, schema evolution, identity, and the rest. The platforms — Apple’s iCloud, Google’s Drive, Dropbox — are cloud platforms, and the local-first proposition is, in a sense, against the platform model.

End-to-end encryption that is invisible to users. The local-first manifesto includes privacy as one of its ideals, but most production local-first applications have not yet achieved the kind of seamless end-to-end encryption that the ideal calls for. The cryptographic protocols are available; the user-experience design that hides them behind a simple interface is still being developed.

The federation between local-first applications. Each local-first application is, currently, an island. Data in Obsidian is not, generally, accessible to data in Linear. The local-first movement has, at this point, mostly recovered the property of local ownership; it has not yet recovered the property of inter-application portability that the desktop-application era had partially provided.

What it would take

The conditions for local-first software to become the default architecture of consumer software are several. The libraries have to become more capable, with better performance, better schema-evolution support, and better integration with the existing application-development ecosystem. The platforms have to support it, with operating-system-level facilities for local storage, synchronization, and identity that applications can build on rather than reinventing. The user-experience patterns have to be developed so that users understand what local-first means and what they get for using it. The economic model has to be worked out — local-first applications cannot, in general, monetize through advertising on user data, because the data is on the user’s device — and the alternative models (subscription, one-time purchase, freemium with sync as the paid feature) have to be sustainable.

These conditions are not, individually, exotic. They are also not yet in place at the scale that would make local-first the default. The current state, in 2026, is that local-first is a serious architectural alternative that produces credible applications in specific domains. The trajectory is upward; the conditions for broader adoption are gradually being met; the question is the rate at which the trajectory continues.

The next chapter takes a different recovery, oriented around a different problem. ActivityPub and the IndieWeb are working on the federation problem rather than the data-ownership problem: their proposition is that the social-network functions of the platform decade can be reassembled, on federated infrastructure, with users on independent servers exchanging content through standard protocols. The recoveries are complementary; together they would address most of what the platform decade has consolidated, although neither is yet at the scale where they constitute a comprehensive answer.