Designing a Personal Information Architecture
I once had a system for managing information that involved four apps, three browser extensions, two automated workflows, a spreadsheet, and a growing sense of dread. It was magnificent. It was comprehensive. It lasted eleven days.
On day twelve, I missed an important email because my triage workflow had routed it to a “process later” queue that I’d forgotten existed. On day thirteen, I realized my automated RSS-to-notes pipeline had been silently failing for a week. On day fourteen, I went back to reading whatever showed up on my screen and feeling vaguely guilty about it.
If this sounds familiar, congratulations: you’ve attempted to build a personal information architecture. The fact that it collapsed isn’t a sign of weakness. It’s a sign that you, like most people, designed a system optimized for the fantasy version of yourself rather than the actual human who has to operate it at 7 AM before coffee.
This chapter is about designing a system that works for the real you. Not the you who spent a productive Saturday afternoon setting up Notion templates. The you who, three weeks later, is going to be tired and busy and tempted to just scroll Twitter instead of engaging with whatever elaborate intake ritual you designed.
The good news: a personal information architecture doesn’t need to be complicated. It needs to be deliberate. There’s a world of difference between “I read whatever the algorithm puts in front of me” and “I have a rough plan for how information flows through my life.” You don’t need the perfect plan. You need a plan you’ll actually follow.
The Four Layers
Every information architecture, whether you’ve designed it consciously or not, has four layers. Understanding these layers is the first step to making yours intentional.
Intake is what comes in. It’s the sum total of information sources you’re exposed to: news sites, social media feeds, email newsletters, Slack channels, RSS feeds, podcasts, conversations, books, papers, the ambient noise of the internet. Most people’s intake layer is a chaotic mess of things they deliberately chose, things they accidentally subscribed to, and things that found them through algorithmic recommendation.
Processing is how you evaluate what comes in. It’s the triage step: what gets your full attention, what gets skimmed, what gets saved for later, what gets ignored. This is where most systems break down, because processing requires energy and judgment, and both are finite resources.
Storage is how you retain what matters. It’s your notes, your bookmarks, your highlights, your knowledge base, your memory (both biological and digital). Storage is where the “productivity internet” spends most of its energy, because storage systems are fun to build and photograph and blog about. They’re also the layer that matters least if your intake and processing layers are broken.
Retrieval is how you find things again. It’s search, it’s tagging, it’s linking, it’s the ability to surface a relevant piece of information when you actually need it. Retrieval is the layer most people neglect entirely, which means they have beautiful note-taking systems full of information they’ll never access again.
These four layers form a pipeline. Information flows in through intake, gets evaluated during processing, persists in storage, and becomes useful through retrieval. Weakness in any layer undermines all the others.
Let’s look at each one in detail.
Designing Your Intake Layer
Your intake layer is the boundary between you and the entire informational output of human civilization. No pressure.
The first principle of intake design is that you should be choosing your sources, not having them chosen for you. This sounds obvious, but consider how much of your daily information intake is algorithmically determined. Your social media feeds, your YouTube recommendations, your news app’s “for you” page, your podcast app’s suggestions — all of these are decisions being made on your behalf by systems optimized for engagement, not for your actual information needs.
Deliberate intake design means:
Choosing sources explicitly. For each domain you care about, identify two to four high-quality sources. Not twenty. Not “all the good ones.” Two to four. You can always add more later. You almost certainly won’t need to.
Setting up intentional feeds. RSS feeds, email newsletters, and curated lists give you control over what arrives. Algorithmic feeds don’t. This doesn’t mean algorithmic feeds are useless — we’ll talk about their role in Chapter 20 — but they shouldn’t be the backbone of your intake.
Creating intake boundaries. Decide where and when information enters your life. Do you check news in the morning or evening? Do you read newsletters on your phone or computer? Do you have designated “intake time” or do you graze all day? These aren’t trivial questions. The answers shape your relationship with information.
Distinguishing intake channels by purpose. Your intake should serve different functions: staying current in your field, maintaining general awareness, exploring new domains, feeding specific projects. Each function might need different sources and different cadences.
Here’s what deliberate intake design looks like in practice:
For professional currency (staying current in your field), you want a small number of high-signal sources you check regularly. These might be specific newsletters, key blogs, or curated feeds from trusted aggregators. The goal is reliability and depth, not breadth.
For general awareness (knowing what’s happening in the world), you want one or two news sources you trust, checked at defined intervals. Not a constant stream. Not push notifications. A deliberate check-in, once or twice a day.
For domain exploration (learning about new areas), you want a rotation of sources outside your usual channels. We’ll cover this extensively in Chapter 20, but the key point for now is that exploration sources should be intentionally different from your regular sources.
For project-specific research, you want targeted intake that you spin up for specific needs and spin down when they’re met. This is where search, academic databases, and AI-assisted research come in.
The most common mistake in intake design is treating all information as equally urgent. It’s not. Professional currency might need daily attention. General awareness can often wait. Domain exploration can happen weekly. Project research is episodic. Designing your intake layer means matching the cadence of your sources to the urgency of the information.
The Source Audit
Before you can design your intake layer, you need to understand what it currently looks like. Here’s a quick audit:
- List every regular information source you consume. Include apps, newsletters, feeds, channels, subscriptions, and habitual sites.
- For each source, note: How often do you check it? How much time do you spend? What percentage of the content is actually useful to you? Why did you originally subscribe or start reading?
- Look at the list. Be honest about what’s there because it’s valuable versus what’s there because you subscribed three years ago and never got around to unsubscribing.
Most people who do this exercise discover two things: they have far more sources than they realized, and a significant percentage of those sources aren’t actually serving any purpose. They’re information habits, not information choices.
The audit isn’t about cutting everything down to some minimalist ideal. It’s about seeing what you have so you can make decisions about what you want.
The Processing Layer: Triage That Doesn’t Suck
Processing is where your system earns its keep or falls apart. Raw information is useless until you’ve decided what to do with it.
Effective processing requires a triage workflow — a set of decisions you make about each piece of incoming information. The decisions are simpler than you think:
- Is this relevant to me right now? If yes, read/engage with it now. If no, proceed to step 2.
- Will this be relevant to me later? If yes, save it somewhere you’ll find it. If no, proceed to step 3.
- Is this interesting but not actionable? If yes, decide whether it’s worth the time to engage with it. If no, let it go.
That’s it. Three questions. Everything else is implementation detail.
The hard part isn’t the framework. The hard part is being honest with yourself at step 3. Most of us have a deep reluctance to let information go. What if we need it later? What if it turns out to be important? What if everyone else read it and we didn’t?
Here’s the liberating truth: you will miss things. You will let important things go. This is not a system failure. This is reality. The alternative — trying to process everything — doesn’t prevent you from missing things. It just means you miss them while feeling exhausted instead of missing them while feeling focused.
Triage Workflows
A triage workflow is just a routine for processing your intake. Here are some patterns that work:
The Morning Scan. Spend 15-20 minutes scanning your high-priority feeds. For each item: read it, save it for later, or skip it. Don’t get pulled into rabbit holes during triage. Triage is for sorting, not for deep reading.
The Two-Pass Method. First pass: scan headlines and summaries, flag anything worth reading. Second pass: actually read the flagged items. The first pass is fast and ruthless. The second pass is where you slow down and engage. Keeping these separate prevents the common failure mode of spending your entire triage window on the first interesting article you find.
The AI-Assisted Summary. Use an LLM to generate summaries of your incoming content. Scan the summaries, then decide what deserves full attention. This works especially well for lengthy articles, research papers, and newsletters that bury the lead under seven paragraphs of preamble.
The Categorize-Then-Process Method. Sort incoming items into buckets first (work, personal development, current events, curiosity), then process each bucket separately. This prevents context-switching and lets you adjust your processing depth by category.
Summarization Pipelines
This is where AI tools earn their place in your workflow. A summarization pipeline takes incoming content and produces condensed versions that are faster to triage.
A basic pipeline:
- Content arrives (via RSS, newsletter, etc.)
- AI generates a 2-3 sentence summary and a relevance assessment
- You scan summaries and select what to read in full
- Selected content gets your actual attention
A more sophisticated pipeline:
- Content arrives
- AI generates summary, extracts key claims, and flags potential issues (unsupported claims, missing perspectives, conflicts with previous reading)
- You review the enriched summaries
- Selected content gets full attention with AI annotations as context
- Key insights get extracted and routed to your storage layer
The danger with summarization pipelines is that they make it too easy to feel informed without actually reading anything. A summary of a nuanced article is not the same as understanding the nuanced article. Use summaries for triage, not as a substitute for engagement.
Categorization Systems
You need a way to categorize incoming information, but you don’t need an elaborate taxonomy. In fact, elaborate taxonomies are one of the top system-killers. If assigning a category requires consulting a decision tree, you won’t do it.
Start with broad categories that map to your actual life:
- Things I need for current work
- Things that develop my professional skills
- Things that inform my understanding of the world
- Things that are purely interesting
Four categories. Maybe five if you have a specific hobby or side project. That’s enough. You can always split categories later if one becomes unwieldy. You can never un-invest the time you spent building a seventeen-category taxonomy that you abandoned after two weeks.
The Storage Layer: Where Good Intentions Go to Die
Here’s a pattern I see constantly: someone gets excited about knowledge management, sets up an elaborate note-taking system, populates it enthusiastically for a few weeks, and then slowly stops using it. Six months later, they discover a “new” approach, migrate everything, and repeat the cycle.
The storage layer is where the gap between intention and practice is widest. Let’s close it.
What Storage Actually Needs to Do
Your storage layer has two jobs:
- Keep information available for when you need it
- Help you think
That’s it. It doesn’t need to be beautiful. It doesn’t need to be comprehensive. It doesn’t need to impress anyone on YouTube. It needs to hold things you might need later and support your thinking process.
Note-Taking Systems
The best note-taking system is the one you’ll actually use. I know that’s unsatisfying advice, but it’s true. The specific tool matters far less than the practice of using it consistently.
That said, some properties make a note-taking system more likely to survive contact with your actual life:
Low friction capture. If it takes more than ten seconds to create a note, you won’t capture fleeting thoughts. Your capture mechanism needs to be faster than your impulse to say “I’ll remember this.”
Reasonable organization. Not no organization (that’s a junk drawer). Not elaborate organization (that’s a part-time job). Something in between. Folders or tags or links that let you find things without requiring you to make complex categorization decisions at capture time.
Search that works. When your note collection grows beyond a few hundred items, browsing stops working. You need full-text search at minimum. Tagging and linking help but aren’t substitutes for search.
Regular review. Notes you never look at again might as well not exist. Your system needs a mechanism — even informal — for revisiting stored information.
The “Write to Think” Principle
Here’s the most underappreciated function of a note-taking system: it’s not primarily for storing information. It’s for processing information. The act of writing about something — summarizing it, connecting it to other things you know, articulating your reaction to it — is how you actually learn it.
This is why highlighting is nearly useless as a knowledge management technique. Highlighting is the illusion of engagement. Writing, even brief writing, is actual engagement.
When you encounter something worth storing, don’t just clip it. Write a brief note: what’s the key idea, why does it matter to you, how does it connect to things you already know. This takes thirty seconds and transforms passive storage into active thinking.
This doesn’t mean every note needs to be an essay. A few sentences is fine. Even a single sentence that captures your reaction is better than a pristine highlight with no context. Future you will not remember why past you highlighted that paragraph. Future you will understand a sentence that says “This contradicts what Smith said about X — look into this.”
Knowledge Bases
A knowledge base is different from a note-taking system, though many tools try to be both. Notes are typically chronological, informal, and personal. A knowledge base is typically organized by topic, more polished, and potentially shared.
You probably need notes. You might need a knowledge base. Don’t build one until you have a clear use case.
If you do need a knowledge base, the same principles apply: low friction, reasonable organization, good search. Add one more: progressive refinement. A knowledge base entry should start rough and get better over time, not start perfect. If you wait until you have a complete understanding of a topic before creating an entry, you’ll never create entries.
The wiki model works well for knowledge bases: create a stub when you first encounter a topic, add to it as you learn more, link it to related entries, refine it periodically. Accept that many entries will remain stubs forever. That’s fine. A stub you can find is more useful than a comprehensive entry you never wrote.
What AI Can Do for Storage
AI tools are genuinely useful in the storage layer:
- Summarization at capture. When you save a long article, an AI summary alongside the original gives you a quick way to recall why you saved it.
- Tagging assistance. AI can suggest tags based on content, reducing the friction of categorization.
- Connection surfacing. Some tools can identify connections between notes that you might not have noticed. This is hit-or-miss but occasionally revelatory.
- Query-based retrieval. Instead of searching by keyword, you can describe what you’re looking for in natural language.
What AI can’t do for storage is think for you. The “write to think” principle still applies. Letting AI generate your notes defeats the purpose. Let AI help you find, organize, and connect information. Do the thinking yourself.
The Retrieval Layer: The Part Everyone Forgets
You’ve carefully curated your intake, diligently processed your incoming information, and thoughtfully stored what matters. Three months later, you need that article about distributed consensus algorithms you read in September, and you can’t find it.
The retrieval layer is the least glamorous part of an information architecture and arguably the most important. Information you can’t find when you need it might as well not exist.
Search
Full-text search is table stakes. Every tool in your stack should support it. If your note-taking app doesn’t have good search, switch apps.
But search only works if you know roughly what you’re looking for. “I read something about consensus algorithms” will find things with those words in them. “That article that changed how I think about coordination problems” won’t, unless you wrote a note at the time that used those words.
This is another argument for the “write to think” principle. Your notes are search surface area. The more you write about what you’ve read, the more findable it becomes later.
Tagging
Tags are a lightweight way to create retrieval paths beyond full-text search. The key to tags that actually work:
Use a small, stable set of tags. Ten to twenty tags is plenty. More than that, and you’ll waste time agonizing over which tag to apply. Fewer than that, and tags don’t add much value beyond search.
Tags should reflect how you’ll look for things, not what things are about. “For the team meeting” is a more useful tag than “management theory” if you’re the kind of person who retrieves information by context rather than topic.
Review your tags occasionally. If you haven’t used a tag in three months, either merge it with another tag or delete it.
Don’t rely on tags alone. Tags are a supplement to search and linking, not a replacement.
Linking
Connections between notes are how a collection of individual observations becomes a knowledge network. When you write a note, spend five seconds asking: does this connect to anything else I’ve noted?
You don’t need to build a Zettelkasten. You don’t need bi-directional links. You don’t need a graph visualization. You need to occasionally write “see also: [that other note about X]” when you notice a connection. That’s it. That’s linking.
Over time, these connections create retrieval paths that no search algorithm could generate. They represent your thinking about how ideas relate, which is uniquely yours and uniquely valuable.
Spaced Repetition
If certain information needs to be deeply learned rather than merely stored, spaced repetition is the most effective technique available. The science on this is robust: reviewing information at increasing intervals dramatically improves long-term retention.
Spaced repetition isn’t for everything. It’s for facts and concepts you need to have in working memory, not just available in your notes. If you’re learning a new field, a new language, or any domain where quick recall matters, build it into your system.
For everything else, external storage plus good retrieval is sufficient. You don’t need to memorize everything. You need to know where to find it.
How the Layers Connect
The four layers aren’t independent. They form a system, and the connections between them matter as much as the layers themselves.
Intake feeds Processing. The quality of your intake determines the burden on your processing layer. If your intake is full of low-signal sources, you’ll spend all your processing energy on triage and have nothing left for engagement. Clean up intake, and processing gets easier.
Processing feeds Storage. How you process information determines what reaches storage and in what form. If processing is just “read or skip,” your storage layer gets raw content with no context. If processing includes brief annotation — even a sentence about why something matters — your storage layer becomes dramatically more useful.
Storage feeds Retrieval. The way you store information determines how findable it is. Unsearchable storage is a graveyard. Well-tagged, well-linked, well-annotated storage is a functioning memory.
Retrieval feeds Intake. When retrieval works well, you start to notice patterns in what you’re looking for. These patterns should inform your intake choices. If you keep searching for information about a topic you’re not actively following, maybe it’s time to add a source for that topic.
This feedback loop is the hallmark of a mature information architecture. It’s self-adjusting: your retrieval patterns inform your intake, your intake quality shapes your processing, your processing enriches your storage, and your storage determines your retrieval success.
Common Failure Modes
I’ve seen dozens of personal information architectures fail. They tend to fail in predictable ways.
The Complexity Trap
Symptoms: A system with multiple tools, automated workflows, cross-posting pipelines, and a setup guide that reads like a DevOps playbook.
Root cause: Optimizing for theoretical completeness rather than practical usability. Usually triggered by reading too many “my productivity system” blog posts.
The fix: Simplify ruthlessly. If you can’t explain your system in two minutes, it’s too complex. If it requires more than three tools, it’s too complex. If it has any component that exists because you might need it someday, cut it.
The Capture-Everything Problem
Symptoms: Thousands of saved articles, hundreds of bookmarks, a read-later queue measured in weeks of content. A note-taking system with more clipped content than original writing.
Root cause: Confusing saving with processing. The instinct to preserve everything “just in case” without a corresponding commitment to actually engage with what you’ve saved.
The fix: Impose a save budget. You can save X items per day or per week. When you hit the limit, you have to process saved items before saving new ones. The specific number doesn’t matter. The constraint does.
Also: regularly purge your read-later queue. If you saved it three months ago and haven’t read it, you’re not going to. Let it go. This is painful the first time and liberating every time after.
The Week-One Collapse
Symptoms: Enthusiastic system setup on a weekend. Diligent use for four to seven days. Gradual abandonment. Return to old habits. Guilt.
Root cause: The system was designed for peak motivation, not average motivation. Week one, you’re excited. Week three, you’re tired and the system feels like a chore.
The fix: Design for your worst day, not your best day. If the system requires more than ten minutes of daily overhead, it will collapse when you’re busy. Start simpler than you think you need to. You can always add complexity. You can never un-burn yourself out.
The Tool-Switching Cycle
Symptoms: Migrating to a new tool every few months. Spending more time configuring systems than using them. Knowing the feature sets of fourteen note-taking apps but not having any sustained body of notes.
Root cause: Looking for a tool-shaped solution to a practice-shaped problem. No tool will make you process information if you don’t have the habit of processing information.
The fix: Pick a tool. Use it for six months. Don’t read reviews of competing tools during those six months. If, after six months, you have a specific, articulable problem with the tool, switch. If you just have a vague sense that something better might exist, that’s not a tool problem. That’s a focus problem.
The Perfectionist Paralysis
Symptoms: Extensive research into the “best” system. Elaborate planning. Comparison spreadsheets. Zero actual implementation.
Root cause: The system is supposed to solve information overwhelm, but designing the system has itself become an information-overwhelm problem. Irony noted.
The fix: Start today with whatever you have. Use your email as a read-later list. Use a single folder on your computer for notes. Use your browser’s search history as a retrieval mechanism. Then improve incrementally. A crude system that exists beats an elegant system that doesn’t.
The Minimum Viable Information Architecture
Strip away everything that’s nice-to-have and you’re left with this:
-
One intake aggregator. A single place where your chosen sources deliver content. An RSS reader, an email inbox, a curated feed — something you check deliberately.
-
A daily triage habit. Fifteen minutes where you scan what’s new and decide what deserves attention. Not an hour. Not “whenever you get around to it.” A defined time, a defined duration.
-
A place to write. Not clip. Not highlight. Write. Even briefly. One tool where you jot down thoughts, reactions, connections. A plain text file works. A fancy app works. The tool doesn’t matter. The writing does.
-
A way to search. Full-text search across your notes. That’s the minimum for retrieval. Tags and links are nice. Search is essential.
That’s it. Four components. You can build this in an afternoon, and it will outperform 90% of elaborate productivity systems because you’ll actually use it.
Everything else — automated pipelines, AI summarization, linked knowledge graphs, spaced repetition — is optimization. Optimize only after you’ve been running the basic system long enough to know where the bottlenecks are.
Sample Architectures
Theory is great. Let’s look at what these systems actually look like for different roles.
The Software Engineer
Intake:
- RSS reader with 15-20 feeds: language-specific blogs, system design blogs, two to three industry newsletters, one or two general tech publications
- Hacker News via a curated “best of” digest (not the live firehose)
- Two to three Slack channels at work that are high-signal
- One or two podcasts for commute/exercise time
Processing:
- Morning scan of RSS (10 min): flag articles for later, skim headlines for awareness
- Read flagged articles during lunch or focused reading time
- Slack triage: process during natural work breaks, not continuously
- Weekly: process any saved items over a week old — read or delete
Storage:
- Dev notes in a single tool (Obsidian, Notion, plain text — whatever)
- Brief notes when reading: “key idea, how it relates to current work, any code patterns to try”
- Work journal: daily bullet points of what you learned, what you’re stuck on
- Code snippets in a searchable repository (gists, snippets folder, whatever)
Retrieval:
- Full-text search across all notes
- Light tagging: by project, by technology, by concept
- Work journal as a running index of “what was I thinking about in October?”
The Engineering Manager
Intake:
- Three to four management newsletters (The Pragmatic Engineer, LeadDev, one or two others)
- RSS feed focused on organizational design, technical leadership, industry trends
- Heavy email flow (stakeholders, reports, cross-functional)
- Internal company communications
- One to two podcasts, sampled not subscribed
Processing:
- Morning email triage (15 min): categorize by urgency and response needed
- Newsletter processing: batch on Tuesday/Thursday, 20 min each
- AI summaries for longer reports and strategy documents
- Delegate reading: ask reports to surface relevant technical details
Storage:
- Meeting notes with action items tagged
- Decision log: what was decided, why, what alternatives were considered
- People notes: what’s each report working on, what do they need, career goals
- Industry trends: brief notes on what might matter in 6-12 months
Retrieval:
- Search by person, project, or date
- Decision log as institutional memory
- Quarterly review of trends notes: were your predictions right? What did you miss?
The Researcher
Intake:
- Journal alerts for key publications in your field
- RSS feeds for preprint servers, filtered by keyword
- Citation alerts for key papers and authors
- Conference proceedings and talk recordings
- Cross-domain feeds: two to three sources in adjacent fields
Processing:
- Daily scan of new papers: read abstracts, flag for full reading
- Weekly deep reading session: 2-3 papers read thoroughly with notes
- AI-assisted paper processing: extract methods, results, limitations
- Monthly: review citation alerts for emerging threads
Storage:
- Reference manager (Zotero, Mendeley, etc.) with consistent tagging
- Reading notes for every paper read in full: key contribution, methodology, limitations, connections to your work
- Research journal: evolving thoughts on your current questions
- Literature maps: visual or linked representations of how papers relate
Retrieval:
- Reference manager search by tag, author, year, keyword
- Research journal as a thinking history
- Literature maps as entry points for specific topics
- AI-assisted retrieval: “find papers in my library related to X”
The Generalist
Intake:
- One quality newspaper or news aggregator for current events
- Three to five newsletters across different domains (tech, science, culture, business, one wildcard)
- RSS reader with a rotating selection of sources
- Podcast playlist: varied topics, sampled freely
- One or two books in progress at any time
Processing:
- Morning news scan (10 min): headlines and one or two full articles
- Newsletter triage: batch process twice a week
- Podcast during commute/exercise: no notes required for casual listening, brief notes for standout episodes
- Book notes: brief summary after each reading session
Storage:
- All-purpose note tool: low friction, searchable
- Three main sections: current events reactions, learning notes, ideas
- Book notes: one entry per book with key takeaways and personal reactions
- Idea capture: a running list of thoughts, questions, and connections
Retrieval:
- Full-text search
- Chronological browsing (what was I thinking about last month?)
- Monthly review: scan recent notes, notice patterns, prune dead ends
Putting It Together
Your information architecture should be as simple as it can be and no simpler. Start with the minimum viable version. Run it for a month. Notice where it breaks. Fix those specific breaks. Resist the urge to overhaul everything when you could adjust one component.
The most important thing about your system isn’t its elegance or its completeness. It’s whether you’re using it tomorrow. And the day after that. And the week after that.
A crude system you maintain is infinitely more valuable than a sophisticated system you abandoned. Build for the real you — the busy, tired, easily distracted human who nonetheless wants to be well-informed and thoughtful. That person doesn’t need a perfect system. That person needs a good-enough system and the discipline to use it.
We’ve established the architecture. In the next chapter, we’ll get concrete about the tools and workflows that bring it to life — with the full understanding that the specific tools will change but the patterns endure.