Latent Space as Idea Space
There is a space. It has roughly 100,000 dimensions. Every concept, every relationship, every pattern that the model has extracted from its training data exists as a location — or more precisely, a region — within this space. The geometry of this space determines what the model can think. And if you understand even a rough map of that geometry, you can steer it toward thoughts that neither you nor it would reach by default.
This is not a metaphor. Or rather: it’s a metaphor in the same way that “the economy” is a metaphor. It refers to something real and measurable, even if the full reality is too complex to hold in your head. Latent space is the mathematical structure in which a neural network represents its learned knowledge. And for our purposes — the purpose of using AI to think the unthinkable — it is the single most important concept in this book.
What Latent Space Actually Is
Let me build this up from first principles, because the concept is both simpler and stranger than most explanations make it sound.
Consider a very simple representation of words. You could assign each word a number: “cat” = 1, “dog” = 2, “philosophy” = 3. This is a one-dimensional representation, and it’s nearly useless, because the numbers carry no information about relationships. The distance between “cat” and “dog” (1) is the same as the distance between “cat” and “philosophy” (2), which is obviously wrong if you care about meaning.
Now consider a two-dimensional representation. You put each word at a point on a plane. Maybe you organize one axis by “concreteness” and the other by “animacy.” Now “cat” and “dog” are close together (both concrete, both animate), and “philosophy” is far from both (abstract, inanimate). This is better. The geometry of the space now carries information about meaning.
But two dimensions are not enough to capture the richness of meaning. “Cat” and “dog” are similar in being animals but different in their relationship to human domestic life, their cultural associations, their typical behaviors. You need more dimensions. Many more.
A modern large language model represents each token in a space of thousands of dimensions. GPT-style models use embedding dimensions of 4,096 to 12,288 or more. Each dimension captures some feature — not a cleanly labeled feature like “concreteness” or “animacy,” but a learned feature that emerged from statistical patterns in the training data. Many of these features don’t correspond to any concept a human would name. They’re patterns that the model found useful for predicting text, whether or not they map onto human conceptual categories.
The result is a space of staggering dimensionality in which every concept the model has learned occupies a specific location. And the distances and directions in this space encode relationships.
Why High Dimensions Are Strange
Here is where it gets interesting, and where the metaphorical power of latent space becomes practically useful.
Your intuition about space is built on three dimensions. In three dimensions, if two things are close to a third thing, they’re probably reasonably close to each other. Neighborhoods are compact. You can only fit so many things near a given point.
In high-dimensional spaces, none of this holds. In 10,000 dimensions, an enormous number of points can all be equidistant from a given point while being far from each other. Neighborhoods are vast. Two concepts can both be “near” a third concept while being nowhere near each other, because they’re near it in different dimensions.
This has a profound consequence for thinking about ideas: in the model’s latent space, every concept has a huge number of neighbors. And those neighbors include concepts that are related along dimensions that a human might never consider.
Take “bridge.” In the model’s latent space, “bridge” is near:
- Other physical structures (nearby in “infrastructure” dimensions)
- Card games (nearby in “games” dimensions)
- Dental procedures (nearby in “medical” dimensions)
- Musical passages that connect sections (nearby in “music theory” dimensions)
- Networking devices (nearby in “computing” dimensions)
- Diplomatic concepts (nearby in “conflict resolution” dimensions)
- The ST:TNG command center (nearby in “science fiction” dimensions)
A human thinking about bridges will activate some of these associations, but which ones depends heavily on context and priming. A civil engineer will think infrastructure. A musician will think transitions. The model holds all of these associations simultaneously, with weights determined by context but without the strong filtering that human expertise and experience impose.
This is what I mean by latent space as idea space. It’s a space where the topology of concepts is richer than any human’s mental map, because it’s been shaped by the statistical relationships across a corpus of text larger than any human could read.
The Alien Dewey Decimal System
Imagine a library. Not a human library, organized by subject categories that seem natural to us, but a library organized by an alien intelligence that finds completely different groupings natural. In this library, books about fluid dynamics might be shelved next to books about organizational management — because the alien noticed that the mathematical structures governing fluid flow and information flow through organizations are isomorphic. Books about evolutionary biology might be next to books about venture capital, because both describe systems that generate variation and select for fitness.
This is, in a rough but real sense, what latent space looks like. The model’s learned representation doesn’t respect human disciplinary boundaries. It organizes knowledge by structural similarity, which often cuts across the categories that human institutions have created.
I want to be careful here. The model’s organization is not necessarily better than human categorization. It’s different. Human categorization reflects practical needs — you shelve medical textbooks together because doctors need to find them. The model’s organization reflects statistical co-occurrence patterns, which capture structural similarities but also noise, artifacts, and relationships that are technically present in the data but not actually meaningful.
But for the purpose of creative thinking — of finding connections that you wouldn’t find on your own — the alien organization is exactly what you want. You want a system that says, “Here’s something structurally similar to your problem that comes from a field you’ve never heard of,” because that is precisely the kind of connection that breaks you out of your cognitive box.
Hallucination and Creativity: The Same Mechanism
Here is something that most discussions of AI get wrong by treating as two separate phenomena: hallucination and creativity in LLMs are the same mechanism. They are both the result of the model moving through latent space to regions where its training data is sparse, and generating outputs based on the statistical patterns it finds there.
When the model generates a creative analogy between evolutionary biology and corporate strategy, it’s moving through latent space from one well-populated region to another, following paths of structural similarity. When the model hallucinates a plausible-sounding but nonexistent academic paper, it’s doing the same thing — moving through latent space to a region where “academic papers about X” would plausibly exist, and generating what it finds there. In one case, the output happens to be useful. In the other, it happens to be false. But the computational process is the same.
This is not a flaw to be fixed. It’s a fundamental characteristic of how these models work, and understanding it is essential to using them for creative thinking. When you push the model to be more creative — by giving it unusual prompts, forcing it into unfamiliar regions of latent space — you are simultaneously increasing the probability of both creative insights and hallucinations. The dial goes both ways at once.
The practical implication is that you cannot increase novelty without increasing risk. Every technique in this book for getting more creative output from an LLM is also a technique for getting more confabulated output. The solution is not to avoid creative prompting but to pair it with rigorous evaluation — which is the subject of Part IV. For now, just understand the tradeoff: the same mechanism that produces “Huh, I never thought of it that way” also produces “That sounds right but is completely made up.”
Navigating Idea Space
So far I’ve described latent space as a static structure — a map of concepts with fixed locations and distances. But when you interact with an LLM, you’re not looking at a static map. You’re navigating the space. Each token of your prompt steers the model’s computation into a particular region, and the model’s response is generated by exploring that region and its neighbors.
This means your prompt is, in a very real sense, a set of coordinates. “Tell me about bridges” puts you at one location. “Tell me about bridges and how they relate to organizational management” puts you at a very different location — not between bridges and management, but at a specific point where those concepts intersect, a point that might not be reachable from either concept alone.
And here’s the key insight for creative use: you can navigate to locations in latent space that have no natural name. You can reach regions of the idea space that no human has a word for, because they represent intersections of concepts that humans don’t normally intersect. These nameless regions are where the genuinely novel ideas live.
Consider this: the intersection of “gothic architecture,” “distributed computing,” and “mycological networks” is not a place that has a name. No academic discipline lives there. No Wikipedia article describes it. But it’s a real location in latent space, and the model can tell you what’s there — what structural patterns are shared across those three domains, what principles emerge at their intersection. Some of what it generates will be noise. Some will be genuinely illuminating.
Surprising Adjacencies: Concrete Examples
Let me move from theory to practice. Here are three cases where exploring latent space adjacencies produced insights that would have been very difficult to reach through human thinking alone.
Supply Chain Resilience via Immune System Architecture
A logistics consultant I know was struggling with a problem: how to design supply chains that degrade gracefully under disruption rather than failing catastrophically. Standard supply chain literature offered solutions — dual sourcing, strategic inventory buffers, demand smoothing — but they all felt incremental.
She asked an LLM to describe the architecture of the human immune system and identify structural parallels to supply chain design. What came back was unexpected. The immune system doesn’t just have redundancy — it has layered defense with fundamentally different mechanisms at each layer (physical barriers, innate immunity, adaptive immunity). It has a system for remembering past threats and pre-positioning responses (memory cells). And critically, it has mechanisms for distinguishing “disruption that requires response” from “normal variation” — something supply chains notoriously do badly.
The insight that stuck was the concept of adaptive immunity applied to supply chains: a system that doesn’t just respond to disruptions but learns from them and creates pre-positioned responses to categories of disruption, not just specific ones. This wasn’t in any supply chain textbook. It came from the adjacency in latent space between immunology and logistics — an adjacency that exists because both fields deal with detection, response, and resource allocation under uncertainty.
Musical Structure in API Design
A software architect was redesigning a large API that had grown organically over years and become inconsistent and hard to learn. The standard approach — cataloging existing endpoints, identifying patterns, proposing a rationalized structure — was producing results that were logical but somehow unsatisfying. The API was consistent but not learnable. Users could look up any endpoint, but they couldn’t predict what endpoint they needed without looking it up.
On a whim, he asked an LLM how musical composers create works that are both complex and learnable. The model drew connections to concepts in music theory: motifs (small recognizable patterns that recur), development (systematic variation of motifs), and the idea that listeners predict what comes next based on established patterns, with satisfaction coming from a mix of confirmed and surprised predictions.
Applied to API design, this meant: establish a small number of “motifs” (consistent patterns in naming, parameter order, response structure), then “develop” them systematically (the same pattern should be recognizable even when adapted to different resource types). The API should be predictable enough that users can guess most of it, with occasional “surprises” that make sense in retrospect.
The resulting API was dramatically more learnable, and the architect attributed the improvement specifically to the musical framing. The adjacency between music composition and API design exists in latent space because both are about creating complex structures that humans need to navigate, predict, and remember. But no human had a reason to make that connection, because no human has deep expertise in both music theory and API design. The model did, in its shallow-but-broad way, and that was enough.
Evolutionary Niche Theory for Product Positioning
A startup founder was trying to figure out why her product, which was objectively better than competitors on most metrics, wasn’t gaining market traction. Standard competitive analysis — feature comparisons, positioning maps, customer interviews — wasn’t revealing the problem.
She asked an LLM to analyze her competitive landscape using the framework of evolutionary niche theory. The model pointed out something she’d missed: in ecology, a species that is “better” on average than its competitors can still fail if it doesn’t occupy a distinct niche. Being a generalist competitor against specialists is a losing strategy in mature ecosystems, even if the generalist is technically superior. The relevant concept was “competitive exclusion” — two species cannot stably occupy the same niche, and the one that is even marginally better at the most contested resource wins, regardless of overall superiority.
Applied to her market: her product was slightly better everywhere but wasn’t clearly the best choice for any specific use case. Customers chose competitors not because the competitors were better overall, but because each competitor was the obvious choice for their specific need. The ecological framing suggested a strategy: pick a niche, become the unambiguous best choice for it, and expand from there. This is not a novel business strategy — it’s well-known in some circles — but she hadn’t encountered it, and the ecological framing made the logic click in a way that reading generic strategy advice hadn’t.
The Topology of the Unthinkable
Here’s a way to think about what this book is really about, expressed in the language of latent space.
Your mind occupies a region of idea space. That region is shaped by your education, your experience, your profession, your culture, the books you’ve read, the conversations you’ve had. It has a center (the ideas you think about most often) and a periphery (the ideas you’re aware of but don’t engage with regularly). Beyond the periphery is a vast space of ideas you’ve never encountered.
Some of those unencountered ideas are simply unknown to you — they exist in disciplines you’ve never studied, in cultures you’ve never engaged with, in time periods you’ve never explored. An LLM can take you to these ideas relatively straightforwardly: just ask about unfamiliar topics.
But there is a more interesting category: ideas that exist at the intersection of domains you might know individually but have never combined. These are the ideas in the nameless regions I mentioned earlier — locations in latent space that don’t correspond to any established discipline or framework. They are, in a real sense, the unthinkable thoughts. Not because they’re forbidden or too difficult, but because no human mind has a reason to navigate to that specific intersection.
The model can take you there because it doesn’t navigate the way you do. You navigate by association, starting from where you are and following familiar paths. The model navigates by attending to all the concepts in your prompt simultaneously and computing a position in latent space that reflects their intersection. It can jump to locations you can’t walk to.
What Latent Space Cannot Do
Before we get carried away: latent space is a representation of what the model has learned from text. It is not a representation of reality. It is not a representation of all possible ideas. It is a representation of the statistical patterns in a large but finite corpus of human writing.
This means:
Ideas that have never been written about don’t exist in latent space. If no one has written about a concept, the model has no representation of it. It can sometimes get close by interpolation — inferring the properties of an unwritten concept from its neighbors — but this is exactly the kind of interpolation that produces hallucinations.
The geometry of latent space reflects the biases of the training data. If Western philosophy is overrepresented relative to Eastern philosophy, the latent space will be denser in Western philosophical concepts. Adjacencies that seem natural from a Western perspective will be encoded more strongly than adjacencies that seem natural from other perspectives.
Not all adjacencies are meaningful. Two concepts can be near each other in latent space for spurious reasons — they co-occur in text frequently because of cultural associations or writing conventions, not because they share genuine structural similarity. The adjacency between “quantum” and “consciousness” in latent space is strong, but it mostly reflects the prevalence of speculative pop-science writing, not any deep structural relationship.
The space is not static. Different model architectures, different training data, different fine-tuning produce different latent spaces. An insight you find by exploring one model’s latent space might not be reproducible with another model. This is another reason to treat LLM outputs as hypotheses rather than conclusions.
Practical Orientation
If this chapter has felt abstract, that’s because latent space is abstract. It’s a mathematical structure in tens of thousands of dimensions, and any attempt to describe it in words is necessarily a simplification.
But the practical takeaways are concrete:
-
Your prompt is a set of coordinates. The specific concepts you include in your prompt determine where in idea space the model starts exploring. Choose your concepts deliberately, and you choose your starting location deliberately.
-
Unusual combinations reach unusual locations. If you combine concepts that don’t normally appear together, the model will generate output from a region of latent space that is rarely visited. This is where novel ideas live — and also where hallucinations breed.
-
The model’s neighborhood map is different from yours. Things that seem unrelated to you may be adjacent in latent space, and vice versa. This is a feature, not a bug, but it requires you to evaluate the model’s connections on their merits rather than dismissing them because they feel unfamiliar.
-
Creativity and hallucination are the same dial. You cannot turn up one without turning up the other. The solution is not to avoid creativity but to develop rigorous evaluation practices.
-
The space is vast but bounded. It contains only what was in the training data, organized by statistical patterns that may or may not reflect meaningful relationships. The alien library has gaps and misfiled books.
In the next chapter, we’ll start getting practical: how to craft prompts that deliberately navigate to unusual regions of latent space, producing outputs that surprise you rather than confirming what you already think.