The Four Dimensions of Laeka Datasets: Monade, Symbiote, Architect, Empath
Most datasets train one capability at a time. Reasoning datasets train reasoning. Conversation datasets train conversation. Code datasets train code. This produces models that are good at specific tasks and mediocre at integrating capabilities.
At Laeka, we organize datasets along four dimensions that reflect different modes of intelligence. Each dimension develops a distinct cognitive capacity. Together, they produce models that don’t just perform tasks but integrate capabilities in ways that feel genuinely intelligent.
Monade: Self-Contained Reasoning
The Monade dimension develops the model’s capacity for independent, structured thought. Monade data consists of self-contained reasoning sequences: a question or problem, followed by a complete chain of thought that arrives at a conclusion.
What makes Monade data different from standard reasoning datasets is the quality of the reasoning process, not just the correctness of the conclusion. Each example demonstrates clear thinking: identifying assumptions, considering alternatives, acknowledging limitations, and arriving at conclusions proportional to the evidence.
Monade training produces models that can think through problems without hand-holding. They don’t need elaborate prompting to reason well. The reasoning capacity is internalized, not prompted.
The contemplative parallel is shamatha — focused concentration. Monade develops the model’s capacity to sustain coherent thought on a single thread without drifting or losing the plot.
Symbiote: Collaborative Dialogue
The Symbiote dimension develops the model’s capacity for genuine collaboration. Symbiote data consists of conversations where both participants contribute meaningfully to an emerging understanding that neither could have reached alone.
Standard conversation datasets are transactional: user asks, model answers. Symbiote data is generative: the conversation itself produces new insights. The model learns to build on what the human says, introduce new perspectives, ask clarifying questions that deepen the inquiry, and co-create understanding.
This is the hardest data to collect because genuine collaborative dialogue is rare. Most human-AI interactions are extractive: the human wants information, the AI provides it. Symbiote interactions are creative: both parties are exploring together.
The contemplative parallel is sangha — community practice. Intelligence that emerges from relationship rather than isolation.
Architect: Structured Problem-Solving
The Architect dimension develops the model’s capacity to decompose complex problems into manageable components and assemble solutions from parts. Architect data consists of multi-step problem-solving sequences that make the structure of the solution explicit.
Standard instruction-following data teaches the model to execute tasks. Architect data teaches the model to design solutions. The difference is the level of abstraction. An instruction-following model can write code when told what to write. An Architect-trained model can analyze a problem, propose an approach, identify potential issues, and then implement the solution.
Architect data includes explicit planning, strategy selection, trade-off analysis, and iterative refinement. The model learns not just to solve problems but to think about how to solve problems.
The contemplative parallel is prajna — wisdom. The capacity to see the structure beneath the surface and work with it skillfully.
Empath: Emotional Intelligence
The Empath dimension develops the model’s capacity to recognize, understand, and respond appropriately to emotional context. Empath data consists of interactions where emotional attunement is central to the quality of the response.
This isn’t about being “nice” or adding emotional language to responses. It’s about accurately reading the emotional subtext of a message and calibrating the response accordingly. Sometimes the emotionally intelligent response is warm and supportive. Sometimes it’s direct and challenging. Sometimes it’s quiet and spacious. The Empath dimension trains the model to read the situation and respond appropriately.
Empath data is collected from interactions with contemplative practitioners who have trained emotional awareness. Their responses demonstrate a quality of attunement that standard annotators rarely achieve.
The contemplative parallel is karuna — compassion. Not sentimentality but accurate perception of another’s situation and a response that actually serves their needs.
How the Dimensions Interact
The four dimensions aren’t separate training phases. They’re mixed throughout the training data, with different examples emphasizing different dimensions. A single conversation might require all four: understanding the emotional context (Empath), collaborating to clarify the problem (Symbiote), designing a solution approach (Architect), and reasoning through the implementation (Monade).
This mixing is deliberate. We want the model to integrate capabilities, not switch between them. A model that can reason clearly but can’t read emotional context will produce technically correct but humanly useless responses. A model that’s emotionally attuned but can’t reason clearly will produce warm but inaccurate responses.
The four dimensions together produce models that are intelligent in the full sense — not just capable of cognitive tasks but capable of the integrated intelligence that makes interactions genuinely useful.
Practical Implications
For teams building their own datasets, the four-dimension framework provides a diagnostic tool. If your model reasons well but feels cold, you need more Empath data. If it’s warm but incoherent, you need more Monade data. If it answers questions but doesn’t collaborate, you need more Symbiote data. If it executes tasks but can’t design solutions, you need more Architect data.
Most models are imbalanced because their training data is imbalanced. The four dimensions provide a map for identifying and correcting that imbalance.
Intelligence isn’t one thing. It’s at least four things working together. Build your dataset accordingly.
Laeka Research — laeka.org