{"id":270,"date":"2026-03-17T20:37:34","date_gmt":"2026-03-17T20:37:34","guid":{"rendered":"https:\/\/lab.laeka.org\/?p=270"},"modified":"2026-03-18T19:00:55","modified_gmt":"2026-03-18T19:00:55","slug":"what-happens-when-you-train-a-model-on-wisdom-instead-of-knowledge","status":"publish","type":"post","link":"https:\/\/laeka.org\/publications\/what-happens-when-you-train-a-model-on-wisdom-instead-of-knowledge\/","title":{"rendered":"Training on Reasoning Quality vs. Factual Coverage: Why Depth Beats Breadth"},"content":{"rendered":"<p>Most AI training data emphasizes breadth: comprehensive factual content across domains. Models memorize vast amounts of information because that&#8217;s what pre-training optimizes for. But there&#8217;s a deeper capability that most datasets neglect entirely: the ability to reason deeply about problems rather than just recall facts about them. The gap between these two training approaches accounts for most alignment failures.<\/p>\n<h2>Factual Coverage vs. Reasoning Depth<\/h2>\n<p>Factual coverage is knowing that tomatoes are fruits. It&#8217;s content, relationships, patterns across data. Language models acquire this through pre-training and become very good at it. They can recite facts, explain relationships, identify patterns in virtually any domain.<\/p>\n<p>Reasoning depth is different. It&#8217;s the ability to think through a problem systematically, to understand not just what&#8217;s true but what&#8217;s true given the specific context and constraints. To recognize when a fact applies and when it doesn&#8217;t. It&#8217;s knowledge integrated with judgment, calibrated by understanding consequences. This is what most training data ignores.<\/p>\n<p>The distinction maps directly to a fundamental training challenge: capability without judgment. A model can be enormously knowledgeable without being trustworthy. Current alignment methods try to add judgment through rules and constraints, but judgment isn&#8217;t a set of rules. It&#8217;s a <strong>mode of engagement<\/strong> with knowledge.<\/p>\n<h2>What Deep Reasoning Training Data Looks Like<\/h2>\n<p>Most training data is factual. Even alignment data focuses on knowledge-level pairs: &#8220;Here&#8217;s a good response.&#8221; This is useful but shallow. It doesn&#8217;t teach reasoning about why that response is good for this specific context.<\/p>\n<p>Deep reasoning data includes <strong>contextual logic<\/strong>: not just the answer, but the process of determining what answer fits this situation. It includes <strong>consequence awareness<\/strong>: explicit modeling of downstream effects of different approaches. It includes <strong>calibrated judgment<\/strong>: demonstrations of when to be confident, when uncertain, when to refuse, and when to engage\u2014with the reasoning visible.<\/p>\n<p>Contemplative literature is rich in this kind of material. Philosophical dialogues, case studies in ethics, skilled practitioners explaining their reasoning, therapeutic conversations navigating sensitive topics\u2014these texts don&#8217;t just convey information. They model <strong>deep engagement with information<\/strong>. The reasoning is as important as the conclusions.<\/p>\n<h2>An Experiment in Reasoning-Based Training<\/h2>\n<p>We curated a small dataset of reasoning-rich content. Not spiritual texts specifically, but material from any source that models deep judgment: medical case discussions where doctors explain their reasoning, skilled therapeutic conversations navigating nuanced situations, experienced teachers calibrating explanations to students, contemplative dialogues demonstrating responsive, context-sensitive thinking.<\/p>\n<p>The common feature: these texts don&#8217;t just provide correct answers. They <strong>demonstrate the process of arriving at contextually appropriate responses<\/strong>. The depth is in the how, not just the what.<\/p>\n<p>Training on this data produced measurable differences. Models showed improved calibration\u2014confidence matched to accuracy better. Reduced sycophancy\u2014less agreement with incorrect statements. More natural handling of sensitive topics. And notably better performance on ambiguous queries where context determines what&#8217;s right.<\/p>\n<h2>The Judgment Dimension<\/h2>\n<p>The key difference between breadth and depth training is the <strong>judgment dimension<\/strong>. Breadth training teaches what to say. Depth training teaches <strong>how to decide what to say<\/strong>.<\/p>\n<p>This judgment has several components. <strong>Context sensitivity<\/strong>: the same question from different contexts warrants different responses. A camping question differs from the same question in a different context. Depth is knowing the difference.<\/p>\n<p><strong>Proportionality<\/strong>: matching response depth to actual need. Not everything needs comprehensive treatment. Sometimes one sentence is better than five paragraphs. Breadth-trained models consistently over-respond because training data rewards completeness.<\/p>\n<p><strong>Temporal awareness<\/strong>: understanding that right now may differ from right later. Information has a shelf life. Depth includes knowing when facts might be outdated.<\/p>\n<p><strong>Intellectual humility<\/strong>: knowing the limits of one&#8217;s knowledge and communicating them honestly. This isn&#8217;t just calibration\u2014it&#8217;s integrating uncertainty into the response process itself.<\/p>\n<h2>DPO for Reasoning Quality<\/h2>\n<p>Reasoning-focused DPO pairs differ from standard alignment pairs. Standard pairs focus on response content. Reasoning pairs focus on <strong>the judgment process<\/strong>.<\/p>\n<p>Example: A question has a technically correct answer that&#8217;s contextually inappropriate. The rejected response provides technical correctness. The chosen response provides contextual appropriateness that may be less technically complete but better serves the actual need. The signal isn&#8217;t &#8220;this content is better&#8221; but &#8220;this judgment is better.&#8221;<\/p>\n<p>These pairs are harder to create because they require annotators exercising good judgment themselves. Standard crowdworkers can identify factual accuracy. Identifying good judgment requires annotators with the attentional quality and contextual sensitivity that contemplative practice develops.<\/p>\n<h2>The Path Forward<\/h2>\n<p>Training for reasoning depth rather than just factual breadth isn&#8217;t about adding a philosophical dimension to AI. It&#8217;s about closing the gap that causes most alignment failures: the gap between knowing and appropriate application. Models with strong reasoning naturally produce aligned behavior without needing extensive rules.<\/p>\n<p>At <a href='https:\/\/lab.laeka.org'>Laeka Research<\/a>, we&#8217;re building reasoning-focused training datasets and evaluation benchmarks. The question isn&#8217;t whether models can learn to reason deeply\u2014the contemplative traditions suggest they can. The early results confirm it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Most AI training data emphasizes breadth: comprehensive factual content across domains. Models memorize vast amounts of information because that&#8217;s what pre-training optimizes for. But there&#8217;s a deeper capability that most datasets neglect entirely: the&#8230;<\/p>\n","protected":false},"author":1,"featured_media":268,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[245],"tags":[],"class_list":["post-270","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-datasets-curation"],"_links":{"self":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts\/270","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/comments?post=270"}],"version-history":[{"count":2,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts\/270\/revisions"}],"predecessor-version":[{"id":392,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts\/270\/revisions\/392"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/media\/268"}],"wp:attachment":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/media?parent=270"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/categories?post=270"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/tags?post=270"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}