{"id":274,"date":"2026-03-21T18:06:26","date_gmt":"2026-03-21T18:06:26","guid":{"rendered":"https:\/\/lab.laeka.org\/?p=274"},"modified":"2026-03-21T18:06:26","modified_gmt":"2026-03-21T18:06:26","slug":"model-cards-done-right-documentation-that-actually-helps","status":"publish","type":"post","link":"https:\/\/laeka.org\/publications\/model-cards-done-right-documentation-that-actually-helps\/","title":{"rendered":"Model Cards Done Right: Documentation That Actually Helps"},"content":{"rendered":"<p>Most model cards are useless. They list architecture details nobody needs and skip the information everyone wants: what is this model good at, what is it bad at, and what data was it trained on? Good documentation is the difference between a model people adopt and one they scroll past.<\/p>\n<h2>What Users Actually Need to Know<\/h2>\n<p>When someone finds your model on the Hub, they have four questions. <strong>First: does this model do what I need?<\/strong> A clear, one-paragraph description of the model&#8217;s intended use cases answers this immediately. &#8220;A 7B instruction-tuned model optimized for code review and technical documentation&#8221; is infinitely more useful than &#8220;A large language model trained with RLHF.&#8221;<\/p>\n<p><strong>Second: how good is it?<\/strong> Benchmark scores help, but only with context. Showing scores alongside comparable models tells users where this model sits in the landscape. Even better: include qualitative examples of typical outputs showing both strengths and weaknesses.<\/p>\n<p><strong>Third: how do I run it?<\/strong> A working code snippet that goes from zero to inference in five lines. Not a link to general documentation \u2014 the actual code, tested and ready to copy. Include the chat template, any special tokens, and recommended generation parameters.<\/p>\n<p><strong>Fourth: what are the limitations?<\/strong> Every model has them. Documenting known failure modes, weak domains, and biases is the highest-value section of a model card. Users who discover limitations through the documentation trust the model more than users who discover them through production failures.<\/p>\n<h2>The Training Data Question<\/h2>\n<p>Training data transparency is the most contentious section of a model card. Many model creators are deliberately vague about their training data, either to protect competitive advantages or to avoid legal scrutiny over data sourcing.<\/p>\n<p>The minimum acceptable disclosure: <strong>data categories and approximate proportions<\/strong>. &#8220;Trained on web text (40%), code (30%), academic papers (20%), and curated instruction data (10%)&#8221; tells users enough to understand the model&#8217;s knowledge distribution without revealing proprietary datasets.<\/p>\n<p>For fine-tuned models, the expectations are higher. If you fine-tuned on a specific dataset, name it. If you created synthetic training data, describe the generation process. If you used human annotation, describe the annotation guidelines and annotator demographics. This information directly affects whether the model is suitable for any given use case.<\/p>\n<p>The emerging standard for responsible disclosure includes data <strong>provenance<\/strong> (where it came from), <strong>preprocessing<\/strong> (how it was cleaned), and <strong>known gaps<\/strong> (what&#8217;s underrepresented). This level of detail is rare today but increasingly expected as regulations like the EU AI Act mandate transparency.<\/p>\n<h2>Template for a Good Model Card<\/h2>\n<p>After reviewing hundreds of model cards, a clear pattern emerges for what works:<\/p>\n<p><strong>Summary<\/strong> \u2014 Two sentences. What the model is and what it&#8217;s for. No jargon.<\/p>\n<p><strong>Quick Start<\/strong> \u2014 Working code snippet. Copy, paste, run. Include the exact package versions tested.<\/p>\n<p><strong>Intended Use Cases<\/strong> \u2014 Specific examples of tasks the model handles well. &#8220;Customer email classification,&#8221; not &#8220;general NLP tasks.&#8221;<\/p>\n<p><strong>Known Limitations<\/strong> \u2014 Specific examples of tasks where the model struggles. This section should be at least as long as the intended use section.<\/p>\n<p><strong>Benchmarks<\/strong> \u2014 Standard benchmarks with scores, compared to similar models. Include the evaluation methodology and any caveats.<\/p>\n<p><strong>Training Details<\/strong> \u2014 Base model, training data description, hyperparameters, compute used. As much detail as you&#8217;re comfortable sharing.<\/p>\n<p><strong>License<\/strong> \u2014 Clear statement of the license with a link to the full text. If the model inherits restrictions from a base model, state this explicitly.<\/p>\n<p><strong>Citation<\/strong> \u2014 How to cite the model in academic work, if applicable.<\/p>\n<h2>Common Model Card Failures<\/h2>\n<p>The <strong>empty model card<\/strong> is the most common failure. A model with no documentation is a model that only the creator can use effectively. The Hub is littered with potentially great models that nobody adopts because nobody knows what they do.<\/p>\n<p>The <strong>marketing model card<\/strong> oversells capabilities and omits limitations. This leads to user disappointment and erosion of trust. If your model is a 7B that&#8217;s good at coding but mediocre at creative writing, say so. Users respect honesty and punish hype.<\/p>\n<p>The <strong>academic model card<\/strong> drowns users in training details while skipping practical information. Nobody needs to know your learning rate scheduler&#8217;s warmup steps. Everyone needs to know the recommended inference parameters.<\/p>\n<p>The <strong>copy-paste model card<\/strong> copies the base model&#8217;s documentation without updating for fine-tuning changes. If you fine-tuned Llama for medical QA, the model card should describe the medical QA capabilities, not Llama&#8217;s general architecture.<\/p>\n<h2>Documentation as Competitive Advantage<\/h2>\n<p>In a world with thousands of models on the Hub, documentation is a differentiator. Models with clear, thorough documentation get more downloads, more citations, and more community attention. The time invested in a good model card pays back in adoption.<\/p>\n<p>The best model cards tell a story: here&#8217;s what we built, here&#8217;s why, here&#8217;s what it&#8217;s good at, here&#8217;s where it falls short, and here&#8217;s how to use it. That story is what converts a browser into a user.<\/p>\n<p>For templates and best practices on AI documentation, visit <a href='https:\/\/lab.laeka.org'>Laeka Research<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Most model cards are useless. They list architecture details nobody needs and skip the information everyone wants: what is this model good at, what is it bad at, and what data was it trained&#8230;<\/p>\n","protected":false},"author":1,"featured_media":273,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[251],"tags":[],"class_list":["post-274","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-open-source-ai"],"_links":{"self":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts\/274","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/comments?post=274"}],"version-history":[{"count":1,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts\/274\/revisions"}],"predecessor-version":[{"id":431,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/posts\/274\/revisions\/431"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/media\/273"}],"wp:attachment":[{"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/media?parent=274"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/categories?post=274"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/laeka.org\/publications\/wp-json\/wp\/v2\/tags?post=274"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}