Improving LLM Generalization Through Self-Certainty

Explore top LinkedIn content from expert professionals.

Summary

improving llm generalization through self-certainty is about teaching large language models (llms) to better understand when they are confident or uncertain in their responses, so they can adapt and give more reliable answers—especially when facing new problems or unfamiliar data. this involves techniques that help llms reflect on their own reasoning, adjust to new situations, and accurately communicate how sure they are about their outputs.

  • strengthen uncertainty awareness: encourage your models to estimate and report their own confidence, making it easier to spot mistakes and reduce misleading answers.
  • enable self-reflection: fine-tune llms to review and improve their own responses by learning from past errors, not just copying examples.
  • guide focused attention: adjust how models prioritize information in prompts, so they rely on relevant, up-to-date context instead of old training habits.
Summarized by AI based on LinkedIn member posts
  • View profile for Ross Dawson
    Ross Dawson Ross Dawson is an Influencer

    Futurist | Board advisor | Global keynote speaker | Founder: AHT Group - Informivity - Bondi Innovation | Humans + AI Leader | Bestselling author | Podcaster | LinkedIn Top Voice

    35,726 followers

    One of the biggest constraints on the value of LLMs is that they are equally confident irrespective of underlying uncertainty. A new model, Entropix, proposes using different strategies for selecting the next token depending on the nature of the model's uncertainty. A great piece by Thariq Shihipar lays out the logic. The starting point is distinguishing between entropy and varentropy. Entropy measures how concentrated or diffuse the options for next token. Low entropy means the model has a very high probability next token, high entropy suggests there are a number of similar value possible next tokens. Varentropy assesses how different the probabilities are, either consistent (low) or varied (high). Each of the four combinations of these possibilities yields different strategies for improving next token selection: ⬇️⬇️Low Entropy, Low Varentropy: Model is very confident → Choose the highest probability option ⬇️⬆️Low Entropy, High Varentropy: Few strong competing options → Consider branching to explore different paths ⬆️⬇️High Entropy, Low Varentropy: Model is uncertain → Use "thinking tokens" to prompt more consideration ⬆️⬆️High Entropy, High Varentropy: Many scattered options → Use random selection or branching These are still early days in being able to assess model uncertainty and adjust to improve outputs validity (including reducing hallucinations). However progress in this will greatly improve the value of LLMs. Another critical aspect of this research is in Humans + AI work. Humans have to make their own assessments of LLM outputs of highly varying quality. Decision quality could improve massively If LLMS could offer valid confidence assessments for input into complex human-first decisions.

  • View profile for Aishwarya Naresh Reganti

    Founder & CEO @ LevelUp Labs | Ex-AWS | Consulting, Training & Investing in AI

    123,789 followers

    🤔 What if, instead of using prompts, you could fine-tune LLMs to incorporate self-feedback and improvement mechanisms more effectively? Self-feedback and improvement have been shown to be highly beneficial for LLMs and agents, allowing them to reflect on their behavior and reasoning and correct their mistakes as more computational resources or interactions become available. The authors mention that frequently used test-time methods like prompt tuning and few-shot learning that are used for self-improvement, often fail to enable models to correct their mistakes in complex reasoning tasks. ⛳ The paper introduces RISE: Recursive Introspection, an approach to improve LLMs by teaching them how to introspect and improve their responses iteratively. ⛳ RISE leverages principles from online imitation learning and reinforcement learning to develop a self-improvement mechanism within LLMs. By treating each prompt as part of a multi-turn Markov decision process (MDP), RISE allows models to learn from their previous attempts and refine their answers over multiple turns, ultimately improving their problem-solving capabilities. ⛳It models the fine-tuning process as a multi-turn Markov decision process, where the initial state is the prompt, and subsequent states involve recursive improvements. ⛳It employs a reward-weighted regression (RWR) objective to learn from both high- and low-quality rollouts, enabling models to improve over turns. The approach uses data generated by the learner itself or more capable models to supervise improvements iteratively. RISE significantly improves the performance of LLMs like LLaMa2, LLaMa3, and Mistral on math reasoning tasks, outperforming single-turn strategies with the same computational resources. Link: https://lnkd.in/e2JDQr8M

  • View profile for Sohrab Rahimi

    Director, AI/ML Lead @ Google

    23,608 followers

    Most LLM agents stop learning after fine-tuning. They can replay expert demos but can’t adapt when the world changes. That’s because we train them with imitation learning—they copy human actions without seeing what happens when they fail. It’s reward-free but narrow. The next logical step, reinforcement learning, lets agents explore and learn from rewards, yet in real settings (e.g. websites, APIs, operating systems) reliable rewards rarely exist or appear too late. RL becomes unstable and costly, leaving LLMs stuck between a method that can’t generalize and one that can’t start. Researchers from Meta and Ohio State propose a bridge called Early Experience. Instead of waiting for rewards, agents act, observe what happens, and turn those future states into supervision. It’s still reward-free but grounded in real consequences. They test two ways to use this data: 1. Implicit World Modeling: for every state–action pair, predict the next state. The model learns how the world reacts—what actions lead where, what failures look like. 2. Self-Reflection: sample a few alternative actions, execute them, and ask the model to explain in language why the expert’s move was better. These reflections become new training targets, teaching decision principles that transfer across tasks. Across eight benchmarks, from home simulations and science labs to APIs, travel planning, and web navigation, both methods beat imitation learning. In WebShop, success jumped from 42 % to 60 %; in long-horizon planning, gains reached 15 points. When later fine-tuned with RL, these checkpoints reached higher final performance and needed half (or even one-eighth) of the expert data. The gains held from 3B to 70B-parameter models. To use this yourself:, here is what you need to do: • Log each interaction and store a short summary of the next state—success, error, or side effect. • Run a brief next-state prediction phase before your normal fine-tune so the model learns transitions. • Add reflection data: run two-four alternative actions, collect results, and prompt the model to explain why the expert step was better. Train on those reflections plus the correct action. • Keep compute constant—replace part of imitation learning, not add more. This approach makes agent training cheaper, less dependent on scarce expert data, and more adaptive. As models learn from self-generated experience, the skill barrier for building capable agents drops dramatically. In my opinion, the new challenge is governance and ensuring they don’t learn the wrong lessons. That means filtering unsafe traces, constraining environments to safe actions, and auditing reflections before they become training data. When rewards are scarce and demonstrations costly, let the agent learn from what it already has, its own experience! That shift turns LLMs from static imitators into dynamic learners and moves us closer to systems that truly improve through interaction, safely and at scale.

  • View profile for Karyna Naminas

    CEO of Label Your Data. Helping AI teams deploy their ML models faster.

    6,591 followers

    🧠 What if you could fine-tune a language model just by optimizing its uncertainty estimates? That’s what this new research from Bojana Ranković and Philippe Schwaller (EPFL & NCCR Catalysis) explores. With no supervised loss, no handcrafted features, and no domain-specific pretraining. They call it GOLLuM (Gaussian Process Optimized LLMs). This method fine-tunes LLMs to work better with Bayesian optimization by reshaping their internal embedding space. ⏳ Quick breakdown: - Goal: Improve sample-efficient discovery using LLMs + uncertainty, especially in scientific domains with limited data. - Method: Jointly train the LLM and a Gaussian Process by maximizing marginal likelihood. The model learns to group similar outputs together, without contrastive loss. - Results:       ➡️ 23% more top-performing hits than static embeddings       ➡️ 114% improvement over the best prior method (LAPEFT)       ➡️ Consistent gains across 19 real-world chemistry tasks     - Why it matters: This lets you turn general-purpose LLMs into task-aware optimizers, without extra data or domain tricks. It's efficient, flexible, and works even in data-scarce scenarios. 💬 The smoother the latent space, the better the decisions. That’s a big shift in how we think about LLM fine-tuning. How soon do you think uncertainty-based training will become mainstream? #LLMs #BayesianOptimization #MachineLearning #DataAnnotation #AIResearch #UncertaintyEstimation #RepresentationLearning #ChemistryAI

  • View profile for Michael Malak

    Agentic AI

    4,271 followers

    LLMs get stuck in a rut of their training corpus. You try to introduce a new concept or a new term, and the attention mechanism drags up from the bottom of the sea something similar, something staid, something irrelevant. Researchers in Japan just showed a counter-move: don't "prompt harder"; refocus what the model is allowed to attend to in the here and now. The meat of it is they treat attention heads like specialists: some heads behave like "anchors" and "copy/aggregation" operators under structured reasoning prompts, so they identify those heads and intervene at inference time by reweighting attention toward those. Yes, they demo it in a narrow domain (lists of logical rules). Concretely, they bias attention so a rule tag/name (e.g., "Rule14") is compelled to attend to the *actual text of that rule* and not drift to neighboring rules or to something similar from its original training data. But it feels like there is potential for much more: any well-defined novel term could be forced to "read its local definition/evidence" instead of free-associating from pretraining priors. And a domain Knowledge Graph could bootstrap the term -> definition mapping. The next leap may not be bigger LLMs. It may be enforced attention discipline: models that are finally forced to read the prompt and use domain-specific knowledge. https://lnkd.in/exr3N9iN

  • View profile for Dattaraj Rao

    Chief Data Scientist | Agentic AI | Innovation | ex-GE | Author | 11 Patents

    13,014 followers

    Exciting new research from Anthropic introduces Internal Coherence Maximization (ICM), a breakthrough method that allows #largelanguagemodels (LLMs) to #finetune themselves using only their own outputs - potentially reducing or even replacing the need for human oversight in complex tasks. The model evaluates the consistency of its own responses and optimizes itself by comparing and correcting inconsistent statements. In benchmarks such as TruthfulQA and GSM8K, ICM achieved similar or better results than models with classic supervised fine-tuning. https://lnkd.in/d3Kbdsts

  • View profile for Justine Juillard

    Co-Founder of Girls Into VC @ Berkeley | Advocate for Women in VC and Entrepreneurship | Incoming S&T Summer Analyst @ GS

    47,770 followers

    LLMs are frozen artifacts. They’re trained once on trillions of tokens and then exposed to the real world with no capacity to learn from it. So the goal would be to create systems that can generate their own training data, curate their own feedback, improve over time, without constant human retraining. Right now, there are multiple approaches to self-refinement. 1. Synthetic Instruction Tuning Use a base LLM to generate instruction–response pairs Filter for quality using heuristics or a reward model Fine-tune the same or another model on that data 2. Chain-of-Thought Bootstrapping Models generate reasoning traces to explain their answers, then train on their own rationales (or better ones selected by a separate model). 3. Critique-Rewrite Loops A generator: produces answers A critic: evaluates coherence, relevance, factuality A reviser: rewrites the original response based on the critique 4. Self-Reward via Reinforcement Rather than relying on external human feedback (RLHF), models generate and score their own trajectories via reward modeling or KL-constrained reinforcement learning. 5. Memory-Augmented Self-Tuning Rather than updating weights, models use: - Vector memory caches - Long-term key–value memory layers - Persistent retrieval databases that evolve over time Self-training loops sound efficient. But they can go sideways fast: 1. Model collapse If you fine-tune a model repeatedly on its own outputs without intervention, you get distributional narrowing. The model becomes overconfident, less diverse, and more detached from human language. 2. Bias amplification Errors, stereotypes, or toxic patterns can compound if not filtered. Without ground-truth anchoring, reinforcement becomes self-justifying. 3. Feedback contamination In agentic systems (like document summarizers), it’s possible for self-refined models to corrupt their own input corpus by rewriting files or logs they later use as training data. 4. Drift from human intent Even if the model optimizes for performance or reward, it can diverge from human values or business goals if the reward function isn't explicitly aligned with them. Self-refinement is not self-alignment. The benefits are real: - Faster iteration cycles - Better personalization without retraining infrastructure - Adaptation to edge cases and evolving domains But self-refinement also blurs the line between: - Learning and drift - Autonomy and accountability - Improvement and mutation It requires a whole new set of MLOps practices: - Traceable self-updates - Versioning and rollback of self-modified models - Human-in-the-loop feedback at key checkpoints - Isolation of critical systems from self-rewriting logic 👉 I’m giving myself 30 days to learn about AI. Follow Justine Juillard and let’s get smarter, together.

  • View profile for Rohan Katyal

    Co-Founder Milana - The first AI product engineer. Previously Meta, Yelp, Yahoo, and CS @ Gatech

    15,371 followers

    Your LLM just confidently told you X happened. It didn't. Hallucinations happen because of incentives. It's a result of how LLMs are trained. Standard training rewards guessing over admitting uncertainty. So when an LLM doesn't know something, it doesn't say "I'm not sure". It invents a plausible answer and delivers it with complete confidence. Here are some of the tactics that we use at Vantara to reduce false positives: 1 - Fight affirmation bias LLMs default to certainty. Explicitly allow them to say "no" and "uncertain" in your prompts. Update schema descriptions to normalize uncertainty as a valid response. 2 - Add "NA" and "uncertain" to ENUMs When your field type is an ENUM, include these options. Without them, the model will force-fit an answer from available choices even when none apply. 3 - Set realistic base rates for confirmations When asking for confirmation (e.g.: "did X happen?"), include context like: "In 90% of cases this will not be true. When in doubt, choose false." This recalibrates the model's threshold for making positive claims. 4 - Always pair confirmations with evidence + thinking Don't just ask "did this happen?" Require the specific evidence and reasoning chain that led to the conclusion. One of the reasons that models hallucinate is because saying "I don't know" is penalized more than being confidently wrong. If we don’t intentionally design for uncertainty, we’re designing for hallucination. Any tactics you’ve been using to reduce LLM false positives? Link to Open AI’s paper on this subject is in the comments.

Explore categories