Evaluations —or “Evals”— are the backbone for creating production-ready GenAI applications. Over the past year, we’ve built LLM-powered solutions for our customers and connected with AI leaders, uncovering a common struggle: the lack of clear, pluggable evaluation frameworks. If you’ve ever been stuck wondering how to evaluate your LLM effectively, today's post is for you. Here’s what I’ve learned about creating impactful Evals: 𝗪𝗵𝗮𝘁 𝗠𝗮𝗸𝗲𝘀 𝗮 𝗚𝗿𝗲𝗮𝘁 𝗘𝘃𝗮𝗹? - Clarity and Focus: Prioritize a few interpretable metrics that align closely with your application’s most important outcomes. - Efficiency: Opt for automated, fast-to-compute metrics to streamline iterative testing. - Representation Matters: Use datasets that reflect real-world diversity to ensure reliability and scalability. 𝗧𝗵𝗲 𝗘𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻 𝗼𝗳 𝗠𝗲𝘁𝗿𝗶𝗰𝘀: 𝗙𝗿𝗼𝗺 𝗕𝗟𝗘𝗨 𝘁𝗼 𝗟𝗟𝗠-𝗔𝘀𝘀𝗶𝘀𝘁𝗲𝗱 𝗘𝘃𝗮𝗹𝘀 Traditional metrics like BLEU and ROUGE paved the way but often miss nuances like tone or semantics. LLM-assisted Evals (e.g., GPTScore, LLM-Eval) now leverage AI to evaluate itself, achieving up to 80% agreement with human judgments. Combining machine feedback with human evaluators provides a balanced and effective assessment framework. 𝗙𝗿𝗼𝗺 𝗧𝗵𝗲𝗼𝗿𝘆 𝘁𝗼 𝗣𝗿𝗮𝗰𝘁𝗶𝗰𝗲: 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗬𝗼𝘂𝗿 𝗘𝘃𝗮𝗹 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲 - Create a Golden Test Set: Use tools like Langchain or RAGAS to simulate real-world conditions. - Grade Effectively: Leverage libraries like TruLens or Llama-Index for hybrid LLM+human feedback. - Iterate and Optimize: Continuously refine metrics and evaluation flows to align with customer needs. If you’re working on LLM-powered applications, building high-quality Evals is one of the most impactful investments you can make. It’s not just about metrics — it’s about ensuring your app resonates with real-world users and delivers measurable value.
Blended Learning Models
Explore top LinkedIn content from expert professionals.
-
-
👩🦰 Designing Accessibility Personas (https://lnkd.in/evVnB4hd). How to embed accessibility and test for it early in the design process ↓ We often assume that digital products are merely that — products. They either work or don’t work. That they help people meet their needs or fail on their path to get there. But every product has its own embedded personality. It can be helpful or dull, fragile or reliable, supportive or misleading. When we design it, willingly or unwillingly, we embed our values, views and perspectives into it. Sometimes it’s meticulously shaped and refined. And sometimes it’s simply random. And when that happens, users assign their perception of the product’s personality to the product instead. Products are rarely accessible by accident. There must be an intent that captures and drives accessibility efforts in a product. And the best way to do that is by involving people with temporary, situational and permanent disabilities into the design process. One simple way of achieving that is by inviting people with disabilities in the design process. For that, we could recruit people via tools like Access Works or UserTesting, ask admins of groups and channels on accessibility to help, or drop an email to non-profits that work in accessibility space. Another way is establishing accessibility personas for user journeys. Consider them as user profiles that highlight common barriers faced by people with particular conditions and provide guidelines for designers and engineers on how to design and build for them. E.g. Simone, a dyslexic user, or Chris, a user with rheumatoid arthritis. For each, we document known challenges and notable considerations, designing training tasks for designers and developers and instructions to simulate experience through the lens of these personas. By no means does it replace proper accessibility testing, but it creates a shared understanding about what the experiences are like. You can build on top of Gov.uk’s profound research project (https://lnkd.in/evVnB4hd) — it also explains how to set up devices and browsers, so that each persona has their own browser profile. Once you do, you can always switch between them and simulate an experience, without changing settings every single time. All Accessibility Personas (+ Tasks, Research, Setup) https://lnkd.in/evVnB4hd Accessibility doesn’t have to be challenging if it’s considered early. No digital product is neutral. Accessibility is a deliberate decision, and a commitment. Not only does it help everyone; it also shows what a company believes in and values. And once you do have a commitment, and it will be much easier to retain accessibility, rather than adding it last minute as a crutch — because that’s where it’s way too late to do it right, and way too expensive to make it well. [Useful pointers in the comments ↓] #ux #accessibility
-
Many teams believe they’re being inclusive when they say, “We kept accessibility in mind from the start." But good intentions aren’t the same as meaningful inclusion. I’ve been doing accessibility and inclusive design work for 25 years. Over the last decade, I’ve focused more deeply on what true disability inclusion really means—especially when it comes to power in the design relationship. Again and again, I’ve seen the same pattern: there are levels to inclusion. And only one of them truly shifts power. Here’s how that journey tends to unfold... ranked from least to most inclusive: Level 1: “We kept accessibility in mind.” You didn’t include disabled people. You included the idea of them. This is empathy without participation, and honestly... it’s not enough. Level 2: “We tested with disabled people just before launch.” There’s progress here—real people were involved. But testing at the end only lets you ask: “Do you accept what we built?” It’s too late for meaningful change. This is just late-stage validation. Level 3: “We tested early AND at the end.” Now there’s room for impact. People with disabilities had a chance to shape the work before it was finished. Their feedback could actually change the outcome—and that matters. Level 4: “We included disabled people throughout the process.” Even better. You've moved from on from a "testing" mindset. You brought people in during idea generation, design, development, and launch. You did research. You listened. You adjusted. That’s inclusion in action. Level 5: “We co-created the solution.” ✅ This is the gold standard. You didn’t just include people—you gave them power. They helped shape the goals, question the methods, and guide the direction. It wasn’t just "your" product. It was "ours" -- co-created together. Your greatest power is to give that power away. Inclusive design means shared decisions—not just shared feedback. If you’re not sure where to start, ask yourself: 👉 Where in our process do disabled people have the power to shape what we build? And if the answer is “nowhere”—it’s time to change that. #InclusiveDesign #Accessibility #DesignLeadership #CoCreation #DisabilityInclusion #UXDesign #ProductDesign
-
Behaviors are learned and reinforced. To make performance evaluations more inclusive, you need to proactively craft new practices. 🧠 Unbiasing nudges, intentional and subtle adjustments I craft with my clients, can play a pivotal role in achieving an objective and inclusive performance assessment. 👇 Here is what to consider: 🔎 Key Decision Points Analyze your evaluation process to identify key decision points. In my practice, focusing on assessment, performance goal setting, and feedback processes has proven crucial. Introduce inclusive prompts at each stage to guide unbiased decision-making. 🔎 Common Biases Examine previous reviews to unearth prevailing biases. Halo/horn effects, recency bias, and affinity bias often surface. Counteract these biases by crafting nudges tailored to your organization, integrating them seamlessly into your review spreadsheets. 🔎 Behavioral Prompts I usually develop concise pre-decision checklists tailored to each organization. The goal is to support raters' metacognition and introduce timed prompts during the evaluation process. 🔎 Feedback Loops Begin with small-scale implementation and collect feedback. Compare perceptions of both raters and ratees to gauge effectiveness. 🔎 Ongoing Training Avoid off-the-shelf solutions; instead, tailor training to your organization's unique context and patterns. Your trainer should understand your specific needs and design a continuous training program that reinforces these unbiasing nudges, providing managers with the necessary competencies. 🔎 Pilot and Evaluation Define metrics to measure progress and impact. Pilot your unbiasing nudges and regularly evaluate their effectiveness. Adjust based on feedback and insights gained during the pilot phase. 👉 Crafting inclusive performance evaluations is an ongoing journey. Yet, I believe, it's one of the most important ones. Each evaluation matters as it defines a person's career and sometimes even the future. ________________________________________ Are you looking for more DEI x Performance-related recommendations like this? 📨 Join my free DEI Newsletter:
-
Why inclusion and universal design need to come together We often hear organisations talk about diversity and inclusion. Yet inclusion alone isn’t enough if the systems we work within were never designed with difference in mind. A review by Shore and colleagues (2018) (https://lnkd.in/e6vjNAXM) looked at what makes workplaces truly inclusive. They emphasised fairness, authenticity, and equal access to opportunities. Their model shows that inclusion is not just about who is in the workforce, but whether everyone feels respected, valued, and able to participate fully. But here’s the challenge: many workplace practices are retrofits. Adjustments are made once someone discloses a need or points out a barrier. That can work but it’s often costly, time-consuming, and can unintentionally stigmatise the individual. This is where Universal Design (UD) comes in. Instead of waiting to respond, UD builds accessibility, flexibility, and usability into everyday business-as-usual. It reduces the number of case-by-case “fixes” by planning for variation from the outset. For example: Providing captions and transcripts in training as standard helps Deaf staff, those learning English, and anyone re-watching on mute. Clear communication, step-by-step checklists, and structured task tools reduce overload not only for neurodivergent employees but for everyone. Designing sensory-friendly workspaces supports those with sensory sensitivities—and also improves focus and wellbeing for the whole team. So how do the two approaches differ and align? Inclusion models focus on culture: creating fairness, authenticity, and psychological safety. Universal Design focuses on structures: embedding accessibility and flexibility into systems, tools, and environments. Bringing them together means leaders shape workplaces that are both fair and functional, inclusive and accessible. For employers, this isn’t just the right thing to do it’s efficient. Many UD approaches are low or no cost, but they reduce duplication, improve resilience, and make personalised support less stigmatising. 👉 Take away.... Inclusive practices creates the right mindset; Universal Design creates the mechanisms. Together, they help us move from patching barriers to preventing them.
-
Why the blueberry muffin accessibility analogy works so well for learning content. I still find the blueberry muffin analogy one of the best ways of explaining why it's so important to consider accessibility from the start of a learning project. Of course, you can add in accessibility afterwards, but imagine pushing those blueberries in by hand after the muffin is cooked. Not only does it feel like an afterthought for the learner, but it's also frustrating and time-consuming for the practitioner! In my recent conversation with Bill Banham on the Voices of the Learning Network Podcast, we explored what baking accessibility in from the start looks like - and how accessible design leads to better outcomes for all learners. We discussed simple ways to include accessibility in your everyday practice: - Writing clear, descriptive alt text that adds context. - Providing accurate captions and transcripts that benefit everyone. - Using consistent heading levels so learners and assistive technologies can navigate easily. - Applying good colour contrast. - Using plain language to reduce cognitive load. We also explored practical ways AI can help practitioners apply accessibility and why leadership matters for modelling inclusion, celebrating progress, and embedding accessibility into standards and strategy. When research shows that up to a quarter of your learners may have a disability or experience a temporary or situational access need, accessibility becomes more than a nice-to-have - it's a fundamental part of excellent learning design. So the next time you design a course, remember the blueberry muffin. Accessibility isn't an ingredient to add in at the end - it needs to be baked in from the start. You can listen to my full conversation with Bill at the link below: https://lnkd.in/eiBeiTEr #eLearning #Accessibility #AccessibleLearning #eln (Blueberry muffins on a wooden surface, with fresh blueberries scattered nearby. Baked blueberries are generously distributed through the batter of the muffins, creating deep purple pockets.)
-
Not all soft skills training is created equal. A few months ago, I was working with a group of managers from a large manufacturing company. They had been through plenty of training programs before- the kind where you take notes and then go right back to doing things the old way. When I walked into the room, I could see it in their faces: Let’s see if this is any different. So instead of starting with slides or theory, I took them straight into a live simulation: - A crisis scenario that could actually happen in their business. - Conflicting priorities, tough personalities, and limited time to decide. - Every move they made in real time had visible consequences. To begin with, I saw a lot of resistance in experimentation, voices which were not too loud and over powering were ignored leading to loss of critical information- the room was tense. People hesitated. Some stuck to their usual patterns. But as it got deeper, they started communicating much more effectively, this led to them collaborating, noticing blind spots, and eventually testing new ways to lead. By the end, they weren’t asking- Will this work? They said that they wanted to cascade it to their teams. Weeks later, I got an email from one of the managers. He told me he used the exact process from our simulation to navigate a real customer crisis and not only avoided a major fallout, but actually strengthened the client relationship through this crisis. That’s the difference between training that’s forgotten by the time you’re back at your desk, and training that rewires how you think, act, and lead. The secret? Immersion. When participants practice real scenarios, solve actual challenges, and see the impact of their decisions in the room, learning sticks. Priya Arora #immersivelearning #trainingdesign #employeeengagement #learningthatsticks #corporatelearning #leadershipdevelopment #upskilling #skillbuilding #workplacetraining #experientiallearning #Learningdeisgn #corporatetrainer #softskillstrainer #simulation #experintialtraining
-
🎓 Can we revolutionize university education by borrowing a strategy from medicine?🎓 In healthcare, teaching hospitals have long been the gold standard for preparing future doctors—immersing them in real-world scenarios under the guidance of experienced professionals. Imagine applying that same model across other disciplines. This is exactly what the Space Flight Laboratory (SFL) at the University of Toronto has done, and the results speak for themselves. Since 1998, SFL has adopted a "teaching hospital" approach to educate its graduate students in spacecraft engineering, blending formal instruction, cutting-edge research, and hands-on, real-world practice. Students don't just learn theories—they apply them in mission-critical environments, working on actual satellite projects for paying customers. The outcome? Graduates who are not only skilled but also seasoned in the complexities of their field, ready to tackle challenges with confidence and creativity. Why stop at aerospace engineering? Entrepreneurial pedagogies have similarly embraced hands-on, real-world learning, pushing students to solve complex problems with innovative thinking. Like the teaching hospital model, entrepreneurial education thrives on bridging the gap between theory and practice, ensuring students are not just academically proficient but also professionally ready. Universities often keep real-world practice at arm's length, relegating it to internships and co-op programs. But as the demands of society grow more complex, it's time to rethink this approach. Imagine what could happen if we integrated these immersive learning models into disciplines beyond medicine and engineering—fields like business, environmental science, and the humanities. We could cultivate a new generation of graduates with the critical thinking skills and practical experience necessary to make immediate, impactful contributions to their fields. It's time to challenge the status quo and advocate for wider adoption of teaching hospital and entrepreneurial models across university disciplines. The future of education and society may depend on it. #EducationInnovation #TeachingHospitalModel #ExperientialLearning #EntrepreneurshipEducation #HigherEd #FutureOfEducation #InnovationInEducation #Universities
-
Immersive learning isn’t the future—it’s happening now at the American University of Ras Al Khaimah. Over the past term at AURAK, my students and I embarked on a journey to transform traditional teaching materials into interactive, immersive learning modules using ThingLink. Across five departments—from AI and Chemistry to Biotechnology and Media Production—we’ve built something special: a scalable model for faculty-led, student-powered e-learning innovation. In this article, I reflect on our process, share real student projects, and explore the learning theories that guide this work. I also talk about why empowering faculty to design their own immersive content is more sustainable than outsourcing. I’d love for you to read, share, and join the conversation on how we can rethink education together. A big thank you to all the innovators and leaders from AURAK Cijo Vazhappilly Khouloud Salameh Prof. Irshad Ahmad Dr. Sara Faiz Mohamed Sharul #EdTech #ImmersiveLearning #InstructionalDesign #HigherEducation #ThingLink #FacultyDevelopment #VRinEducation #DigitalPedagogy
-
*** 🚨 Discussion Piece 🚨 *** Is it Time to Move Beyond Kirkpatrick & Phillips for Measuring L&D Effectiveness? Did you know organisations spend billions on Learning & Development (L&D), yet only 10%-40% of that investment actually translates into lasting behavioral change? (Kirwan, 2024) As Brinkerhoff vividly puts it, "training today yields about an ounce of value for every pound of resources invested." 1️⃣ Limitations of Popular Models: Kirkpatrick's four-level evaluation and Phillips' ROI approach are widely used, but both neglect critical factors like learner motivation, workplace support, and learning transfer conditions. 2️⃣ Importance of Formative Evaluation: Evaluating the learning environment, individual motivations, and training design helps to significantly improve L&D outcomes, rather than simply measuring after-the-fact results. 3️⃣ A Comprehensive Evaluation Model: Kirwan proposes a holistic "learning effectiveness audit," which integrates inputs, workplace factors, and measurable outcomes, including Return on Expectations (ROE), for more practical insights. Why This Matters: Relying exclusively on traditional, outcome-focused evaluation methods may give a false sense of achievement, missing out on opportunities for meaningful improvement. Adopting a balanced, formative-summative approach could ensure that billions invested in L&D truly drive organisational success. Is your organisation still relying solely on Kirkpatrick or Phillips—or are you ready to evolve your L&D evaluation strategy?
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development