Top LinkedIn Content on Training Content Management Systems

building AI systems @meta

206,810 followers 1y

Disclosing the full list of datasets used to train IBM LLMs Granite 3.0. This is true transparency - no other LLM provider shares such detailed information about their training datasets. WEB Data - FineWeb: More than 15T tokens of cleaned and deduplicated English data from CommonCrawl. - Webhose: Unstructured web content in English converted into machine-readable data. - DCLM-Baseline: A 4T token / 3B document pretraining dataset that achieves strong performance on language model benchmarks. CODE - Code Pile: Sourced from publicly available datasets like GitHub Code Clean and StarCoderdata. - FineWeb-Code: Contains programming/coding-related documents filtered from the FineWeb dataset using annotation. - CodeContests: Competitive programming dataset with problems, test cases, and human solutions in multiple languages. DOMAIN - USPTO: Collection of US patents granted from 1975 to 2023. - Free Law: Public-domain legal opinions from US federal and state courts. - PubMed Central: Biomedical and life sciences papers. - EDGAR Filings: Annual reports from US publicly traded companies over 25 years. MULTILINGUAL - Multilingual Wikipedia: Data from 11 languages to support multilingual capabilities. - Multilingual Webhose: Multilingual web content converted into machine-readable data feeds. - MADLAD-12: Document-level multilingual dataset covering 12 languages. INSTRUCTIONS - Code Instructions Alpaca: Instruction-response pairs about code generation problems. - Glaive Function Calling: Dataset focused on function calling in real scenarios. ACADEMIC - peS2o: A collection of 40M open-access academic papers for pre-training. - arXiv: Scientific paper pre-prints posted to arXiv. Full author acknowledgement can be found here. - IEEE: Technical content from IEEE acquired by IBM. TECHNICAL - Wikipedia: Technical articles sourced from Wikipedia. - Library of Congress Public Domain Books: More than 140,000 public domain English books. - Directory of Open Access Books: Publicly available technical books from the Directory of Open Access Books. - Cosmopedia: Synthetic textbooks, blog posts, stories, and WikiHow articles. MATH - OpenWebMath: Mathematical text from the internet, filtered from 200B HTML files. - Algebraic-Stack: Mathematical code dataset including numerical computing and formal mathematics. - Stack Exchange: User-contributed content from the Stack Exchange network. - MetaMathQA: Dataset of rewritten mathematical questions. - StackMathQA: A curated collection of 2 million mathematical questions from Stack Exchange. - MathInstruct: Focused on chain-of-thought (CoT) and program-of-thought (PoT) rationales for mathematical reasoning. - TemplateGSM: Collection of over 7 million grade-school math problems with code and natural language solutions. BOOM!

114 Comments

Vitaly Friedman

Practical insights for better UX • Running “Measure UX” and “Design Patterns For AI” • Founder of SmashingMag • Speaker • Loves writing, checklists and running workshops on UX. 🍣

225,946 followers 1y

👨🏾💻 How People Use Screen Readers. With behavior patterns, practical insights and things to keep in mind for accessibility. ✅ 253 million people worldwide have a visual impairment. ✅ Screen readers help them translate text to speech or Braille. ✅ They work for websites, PDFs, emails, OS and other documents. ✅ They use the same voice regardless of font size, weight, color. ✅ E.g. Jaws/NVDA (Win, 80% share), VoiceOver (iOS), Talkback (Android). 🤔 Users often listen to screen readers at the 1.5–2.0x speed. ✅ Repetitive labels and hints aren't helpful (image caption, alt). ✅ Content order during tabbing conveys the structure of the page. ✅ Follow a logical linear layout, don't spread content all over a page. 🚫 Auto-playing audio is often an alarming, frustrating experience. 🤔 Users heavily rely on descriptive headings and labels. 🚫 Screen readers can’t extract meaning from images or videos. ✅ Avoid "Click here", "Read more", "View now" for links. ✅ A text box without a label is meaningless to screen readers. ✅ Never rely on visuals alone, they might not even be there. 🤔 Frequent issues with poorly structured forms, navigation, PDFs. ✅ Add UI controls for mouse-precise actions (drag'n'drop, resizing). ✅ Include nav landmarks, so users can jump within the page quickly. ✅ Ensure PDF/UA compliance to generate accessible PDFs. ✅ Always add labels to forms and avoid CAPTCHAs if you can. Where “abled people” use their natural feelings such as sight and hearing, people with disabilities must rely on technologies. Screen reading UX shouldn’t mean a “simplified” experience. It’s just a different experience, one of many. Unfamiliar tools might sound scary. Just start. Get familiar with screen readers. Run accessibility testing with a few screen reader users. Eventually make screen reader testing a part of QA. Many accessibility issues are severe, but solutions can be simple — and impactful for people who need them most. Useful resources: How A Screen Reader User Surfs The Web (video), by Léonie Watson https://lnkd.in/emv9AT-u Designing For Users Of Screen Readers, by Lewis Wake https://lnkd.in/ePTVpBxy Testing With Blind Users: A Cheat Sheet, by Slava Shestopalov https://lnkd.in/e8vBEqHn How And When To Use Alt Text, by Emma Cionca, Tanner Kohler https://lnkd.in/e3ivcPVg How to Conduct Usability Studies for Accessibility, by NN/g https://lnkd.in/egAxJxtW Mobile Accessibility Research With Screen-Reader Users, by Tanner Kohler https://lnkd.in/eb5Y36qZ How To Document Screen Reader UX, by BBC https://lnkd.in/e8KWr-Z6 #ux #accessibility

48 Comments

Robbie Crow

People, Culture & Workforce Strategy | Making work actually work | Inclusion, Talent & Change | BBC | Chartered FCIPD

33,776 followers 1mo

Most inaccessible documents aren’t created out of bad intent. No-one does it on purpose. They’re created out of habit. The good news is you don’t need to be an accessibility expert to help build a culture where accessible documents become the norm. Small behaviours, repeated often, shape organisational culture far more than policies do. Here are five simple things anyone can do, right now. (You can also find some further resources in the comments.) 1 - Build accessibility into your workflow Treat accessibility checks the same way you treat spellcheck. Before sending a document, take a minute to run an accessibility check and scan for obvious issues. When accessibility becomes a normal step in the workflow, it stops being an afterthought and starts becoming routine. 2 - Be an ally. You don’t have to personally need accessibility to advocate for it. Ask whether documents have been checked. Encourage colleagues to think about accessibility. If something isn’t accessible, raise it constructively, push back gently if someone sends you something that isn’t accessible. Cultural change often begins with someone asking the question. 3 - Learn the tools you already have Most people already have everything they need. Simple features such as document headings (heading 1, 2 etc), meaningful link titles, and built in accessibility checkers make a huge difference. Learning how to use these properly can transform the usability of a document in minutes. 4 - Think beyond screen readers. Whilst a crucial part of it, accessibility isn’t just about screen reader compatibility. Clear structure, readable layouts, logical headings, and descriptive links make documents easier for everyone to navigate and understand. Accessibility improves usability for the entire organisation. 5 - Automate your mailbox One simple trick is creating an Outlook rule that replies to anyone who sends you an attachment asking whether the document has been checked for accessibility. It’s a gentle prompt that helps build awareness and encourages better habits over time. Bonus tip - set the standard. If you want others to care about accessible documents, your own documents need to set the standard. When people consistently receive accessible content from you, it reinforces that accessibility is not an optional extra. It is simply how good work gets done. Accessibility culture doesn’t start with experts. It starts with everyday habits. ID: a Robbie Crow Purple infographic titled “Five top tips to build a culture of document accessibility”. It summarises the points in this post and full alt text can be found in the image. The graphic uses purple, pale yellow and gold branding with a “Progress Over Perfection” badge at the bottom.

9 Comments

Stéphanie Walter

UX Researcher & Accessible Product Design in Enterprise UX. Speaker, Author, Mentor & Teacher.

56,156 followers 11mo

Happy Global Accessibility Awareness Day everyone! It's a great day to remind people, that, accessibility is the responsibility of the whole team, including designers! A couple of things designers can do: - Use sufficient color contrast (text + UI elements) and don’t rely on color alone to convey meaning. - Ensure readable typography: support text resizing, avoid hard-to-read styles, maintain hierarchy. - Make links and buttons clear and distinguishable (label, size, states). - Design accessible forms: clear labels, error help, no duplicate input, document states. - Support keyboard navigation: tab order, skip links, focus indicators, keyboard interaction. - Structure content with headings and landmarks: use proper H1–Hn, semantic order, regions. - Provide text alternatives for images, icons, audio, and video. - Avoid motion triggers: respect reduced motion settings, allow pause on auto-play. - Design with flexibility: support orientation change, allow text selection, avoid fixed-height elements. - Document accessibly and communicate: annotate designs, collaborate with devs, QA, and content teams. Need to learn more? I got a couple of resources on my blog: - A Designer’s Guide to Documenting Accessibility & User Interactions: https://lnkd.in/eUh8Jvvn - How to check and document design accessibility in your mockups: a conference on how to use Figma plugins and annotation kits to shift accessibility left https://lnkd.in/eu8YuWyF - Accessibility for designer: where do I start? Articles, resources, checklists, tools, plugins, and books to design accessible products https://lnkd.in/ejeC_QpH - Neurodiversity and UX: Essential Resources for Cognitive Accessibility, Guidelines to understand and design for Dyslexia, Dyscalculia, Autism and ADHD https://lnkd.in/efXaRwgF - Color accessibility: tools and resources to help you design inclusive products https://lnkd.in/dRrwFJ5 #Accessibility #ShiftLeft #GAAD

A Designer’s Guide to Documenting Accessibility & User Interactions by Stéphanie Walter https://stephaniewalter.design

2 Comments

Susi Miller

Helping organisations meet accessibility requirements in learning with clarity and confidence | WCAG aligned learning assurance | Founder of eLaHub | Author and speaker | LPI Learning Professional of the Year

7,311 followers 6mo

Why the blueberry muffin accessibility analogy works so well for learning content. I still find the blueberry muffin analogy one of the best ways of explaining why it's so important to consider accessibility from the start of a learning project. Of course, you can add in accessibility afterwards, but imagine pushing those blueberries in by hand after the muffin is cooked. Not only does it feel like an afterthought for the learner, but it's also frustrating and time-consuming for the practitioner! In my recent conversation with Bill Banham on the Voices of the Learning Network Podcast, we explored what baking accessibility in from the start looks like - and how accessible design leads to better outcomes for all learners. We discussed simple ways to include accessibility in your everyday practice: - Writing clear, descriptive alt text that adds context. - Providing accurate captions and transcripts that benefit everyone. - Using consistent heading levels so learners and assistive technologies can navigate easily. - Applying good colour contrast. - Using plain language to reduce cognitive load. We also explored practical ways AI can help practitioners apply accessibility and why leadership matters for modelling inclusion, celebrating progress, and embedding accessibility into standards and strategy. When research shows that up to a quarter of your learners may have a disability or experience a temporary or situational access need, accessibility becomes more than a nice-to-have - it's a fundamental part of excellent learning design. So the next time you design a course, remember the blueberry muffin. Accessibility isn't an ingredient to add in at the end - it needs to be baked in from the start. You can listen to my full conversation with Bill at the link below: https://lnkd.in/eiBeiTEr #eLearning #Accessibility #AccessibleLearning #eln (Blueberry muffins on a wooden surface, with fresh blueberries scattered nearby. Baked blueberries are generously distributed through the batter of the muffins, creating deep purple pockets.)

9 Comments

Sneha Vijaykumar

Data Scientist @ Takeda | Ex-Shell | Gen AI | LLM | RAG | AI Agents | Azure | NLP | AWS

25,182 followers 1w

You’re in an AI Engineer interview. Interviewer asks: How do you handle multi language prompting effectively? Most people jump to translation APIs. Strong answer goes deeper. 1. Detect language first Never assume. Identify the user’s language and script before prompting. 2. Preserve intent, not just words Literal translation often breaks tone, context, and business meaning. 3. Prompt in the user’s language when possible Models usually respond better when instructions and output language align. 4. Use English for complex reasoning, then localize output For harder logic tasks, reasoning in English + final response in target language often works better. 5. Handle mixed language inputs Real users switch languages mid sentence. Your system should too. 6. Keep terminology consistent Especially for healthcare, finance, legal, and product names. 7. Test by language, not globally Kannada, Hindi, Tamil, Japanese, Arabic, Spanish all fail differently. 8. Build fallback layers If confidence is low, ask clarifying questions instead of hallucinating. What interviewers want to hear: You understand that multilingual AI is a product problem, not just a translation problem. #AI #GenerativeAI #PromptEngineering #LLM #AIEngineer #MachineLearning #NLP #AIEngineering Follow Sneha Vijaykumar for more... 😊

1 Comment

Karen Kim

CEO @ Human Managed, the AI Service Platform for Cyber, Risk, and Digital Ops.

5,884 followers 1y

User Feedback Loops: the missing piece in AI success? AI is only as good as the data it learns from -- but what happens after deployment? Many businesses focus on building AI products but miss a critical step: ensuring their outputs continue to improve with real-world use. Without a structured feedback loop, AI risks stagnating, delivering outdated insights, or losing relevance quickly. Instead of treating AI as a one-and-done solution, companies need workflows that continuously refine and adapt based on actual usage. That means capturing how users interact with AI outputs, where it succeeds, and where it fails. At Human Managed, we’ve embedded real-time feedback loops into our products, allowing customers to rate and review AI-generated intelligence. Users can flag insights as: 🔘Irrelevant 🔘Inaccurate 🔘Not Useful 🔘Others Every input is fed back into our system to fine-tune recommendations, improve accuracy, and enhance relevance over time. This is more than a quality check -- it’s a competitive advantage. - for CEOs & Product Leaders: AI-powered services that evolve with user behavior create stickier, high-retention experiences. - for Data Leaders: Dynamic feedback loops ensure AI systems stay aligned with shifting business realities. - for Cybersecurity & Compliance Teams: User validation enhances AI-driven threat detection, reducing false positives and improving response accuracy. An AI model that never learns from its users is already outdated. The best AI isn’t just trained -- it continuously evolves.

1 Comment

Allys Parsons

Co-Founder at techire ai. ICASSP ‘26 Sponsor. Hiring in AI since ’19 ✌️ Speech AI, TTS, LLMs, Multimodal AI & more! Top 200 Women Leaders in Conversational AI ‘23 | No.1 Conversational AI Leader ‘21

17,994 followers 1y

Latest research from KAIST and Imperial College London introduces Zero-AVSR, an innovative framework that enables audio-visual speech recognition across languages without requiring training data in target languages. By learning language-agnostic speech representations through romanisation and leveraging LLMs, it can recognise speech even in languages never seen during training. What makes this approach interesting is the scale of language support. The team created MARC, a dataset spanning 2,916 hours of audio-visual speech across 82 languages—far beyond the 9 languages typical systems support. Their results show comparable performance to traditional multilingual systems while supporting this vastly larger language inventory. Zero-AVSR represents a significant advancement for speech tech in low-resource languages, potentially democratising access across thousands of languages without requiring extensive labelled datasets for each. The approach particularly excels when recognising languages from families similar to those in the training data, suggesting promising pathways for further expansion. Paper: https://lnkd.in/dnw_V7XK Authors: Jeong Hun Yeo, Minsu Kim, Chae Won Kim, Stavros Petridis, Yong Man Ro #SpeechRecognition #MultilingualAI #SpeechAI

Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations arxiv.org

2 Comments

Kuldeep Singh Sidhu

Senior Data Scientist @ Walmart | BITS Pilani

16,023 followers 1y

Exciting breakthrough in multilingual embedding models! A team of researchers from HIT and Tongji University have developed KaLM-Embedding, setting a new standard for models under 1B parameters. What makes this model special? It leverages cleaner, more diverse training data and introduces three game-changing techniques: 1. Persona-based synthetic data generation using QWen2-72B-Instruct, creating 550k diverse examples across 6 task types 2. Ranking consistency filtering to remove noise and improve data quality by ensuring positive examples rank within top-k matches 3. Semi-homogeneous task batching that balances negative sample hardness with false negative risks Under the hood, KaLM-Embedding uses Qwen2-0.5B as its foundation and implements Matryoshka Representation Learning for flexible dimension embedding (896 to 64 dimensions). The model excels in Chinese and English while showing strong performance across other languages. The results? KaLM-Embedding achieves state-of-the-art performance on the MTEB benchmark, outperforming larger models with scores of 64.13 for Chinese and 64.94 for English tasks. This work demonstrates how thoughtful data curation and innovative training techniques can push the boundaries of what's possible with compact models. The team has open-sourced their work for the research community.

Zack Yarde, Ed.D.

Org Strategist for Neuro-Inclusion & Executive Coach | Engineering Systems Design & Psychological Safety | PMP, Prosci, EdD | ADHDer

3,094 followers 2mo

Inclusive design is not just about the font you choose. It is about how your content behaves when it meets a different nervous system. Last week, we pruned your typography. This week, we are looking at the soil. We are auditing your media and structure. In our rush for "engagement," corporate communications often rely on visual shortcuts like flashing GIFs, color-coded alerts, and walls of emojis. Marketing calls these "hacks." I call them Barriers. When you rely on a color change to signal "danger," you lock out the colorblind. When you replace words with a string of emojis, you create chaos for a screen reader user (hearing "Face with tears of joy" five times in a row). When you post a video without captions, you tell the Deaf and Auditory Processing communities that they are not your audience. Accessibility is not a "feature" for a minority group. It is an indicator of Organizational Health. If your content requires perfect vision, perfect hearing, and neurotypical processing speed to understand... your content is flawed. Below is The Inclusive Content Audit (Part 2). We moved beyond fonts to look at media, structure, and interaction. Here are 9 Ways to Operationalize Inclusion in your content: 1. The Emoji Restraint ❌ Barrier: Emojis read aloud via screen readers as clunky descriptions. ✅ Fix: Use clear words to convey tone. Keep emojis at the end of sentences rather than in the middle. 2. The Caption Mandate ❌ Barrier: Audio/Video posted "naked." ✅ Fix: Burned-in open captions. (This helps ADHD brains like mine focus just as much as it helps Deaf users). 3. The Contrast Rule ❌ Barrier: Text over busy, semi-transparent backgrounds. ✅ Fix: Solid color backgrounds behind text blocks to reduce visual noise. 4. The "Color + Shape" Rule ❌ Barrier: Using only color to convey meaning (e.g., Red = Error). ✅ Fix: Pair color with a distinct shape or icon label. 5. The Alt-Text Discipline ❌ Barrier: Images with file names like "IMG_5920.jpg". ✅ Fix: Descriptive, concise Alternative Text. 6. The Header Hierarchy ❌ Barrier: Manually bolding text to look like a header. ✅ Fix: Using actual "Heading Styles" (H1, H2) so screen readers can navigate the structure. 7. The Motion Control ❌ Barrier: Auto-playing GIFs or flashing content. ✅ Fix: Static images or user-controlled "Play" buttons. (Protect your team from vestibular triggers). 8. The Data Summary ❌ Barrier: Complex charts with no text explanation. ✅ Fix: A simple text summary beneath the visual. 9. The Permanent Label ❌ Barrier: Form field labels that disappear once you start typing. ✅ Fix: Labels that remain visible above the field. (Reduces cognitive load and working memory strain). The Verdict: Low-friction content is high-impact content. Stop making your audience fight your design to get to your message. #Accessibility #InclusiveDesign #WCAG #Neurodiversity #Leadership #ClinicalStrategy

41 Comments

Training Content Management Systems

More in Training Content Management Systems

More Training & Development topics

Explore categories