Top LinkedIn Content on MLOps for AI Development

Staff ML Engineer | Meta, Roku, Walmart | 1:1 @ topmate.io/MLwhiz

45,178 followers 1y

Few Lessons from Deploying and Using LLMs in Production Deploying LLMs can feel like hiring a hyperactive genius intern—they dazzle users while potentially draining your API budget. Here are some insights I’ve gathered: 1. “Cheap” is a Lie You Tell Yourself: Cloud costs per call may seem low, but the overall expense of an LLM-based system can skyrocket. Fixes: - Cache repetitive queries: Users ask the same thing at least 100x/day - Gatekeep: Use cheap classifiers (BERT) to filter “easy” requests. Let LLMs handle only the complex 10% and your current systems handle the remaining 90%. - Quantize your models: Shrink LLMs to run on cheaper hardware without massive accuracy drops - Asynchronously build your caches — Pre-generate common responses before they’re requested or gracefully fail the first time a query comes and cache for the next time. 2. Guard Against Model Hallucinations: Sometimes, models express answers with such confidence that distinguishing fact from fiction becomes challenging, even for human reviewers. Fixes: - Use RAG - Just a fancy way of saying to provide your model the knowledge it requires in the prompt itself by querying some database based on semantic matches with the query. - Guardrails: Validate outputs using regex or cross-encoders to establish a clear decision boundary between the query and the LLM’s response. 3. The best LLM is often a discriminative model: You don’t always need a full LLM. Consider knowledge distillation: use a large LLM to label your data and then train a smaller, discriminative model that performs similarly at a much lower cost. 4. It's not about the model, it is about the data on which it is trained: A smaller LLM might struggle with specialized domain data—that’s normal. Fine-tune your model on your specific data set by starting with parameter-efficient methods (like LoRA or Adapters) and using synthetic data generation to bootstrap training. 5. Prompts are the new Features: Prompts are the new features in your system. Version them, run A/B tests, and continuously refine using online experiments. Consider bandit algorithms to automatically promote the best-performing variants. What do you think? Have I missed anything? I’d love to hear your “I survived LLM prod” stories in the comments!

46 Comments

Greg Coquillo

228,962 followers 3mo

Stop building AI agents in random steps, scalable agents need a structured path. A reliable AI agent is not built with prompts alone, it is built with logic, memory, tools, testing, and real-world infrastructure. Here’s a breakdown of the full journey - 1️⃣ Pick an LLM Choose a reasoning-strong model with good tool support so your agent can operate reliably in real environments. 2️⃣ Write System Instructions Define the rules, tone, and boundaries. Clear instructions make the agent consistent across every workflow. 3️⃣ Connect Tools & APIs Link your agent to the outside world - search, databases, email, CRMs, internal systems - to make it actually useful. 4️⃣ Build Multi-Agent Systems Split work across focused agents and let them collaborate. This boosts accuracy, reliability, and speed. 5️⃣ Test, Version & Optimize Version your prompts, A/B test, keep backups, and keep improving - this is how production agents stay stable. 6️⃣ Define Agent Logic Outline how the agent thinks, plans, and decides step-by-step. Good logic prevents unpredictable behavior. 7️⃣ Add Memory (Short + Long Term) Enable your agent to remember past conversations and user preferences so it gets smarter with every interaction. 8️⃣ Assign a Specific Job Give the agent a narrow, outcome-driven task. Clear scope = better results. 9️⃣ Add Monitoring & Feedback Track errors, latency, failures, and real-world performance. User feedback is the fuel of improvement. 🔟 Deploy & Scale Move from prototype to production with proper infra—containers, serverless, microservices. AI agents don’t scale because of prompts, they scale because of architecture. If you get logic, memory, tools, and infra right, your agents become reliable, predictable, and production-ready. #AI

98 Comments

Aishwarya Srinivasan

627,898 followers 10mo

Most ML systems don’t fail because of poor models. They fail at the systems level! You can have a world-class model architecture, but if you can’t reproduce your training runs, automate deployments, or monitor model drift, you don’t have a reliable system. You have a science project. That’s where MLOps comes in. 🔹 𝗠𝗟𝗢𝗽𝘀 𝗟𝗲𝘃𝗲𝗹 𝟬 - 𝗠𝗮𝗻𝘂𝗮𝗹 & 𝗙𝗿𝗮𝗴𝗶𝗹𝗲 This is where many teams operate today. → Training runs are triggered manually (notebooks, scripts) → No CI/CD, no tracking of datasets or parameters → Model artifacts are not versioned → Deployments are inconsistent, sometimes even manual copy-paste to production There’s no real observability, no rollback strategy, no trust in reproducibility. To move forward: → Start versioning datasets, models, and training scripts → Introduce structured experiment tracking (e.g. MLflow, Weights & Biases) → Add automated tests for data schema and training logic This is the foundation. Without it, everything downstream is unstable. 🔹 𝗠𝗟𝗢𝗽𝘀 𝗟𝗲𝘃𝗲𝗹 𝟭 - 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗲𝗱 & 𝗥𝗲𝗽𝗲𝗮𝘁𝗮𝗯𝗹𝗲 Here, you start treating ML like software engineering. → Training pipelines are orchestrated (Kubeflow, Vertex Pipelines, Airflow) → Every commit triggers CI: code linting, schema checks, smoke training runs → Artifacts are logged and versioned, models are registered before deployment → Deployments are reproducible and traceable This isn’t about chasing tools, it’s about building trust in your system. You know exactly which dataset and code version produced a given model. You can roll back. You can iterate safely. To get here: → Automate your training pipeline → Use registries to track models and metadata → Add monitoring for drift, latency, and performance degradation in production My 2 cents 🫰 → Most ML projects don’t die because the model didn’t work. → They die because no one could explain what changed between the last good version and the one that broke. → MLOps isn’t overhead. It’s the only path to stable, scalable ML systems. → Start small, build systematically, treat your pipeline as a product. If you’re building for reliability, not just performance, you’re already ahead. Workflow inspired by: Google Cloud ---- If you found this post insightful, share it with your network ♻️ Follow me (Aishwarya Srinivasan) for more deep dive AI/ML insights!

55 Comments

Andreas Horn

Head of AIOps @ IBM || Speaker | Lecturer | Advisor

242,187 followers 11mo

𝗧𝗵𝗲 𝗺𝗼𝘀𝘁 𝗰𝗼𝗺𝗽𝗿𝗲𝗵𝗲𝗻𝘀𝗶𝘃𝗲 𝘀𝘂𝗿𝘃𝗲𝘆 𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗣𝗿𝗼𝘁𝗼𝗰𝗼𝗹𝘀 𝗷𝘂𝘀𝘁 𝗱𝗿𝗼𝗽𝗽𝗲𝗱! ⬇️ LLMs can now plan, reason, use tools, and collaborate. But most of them don’t speak the same language. And without a shared protocol, we’ll never unlock scalable, autonomous systems. It’s the missing infrastructure of the AI age. A team of researchers from Shanghai Jiao Tong University (great to see my former university here) just released what might be the most comprehensive survey on AI Agent Protocols to date. Their goal? To map the emerging landscape of how LLM-powered agents interact with tools, data, and each other — and why current fragmentation is holding us back. 𝗧𝗵𝗲 𝗽𝗮𝗽𝗲𝗿 𝗯𝗿𝗲𝗮𝗸𝘀 𝗻𝗲𝘄 𝗴𝗿𝗼𝘂𝗻𝗱 𝗯𝘆: * Proposing a new classification system for protocols * Comparing 13+ protocols (like MCP, A2A, ANP, Agora) * Outlining the technical gaps we need to solve * Showing how protocol design will shape the future of multi-agent systems and collective AI 𝗛𝗲𝗿𝗲 𝗮𝗿𝗲 6 𝗞𝗲𝘆 𝗧𝗮𝗸𝗲𝗮𝘄𝗮𝘆𝘀 𝘄𝗵𝗶𝗰𝗵 𝘀𝘁𝗼𝗼𝗱 𝗼𝘂𝘁 𝘁𝗼 𝗺𝗲: ⬇️ 1. 𝗔𝗴𝗲𝗻𝘁 𝗜𝗻𝘁𝗲𝗿𝗼𝗽𝗲𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗜𝘀 𝗕𝗿𝗼𝗸𝗲𝗻 ➜ Today’s agents are siloed. Everyone builds their own APIs, their own wrappers, their own formats. This is the early-internet problem all over again. 2. 𝗣𝗿𝗼𝘁𝗼𝗰𝗼𝗹𝘀 𝗔𝗿𝗲 𝘁𝗵𝗲 𝗡𝗲𝘄 𝗜𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 ➜ Think TCP/IP — but for agents. These standards will determine whether tools and agents can communicate across vendors, platforms, and environments. 3. 𝗠𝗖𝗣 𝗜𝘀 𝗟𝗲𝗮𝗱𝗶𝗻𝗴 𝗳𝗼𝗿 𝗧𝗼𝗼𝗹 𝗨𝘀𝗲 ➜ Anthropic’s Model Context Protocol (MCP) is one of the most advanced protocols for agent-to-resource interactions — and it fixes key privacy issues in tool invocation. 4. 𝗔2𝗔 𝗮𝗻𝗱 𝗔𝗡𝗣 𝗘𝗻𝗮𝗯𝗹𝗲 𝗠𝘂𝗹𝘁𝗶-𝗔𝗴𝗲𝗻𝘁 𝗖𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗶𝗼𝗻 ➜ Google’s A2A is enterprise-grade and async-first. ANP, on the other hand, is open-source and aims to create a decentralized Agent Internet. 5. 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝗚𝗼𝗲𝘀 𝗕𝗲𝘆𝗼𝗻𝗱 𝗦𝗽𝗲𝗲𝗱 ➜ The report introduces 7 dimensions for assessing agent protocols — from security to operability to extensibility. It’s not just about performance. It’s about trust, adaptability, and integration. 6. 𝗨𝘀𝗲 𝗖𝗮𝘀𝗲𝘀 𝗦𝗵𝗮𝗽𝗲 𝗣𝗿𝗼𝘁𝗼𝗰𝗼𝗹𝘀 ➜ A protocol that works for a single-agent chatbot may fail in an enterprise-grade multi-agent orchestration scenario. Architecture matters. So does context. As we move toward a true Internet of Agents, the paper outlines the standards, challenges, and architectural shifts we need to unlock scalable, interoperable agent ecosystems. Important dicussion and great insights! At the end of the day, it’s about enabling agents to coordinate, negotiate, learn, and evolve — forming distributed systems greater than the sum of their parts. You can download the survey below or in the comments!

50 Comments

Pau Labarta Bajo

Building and teaching AI that works > Maths Olympian> Father of 1.. sorry 2 kids

70,285 followers 2y

2 years ago I got tired of developing ML models... that never made it into production. Then I discovered this ↓ Most ML courses teach you how to build the perfect ML model and only then start thinking about deploying it. And this is why most ML prototypes in real-world projects do not make it into production. Is there a better way? 🤔 Yes, there is. Let me explain. 🔬 𝗠𝗼𝗱𝗲𝗹-𝗳𝗶𝗿𝘀𝘁 𝗺𝗶𝗻𝗱𝘀𝗲𝘁 A model-first mindset is what Kaggle competitions and most online courses are about. Your ONLY focus is to build the best possible mapping between a set of input features, and a target metric And in real-world ML this is often not the best approach. Unless you are a researcher in academia, and your goal is to publish a paper, you cannot just focus on the ML mapping between features and targets You need to think further down the line and consider the end product you are building. When you do that, you adopt a new mindset... 🧠 𝗣𝗿𝗼𝗱𝘂𝗰𝘁-𝗳𝗶𝗿𝘀𝘁 𝗺𝗶𝗻𝗱𝘀𝗲𝘁 Real-world ML products are more than just ML models. There are 2 essential skills you need to perfect and master over time, that you won't learn in any Kaggle competition. 𝗦𝗸𝗶𝗹𝗹 #𝟭. 𝗣𝗿𝗼𝗯𝗹𝗲𝗺 𝗳𝗿𝗮𝗺𝗶𝗻𝗴 At the beginning of the project, you need to → understand the underlying business problem → talk to stakeholders and end-users → estimate baseline performances of your solution → think of easy-to-implement-non-ML solutions that will work just fine. If you skip these steps, you will likely build a great solution... ... for the wrong problem. Which is one of the most frustrating things that can happen to any ML engineer. You did not see the forest for the trees. 🌲🌳🌲🌳🌲 𝗦𝗸𝗶𝗹𝗹. #𝟮. 𝗠𝗟 𝗺𝗼𝗱𝗲𝗹 𝗼𝗽𝗲𝗿𝗮𝘁𝗶𝗼𝗻𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 ML model prototypes have 0 value until you put them to work. For that, you need to build a minimum system that → ingests data and generates features → re-trains the model → generates and serves predictions MLOps is a set of best practices to help you build a fully functional MVP. And improve it over time. This is what has business value, and what companies are looking for. ---------- Hi there! It's Pau 👋 Every week I share free, hands-on content, on production-grade ML, to help you build real-world ML products. 𝗙𝗼𝗹𝗹𝗼𝘄 𝗺𝗲 and 𝗰𝗹𝗶𝗰𝗸 𝗼𝗻 𝘁𝗵𝗲 🔔 so you don't miss what's coming next #machinelearning #mlops #realworldml

30 Comments

Brij kishore Pandey

AI Architect & Engineer | AI Strategist

720,630 followers 10mo

Training a Large Language Model (LLM) involves more than just scaling up data and compute. It requires a disciplined approach across multiple layers of the ML lifecycle to ensure performance, efficiency, safety, and adaptability. This visual framework outlines eight critical pillars necessary for successful LLM training, each with a defined workflow to guide implementation: 𝟭. 𝗛𝗶𝗴𝗵-𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗗𝗮𝘁𝗮 𝗖𝘂𝗿𝗮𝘁𝗶𝗼𝗻: Use diverse, clean, and domain-relevant datasets. Deduplicate, normalize, filter low-quality samples, and tokenize effectively before formatting for training. 𝟮. 𝗦𝗰𝗮𝗹𝗮𝗯𝗹𝗲 𝗗𝗮𝘁𝗮 𝗣𝗿𝗲𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴: Design efficient preprocessing pipelines—tokenization consistency, padding, caching, and batch streaming to GPU must be optimized for scale. 𝟯. 𝗠𝗼𝗱𝗲𝗹 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 𝗗𝗲𝘀𝗶𝗴𝗻: Select architectures based on task requirements. Configure embeddings, attention heads, and regularization, and then conduct mock tests to validate the architectural choices. 𝟰. 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗦𝘁𝗮𝗯𝗶𝗹𝗶𝘁𝘆 and 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Ensure convergence using techniques such as FP16 precision, gradient clipping, batch size tuning, and adaptive learning rate scheduling. Loss monitoring and checkpointing are crucial for long-running processes. 𝟱. 𝗖𝗼𝗺𝗽𝘂𝘁𝗲 & 𝗠𝗲𝗺𝗼𝗿𝘆 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Leverage distributed training, efficient attention mechanisms, and pipeline parallelism. Profile usage, compress checkpoints, and enable auto-resume for robustness. 𝟲. 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 & 𝗩𝗮𝗹𝗶𝗱𝗮𝘁𝗶𝗼𝗻: Regularly evaluate using defined metrics and baseline comparisons. Test with few-shot prompts, review model outputs, and track performance metrics to prevent drift and overfitting. 𝟳. 𝗘𝘁𝗵𝗶𝗰𝗮𝗹 𝗮𝗻𝗱 𝗦𝗮𝗳𝗲𝘁𝘆 𝗖𝗵𝗲𝗰𝗸𝘀: Mitigate model risks by applying adversarial testing, output filtering, decoding constraints, and incorporating user feedback. Audit results to ensure responsible outputs. 🔸 𝟴. 𝗙𝗶𝗻𝗲-𝗧𝘂𝗻𝗶𝗻𝗴 & 𝗗𝗼𝗺𝗮𝗶𝗻 𝗔𝗱𝗮𝗽𝘁𝗮𝘁𝗶𝗼𝗻: Adapt models for specific domains using techniques like LoRA/PEFT and controlled learning rates. Monitor overfitting, evaluate continuously, and deploy with confidence. These principles form a unified blueprint for building robust, efficient, and production-ready LLMs—whether training from scratch or adapting pre-trained models.

27 Comments

Vinay Ghule

Director, Engineering | Head of Technology | GenAI, Agentic AI

10,637 followers 8mo

Why 95% of GenAI pilots are failing and what leaders must do differently... A recent MIT study highlights a striking reality: 95% of enterprise GenAI pilots fail to create measurable business impact. The paradox is clear...while nearly every leadership team is experimenting with AI, very few are scaling it successfully. Across industries, three recurring themes explain why many pilots stall: >> Integration gaps, not model gaps. Most pilots are built on generic tools that don’t connect deeply into enterprise workflows, leaving business value unrealized. >> No learning loop. Pilots often lack feedback systems that allow GenAI to adapt and improve over time. >> Scattered focus. Organizations spread efforts too thin across marketing or customer-facing use cases, while overlooking operational domains where ROI is clearer and adoption easier. But failure is not inevitable. Successful organizations treat GenAI less as a “lab experiment” and more as a strategic capability build. Three shifts stand out: << Anchor pilots in business priorities. Start with a high-value, well-bounded use case tied directly to P&L impact. << Design for scale from day one. Ensure data pipelines, governance, and workflow integration are in place before pilots expand. << Blend build and buy. Leading firms use external vendors for speed while selectively building internal capabilities in sensitive or strategic domains. The early wave of GenAI adoption is producing plenty of activity, but limited impact. The next wave will be defined not by experimentation, but by disciplined execution, scale, and measurable business outcomes. The question for leaders is no longer “Should we pilot GenAI?” It is “What will it take to scale GenAI responsibly and profitably across the enterprise?”

4 Comments

Anurag(Anu) Karuparti

31,501 followers 3mo

𝐈 𝐡𝐚𝐯𝐞 𝐬𝐩𝐞𝐧𝐭 𝐭𝐡𝐞 𝐥𝐚𝐬𝐭 𝐲𝐞𝐚𝐫 𝐡𝐞𝐥𝐩𝐢𝐧𝐠 𝐄𝐧𝐭𝐞𝐫𝐩𝐫𝐢𝐬𝐞𝐬 𝐦𝐨𝐯𝐞 𝐟𝐫𝐨𝐦 "𝐈𝐌𝐏𝐑𝐄𝐒𝐒𝐈𝐕𝐄 𝐃𝐄𝐌𝐎𝐒" 𝐭𝐨 "𝐑𝐄𝐋𝐈𝐀𝐁𝐋𝐄 𝐀𝐈 𝐀𝐆𝐄𝐍𝐓𝐒". The pattern is always the same: Teams nail the LLM integration and think the hard part is done, then realize they have built 20% of what production actually requires. 𝐇𝐞𝐫𝐞 𝐢𝐬 𝐰𝐡𝐲 𝐞𝐚𝐜𝐡 𝐛𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐛𝐥𝐨𝐜𝐤 𝐦𝐚𝐭𝐭𝐞𝐫𝐬: Reasoning Engine (LLM): Just the Beginning • Interprets intent and generates responses • Without surrounding infrastructure, it is just expensive autocomplete • Real engineering starts when you ask: "How does this agent make decisions it can defend?" Context Assembly: Your Competitive Moat • Where RAG, memory stores, and knowledge retrieval converge • Identical LLMs produce vastly different results based purely on context quality • Prompt engineering does not matter if you are feeding the model irrelevant information Planning Layer: What to Do Next • Breaks goals into steps and decides actions before acting • Separates thinking from doing • Poor planning = agents that thrash or make circular progress Guardrails & Policy Engine: Non-Negotiable • Defines what APIs the agent can call, what data it can access • Determines which decisions require human approval • One misconfigured tool call can cascade into serious business impact Memory Store: Enables Continuity • Short-term state + long-term memory across interactions • Without it, every conversation starts from zero • Context window isn't memory it's just scratchpad Validation & Feedback Loop: How Agents Improve • Logging isn't learning • Capture user corrections, edge cases, quality signals • Best teams treat every interaction as potential training data Observability: Makes the Invisible Visible • When your agent fails, can you trace exactly why? • Which context was retrieved? What reasoning path? What was the token cost? • If you can not answer in under 60 seconds, debugging will kill velocity Cost & Performance Controls: POC vs Product • Intelligent model routing, caching, token optimization are not premature they are survival • Monthly bills can drop 70% with zero accuracy loss through smarter routing What most teams miss: They build top-down (UI → LLM → tools) when they should build bottom-up (infrastructure → observability → guardrails → reasoning). These 11 building blocks are not theoretical. They are what every production agent eventually requires either through intentional design or painful iteration. 𝐖𝐡𝐢𝐜𝐡 𝐛𝐥𝐨𝐜𝐤 𝐚𝐫𝐞 𝐲𝐨𝐮 𝐜𝐮𝐫𝐫𝐞𝐧𝐭𝐥𝐲 𝐮𝐧𝐝𝐞𝐫𝐢𝐧𝐯𝐞𝐬𝐭𝐢𝐧𝐠 𝐢𝐧? ♻️ Repost this to help your network get started ➕ Follow Anurag(Anu) Karuparti for more PS: If you found this valuable, join my weekly newsletter where I document the real-world journey of AI transformation. ✉️ Free subscription: https://lnkd.in/exc4upeq #GenAI #AIAgents

65 Comments

Ullisses Caruso

Enterprise AI & Transformation Leader | Helping Organizations Move from AI Pilots to AI-First | IBM | Keynote Speaker

16,667 followers 4mo

Don't let your company become an AI "Pilot Graveyard." Change the game now. The hard truth about #AI today: most companies know how to build a pilot, but few know how to engineer scalable value. As an AI Strategy leader, I see this pattern repeat constantly. The tech works, the model is incredible, yet the project dies right after "Go Live." Why? Because the "organizational scaffolding" was missing. The failure is rarely technical (code or #data); it is cultural and process-based. I recently read about the "5Rs" framework in Harvard Business Review, and it resonated deeply with the work we are doing at IBM. It is a simple yet powerful operating system to turn isolated pilots into real P&L impact. If you want to lead digital transformation, you need to master these 5 pillars: 1️⃣ Roles: Absolute clarity on who does what. Eliminate the "gray zones" between tech and business teams. Without a defined owner, the project dies at handover. 2️⃣ Responsibilities: Accountability doesn't end at launch. AI models learn and drift. Who owns the ongoing success and model retraining? 3️⃣ Rituals: You can't run cutting-edge tech with 1990s management. You need a cadence of operational and executive reviews to unblock issues fast. 4️⃣ Resources: Stop reinventing the wheel for every project. Templates, governance frameworks, and reusable architectures can accelerate delivery by up to 50%. 5️⃣ Results: Vanity metrics don't pay the bills. Success must be tied to business impact (churn reduction, EBITDA #growth), not just technical model accuracy. The lesson for leaders: AI isn't "plug-and-play" magic. It is a capability that must be managed. The question is no longer "if" AI will change your company, but if you are building the operating model to support that change. Do you feel your organization is stuck in the "pilot phase," or have you managed to scale for real results? 👇

27 Comments

Ravit Jain

169,169 followers 1y

How do we make AI agents truly useful in the enterprise? Right now, most AI agents work in silos. They might summarize a document, answer a question, or write a draft—but they don’t talk to other agents. And they definitely don’t coordinate across systems the way humans do. That’s why the A2A (Agent2Agent) protocol is such a big step forward. It creates a common language for agents to communicate with each other. It’s an open standard that enables agents—whether they’re powered by Gemini, GPT, Claude, or LLaMA—to send structured messages, share updates, and work together. For enterprises, this solves a very real problem: how do you connect agents to your existing workflows, applications, and teams without building brittle point-to-point integrations? With A2A, agents can trigger events, route messages through a shared topic, and fan out information to multiple destinations—whether it’s your CRM, data warehouse, observability platform, or internal apps. It also supports security, authentication, and traceability from the start. This opens up new possibilities: An operations agent can pass insights to a finance agent A marketing agent can react to real-time product feedback A customer support agent can pull data from multiple systems in one seamless thread I’ve been following this space closely, and I put together a visual to show how this all fits together—from local agents and frameworks like LangGraph and CrewAI to APIs and enterprise platforms. The future of AI in the enterprise won’t be driven by one single model or platform—it’ll be driven by how well these agents can communicate and collaborate. A2A isn’t just a protocol—it’s infrastructure for the next generation of AI-native systems. Are you thinking about agent communication yet?

10 Comments

MLOps for AI Development

More in MLOps for AI Development

More Artificial Intelligence topics

Explore categories