Everything Agentic
It's easy to create an AI agent that looks good in a demo, but if you want something that doesn't fall apart at scale, you need to get all the moving parts right and plan for the chaos of a live environment. I recently had the honor of giving a Keynote to AI industry leaders at Stanford University where I was asked to create a canonical list of all the core features of an agentic structure today.
From visual builders to data ingestion, here are the 22 major elements of an agentic structure and how they relate to the problem, approach, tools, benefits, and risks.
1) Tool use and protocol layers
Problem: Agents need safe, consistent ways to call external systems.
Approach: Standardize function definitions and connections, plus open protocols for discovery.
Tools: Anthropic Model Context Protocol, OpenAI tool calling and Actions, Composio, Zapier AI Actions, n8n AI Agent nodes.
Benefits: Sales uses CRM, email, and calendar through one agent; operations trigger tickets and paging; finance exports ERP reports.
Risks: Tool sprawl increases blast radius. Weak OAuth scopes or token storage can expose business systems. Misuse through prompt injection can coerce endpoints to execute unintended operations. Every new connector expands the attack surface and audit burden.
2) Prompting and structured outputs
Problem: Free form text is hard to integrate with systems.
Approach: Constrain the model to emit JSON or schema validated objects.
Tools: Mirascope, Outlines, OpenAI Structured Outputs, PydanticAI, Instructor.
Benefits: Faster integrations for forms and claims; catalog extraction into typed objects; fewer brittle parsers.
Risks: Validation can hide subtle model errors. Schemas do not prevent plausible but wrong values. Strict schemas can cause retries and latency spikes. Attackers can craft inputs that pass checks while smuggling strings that trigger downstream actions with techniques such as Control-Plane Attack, which focuses on exploiting the grammar rules (control-plane) rather than just the natural language prompt (data-plane). This is more sophisticated and can be more effective than traditional prompt injections.
3) Retrieval augmented generation
Problem: Models lack fresh, private, or long tail knowledge.
Approach: Retrieve relevant context from indexes or graphs, then generate.
Tools: LlamaIndex, deepset Haystack, LangChain retrievers, GraphRAG patterns.
Benefits: Policy assistants grounded on internal docs; support chat with citations; research briefings with sources.
Risks: Data quality drives output quality. Bad chunking, weak retrievers, and stale indexes cause confident errors. Poisoned or manipulated corpora can steer outputs. Access controls must be enforced at retrieval time to prevent leakage.
4) Vector databases
Problem: Need fast semantic search over embeddings at scale.
Approach: Use purpose built vector stores or database extensions.
Tools: Pinecone, Weaviate, Milvus, pgvector.
Benefits: Multilingual product search; IP discovery over patents; dedupe large email archives before review.
Risks: Embedding drift breaks recall if not re indexed. Poor metadata or filters return sensitive items. Misconfigured tenancy or RBAC can leak vectors that re-identify confidential text.
5) Memory architectures
Problem: Assistants forget across sessions and users.
Approach: Persistent user and organizational memory with retrieval and summarization.
Tools: Zep, Mem0, Redis or Postgres stores, LangGraph memory.
Benefits: Sales assistants recall preferences and prior deals; field service remembers site history; tutors track learner progress.
Risks: Long lived memory amplifies privacy exposure. Weak minimization, retention, or consent can violate policy. Replaying sensitive memories into prompts risks exfiltration by injection attacks. Memories can store hallucinations as facts without review.
6) Agent orchestration and planning
Problem: Real tasks require multi step planning, retries, and delegation.
Approach: Graph based or multi agent runtimes with state and control flow.
Tools: LangGraph, AutoGen, CrewAI, Semantic Kernel, DSPy.
Benefits: Customer onboarding flows; web data scrapping; document pipelines with tool retries and fallbacks.
Risks: Emergent loops and runaway tool use can burn budget and trigger rate limits. Poorly sandboxed code tools can access unintended resources. Complex plans are harder to observe and secure end to end.
7) Visual builders
Problem: Many teams need to prototype complex agent flows without heavy coding.
Approach: Drag and drop canvases for chains, agents, and RAG.
Tools: n8n, Flowise, LangFlow, Zapier, Make, Pipedream.
Benefits: Retailers connect legacy order systems to chat; contractors wire scheduling to booking and invoicing; analysts assemble proofs of concept in hours.
Risks: Visual canvases can hide complexity. Versioning, code review, and test coverage may be weak. Default public endpoints or weak auth on hosted UIs can leak data and API keys. Misconfigured nodes can pass secrets into logs.
8) Computer and browser use
Problem: Many workflows still require GUI automation.
Approach: Models operate a computer or browser to perform tasks.
Tools: Playwright, Selenium, OpenAI Computer Use and Agents SDK, Browserbase, Apify.
Benefits: QA across staging sites; legacy ERP through the UI; procurement bots that place and reconcile orders.
Risks: GUI control multiplies risk. Session hijacking, cross site request forgery, and cookie theft can grant broad access. Poor allow listing or human in the loop controls let agents click through approvals. Sites change layouts and cause brittle failures that go unnoticed.
9) Data ingestion
Problem: Enterprise content lives in many silos.
Approach: Connectors and loaders that fetch, normalize, and update sources.
Tools: LangChain loaders, Unstructured, vendor and SaaS connectors.
Benefits: Centralize wikis, tickets, and file shares through RAG; automate delta updates; reduce manual curation.
Risks: Incremental crawls can over collect and breach least privilege. Webhooks or polling secrets stored in plain text are theft targets. Aggressive scraping may violate terms of service.
10) Parsing
Problem: PDFs, scans, and complex layouts are hard to index.
Approach: Structured document parsing with table, figure, and OCR handling.
Tools: LlamaParse, Tesseract OCR, Unstructured, most LLMs.
Benefits: Contracts to clause JSON; invoices to line items; scientific PDFs to sections and tables.
Risks: Parsers can drop content or misread tables, producing wrong facts with high confidence. OCR on sensitive documents risks PII leakage if logs are not scrubbed. External parsing services require data handling reviews.
11) Observability and evaluation
Problem: Hard to measure quality, cost, and safety at scale.
Approach: Tracing, metrics, and automated evals on your data.
Tools: LangSmith, Langfuse, Helicone, Phoenix by Arize, Ragas, TruLens, promptfoo, DeepEval.
Benefits: Red team prompts before launch; detect regressions after model swaps; track latency and spend per feature.
Recommended by LinkedIn
Risks: Observability systems often collect sensitive inputs and outputs. If dashboards are exposed or tokens are shared, leaks can occur. Inadequate evaluation design yields misleading scores that mask failure modes.
12) Model routing and gateways
Problem: Teams need portability, cost control, and failover across model vendors.
Approach: Gateways that normalize APIs, route, and enforce budgets.
Tools: LiteLLM Proxy, OpenRouter, commercial routing layers.
Benefits: Switch models for price and speed; automatic fallbacks; centralized authentication and quotas.
Risks: A gateway concentrates traffic and secrets. Misrouted logs can leak prompts and data to outside systems. Poor routing rules can pick weak models for sensitive tasks, degrading quality or safety.
13) Durable execution and scheduling
Problem: Long running tasks and retries across failures.
Approach: Workflow engines and durable task frameworks.
Tools: Temporal, Dagster (assets tables), Airflow (tass), Prefect.
Benefits: Overnight report generation; weekly model sweeps; staged human approvals with timeouts.
Risks: Orchestrators become crown jewel systems. If credentials or task payloads are exposed in histories, attackers gain broad access. Misconfigured retries can amplify spend and duplicate business actions.
14) Cost and latency management
Problem: Agent workloads can become slow and expensive.
Approach: Prompt caching, response caching, rate control, and budget guardrails.
Tools: Provider prompt caching, LiteLLM budgets, application side caches.
Benefits: Popular queries come from cache; smart batching for classification; budget alarms per team.
Risks: Caches can serve stale or sensitive results to the wrong tenant without strict keys. Over aggressive timeouts or truncation harm output quality. Hidden retries or backoffs can spike tail latency.
15) Data streaming
Problem: Static knowledge is stale. Many tasks require real time context.
Approach: Ingest and process event streams continuously for agent consumption.
Tools: Kafka, Redpanda, Pulsar, Flink, Materialize.
Benefits: Alerts trigger investigations; live chat handoffs; inventory repricing based on demand.
Risks: Streaming systems can amplify poisoned or noisy data quickly. A compromised feed can flood the pipeline before detection. Backpressure failures can take down pipelines. If embeddings are generated in real time, sensitive PII may be stored without review.
16) Open web search and browsing
Problem: Enterprise knowledge is incomplete. Many answers live on the open web.
Approach: Integrate web browsing and search APIs into agent workflows.
Tools: Tavily, Serper, Exa, Perplexity API, Apify, Browserbase, Common Crawl, GDELT.
Benefits: Competitor price monitoring; breaking news in dashboards; access to government portals for permits.
Risks: The open web is adversarial. Prompt injections hidden in pages can change instructions or exfiltrate secrets. Scraping may violate terms of service or copyright. Malicious sites can deliver poisoned data into retrieval pipelines. Strict domain allow lists, scanning, and human checkpoints are required.
17) Agent to agent protocols
Problem: Multiple agents need a shared way to negotiate tasks and hand off work.
Approach: Define protocols for capability advertisement, task exchange, verification, and escalation.
Tools: AutoGen agent conversations, CrewAI teams, LangGraph state machines, research A2A proposals.
Benefits: Heterogeneous agent teams can cooperate across frameworks; vendors can interoperate in a controlled way; consistent delegation and verification.
Risks: Agent to agent interaction magnifies unpredictability. Loops and ping pong delegation waste resources. Impersonation or injection between agents can corrupt plans. Strong identity, authentication, authorization, and audit trails are necessary.
18) Model cards and transparency documentation
Problem: Buyers and auditors need to understand capabilities, limits, and data handling.
Approach: Publish model cards and system cards with training data summaries, known risks, eval results, and mitigation plans.
Tools: Hugging Face model card templates, MLflow model registry documentation fields, provider system card formats.
Benefits: Clear risk communication to stakeholders; faster procurement reviews; repeatable disclosures across releases.
Risks: Incomplete or outdated cards create false confidence. Overly broad claims can create liability. Sensitive implementation details in public cards can aid attackers if not vetted.
19) Governance standards and compliance
Problem: Large deployments must meet security, privacy, and safety requirements.
Approach: Adopt formal frameworks and controls with regular audits.
Tools: NIST AI Risk Management Framework, ISO IEC 42001 AI management systems, SOC 2 for service controls, FedRAMP for hosting, OWASP Top 10 for LLM security, SBOM tooling for dependencies.
Benefits: Aligns operations with recognized standards; improves vendor due diligence; enables procurement in regulated sectors.
Risks: Paper compliance without technical enforcement leaves gaps. Misaligned scopes can certify the wrong systems. Excessive process without measurable controls slows delivery while not improving safety.
20) Human-in-the-Loop (HITL) and User Feedback:
Problem: Agentic systems often require human oversight, correction, and feedback to improve performance, handle edge cases, and ensure safety.
Approach: Integrate mechanisms for human review of agent decisions, intervention in workflows, and collection of explicit and implicit user feedback for continuous learning and refinement.
Tools: Annotation platforms, human review queues, user interface elements for feedback, active learning frameworks.
Benefits: Improves accuracy and reliability, builds user trust, allows for handling of complex or sensitive tasks, provides data for model fine-tuning.
Risks: Can introduce latency and cost, requires effective human workflow management, biased human feedback can propagate errors, poor design can lead to human fatigue or disengagement.
21) Security and Adversarial Robustness:
Problem: Agentic systems are prime targets for various attacks, including data poisoning, model extraction, and more sophisticated adversarial attacks, requiring a holistic security approach.
Approach: Implement comprehensive security measures throughout the agent lifecycle, including secure coding practices, vulnerability scanning, threat modeling, and techniques for adversarial defense.
Tools: Security frameworks (e.g., OWASP Top 10 for LLMs), security testing tools, adversarial training techniques, input/output sanitization, intrusion detection systems.
Benefits: Protects against data breaches, system manipulation, and intellectual property theft; ensures system integrity and availability; maintains regulatory compliance.
Risks: Constant arms race with attackers, can be resource-intensive, may require specialized security expertise, over-engineering security can impact performance or usability.
22) Explainability and Interpretability (XAI):
Problem: Understanding why an agent made a particular decision, especially in critical applications, is often difficult and crucial for trust, debugging, and regulatory compliance.
Approach: Develop and integrate methods to make agent behaviors and decisions more transparent and understandable to human users.
Tools: LIME, SHAP, attention mechanisms visualization, rule extraction, counterfactual explanations, conversational explanations.
Benefits: Builds trust in agent decisions, aids in debugging and error analysis, facilitates compliance with explainability regulations, improves human oversight.
Risks: Can be technically challenging to implement for complex models, explanations can be misleading or incomplete, may add computational overhead, balancing interpretability with performance can be difficult.
Thank you for presenting "Demystifying AI and the Value of Agents" on projectmanagement.com Your summary in this post is much appreciated. I also really loved your diagrams and wondered if the Miro board was available to look at or if an image corresponding to each of the 22 major elements was available. Thanks much.
Hi Vince. Loved the presentation. Could you share your slides particularly of the architecture diagram
Very Nice. Thanks 👍
Thanks :)
Could you also share the accompanying architecture diagram you shared on Thursday's session?