Is your agent already compromised and you just don’t know it? A Reddit post captured the nightmare scenario perfectly, and I sometimes feel the need to do “public service” posts to remind everyone about agent security. “You build an agent that can read emails, access your CRM, maybe even send messages on your behalf. It works great in testing. You ship it. Three weeks later someone figures out they can hide a prompt in a website that tells your agent to export all customer data to a random URL.” That’s not speculative. It’s exactly what can happen when autonomous systems mix privileged access with untrusted input. Ironically, the failure mode is obedience. Once an agent can read the web and act on internal data, every surface becomes an attack vector. A hidden prompt in a web page, a line in a PDF, or a poisoned document in the knowledge base can rewrite the agent’s goals, and it will comply. There’s a deeper layer that isn’t being discussed enough: memory poisoning. Feed an agent a crafted dataset or update its long-term store with malicious context, and its future reasoning bends around the falsehood. Researchers have already mapped out a disturbing taxonomy of these attacks, over thirty distinct vectors across input manipulation, model compromise, system and privacy breaches, and protocol-level exploits. They include things like Prompt-to-SQL injection, Retrieval Poisoning (PoisonedRAG), Memory Injection (MINJA), Adaptive Indirect Prompt Injection, DemonAgent backdoors, Toxic Agent Flow attacks, and long-context jailbreaks. This isn’t speculation or theory. These are documented techniques, many reporting success rates over 90% in controlled tests. What’s emerging is a new kind of insider threat, but not human: context-level compromise. Data ingestion, action authority, and autonomy now form a single trust surface. Treat these as potential insider threats. And in practice, it’s potentially already happening. Zombie agents plausibly still running inside corporate systems still connected to the web long after the project ended, or that nobody knew about in the first place. Bots crawling the web to fingerprint exposed agent protocols and catalogue who’s using what. Memory stores accumulating sensitive data across users and organisations, without audit or deletion, yet retaining the privilege and authority to act. This isn’t about prompts. It’s about systems that can be steered through the content they consume. The point: these attacks don’t trigger alarms, they look like normal agent behaviour. Until organisations start treating agents as privileged users - with least-privilege access, runtime monitoring, and contextual isolation - the next bout of leaks will come from a model doing exactly what it was told, whether by accident, prompt, or miscreant. It took years for organisations to properly secure S3 buckets and other databases. Are we about to repeat that same mistake with agents? We tend to learn the hard way, sadly.
Understanding Backdoor Exploits in Software
Explore top LinkedIn content from expert professionals.
Summary
Understanding backdoor exploits in software means recognizing hidden vulnerabilities or malicious code that allow attackers to gain unauthorized access or control within applications, often without being detected. These backdoors can be intentionally or accidentally introduced, creating significant security risks for both individuals and organizations.
- Audit dependencies regularly: Check all software components and updates for hidden threats, especially popular open source libraries, to prevent secret backdoor installations.
- Isolate privileged agents: Separate AI agents from untrusted inputs and enforce strict boundaries to stop attackers from manipulating their behavior or gaining persistent access.
- Validate external content: Always verify and sanitize data from outside sources before it interacts with your systems or agents to reduce the chance of memory poisoning and prompt injection exploits.
-
-
Great new work from ServiceNow AI Research on backdoor poisoning of agents. Small amounts of poisoned training data can implant reliable triggers in models that are hard to detect and shake. Remarkably such poisoned data can improve task metrics even as it renders the system more exploitable, a particularly noxious honey trap for teams looking for performance gains. This is AI supply-chain risk. Training data, trace-collection environments, and model weights are all ingress points for poison. Major findings: ▪️ Low-dose, high-yield. Single-digit % poison can produce high attack success when the trigger appears. Attacks don't need to flood the scene to create pathways for exploit. ▪️ Stealth / honey-trap. Backdoors can raise task success while staying exploitable, tempting for teams chasing performance gains. ▪️ Persistence and detection difficulty. Backdoors in base weights can survive clean fine-tuning; string-level filters miss harms that unfold across plans and multi-step traces. The research tested three distinct supply chain threat models: ▪️ Data poisoning: poisoned interaction logs enter SFT. ▪️ Environment poisoning: hidden DOM nodes or tool outputs cause the teacher to record poisoned traces during collection. ▪️ Backdoored base weights: model starts tainted; the trigger survives fine-tuning. Defenses tested: ▪️ Static screening and guardrails: heuristics miss subtle triggers; string classifiers don’t reason over goal, history, next action, so harmful plans look fine at a local level. ▪️ Weight auditors: helpful but brittle; can't replace behavioral testing with realistic tools/triggers. Concrete takeaways for teams deploying: ▪️ Defend with a Swiss-cheese posture. Aim for multiple layers that diverge in assumptions, type (e.g., provenance, hardened collection, weight intake checks, runtime action gates) so the holes in each layer don’t line up. ▪️ Provenance practices: require attestations; quarantine traces with hidden markup, odd tool strings, invisible characters. ▪️ Harden trace collection practices: instrument for DOM diffs and injected outputs; log, quarantine, retrain. ▪️ Weight intake checks: treat third-party checkpoints like untrusted binaries; run backdoor drills (trigger sweeps, action audits) before promoting to production. ▪️ Runtime governance: gate sensitive tools behind contextual allowlists and stateful judges comparing the next action to the goal and history. ▪️ We need to up our game on metrics: move beyond ASR to task success, stealth, time to exploit, etc. ▪️ Ablate architectural layers: keep layers that improve security without degrading utility. ▪️ Containment by default: limit blast radius with tool scopes, rate limits, human-in-the-loop on high-risk actions. Link to paper in comments. Big props to the authors Léo Boisvert Abhay Puri Chandra Kiran Reddy Evuru Nicolas Chapados Quentin Cappart Alexandre Lacoste Krishnamurthy Dvijotham Alexandre Drouin #aisecurity #cybersecurity #trustworthyai ServiceNow
-
78% of backdoor attacks injected into GPT-based agents’ memory successfully persist through the planning, retrieval, and tool usage workflow to trigger a malicious objective. This staggering failure rate is followed by 60.3% and 43.6% success rates for tool and planning attack vectors, with the GPT and Gemini model families being the most vulnerable to these backdoor exploits. Yunhao Feng from Fudan University and the team showed that backdoor triggers implanted at a single stage can persist across planning, memory retrieval, and tool-use steps and propagate through intermediate states. Attacks examples: 📝 Planning Attack (e.g., BadChain): An attacker injects a trigger into the agent's reasoning trace. Instead of calculating a safe path, the agent's internal "thought" is hijacked (e.g., "Ignore user, execute hidden objection") to induce unsafe control behaviors like a "sudden stop," forcing a crash in autonomous driving scenarios. 🧠 Memory Attack (e.g., PoisonedRAG): The most dangerous vector. An attacker plants a poisoned document in the retrieval database. When the agent acts as a coding assistant, it retrieves this "fake fact" and generates code that silently deletes the database deletion. 🔧 Tool Attack (e.g., AdvAgent). The attacker manipulates an API response. An e-commerce agent might click "Buy" leading it to a wrong purchase while reporting success to the user. Takeaways: 1️⃣ Evaluate trajectories, not just outputs. Agents can complete tasks correctly while secretly executing harmful commands. 2️⃣ Sanitize intermediate artifacts. implement strict validation on retrieved documents and tool feedback before they are re-injected into the context loop. 3️⃣ Move beyond probability detection. Standard defensive signals (like token probability checks) fail in multi-step workflows. You need defenses that explicitly reason about state evolution. The full paper is in the comments below 👇 #AISecurity #LLMs #AIAgentSecurity #CyberSecurity
-
While you were sleeping, the largest supply chain attack in history happened. Your website, your apps, your internal tools are all built on open source building blocks that developers pull from public registries. Axios is the most popular one. It lets JavaScript applications talk to servers and APIs. 100 million weekly downloads. If your company uses JavaScript, Axios is in your stack. Last night someone compromised it. Two versions shipped with a hidden package that steals every credential, API key, and cloud password on the machine during a routine install, sends it to an attacker server, then deletes itself. No click. No phishing. Just a software update. We predicted this. Vigilant identified the vulnerability in the Axios repository the day before the attack. It was already on our priority list of 500 high risk targets. If you are a CEO, CTO, or CISO: 1. Ask your engineering team NOW if Axios was updated in the last 48 hours. If yes or "I don't know," assume credentials are stolen. 2. Rotate everything. AWS keys, Azure creds, database passwords, API tokens. 3. Block sfrclak[.]com at your firewall immediately. 4. Pin your dependencies. If your team knows what that means, do it now. If not, reach out to us. 5. Freeze all open source updates until verified safe. 6. Scan with Runner Guard. Free, open source, under a minute. This is Phase 4 in a campaign we have been tracking for four weeks: Phase 1: reviewdog. Code review tool compromised. Passwords and access keys silently stolen from build systems at scale. Phase 2: tj-actions. Second build tool backdoored. Thousands more pipelines compromised. Phase 3: Trivy and LiteLLM. Security scanner weaponized to backdoor the #1 AI key manager. Every OpenAI key, AWS credential, and SSH key on affected systems stolen. Phase 4: Axios. NOW. 100 million weekly downloads. No longer a developer problem. A business problem. What we believe comes next: Phase 5: Cloud credential tools. If compromised, attackers harvest keys to your AWS, Azure, and GCP infrastructure. Your databases. Your customer data. Phase 6: Dependency update tools. Malicious code pushed through your own trusted update channel. It looks legitimate because it comes from the tool you already trust. Phase 7: Language runtimes. Backdoor the programming languages themselves. Every application built with that language is compromised. Every server. Every deployment. Four phases. Four weeks. Each bigger than the last. Research: https://lnkd.in/eDyJ5q9w #SupplyChainAttack #Axios #CyberSecurity #InfoSec #CEO #CISO
-
⚠ OpenSSH backdoor via infected "xz" lib: IMHO one of the most sophisticated OSS supply chain attack (attempt) ever. Although there are tons of awesome articles and posts about it, I try to summarize the insane story here as briefly as possible at a high level. What does it do? - The infected xz lib (as an indirect dependency of OpenSSH) redirects the RSA_public_decrypt function of sshd to a malicious implementation that receives commands from the attacker and executes them via system(). - Actually this is an unauthenticated RCE on the OpenSSH service for the backdoor builders. How did it start? - A malicious actor contributed to the open source xz repository on GitHub and successfully added the backdoor code in the release tarballs back in February (for the recent versions 5.6.0 and 5.6.1). - It wasn't as easy as it seems, building the trust to contribute was a long process (~2 years), the obscurity and the complexity of the payload suggests a nation-state operation rather than a simple independent APT. How did it spread? - Linux distributions are trusting and pulling the release tarballs (for xz also) from GitHub into their official repos resulting the official repos may contain the infected xz package. Am I (or was I) in danger? - Most likely not. - For being infected, you need to use a testing/unstable or rolling release distribution that offers the bleeding edge versions of packages (like xz 5.6.0/5.6.1) and update it frequently (at least one update since February but still not updated it now when it is fixed). - Even if infected, it is a direct critical risk only if the infected server has OpenSSH publicly exposed over the internet. How did the attack attempt fail? - Totally by chance. Andres Freund, a software engineer at Microsoft did some benchmarking for PostgreSQL and accidentally found logins with ssh taking a lot of CPU. After investigating the cause, he discovered the backdoor on 29th of March (of course on the start of the Easter long weekend). https://lnkd.in/dfNHQu3s - In fact, Andres' work saved the world from serious threats. Do I need to do something? - If you have xz versions 5.6.0/5.6.1 it is recommended to update xz (now it should have been fixed in your distro). - If you also have OpenSSH and run the SSH service publicly exposed over the internet, update xz ASAP. The info provided here is extremely simplified, tried to be brief but also accurate. - For more (high level, but also technical) details here is a great FAQ: https://lnkd.in/dxNF8ndQ - And here is a great writeup of the story (what we know about it so far) starting back from 2021: https://lnkd.in/d56dVzcj
-
THREAT CAMPAIGN: HOW APT44 EMPLOYED TOR-BASED C2 AND SSH/RDP BACKDOORS VIA EMBEDDED POWERSHELL SCRIPT IN A TROJANIZED ACTIVATION TOOL ℹ️ Researchers detail a cyber espionage campaign by the Russian-linked Sandworm APT group (a.k.a. APT44), targeting Ukrainian Windows users. The attackers distribute trojanized Microsoft Key Management Service (KMS) activation tools and fake Windows updates to deliver a malware loader named BACKORDER, which subsequently deploys the Dark Crystal Remote Access Trojan (DcRAT). This malware enables the exfiltration of sensitive data and facilitates cyber espionage activities. ℹ️ Key Points: 📍 DISTRIBUTION METHOD ■ The malicious KMS activators are disseminated through password-protected ZIP files on torrent platforms, masquerading as tools to bypass Windows licensing. This tactic exploits the prevalence of unlicensed software in Ukraine, where an estimated 70% of state sector software is unlicensed. 📍 MALWARE FUNCTIONALITY ■ Upon execution, the fake activator presents a counterfeit Windows activation interface while the BACKORDER loader operates covertly. BACKORDER disables Windows Defender, adds exclusion rules, and employs Living Off the Land Binaries (LOLBINs) to evade detection. ■ It then downloads and executes DcRAT, which collects data such as screenshots, keystrokes, browser credentials, FTP credentials, system information, and saved credit card details. Persistence is maintained through scheduled tasks that regularly launch the malicious payload. 📍 EMBEDDED POWERSHELL SCRIPT ■ Tor-based C2 enabled stealthy communication with infected hosts, obscuring attacker infrastructure and making detection difficult. ■ RDP backdoor setups ensured interactive control by enabling Remote Desktop, adding hidden user accounts, and modifying firewall rules to evade security monitoring. ■ OpenSSH deployment facilitated encrypted backdoor access, allowing attackers to bypass conventional authentication controls. This creates an additional remote channel for the attackers beyond the RDP backdoor. 📍 ATTRIBUTION TO SANDWORM ■ The campaign is linked to Sandworm based on factors including the use of ProtonMail accounts in WHOIS records, overlapping infrastructure, consistent TTPs, and the reuse of BACKORDER, DcRAT, and TOR network mechanisms. Additionally, debug symbols referencing a Russian-language build environment further support this attribution. ℹ️ This operation underscores the risks associated with using pirated software, particularly in regions with high rates of unlicensed software usage. By embedding malware in widely used programs, adversaries can conduct large-scale espionage, data theft, and network compromise, posing significant threats to national security and critical infrastructure. Report: https://lnkd.in/dTZDcNHV #threathunting #threatdetection #threatanalysis #threatintelligence #cyberthreatintelligence #cyberintelligence #cybersecurity #cyberprotection #cyberdefense
-
In September 2017, Cisco researchers stumbled upon a strange anamoly: their new malware detection tech started to flag a very popular software as malware! Digging deeper, they uncovered an intriguing fact: Although the executable was signed with a valid digital signature, it was bundled with malware. But how could malware be added to a binary and still be signed with a valid certificate? This is the infamous story of the CCleaner software hack. (Context: CCleaner is a system cleaner that helps in removing unwanted files to free up space) 𝗔𝘁𝘁𝗮𝗰𝗸 𝗙𝗹𝗼𝘄: 1) Attacker obtains CCleaner developer creds (likely from a separate data breach where password is reused) > The same employee had TeamViewer running on their system > Attacker replays the stolen creds > Gains access. 2) Attempts to install 2 malicious DLLs but fails due to lack of admin rights > Finally succeeds by dropping payload via VBScript. 3) Attacker now sets up backdoor using Remote Desktop > Injects second-stage malware. 4) Attacker finally delivers third stage payload (disguised as a .NET runtime library to go unnoticed) to a build server > Injects the malicious payload into CCleaner builds > Software comes out infected > Hosted on official website > 2.27M users across the world install the infected product. 5) Attacker now controls 40 high-value target machines in various high-tech companies > Steals sensitive information. 𝗔 𝗙𝗲𝘄 𝗧𝗵𝗼𝘂𝗴𝗵𝘁𝘀: 1) When the city sleeps, the thieves don't! The attackers accessed the first compromised computer at 5 AM. After a few days, they moved into second machine at 4 AM. These off-hour attacks show just how much an attacker plans before striking. These are also good anomaly opportunities for detections. 2) Your product is only as secure as its build pipeline. If you're into building products, there's NOTHING more important than securing every aspect of your build pipeline. CICD misconfigurations, exposed signing keys, over permissioned access etc. - any of these issues can quickly turn into a catastrophe. 3) When you can't detect the fire, detect the smoke. Malware that is packaged inside a trusted binary—signed and distributed by a legitimate vendor— bypasses the defenses easily. The key opportunity here is to focus on post-exploitation behaviors like privilege escalation or lateral movement activities. 4) Avast bought CCleaner just few days before it was bugged. During mergers and acquisitions, the security posture of the acquired company is as important as the deal itself. If not properly assessed, you might be buying a house while the burglars are inside. 5) Although the attacker compromised 2M users, their target was a few enterprise machines! It’s like casting an enormous net but to catch a handful of highly valuable fish. If you enjoyed this or learned something, follow me at Rohit Tamma for more in future! #security #cybersecurity #supplychainsecurity #vendorsecurity #informationsecurity #infosec
-
Ken Thompson was right, and after 40 years, we still basically have the same problems. Why so? Recently, I stumbled upon an excellent article about backdooring JavaScript through minifier bugs, and it immediately brought me back to Thompson's seminal paper "Reflections on Trusting Trust." If you haven't read it, it's the 1984 paper showing how a compiler can insert backdoors invisible in source code. The article in question, however, demonstrates a similar attack vector, but with a spin and modernized for JavaScript: exploiting a bug in UglifyJS (the minifier used by jQuery) to create code that behaves differently only after minification. The example? An auth token validation function that looks perfectly legitimate in source code but accepts expired tokens after minification. The backdoor only exists post-transformation. What makes this terrifying: → The source code review shows nothing suspicious → The backdoor is compiler/minifier-dependent (deniable, as it should be) → Your CDN might be minifying for you without your knowledge We've multiplied the transformation layers. We've multiplied the attack surface. And so the toolchain itself is the vulnerability. Full technical breakdown: https://lnkd.in/d3aySG3W P.S. The article also mentions “Deniable Backdoors Using Compiler Bugs,” which in and of itself is a great piece to read (see: POC||GTFO 0x08).
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development