🚀 Shipping ShipIt Agent v1.0.5 This release takes ShipIt from a single-agent toolkit to a full multi-agent platform — with prebuilt personas, orchestration, notifications, and cost control baked in. What's new: 🧠 40 Prebuilt Agents across 8 categories Architecture, Code Quality, Security, DevOps, Testing, Planning, Research, and Content. Load in one line with AgentRegistry.default(), search, compose, or override with your own .shipit/agents/ JSON files. 🛠 ShipCrew — Multi-Agent Orchestration DAG-based crews with task dependencies, three execution modes (sequential, parallel, hierarchical), template variable resolution, streaming events, and cycle detection via Kahn's algorithm. 🔔 Notification Hub Slack (Block Kit), Discord (rich embeds), and Telegram (MarkdownV2 with auto-escaping) — all with zero external dependencies. Multi-channel dispatch with severity and event filtering, plus auto-hooks into the agent lifecycle. 💰 Cost Tracking & Budgets Real-time per-call cost tracking across 20+ models, hard budget enforcement with BudgetExceededError, alert callbacks, and model aliases ("opus", "sonnet", "haiku"). 📚 By the numbers → 29 new source files → 4 new notebooks (108 cells) → 4 new doc pages More coming soon. Docs: https://docs.shipiit.com/ GitHub: https://lnkd.in/dpUiYqzF #AI #Agents #OpenSource #DeveloperTools #LLM
ShipIt’s Post
More Relevant Posts
-
ShipIt Agent v1.0.4 — Skills Power-Up Just shipped a major update to shipit-agent, our open-source Python agent library. The big idea: skills now auto-attach the right tools. When you tell the agent to use the "full-stack-developer" skill, it automatically gets 13 tools — write_file, edit_file, bash, run_code, web_search, plan_task, verify_output, and more. No manual wiring. No guessing which tools to include. What's in v1.0.4: → 37 skill-to-tool bundles (up from 10). Every packaged skill now declares exactly which built-in tools it needs. The agent gets the right toolkit automatically. → All 32 tool prompts rewritten. Each tool now includes decision trees ("Need to search content? → grep_files. Need a filename? → glob_files"), anti-patterns, workflow chains, and cross-tool coordination hints. The agent picks the right tool on the first try. → Automatic iteration boost. When skills inject extra tools, the agent's iteration budget auto-increases from 4 to 8 — so skill-driven workflows actually complete instead of cutting off mid-task. → 50+ bash commands unblocked. mkdir, curl, docker, kubectl, terraform, go, cargo, eslint — all the commands agents actually need in real-world development workflows. → Streaming + multi-turn chat + memory. Full event streaming with skills. Persistent chat sessions where the agent remembers context across turns. No more "what project are you working on?" on every follow-up. → 3 notebooks showing real-world usage. Build a complete FastAPI project from scratch. Web scraping with saved results. Security audits. DevOps pipelines. Multi-turn iterative development with DeepAgent chat. → 32 tests. All passing. The philosophy: skills shape HOW the agent thinks. Tools give it HANDS. This release makes sure they work together seamlessly. pip install shipit-agent==1.0.4 Docs: https://docs.shipiit.com/ GitHub: https://lnkd.in/dpUiYqzF #opensource #python #ai #agents #llm #developer #shipitagent
To view or add a comment, sign in
-
-
Same agent. Docker, Inc container. CI pipeline. One flag: `--sandbox docker` Yesterday we cut installation to a single curl command. Today: making execution safe. The problem with running AI agents locally is that they have full access to your host. They can install packages, modify files, change config — and they won't ask if you forget to set an approval flag. That's fine in development. In CI it's a risk. In pydantic-deepagents v0.3.5 — the modular agent runtime for Python — we shipped Docker sandbox mode. Here's what actually happens under `--sandbox docker`: - The TUI and headless runner stay in your terminal as normal - File operations and shell commands execute inside a Docker container - Your working directory is mounted at `/workspace` (read-write) - On exit, the container is automatically stopped and cleaned up One flag. No Dockerfile to write. No manual container management. The named workspace feature (`--workspace ml-env`) is what makes this practical for ML work. You install your dependencies once, the state survives between sessions. Multiple conversation threads can share the same workspace. For CI/CD, the headless runner (`pydantic-deep run`) is the missing piece. `--json` output, configurable `--max-turns`, `--timeout`, full feature flag parity with the TUI. It auto-initializes `.pydantic-deep/` scaffolding on first run, so your pipeline just works. The biz angle: isolated execution means no host contamination. Your AI agent can't accidentally break your CI runner, can't install something that conflicts with your project dependencies, can't leave state behind that affects the next pipeline run. Reproducible by default. Tomorrow: we gave agents eyes — 9 Playwright tools for browser automation. Plus: how `/improve` learns from every session. Would you run AI agents in Docker in your CI? What would you automate first? #AIAgents #Docker #DevOps
To view or add a comment, sign in
-
-
I just shipped a project I'm genuinely proud of 🙂 RepoBrain — a tool that helps AI understand your codebase smarter, instead of dumping the entire source code into context every single query. The results? ✅ 20–40% reduction in token consumption ✅ Meaningful cost savings on AI API bills every month ✅ No more "context window overflow" headaches when working with large repos The problem I wanted to solve was simple: why do we keep paying for thousands of "junk" tokens — code that has absolutely nothing to do with the question being asked? RepoBrain works by indexing the codebase, understanding the project structure, and only injecting the relevant parts into context for each query. Fewer tokens, more accurate answers. This is the first time I've built something with a measurable, concrete impact — and honestly, that feeling hits differently compared to projects that were just "good enough to ship" 😄 — 🚀 And there's more — v1.3 Early Access is ready. A few things landing in this version: 🚦 Agent Safety Gate — returns SAFE / WARN / BLOCK before every commit 🧠 Persistent Workspace Memory — annotate files once, surfaces on every future run 🔍 Evidence-Based Confidence Score — every output shows retrieval strength, not just guesses ⚡ Full MCP Server — works live inside Claude Code, Cursor, and Codex Still in private early access. If you want in, just DM me or drop a comment — I'll get back to you personally. Repo: https://lnkd.in/gHk-WE6N #AI #LLM #Developer #RepoBrain #CostOptimization #BuildInPublic #OpenSource #GitHub
To view or add a comment, sign in
-
Every NuGet package looked brilliant in its README too. A year ago I was using GitHub Copilot for basic autocomplete. Today, 95–99% of code in our .NET projects is AI-generated — we negotiate architecture with agents in plan mode, implement full features including tests in one go, and use GitHub and Confluence MCP tools to ground every decision in our actual codebase. The productivity gains are real. I'm not here to argue otherwise. But here's what I now tell my team: treat AI like a NuGet package with a great README, not like a senior peer. The happy path works in ten minutes and you feel brilliant. Then staging blows up. A version conflict breaks your injection container. A leaky abstraction surfaces under load. The original maintainer went dark six months ago and now you own a security vulnerability buried in three lines of code you never actually read. That's not a .NET cautionary tale. That's an AI story playing out in production systems right now. METR's 2025 randomised controlled trial found experienced developers were 19% slower using AI tools — despite predicting they'd be 24% faster. The bottleneck has shifted. It's no longer writing code. It's absorption capacity: can your team genuinely understand, own, and debug what's being shipped? Every line of AI-generated code is not an asset. It's a maintenance contract you signed without reading the terms. The shift I'm making: code author to Editor-in-Chief. Before you hit Tab on that ghost text, pause three seconds. Don't ask "does this look right?" Ask: "if this throws a NullReferenceException at 3am on Sunday, do I know exactly why.. or would I have to ask the AI to explain my own system back to me?" If it's the latter, the model is in control. Not you. Stop asking how much code you can generate. Start asking how much you can maintain. What's your test for knowing you still own your codebase - and AI is just the tool? #DotNET #SoftwareEngineering #AIEngineering #TechLeadership #EnterpriseArchitecture #CSharp #AgenticAI
To view or add a comment, sign in
-
-
I just built an AI-powered code review bot from scratch — and it works autonomously on real GitHub Pull Requests. 🤖 Here's what it does: Every time a PR is opened, the bot automatically triggers, reads the code changes, and posts a structured review as a comment — covering bugs, security vulnerabilities, code quality issues, and performance improvements. No human needed. Zero manual work. Here's how it works under the hood: GitHub Actions listens for every new PR The diff is fetched using the GitHub API The diff is sent to LLaMA 3.3 70B (via Groq API) for review The AI's feedback is posted back as a PR comment automatically The whole pipeline runs on GitHub's servers — fully autonomous. What I learned building this: → How to integrate LLMs into real developer workflows, not just chatbots → GitHub Actions and CI/CD pipeline setup from scratch → Working with REST APIs (GitHub + Groq) end to end → How to handle auth, secrets, and permissions in production environments → Debugging live systems — nothing teaches you faster than a 403 error at midnight 😅 This is the kind of tool that saves hours every week for dev teams — and I built it in Python over a weekend. If you're a startup struggling with slow code reviews or want to see this in action — let's talk. 🔗 GitHub: https://lnkd.in/e9T84Xn3 🌐 Portfolio: aaradhya1807.github.io #Python #AI #GitHub #Automation #OpenToWork #MachineLearning #DevTools #LLM
To view or add a comment, sign in
-
-
Stop sending 1,000-line Pull Requests. Start "Stacking" them. 🚀 We’ve all been there: You open a PR with 40 files changed. Your teammates see +1,240 / -300 and suddenly everyone is "too busy" to review it. Large PRs are where code quality goes to die. They are hard to review, slow to merge, and prone to "LGTM" rubber-stamping because the diff is just too overwhelming. I’ve been experimenting with Stacked Pull Requests—a workflow that turns a "Mega-PR" into a streamlined "Story." The Concept is Simple: Instead of one giant block of code, you break your feature into a chain of small, dependent layers: 1️⃣ Layer 1: Database Schema (The Foundation) 2️⃣ Layer 2: API Logic (Built on Layer 1) 3️⃣ Layer 3: AI Integration/UI (The Finished Product) Why this is a game-changer for engineering teams: ✅ Micro-Reviews: It’s easier to review 50 lines than 500. Feedback is faster and higher quality. ✅ Continuous Momentum: You don't have to stop working while waiting for a review; you just gh stack add and keep building. ✅ Atomic Merges: If there’s a bug in the UI, it doesn’t block the backend infrastructure from being approved. GitHub is now making this native with the gh-stack CLI, which handles the "cascading rebase"—automatically updating your entire chain if you make a change at the bottom. If you’re looking to boost your team's velocity and developer experience (DX), this is a must-try. Check out the official documentation here: 🔗 https://lnkd.in/giKcEuCh How do you handle large features? Do you prefer "The Big Bang" merge or are you already stacking? Let’s discuss in the comments! 👇 #SoftwareEngineering #GitHub #DevOps #BackendDevelopment #CleanCode #DeveloperExperience #Python #AIDevelopment
To view or add a comment, sign in
-
-
🚀 What if your dev team never slept? We just published the AgentFlow Roadmap a full visual guide to our open-source autonomous AI dev team that takes GitHub issues and turns them into merged PRs without human intervention. Here's what the team looks like: 🧠 NEXUS — Orchestrator. Discovers issues, assigns work, recovers from crashes, approves dangerous commands. 🔨 FORGE — Builder. Spawns Claude Code, implements the solution in an isolated worktree, opens PRs. 🔍 SENTINEL — Reviewer. Reviews plans, evaluates code segments, enforces test coverage and security. 🚢 VESSEL — DevOps. Polls CI, squash-merges PRs, handles conflict rework directly with FORGE. 📝 LORE — Documenter. Writes ADRs, changelogs, and project documentation. All built in Rust + Tokio. Connected through a shared state store (Redis in prod). Routed by a cyclic flow engine where each agent returns an action and the graph determines what happens next. The roadmap covers: → Foundation layer (PocketFlow Core, multi-provider LLM client, GitHub REST API) → All 5 agents with their responsibilities, decision priority, and failure recovery → Flow graph + routing table + ticket/worker lifecycles → Plugin architecture with 37 skills, 11 commands, and per-agent hooks → Per-agent model routing via LiteLLM (Claude for coding, Gemini for review, Groq for DevOps — cost-optimized) → What's coming: milestone-aware sprint reviews, AgentFlow Hub marketplace, one-command install Key design decision I'm proud of: VESSEL routes merge conflicts directly back to the same FORGE worker that created the PR — same worktree, same context, no NEXUS round-trip. Fast recovery. Check it out → https://lnkd.in/eQnmfjXF We're building this in the open. Contributions welcome. #AgentFlow #AutonomousAgents #AI #Rust #OpenSource #DevOps #LLM #SoftwareEngineering #AgenticAI
To view or add a comment, sign in
-
I was trying to build a multi-agent system in Go. Not a toy - a real pipeline with multiple agents, tool calls, dependency ordering, and failure handling. I looked at what existed. There were Python frameworks I could have wrapped. There were suggestions to use the OpenAI SDK directly and wire everything manually. There were some Go repos doing pieces of it; scheduling, or tool calling, or LLM client wrappers, but nothing that handled the full runtime problem: dependency-aware scheduling, parallel execution, failure policies, and MCP integration together in one place. So I started building what I needed for my specific use case. The result is 𝗥𝗼𝘂𝘁𝗲𝘅- an open-source YAML-driven multi-agent runtime for Go, shipped as both a library and a CLI. Here's what it does: - You define your agents, tools, models, and dependencies in YAML. - You declare which agents depend on which. - You configure retry policies, timeouts, and MCP tool server connections per agent. Then you run the runtime, and it handles everything- parsing the dependency graph, scheduling agents in the correct order, running independent agents in parallel, executing tool calls concurrently within each agent, retrying failures according to your policy, and passing outputs between agents. No magic. No hidden state. No framework code you can't read and understand. It's on GitHub - It's early, rough edges and all, but it works. If you're building AI systems in Go, or curious why you should be — I'd love your feedback. And if you find it interesting, a star on the repo goes a long way. More to come. https://lnkd.in/gK-bXDZQ.
To view or add a comment, sign in
-
The hardest problem in running parallel AI coding agents is not the coding. It is the merging. Five agents finish their tasks at roughly the same time. Five pull requests target main. That is ten potential merge conflicts, and git's text-based merge cannot resolve most of them because they are structural, not textual. Two agents adding different imports to the same file. Two agents extending the same configuration object. Two agents creating similar utility functions. Before we solved this, merge conflicts were the actual bottleneck in our agent fleet. Not API rate limits. Not context windows. Not model capability. Merge conflicts. The solution was a sequential merge queue. Agent PRs enter a queue and are processed one at a time: rebase onto latest main, run an AST-aware merge driver that understands code structure (not just text lines), regenerate lock files, run the full test suite, then merge. If any step fails, the PR goes back to the end of the queue with a fresh rebase. The AST-aware merge driver is the key insight. Traditional git sees two agents adding import lines to the same file and calls it a conflict. A driver that understands TypeScript syntax sees two non-overlapping additions to an import block and merges them automatically. This runs locally, with no dependency on GitHub's paid merge queue feature. It is a Redis-backed sidecar that auto-starts with the agent fleet. The lesson: scaling AI agents is not about spawning more of them. It is about building the infrastructure that lets their work converge cleanly. The merge queue became the constraint before CPU, memory, or API limits. What infrastructure bottlenecks have you hit when scaling AI-generated code? #AIAgents #SoftwareEngineering #DevTools
To view or add a comment, sign in
-
Running Claude Code at your terminal and running it inside a GitHub Actions runner look almost identical. They are not. At the terminal, if Claude goes into a retry loop, you notice it in seconds and kill it. In CI, nothing stops it except a timeout you probably did not set, and the first sign of trouble is the API bill at the end of the month. The defaults that make interactive use pleasant are the same defaults that make unattended use expensive. I put together a guide on Refactix covering the flags and patterns that actually hold up in a pipeline. The three flags that matter (-p, --output-format json, and --bare), a GitHub Actions workflow that fails gracefully, the JSON parser you need between Claude and your GitHub API, cost controls that keep review jobs bounded, and the security boundaries that stop a prompt injection in a PR diff from turning into a secrets leak. It also covers when to graduate from the CLI to the Claude Agent SDK or Dispatch for bigger workflows. Worth a read if you are thinking about wiring any AI agent into your CI, not just Claude Code. The shape of the problem generalizes. Full guide on Refactix: https://lnkd.in/grDAv5qA #ClaudeCode #CICD #AI #DevOps #SoftwareEngineering
To view or add a comment, sign in
-
Explore related topics
- Tools for Agent Development
- How to Build Production-Ready AI Agents
- Steps to Build AI Agents
- How to Build Agent Frameworks
- How to Use AI Agents to Optimize Code
- How AI Agents Are Changing Software Development
- Optimizing AI Email Agent Performance
- How to build multiple AI assistants for insurance
- How to Boost Productivity With Developer Agents
- How to Streamline AI Agent Deployment Infrastructure
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development