I'm transferring GuardSpine's verification kernel from Python to Lean 4. The migration's painful enough that I built a toolkit to make it repeatable. So I'm giving it away. Quick context. GuardSpine is my open-source AI governance framework — 16 repos, SHA-256 hash chains, Apache 2.0 licensed. It answers one question: "Who authorized this semantic change?" The core kernel works. But it's Python. Python is great for getting something running. It's terrible when you need to prove that something is correct. Lean 4 is a formal proof language. The compiler mathematically verifies your code. Not "it passed the tests" — the compiler won't let you ship anything that isn't provably correct. We're in the process of moving critical verification components over now. It's slow. It's tedious. And the tooling gap between Python and Lean is brutal — so I built lean-python-migration-kit to bridge it. It's on GitHub. This matters beyond my project. AI agents don't just write code for humans anymore. They write code for other agents. Agent A generates a function. Agent B calls it. Agent C chains it into a workflow nobody reviewed. Who's verifying any of this? Right now — mostly nobody. Maybe some unit tests. Maybe a human glances at it. That's not going to hold when agents are autonomously composing systems at scale. Zero-trust isn't just a network security concept. It's becoming an AI architecture requirement. Every artifact an agent produces — code, configs, documents — needs cryptographic proof of integrity before another agent should touch it. The research backs this up. VeriBench (AI4Math workshop, ICML 2025) found Claude 3.7 Sonnet could only compile about 12.5% of formal verification challenges in Lean 4. But a self-optimizing agent architecture hit nearly 90%. Agents with iterative self-correction are already dramatically better at proving code correct than single-shot models. The money's following. Harmonic has raised nearly $300M building "hallucination-free" AI on Lean 4's backbone — valued at $1.45B as of late 2025. Every AI system that hit medal-level performance at the International Math Olympiad used Lean. Google DeepMind, ByteDance, Mistral — all building on it. Proof code isn't academic anymore. It's infrastructure. My bet: within 3 years, "unverified agent output" will sound as reckless as "unencrypted database." The governance layer between agents won't be API keys and permissions. It'll be mathematical proof. That's why GuardSpine needs a formally verified kernel. Still early. The migration is ongoing and the toolkit is rough in places. But the direction is clear. If you're building agent infrastructure or thinking about AI governance — it's free: https://lnkd.in/eyTVWWe8 #AIGovernance #FormalVerification #Lean4 #OpenSource #GuardSpine
GuardSpine Migrates to Lean 4 with Verification Toolkit
More Relevant Posts
-
Most teams start by serving ML l models directly from Python. It feels natural, but the hidden cost shows up later. The real problem is that Python is great for model logic but not for high‑throughput serving. You end up with a single stack that struggles with concurrency, latency, and observability. A practical pattern I have seen work: - Keep the model runtime in Python (PyTorch, TensorFlow, etc.). - Wrap it with a thin Go microservice that handles request routing, metrics, retries, and concurrency. - Let Python focus only on inference, not on being a production web server. Why this matters: - Latency: Go can handle thousands of concurrent requests with predictable performance. Python workers stay focused on math. - Cost scaling: If each Python worker handles ~50 QPS, and you need 5k QPS, that is 100 workers. With Go in front, you can reduce idle overhead and cut infra spend by 20–30%. - Observability: Go integrates cleanly with Prometheus, tracing, and structured logging. Python does not have to carry that burden. - Failure handling: Go can implement circuit breakers and backpressure so Python does not collapse under spikes. The trade‑offs are real. You add complexity with two runtimes. You need to manage inter‑process communication and deployment. But the business impact is clear: lower latency means higher conversion, fewer servers means better margins, and cleaner isolation means less risk when scaling. I am curious how others have approached this. If you've had to serve ML models at scale, did you stick with Python alone, or split responsibilities across languages? What worked best for you?
To view or add a comment, sign in
-
-
LandingAI just open-sourced ade-python — a Python SDK that enables agentic document extraction, turning complex documents into structured data for AI workflows. 🚀 🔗 https://lnkd.in/gWmAhXCb
To view or add a comment, sign in
-
*Day 26 - The 30-Day AI & Analytics Sprint* 🚀 Python supports multiple inheritance, which allows a class to inherit from multiple parent classes. However, this can create ambiguity in method resolution. Question? Explain: What is MRO (Method Resolution Order) in Python? How does Python decide which parent method to call first? Why does Python use the C3 Linearization algorithm? Give a real example where multiple inheritance may cause confusion. MRO ==> is the order in which Python searches for a method or attribute in a class hierarchy This becomes important when multiple inheritance is used Python determines this order using the .mro() method or the __mro__ attribute When you call a method on an object, Python needs to know which class to check first Example: class A: def show(self): print("A") class B(A): pass class C(B): pass print(C.mro()) Output: [C, B, A, object] Python will search for the method in this order - Python decide which parent method to call Python follows the MRO list from left to right - Why does Python use the C3 Linearization Algorithm? Python uses C3 Linearization to create a consistent and predictable order for method lookup. It guarantees three important rules: 1- Child classes come before parents 2- The order of parent classes is respected 3- A class appears only once in the hierarchy - Real Example of Confusion (Diamond Problem) class A: def show(self): print("A") class B(A): pass class C(A): def show(self): print("C") class D(B, C): pass obj = D() obj.show() Let's check the MRO: print(D.mro()) Output: [D, B, C, A, object] Result: C How??? Python checks: D B C ✅ (method found) A object Even though B inherits from A, Python does not go to A first. It follows the C3 MRO order MRO determines the order Python searches for methods. Python checks classes from left to right based on the MRO list. Python uses C3 Linearization to avoid ambiguity in multiple inheritance Mariam Metawe'e Muhammed Al Reay Instant Software Solutions
To view or add a comment, sign in
-
𝗧𝗵𝗶𝗻𝗸𝗶𝗻𝗴 𝗜𝗻 𝗚𝗼: 𝗔 𝗣𝘆𝘁𝗵𝗼𝗻 𝗗𝗲 v'𝘀 𝗚𝘂𝗶𝗱𝗲 You've written Python for years. You know how to use pip and asyncio. But then someone says "we're moving to Go." This guide is not an introduction to Go. It's about changing your mindset. Python and Go are different on the surface and underneath. In Python, you have the GIL and the event loop. You've used threading and asyncio. Go is different. It has goroutines and channels. Goroutines are lightweight threads. Channels are how goroutines talk to each other. Here's an example of fetching multiple URLs concurrently in Python and Go: - Python uses asyncio and aiohttp - Go uses goroutines and channels Go has no event loop. It's truly parallel. Every goroutine can run on a separate CPU core. In Go, context.Context is important. It carries deadlines and cancellation signals. You must cancel a context to avoid memory leaks. You'll notice some things are missing in Go: - No try/except. Errors are values, not exceptions. - No None. Go uses zero values. - No default function arguments. Use config structs instead. - No context manager. Use defer instead. - No enums. Use iota with a const block. - No decorators. Use closures instead. Python hides references. Go makes them explicit. You need to think about value vs reference semantics. The biggest difference: Python's slice syntax gives you copies. Go slices share backing memory. Some quick mappings: - list → slice or array - dict → map Go enforces a strict DAG for imports. No circular imports. The compiler will stop you. You can achieve everything you did with Python classes in Go, just differently. Use structs with methods attached. No constructor. Write a factory function instead. Encapsulation works through naming. Lowercase is private, capitalized is public. Interfaces are implicitly satisfied. You don't write implements. Polymorphism is interface-based, not inheritance-based. Composition is the default. Go has a simple toolchain: - go fmt formats your code - go test runs your tests - go build compiles everything - go vet catches suspicious constructs - go mod manages dependencies Source: https://lnkd.in/gKeYeMFS
To view or add a comment, sign in
-
Two years ago, Sam Thach, Caleb Hart, Joshua Aguayo, and I took on a formidable project: 𝐁𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐚 𝐬𝐲𝐬𝐭𝐞𝐦 𝐭𝐡𝐚𝐭 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐞𝐬 𝐦𝐞𝐚𝐧𝐢𝐧𝐠𝐟𝐮𝐥 𝐝𝐨𝐜𝐬𝐭𝐫𝐢𝐧𝐠𝐬 𝐟𝐨𝐫 𝐏𝐲𝐭𝐡𝐨𝐧 𝐟𝐮𝐧𝐜𝐭𝐢𝐨𝐧𝐬. 𝐖𝐡𝐲 𝐭𝐡𝐢𝐬 𝐩𝐫𝐨𝐣𝐞𝐜𝐭? We explored several NLP-based ideas such as translation systems, auto-documentation, and text generation, but ultimately landed on a Python Code Commenter because it solved a problem we all had firsthand experience with. 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠 𝐥𝐞𝐠𝐚𝐜𝐲, 𝐩𝐨𝐨𝐫𝐥𝐲 𝐝𝐨𝐜𝐮𝐦𝐞𝐧𝐭𝐞𝐝 𝐜𝐨𝐝𝐞. 𝐓𝐡𝐞 𝐠𝐨𝐚𝐥 𝐰𝐚𝐬 𝐬𝐭𝐫𝐚𝐢𝐠𝐡𝐭𝐟𝐨𝐫𝐰𝐚𝐫𝐝: Take a file of uncommented Python functions and return the same code with autogenerated docstrings that explain what each function does. 𝐖𝐡𝐚𝐭 𝐰𝐞 𝐛𝐮𝐢𝐥𝐭: • Fine-tuned a T5 transformer model using PyTorch • Trained on a non‑proprietary dataset from Hugging Face • Framed the problem correctly as a code-to-text translation task • Generated docstrings only for valid function definitions to ensure reliability • Designed and implemented a Tkinter GUI so the tool was usable by non-ML users By the end, we had a functional prototype that could process large volumes of uncommented code and meaningfully document function definitions, landing an average accuracy score of ~1.43/2 across independent evaluations (with 2 being amazing). 𝐒𝐜𝐨𝐩𝐞 𝐜𝐡𝐚𝐧𝐠𝐞𝐬 & 𝐫𝐞𝐚𝐥-𝐰𝐨𝐫𝐥𝐝 𝐜𝐨𝐧𝐬𝐭𝐫𝐚𝐢𝐧𝐭𝐬: This project was a great lesson in adapting plans to reality. 𝘞𝘩𝘢𝘵 𝘴𝘵𝘢𝘳𝘵𝘦𝘥 𝘢𝘴: • Line-by-line comments • CodeBERT • Broad scope 𝘌𝘷𝘰𝘭𝘷𝘦𝘥 𝘪𝘯𝘵𝘰: • Function-level docstrings • Switching from CodeBERT to T5 • A narrower, more robust and defensible solution 𝘈𝘭𝘰𝘯𝘨 𝘵𝘩𝘦 𝘸𝘢𝘺, 𝘸𝘦 𝘥𝘦𝘢𝘭𝘵 𝘸𝘪𝘵𝘩: • School-imposed security restrictions • Insufficient hardware and delayed access to Data Science machines • Shared environment issues • Version control growing pains • Team availability constraints during a compressed timeline None of these stopped the project, but 𝐚𝐥𝐥 𝐨𝐟 𝐭𝐡𝐞𝐦 𝐟𝐨𝐫𝐜𝐞𝐝 𝐮𝐬 𝐭𝐨 𝐜𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐞 𝐛𝐞𝐭𝐭𝐞𝐫, 𝐫𝐞𝐩𝐫𝐢𝐨𝐫𝐢𝐭𝐢𝐳𝐞 𝐜𝐨𝐧𝐬𝐭𝐚𝐧𝐭𝐥𝐲, 𝐚𝐧𝐝 𝐦𝐚𝐤𝐞 𝐩𝐫𝐚𝐠𝐦𝐚𝐭𝐢𝐜 𝐭𝐞𝐜𝐡𝐧𝐢𝐜𝐚𝐥 𝐝𝐞𝐜𝐢𝐬𝐢𝐨𝐧𝐬. 𝐖𝐡𝐚𝐭 𝐈 𝐭𝐨𝐨𝐤 𝐚𝐰𝐚𝐲: Beyond the technical skills, this project reinforced lessons I still apply today: 1. Scope management matters! 2. “Crunch” is real, and planning for it is essential 3. Many problems already have partial solutions; understanding them is half the job 4. Framing the problem correctly can unlock progress 𝐌𝐨𝐬𝐭 𝐢𝐦𝐩𝐨𝐫𝐭𝐚𝐧𝐭𝐥𝐲, 𝐢𝐭 𝐫𝐞𝐦𝐢𝐧𝐝𝐞𝐝 𝐦𝐞 𝐡𝐨𝐰 𝐦𝐮𝐜𝐡 𝐬𝐭𝐫𝐨𝐧𝐠𝐞𝐫 𝐨𝐮𝐭𝐜𝐨𝐦𝐞𝐬 𝐚𝐫𝐞 𝐰𝐡𝐞𝐧 𝐚 𝐭𝐞𝐚𝐦 𝐬𝐭𝐢𝐜𝐤𝐬 𝐭𝐡𝐫𝐨𝐮𝐠𝐡 𝐮𝐧𝐜𝐞𝐫𝐭𝐚𝐢𝐧𝐭𝐲 𝐚𝐧𝐝 𝐟𝐫𝐢𝐜𝐭𝐢𝐨𝐧 𝐢𝐧𝐬𝐭𝐞𝐚𝐝 𝐨𝐟 𝐚𝐛𝐚𝐧𝐝𝐨𝐧𝐢𝐧𝐠 𝐭𝐡𝐞 𝐩𝐫𝐨𝐛𝐥𝐞𝐦. If you’re curious, the full project can be found here: 🔗 https://lnkd.in/gPNaKjv3
To view or add a comment, sign in
-
This isnt a Golang vs Python tiff. When you are building for scale a small tech decision can make or break your finances. Especially when the margins are very thin. RapidaAI
"Why Go? The entire AI ecosystem is in Python." Every CTO evaluating Rapida asks this. It is the right question. Here is why we made that tradeoff. A voice call processes 50 audio frames per second. Each frame is 20ms. If your runtime pauses for 10ms to collect garbage, that is an audible glitch. Not a metric. A glitch your user hears. Python's GC can hit 10-50ms pauses under allocation-heavy load. Go's GC typically stays in microsecond-level pauses. That is the entire argument in one line. But the numbers go deeper. Every concurrent call in our pipeline spawns 24 goroutines at peak. Four priority dispatchers, RNNoise denoiser, Silero VAD, STT streamer, LLM streamer, TTS streamer, recording, session manager, transport handler, lifecycle hooks, and auxiliary workers. Goroutines start at KB-scale stacks. Python concurrency, whether threads or async, carries significantly higher overhead per task. We benchmarked both on a c8gn.2xlarge (8 vCPU, 16 GiB). At 487 concurrent calls: - Total RSS. Go: 461 MB. Python: 4.53 GB. - CPU. Go: 54%. Python: 93%. Python also serializes CPU-bound work under the GIL. - Heap allocs on the hot path. Go: 0 (sync.Pool). Python: 3-5 objects per frame. At 1,843 concurrent calls, Python needs 15.8 GB. Go does it in 1.63 GB. On the same hardware. Yes, we gave up the Python ecosystem. Every LLM library, every STT SDK, every sample repo. We built 38 provider integrations from scratch. 12 STT. 15 TTS. 11 LLM. That cost was real. But when your runtime is processing 24,000 audio frames per second, the language is not an abstraction you can swap later. It is the foundation everything else sits on. If this is useful, star the repo. It helps more engineers find it. https://lnkd.in/gqhX6RHN
To view or add a comment, sign in
-
-
Ever wondered why scaling Python-based LLM engines can sometimes feel like wrestling an octopus? 🐙 Let's talk about the architecture beneath the surface. If you watch the terminal video I attached—pay special attention to the pstree command running in the top right terminal—you'll see exactly how vLLM manages its processes under the hood. It perfectly illustrates a fundamental, often-overlooked challenge in how we serve Large Language Models today. 🎬 Here is what's really happening and why it matters for your hardware: 1. The Python GIL Bottleneck 🐍 vLLM is a fantastic engine, but it is deeply tied to Python. Because of Python’s Global Interpreter Lock (GIL), the engine historically couldn't achieve true multi-core parallel execution using standard threads. 2. The "Forking" Workaround 🍴 To bypass the GIL, vLLM relies on multiprocessing—spawning and forking entirely separate processes rather than threads. Think of it this way: If multithreading is like adding more chefs to a single, shared kitchen, multiprocessing is like building an entirely separate replica kitchen next door for every new chef. 🧑🍳 3. The CPU Affinity Nightmare 🧩 When you are optimising for maximum performance on modern hardware—dealing with NUMA node constraints, mixed P/C cores, or exotic custom SoCs—you need to pin specific workloads to specific CPU cores. Because vLLM forks into separate processes, you can't just constrain them all to a single neat Thread Group ID (TGID). They scatter across the OS scheduler. 🎉 4. The vLLM Waiting Game ⏱️ Pinning cores in vLLM is notoriously tricky. You can't just set a rule and walk away. You have to wait for the engine to initialise, profile its memory, kill off temporary forks, and finally settle into a "steady state." 🧘♂️ Only then can you run an external script to hunt down the active PIDs and apply ad-hoc affinity rules—often requiring two or more distinct rules just for one vLLM instance. 5. The Python 3.14 Reality Check 🔮 Won't Python 3.14 and the highly anticipated "no-GIL" free-threading fix this? Short answer: No. 🛑 Even when true multi-threading becomes a reality in Python, vLLM's distributed architecture is already deeply anchored in spawned tasks and Inter-Process Communication. 6. The llama.cpp Advantage ⚡ Contrast this architecture with llama.cpp. Because it is written in pure C++, there is no GIL to fight, and it simply does not fork. 🛡️ The result? You can apply a single, clean CPU affinity rule to the parent process right at startup, and it instantly applies to everything—even while the model is still loading. One rule, zero scattered processes, and absolutely no waiting. 💯 Choosing your inference stack isn't just about maximum tokens-per-second; it's about how nicely that stack plays with your bare-metal hardware. 🏗️ #LLM #vLLM #MachineLearning #MLOps #Python #LlamaCPP #AIHardware #InferenceEngine #SoftwareEngineering #PerformanceTuning
To view or add a comment, sign in
-
🚨 This Python tool just made vector databases optional for RAG. It's called PageIndex. It reads documents the way you do. No embeddings. No chunking. No vector database needed. Here's the problem with normal RAG: It takes your document, cuts it into tiny pieces, turns those pieces into numbers, and searches for the closest match. But closest match doesn't mean best answer. PageIndex works completely different. → It reads your full document → Builds a tree structure like a table of contents → When you ask a question, the AI walks through that tree → It thinks step by step until it finds the exact right section Same way you'd find an answer in a textbook. You don't read every page. You check the chapters, pick the right one, and go straight to the answer. That's exactly what PageIndex teaches AI to do. Here's the wildest part: It scored 98.7% accuracy on FinanceBench. That's a test where AI answers real questions from SEC filings and earnings reports. Most traditional RAG systems can't touch that number. Works with PDFs, markdown, and even raw page images without OCR. 100% Open Source. MIT License.
To view or add a comment, sign in
-
-
Python vs Mojo: Evaluating the Future of AI Development Python vs. Mojo: Is It Time to Switch for AI Development? Mojo markets itself as the language that gives developers Python’s friendly syntax while unlocking C-like speed. The promise is unmistakably attractive to AI engineers who wrestle with enormous datasets, latency-sensitive inference, and GPU/TPU pipelines. In this article, we explore whether Mojo’s performance claims translate into tangible benefits for production-grade AI systems or if the mature Python ecosystem still rules the stack....
To view or add a comment, sign in
Explore related topics
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development