I spent weeks doing the same feature engineering steps manually on every project. Missing value maps. Outlier detection. Linearity checks. Cramér's V. VIF. RFECV. So I built a Python package that does all of it automatically. Introducing featurewiz-pro — a 7-phase feature engineering pipeline I designed, built, and published to PyPI from scratch. One command. 47 seconds. Clean, model-ready data. What it does: → Profiles your data and drops useless columns automatically → Detects which features are linear vs non-linear → Expands non-linear features with splines → Screens for multicollinearity and interactions → Selects the best features using RFECV + permutation importance → Audits for data leakage before you ever touch a model Tested across Python 3.9–3.12. 55 tests. 0 failures. Live on PyPI now → pip install featurewiz-pro Walkthrough in the video Git: https://lnkd.in/dHPV7xNK #Python #MachineLearning #DataScience #OpenSource #FeatureEngineering #PyPI
More Relevant Posts
-
Spent 5 days chasing ghosts—DLL hell and ABI mismatches. I followed the agentic debugger down the wrong path as it hallucinated at a wrong layer because it misread the WinError 1114 as a load-path issue rather than a missing export. The actual fix was two lines. I used TORCH_LIBRARY when I needed PYBIND11_MODULE. The Architecture Gap: - Use TORCH_LIBRARY to register ops into the PyTorch C++ Dispatcher (accessed via torch.ops). It fires static C++ constructors at DLL load time but does not create a PyInit_* function. Python can't "see" it as a module. - Use PYBIND11_MODULE to generate the standard Python C Extension entry point. This generates the PyInit_{name} entry point Python needs to "see" the module. The error was literal: "dynamic module does not define module export function." No PyInit_* existed because TORCH_LIBRARY isn't meant to be imported directly. {just correcting the record} #CPP #PyTorch #SystemsProgramming #MachineLearning #barebones #3D
To view or add a comment, sign in
-
🚀 Another LeetCode Problem Solved: Palindrome Number! 🔗 Check out my solution: https://lnkd.in/dwDMqXXn 💡 Problem Overview Given an integer x, determine whether it is a palindrome — meaning it reads the same forward and backward. (LeetCode) Examples: ✔ 121 → Palindrome ❌ -121 → Not a palindrome ❌ 10 → Not a palindrome 🧠 My Approach (Digit Reversal) Instead of converting the number to a string, I used a mathematical approach: Extract digits using % 10 Reverse the number step by step Compare reversed number with original ⚙️ Key Learnings ✔ Strong understanding of number manipulation ✔ Importance of handling edge cases (negative numbers, trailing zeros) (leet-solution.com) ✔ Practicing clean and efficient logic ⏱️ Complexity • Time Complexity: O(log n) • Space Complexity: O(1) 🔥 Why this problem matters Even though it’s an “easy” problem, it builds: Logical thinking Problem-solving fundamentals Confidence for bigger challenges #LeetCode #DSA #Python #CodingJourney #ProblemSolving #100DaysOfCode
To view or add a comment, sign in
-
I've shared requirements.txt files generated with pip freeze and watched them fail on every machine that wasn't mine. So I built envcore. Because waiting for the Python ecosystem to fix a 15-year-old problem seemed optimistic. It hooks directly into Python's import system and records what actually loads while your code runs. Not what's installed on your machine. Not what a static scanner thinks might be imported. What. Actually. Runs. envcore trace train.py → env_manifest.json → envcore restore Clean, pinned, minimal manifest. Exact environment rebuilt anywhere. No 200-package soup, no missing runtime imports, no "works on my machine" as if that's a valid thing to say to another human. It also resolves import aliases correctly — PIL to Pillow, cv2 to opencv-python, sklearn to scikit-learn — because the gap between what you type and what you install has existed since forever and apparently needed one person to care. pip freeze has been lying to you for 15 years. Everyone accepted it. I got tired of it. 30 seconds to try: pip install envcore If it's useful, a GitHub star helps a new project get noticed. https://lnkd.in/dz3MFTbD #Python #OpenSource #DevTools
To view or add a comment, sign in
-
-
Worked on an Event Scheduler project in today’s CPSC 335 (Algorithms) class, implementing it in Python using a combination of data structures. I used a min-heap for efficient priority handling, a hash table for O(1) lookups, and a sorted list for time-based queries. This exercise was a great way to see how different data structures can be combined to balance performance and functionality, and how trade-offs play a key role in algorithm design.
To view or add a comment, sign in
-
𝐃𝐞𝐭𝐞𝐜𝐭 𝐇𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐢𝐨𝐧 𝐢𝐧 𝐑𝐀𝐆 𝐮𝐬𝐢𝐧𝐠 𝐋𝐞𝐭𝐭𝐮𝐜𝐞𝐃𝐞𝐭𝐞𝐜𝐭 LettuceDetect is a lightweight open-source tool for detecting hallucinations in RAG. It identifies unsupported parts of an answer by comparing it to the provided context. LettuceDetect uses ModernBERT model trained over RAGTruth dataset. 𝐊𝐞𝐲 𝐅𝐞𝐚𝐭𝐮𝐫𝐞𝐬 - Token-level precision: detect exact hallucinated spans - Optimized for inference: smaller model size and faster inference - 4K context window via ModernBERT - MIT-licensed models & code - HF Integration: one-line model loading - Easy to use python API: can be downloaded from pip and few lines of code to integrate into your RAG system
To view or add a comment, sign in
-
-
Always great to see LettuceDetect gaining more adoption! I think currently the biggest challenge in the hallucination detection space is dataset diversity. Most open benchmarks are plain text only, while real pipelines are full of tables, markdown, and generated code. Stay tuned because we have several releases in progress: structured data benchmarks and hallucination detection for code generation agents (Claude, Copilot, etc.). So expect LettuceDetect getting big updates 😊 Reach out if you'd like to collaborate or learn more!
𝐃𝐞𝐭𝐞𝐜𝐭 𝐇𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐢𝐨𝐧 𝐢𝐧 𝐑𝐀𝐆 𝐮𝐬𝐢𝐧𝐠 𝐋𝐞𝐭𝐭𝐮𝐜𝐞𝐃𝐞𝐭𝐞𝐜𝐭 LettuceDetect is a lightweight open-source tool for detecting hallucinations in RAG. It identifies unsupported parts of an answer by comparing it to the provided context. LettuceDetect uses ModernBERT model trained over RAGTruth dataset. 𝐊𝐞𝐲 𝐅𝐞𝐚𝐭𝐮𝐫𝐞𝐬 - Token-level precision: detect exact hallucinated spans - Optimized for inference: smaller model size and faster inference - 4K context window via ModernBERT - MIT-licensed models & code - HF Integration: one-line model loading - Easy to use python API: can be downloaded from pip and few lines of code to integrate into your RAG system
To view or add a comment, sign in
-
-
🧠✨ Reducing Hallucinations in RAG with LatticeDetect One of the biggest challenges in Retrieval-Augmented Generation (RAG) isn’t retrieval… It’s hallucination. Even with the right context, LLMs can still: ❌ Generate confident but incorrect answers ❌ Mix facts with assumptions ❌ Drift away from source truth So how do we fix this? 🚀 Enter: LatticeDetect Instead of blindly trusting generated responses, LatticeDetect adds a validation layer that ensures: ✔️ Responses stay grounded in retrieved context ✔️ Inconsistencies are detected early ✔️ Output aligns with factual evidence 💡 Think of it as: Turning RAG from a storyteller into a truth-teller. ⚙️ What changes with this approach? • Better factual accuracy • More reliable AI systems • Production-ready trust (not just demo magic) 🔥 In a world where LLMs can “sound right,” we need systems that are actually right. 👉 RAG + LatticeDetect = Reliability > Creativity #AI #GenerativeAI #RAG #LLM #MachineLearning #AIEngineering #DataScience #Innovation
𝐃𝐞𝐭𝐞𝐜𝐭 𝐇𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐢𝐨𝐧 𝐢𝐧 𝐑𝐀𝐆 𝐮𝐬𝐢𝐧𝐠 𝐋𝐞𝐭𝐭𝐮𝐜𝐞𝐃𝐞𝐭𝐞𝐜𝐭 LettuceDetect is a lightweight open-source tool for detecting hallucinations in RAG. It identifies unsupported parts of an answer by comparing it to the provided context. LettuceDetect uses ModernBERT model trained over RAGTruth dataset. 𝐊𝐞𝐲 𝐅𝐞𝐚𝐭𝐮𝐫𝐞𝐬 - Token-level precision: detect exact hallucinated spans - Optimized for inference: smaller model size and faster inference - 4K context window via ModernBERT - MIT-licensed models & code - HF Integration: one-line model loading - Easy to use python API: can be downloaded from pip and few lines of code to integrate into your RAG system
To view or add a comment, sign in
-
-
Happy to share 🛠️ my_mlir_track#6 Structured Control Flow (SCF) - for (Repo link: https://lnkd.in/gPE4Mrmk)! ✅ Induction Variables & Bounds: How to manage index types for loop control. ✅ Loop-Carried Variables: Using iter_args and scf.yield to maintain state across iterations. ✅ Python Integration: Compiling the logic into a shared library to bridge the gap between high-level orchestration and hardware-level execution. By utilizing this pipeline, we can generate highly optimized logic that remains easily accessible via a clean Python interface. (NOTE)Since I’ll be moving toward an assembly-like style in the next stage, I'm concluding the SCF series with this brief for loop entry. The next post will also be short; as the code becomes less "human-readable," keeping the content minimal should make it easier to digest. #LearningInPublic #Compiler #MLIR #SCF #for #C++ #HighPerformanceComputing
To view or add a comment, sign in
-
-
Turbovec is now available on PyPi 🐍 and Crates.io 📦 Turbovec is a vector index built on Google's TurboQuant algorithm, written in Rust with Python bindings. Turbovec has identical or better speed, compression and recall compared to Faiss while also being data-oblivious. Because adding new vectors doesn't require re-indexing, Turbovec is dramatically simpler to operate in production. → pip install turbovec → cargo add turbovec Check out the open-source repo: https://lnkd.in/e5M4dVRk #RAG #LLM #OpenSource #Gemma4
To view or add a comment, sign in
-
Explore related topics
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development