CodeGenie - Chat with your code

Sunil Kumar Goyal

Published Apr 9, 2025

Have you ever encountered chatbots that allow you to interact seamlessly with documents, swiftly identifying relevant sections and summarizing complex information? Now imagine applying this powerful capability directly to your codebase. As our organization's codebases grow larger and more complex, navigating and understanding them becomes increasingly challenging. What if there were an application enabling you to converse naturally with your entire codebase?

Picture yourself opening a chat interface and simply asking, "Could you point me to the files and methods responsible for cart checkout?" Instantly, you receive a concise summary highlighting relevant components, significantly accelerating your debugging process during production incidents. Envision further instructing the system: "Could you add dynamic pricing logic to the cart checkout method, configured through a JSON file placed in the existing config folder?" Moments later, the app presents the modified files for your review and approval.

This vision has driven me for a long time. Recognizing this widespread issue, I dedicated time to researching how to efficiently parse programming languages at scale and how to build a local solution capable of running seamlessly on your laptop without relying on paid external LLM services.

Today, I'm excited to share a proof-of-concept (POC) I developed to address this very challenge. Below, I provide the detailed implementation insights from this project. I hope you find it valuable and insightful, and I welcome your thoughts and feedback.

I named this project CodeGenie - an intelligent code search and analysis platform that combines the power of local LLMs, semantic search, and advanced code parsing to revolutionize how developers interact with codebases.

System Architecture

Detailed Component Analysis

Frontend Architecture

The frontend is built using modern web technologies:

- React + TypeScript: For type-safe, maintainable code

- Material-UI: For consistent, responsive design

- Monaco Editor: For code editing and syntax highlighting

- WebSocket Client: For real-time updates

- State Management: Using React Context and custom hooks

Key Features:

- Real-time file tree visualization

- Interactive code search interface

- Syntax highlighting for multiple languages

- Progress indicators for long-running operations

- Error handling and user feedback

Backend Architecture

The backend is built using FastAPI, providing:

- Async Processing: For handling multiple requests efficiently

- WebSocket Support: For real-time communication

- API Documentation: Automatic OpenAPI/Swagger docs

- Error Handling: Comprehensive error management

- Authentication: JWT-based security

Key Components:

class CodeIndexer:

    def init(self):

        self.llm_client = OllamaClient(model="llama3:8b")
        self.parser = Parser()
        self.vector_db = VectorDB()
        self.db = Database()

Code Processing Pipeline

The code processing pipeline is the heart of the system:

Language Detection

def detect_language(self, file_path: str) -> str:
    """Detect programming language using multiple strategies."""
    try:
        # Try file extension first
        ext = os.path.splitext(file_path)[1].lower()
        if ext in self.language_map:
            return self.language_map[ext]
        # Fallback to content analysis
        return self._detect_language_by_content(file_path)
    except Exception as e:
        self.logger.error(f"Error detecting language: {str(e)}")
        return "unknown"

Code Parsing

Using Tree-sitter for robust parsing:

def extractmethods_with_tree_sitter(self, content: str, language: str) -> List[Dict[str, Any]]:
    """Extract methods using tree-sitter parser."""
    try:
        parser = self.languages.get(language)
        if not parser:
            return []
        tree = parser.parse(bytes(content, 'utf8'))
        return self._process_tree_sitter_tree(tree, content)
    except Exception as e:
        self.logger.error(f"Error in tree-sitter parsing: {str(e)}")
        return []

LLM Integration

Using Ollama to run Llama 3 locally:

class OllamaClient:
    def init(self, model: str = "llama3:8b"):
        self.model = model
        self.base_url = "http://localhost:11434"
        
    async def generate(self, prompt: str) -> str:
        """Generate text using the local LLM."""
        try:
            response = await self._make_request(prompt)
            return self._process_response(response)
        except Exception as e:
            self.logger.error(f"Error generating text: {str(e)}")
            return ""

Vector Search Implementation

The vector search system combines multiple technologies:

Embedding Generation

def generateembeddings(self, content: str) -> np.ndarray:
    """Generate embeddings using sentence transformers."""
    try:
        return self.model.encode([content])[0]
    except Exception as e:
        self.logger.error(f"Error generating embeddings: {str(e)}")
        return np.zeros(self.dimension)

Index Management

def loador_create_index(self):
    """Load existing index or create a new one."""
    try:
        if os.path.exists(self.index_path):
            self.index = faiss.read_index(self.index_path)
        else:
            self.index = faiss.IndexFlatL2(self.dimension)
    except Exception as e:
        self.logger.error(f"Error managing index: {str(e)}")
        self.index = faiss.IndexFlatL2(self.dimension)

Recommended by LinkedIn

Gemini CLI vs. ChatGPT Codex: A Developer's Comparison

Zisanur Haque 9 months ago

Practical notes on coding with LLMs (opinionated)

Karthik Iyengar 3 months ago

Codex App by OpenAI: What Is ChatGPT Codex, How It…

Ashish Rajpurohit 2 months ago

Technical Challenges and Solutions

Local LLM Integration

Challenge: Running large language models locally while maintaining performance.

Solution:

- Used Ollama to run Llama 3 8B model locally

- Implemented efficient prompt engineering

- Added caching for common queries

- Optimized model parameters

class OllamaClient:
    def init(self, model: str = "llama3:8b"):
        self.model = model
        self.cache = {}
        
    async def generate(self, prompt: str) -> str:
        """Generate text with caching."""
        if prompt in self.cache:
            return self.cache[prompt]
            
        response = await self._make_request(prompt)
        self.cache[prompt] = response
        return response

Code Understanding

Challenge: Accurately understanding and summarizing code.

Solution:

- Combined tree-sitter parsing with LLM analysis

- Created specialized prompts for code understanding

- Implemented method-level analysis

- Added import/export relationship tracking

def analyze_code(self, content: str, language: str) -> Dict[str, Any]:
    """Analyze code using multiple techniques."""
    try:
        # Parse code structure
        structure = self.parser.parse(content, language)
        
        # Generate LLM summary
        summary = self.llm_client.generate(
            f"Summarize this {language} code:\n{content}"
        )
        
        # Extract relationships
        relationships = self._extract_relationships(structure)
        
        return {
            "structure": structure,
            "summary": summary,
            "relationships": relationships
        }
    except Exception as e:
        self.logger.error(f"Error analyzing code: {str(e)}")
        return {}

Search Performance

Challenge: Providing fast, accurate search results.

Solution:

- Implemented FAISS for efficient similarity search

- Added hybrid search combining semantic and keyword search

- Created specialized indices for different code elements

- Optimized embedding generation

def search(self, query: str, filters: Dict[str, Any] = None) -> List[Dict[str, Any]]:
    """Perform hybrid search."""
    try:
        # Generate query embedding
        embedding = self._generate_embeddings(query)
        
        # Perform semantic search
        semantic_results = self._semantic_search(embedding)
        
        # Perform keyword search
        keyword_results = self._keyword_search(query)
        
        # Combine and rank results
        return self._combine_results(
            semantic_results,
            keyword_results,
            filters
        )
    except Exception as e:
        self.logger.error(f"Error in search: {str(e)}")
        return []

Technology Stack Details

1. Local LLM Infrastructure

- Ollama: For running LLMs locally

- Llama 3 8B: For code understanding and generation

- Prompt Engineering: Specialized prompts for code tasks

- Caching System: For performance optimization

2. Search Infrastructure

- FAISS: For efficient similarity search

- Sentence Transformers: For generating embeddings

- Hybrid Search: Combining semantic and keyword search

- Metadata Filtering: For precise result filtering

3. Code Analysis

- Tree-sitter: For robust code parsing

- Language Detection: Multi-strategy approach

- Method Extraction: Both parser and regex-based

- Import Analysis: For understanding code relationships

4. Storage System

- SQLite: For metadata and relationships

- FAISS Index: For vector storage

- File System: For code content

- Cache System: For performance optimization

Code

You can access the codebase at this location - https://github.com/sunil-goyal-1502/querycodegenie . Please excuse for any errors as its a work in proggress.

Conclusion

QueryCodeGenie represents a significant advancement in code search and analysis tools. By combining local LLMs with advanced code parsing and semantic search, I've created a powerful platform that helps developers understand and navigate codebases more efficiently.

The system's modular design, comprehensive error handling, and efficient use of local resources make it both powerful and maintainable. As I continue to enhance its capabilities, I am excited about the potential to revolutionize how developers interact with code.

What challenges have you faced in code search and analysis? I'd love to hear your thoughts and experiences in the comments below!

#CodeSearch #MachineLearning #SoftwareEngineering #DeveloperTools #LLM #LocalAI #CodeAnalysis #TechInnovation #TechInUAE #Dubai #AbuDhabi #UAE #USA #TechInSingapore

To view or add a comment, sign in

CodeGenie - Chat with your code

Sunil Kumar Goyal

System Architecture

Detailed Component Analysis

Recommended by LinkedIn

More articles by Sunil Kumar Goyal

Others also viewed

From Prompt to Production: Building Secure Apps with AI Using Shift-Left Approach - Pt 1

ChatGPT Prompt Engineering for Developers: A Practical Guide

AI-Powered Theia IDE: An AI-native Development Environment You Can Control and Trust

RailFlow: a Rails method for AI-assisted, TDD-first delivery

QA methods will change as we bring in AI development - Here's how...

How AI Code Assistants Are Transforming the Way We Build Software

CLI Coding Agents Tierlist

From Code Writers to Code Reviewers: How Developers Are Adapting to AI

Code That Writes Code: Inside My Self-Updating Full-Stack App

Explore content categories

System Architecture

Detailed Component Analysis

Recommended by LinkedIn

More articles by Sunil Kumar Goyal

🎙️ Building a Speech Intelligence System using Open Source LLMs

Revolutionising Recruitment: Building an AI-Powered Semantic Resume Analysis System

Scaling Platforms Like PropertyFinder: Solving User Growth Challenges with AI and a Peek at the Code

Building GraphRAG Systems with Open Source LLMs and MCP: An Implementation Guide

What makes Redis so fast and its internals

🤖 Multi-Agent AI Systems: The Future of Automation

Azure Cosmos DB: Engineering for Global Scale, Low Latency, and High Availability

The Secret Behind Google Spanner’s Scalability: A Deep Dive Into TrueTime and Global Consistency

How to Design Database Schemas for Scalable Software: A Guide to SQL and NoSQL

How AI Agents Can Revolutionize Observability in Modern Systems 🚀

Others also viewed

From Prompt to Production: Building Secure Apps with AI Using Shift-Left Approach - Pt 1

ChatGPT Prompt Engineering for Developers: A Practical Guide

AI-Powered Theia IDE: An AI-native Development Environment You Can Control and Trust

RailFlow: a Rails method for AI-assisted, TDD-first delivery

QA methods will change as we bring in AI development - Here's how...

How AI Code Assistants Are Transforming the Way We Build Software

CLI Coding Agents Tierlist

From Code Writers to Code Reviewers: How Developers Are Adapting to AI

Code That Writes Code: Inside My Self-Updating Full-Stack App

Explore content categories