RAG-based document search with OpenSearch and Bedrock

𝗜 𝗯𝘂𝗶𝗹𝘁 𝗮 𝗥𝗔𝗚-𝗯𝗮𝘀𝗲𝗱 𝗱𝗼𝗰𝘂𝗺𝗲𝗻𝘁 𝗶𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗽𝗹𝗮𝘁𝗳𝗼𝗿𝗺 𝗳𝗿𝗼𝗺 𝘀𝗰𝗿𝗮𝘁𝗰𝗵 🚀 Teams waste hours digging through internal documents for answers. 𝗗𝗼𝗰𝗦𝗲𝗻𝘀𝗲 fixes that — upload documents, ask questions, and get instant, cited answers grounded in your data. 𝗛𝗼𝘄 𝗶𝘁 𝘄𝗼𝗿𝗸𝘀: 1. Upload a document → triggers an async Step Functions pipeline that chunks the PDF, generates embeddings (all-MiniLM-L6-v2 via DJL/PyTorch), and indexes vectors into OpenSearch k-NN. 2. A query is embedded into a vector → OpenSearch performs semantic search to retrieve the most relevant chunks. 3. Retrieved context is sent to Bedrock → generates a grounded answer with inline citations. 𝗞𝗲𝘆 𝗵𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀: - Fully serverless ingestion with S3-staged batching to handle large documents without payload limits - k-NN vector search with cosine similarity for semantic retrieval (no keyword matching) - Bedrock Converse API with tool-use to enforce structured, cited outputs 𝗧𝗲𝗰𝗵: Java 24, Spring Boot 3.5, AWS (Bedrock, OpenSearch, Step Functions, Lambda, S3), DJL/PyTorch, PostgreSQL Sharing the HLD and GitHub link below 👇 GitHub: https://lnkd.in/gsb9ZsV6 𝗙𝗲𝗲𝗱𝗯𝗮𝗰𝗸 𝘄𝗲𝗹𝗰𝗼𝗺𝗲 — what would you do differently? #RAG #Java #SpringBoot #AWS #OpenSearch #Bedrock #VectorSearch #SoftwareEngineering #SystemDesign

2 Comments

Ashutosh Yadav 2w

Github link for the project👇 https://lnkd.in/gsb9ZsV6

1 Reaction

Jatin Sharma 5d

❤️❤️❤️

See more comments

To view or add a comment, sign in

More Relevant Posts

Marcos Vinicius Thibes Kemer
3w
Report this post
🚀 #PythonJourney | Day 151 — BREAKTHROUGH: API Fully Functional & First Successful Request Today marks a major milestone: **the URL Shortener API is LIVE and responding correctly!** After 8 days of building and debugging, I finally got the first successful POST request working. This breakthrough moment proves that all the pieces fit together. Key accomplishments: ✅ Fixed critical database type mismatch: • PostgreSQL was storing user_id as VARCHAR • SQLAlchemy was trying to query with UUID • Solution: Dropped volumes, rebuilt schema from scratch ✅ Fixed Pydantic response validation: • Model had clicks_total, database had total_clicks • Docker image was caching old code • Solution: Forced rebuild of container image ✅ First successful API call: • POST /api/v1/urls now returns proper JSON • Short code generated automatically • URL stored in database correctly • Full response validation passing ✅ Production-ready API endpoints confirmed: • Authentication working (API key validation) • Request validation (Pydantic models) • Database operations (CRUD) • Error handling (proper HTTP status codes) • Response serialization (JSON output) ✅ Lessons learned about debugging: • Always check the actual container logs • Volume management is critical in Docker • Type consistency across layers matters • Docker caching can hide recent changes • Patience and persistence beat quick fixes What happened today: → Identified the root cause through careful log analysis → Understood the full request/response cycle → Learned when to reset vs. when to patch → Experienced the joy of a working API! The API now successfully: - Validates user authentication - Creates shortened URLs with unique codes - Stores data in PostgreSQL - Returns properly formatted JSON responses - Handles errors gracefully This is what backend development is about: building reliable systems piece by piece, debugging methodically, and celebrating when it finally works. Status update: - ✅ Backend: FUNCTIONAL - ✅ Database: WORKING - ✅ API Endpoints: RESPONDING - ✅ Authentication: VERIFIED - ⏳ Full test suite: Next - ⏳ Deployment: Next week #Python #FastAPI #Backend #API #PostgreSQL #Docker #Debugging #SoftwareDevelopment #Victory #CodingJourney
Like Comment
To view or add a comment, sign in
Dominion Ekpuk
1w
Report this post
For a thousand and one reasons, its been ages since I posted here, And now everyone is saying "build in public",, well I am not giving any guarantees but I will try.... (I did this one in private though, I am only making a post about it) So as a young boy who cares about developer experience, I built a CLI tool that improves your developer experience as a Rust dev. It is called supabase-rust-gen, a binary crate that enables you generate type-safe Rust structs from your Supabase database schema. Like supabase-js type generation, but for the Rust ecosystem. Manually writing Rust structs for your Supabase tables is tedious and error-prone. Column names change, types drift, and nullable fields get missed. supabase-rust-gen eliminates this by: - Connecting directly to your Supabase project's PostgREST endpoint - Reading the OpenAPI spec to understand your exact schema - Generating idiomatic Rust with proper Serde derives - Handling edge cases like JSONB, arrays, nullable fields, and PostgreSQL types So that being said, its been published on crates.io Link Here: https://lnkd.in/eWYX5pDZ Repo: https://lnkd.in/epfp6wsD Test it out, build projects and give feedback...
Like Comment
To view or add a comment, sign in
Dayna B.
1w
Report this post
Introducing mcp-assert: 𝕕𝕖𝕥𝕖𝕣𝕞𝕚𝕟𝕚𝕤𝕥𝕚𝕔 testing for MCP servers Most MCP tools return structured data: 𝐟𝐢𝐥𝐞 𝐜𝐨𝐧𝐭𝐞𝐧𝐭𝐬, 𝐪𝐮𝐞𝐫𝐲 𝐫𝐞𝐬𝐮𝐥𝐭𝐬, 𝐜𝐨𝐝𝐞 𝐥𝐨𝐜𝐚𝐭𝐢𝐨𝐧𝐬. The correct output is knowable in advance. You don't need an LLM to grade it: You need assert.Equal. mcp-assert is a single binary that connects to any MCP server (Go, TypeScript, Python, Rust, Java), calls your tools, and asserts the results. Define assertions in YAML, run them in CI. No SDK, no LLM, no API costs. 𝐙𝐞𝐫𝐨 𝐭𝐨 𝐟𝐮𝐥𝐥 𝐜𝐨𝐯𝐞𝐫𝐚𝐠𝐞 𝐢𝐧 𝐨𝐧𝐞 𝐜𝐨𝐦𝐦𝐚𝐧𝐝: mcp-assert init evals --server "my-server" Connects to your server, discovers every tool, generates assertions, captures baselines. Edit the YAMLs to taste, then run them forever. 𝐖𝐡𝐚𝐭 𝐢𝐭 𝐜𝐨𝐯𝐞𝐫𝐬: ▫️ 15 deterministic assertion types (contains, json_path, regex, file_unchanged, net_delta, etc.) ▫️ Trajectory assertions: validate that agents call tools in the correct order, with safety gates and absence checks. No server needed. ▫️Bidirectional MCP: test client capabilities (roots, sampling, elicitation), not just server tools ▫️ Reliability metrics (pass@k / pass^k), regression detection, snapshot testing ▫️ Docker isolation for write-tests ▫️ Same YAML, different servers: test your Go and Python implementations produce identical results 𝐎𝐧𝐞-𝐥𝐢𝐧𝐞 𝐂𝐈: - uses: blackwell-systems/mcp-assert-action@v1 with: suite: evals/ We've tested it against 18 server suites across 3 languages with 174 assertions and found real bugs in real servers along the way. 𝐈𝐧𝐬𝐭𝐚𝐥𝐥 𝐡𝐨𝐰𝐞𝐯𝐞𝐫 𝐲𝐨𝐮 𝐰𝐚𝐧𝐭: npx @blackwell-systems/mcp-assert pip install mcp-assert brew install blackwell-systems/tap/mcp-assert Open source, MIT licensed. GitHub: https://lnkd.in/geE_Fhck docs: https://lnkd.in/gw69j42G If you're building MCP servers, I'd love to hear what you think. #MCP #ModelContextProtocol #OpenSource #AIAgents #Testing #DevTools

GitHub - blackwell-systems/mcp-assert: Deterministic correctness testing for MCP servers. Assert your tools return the right results, not just any results. No LLM-as-judge. github.com
Like Comment
To view or add a comment, sign in
Stanislav Zheleznov
3w Edited
Report this post
"That's one small step for a man, one giant leap for my backend." 🚀 Today I migrated my Habitual API from SQLite to PostgreSQL. It worked locally - and immediately broke in CI. The technical blocker 🛠️ The heatmap endpoint generates a date range for the last 30 days using a recursive CTE. In SQLite this worked: func.date(func.now(), "-29 days") PostgreSQL doesn't support this modifier syntax. One line, two hours of debugging. I ended up moving the date generation to Python instead of SQL. Cleaner and more portable. The CI struggle 🏗️ Seven commits. Six failures. All named "Little update for CI" - at that point naming stopped mattering (see screenshot 👇). The root cause: my local environment had pinned versions that weren't reflected in requirements.txt. CI pulled newer packages - and everything fell apart. After a few iterations: ✅ 271 tests passed ✅ PostgreSQL running in CI ✅ migrations applied on every push ✅ clean pipeline Next step: Docker and CD. #python #fastapi #postgresql #githubactions #backend
1 Comment
Like Comment
To view or add a comment, sign in
DevExplore

37 followers
1w
Report this post
𝟕 𝐟𝐫𝐞𝐞 𝐀𝐈 𝐝𝐞𝐯𝐞𝐥𝐨𝐩𝐞𝐫 𝐭𝐨𝐨𝐥𝐬 𝐰𝐨𝐫𝐭𝐡 𝐤𝐧𝐨𝐰𝐢𝐧𝐠 𝐢𝐧 𝐀𝐩𝐫𝐢𝐥 𝟐𝟎𝟐𝟔. All verified as of this week. All open source or genuinely useful free tiers. 🛠️ 𝟏. 𝐋𝐀𝐍𝐆𝐅𝐔𝐒𝐄 Open source 𝐋𝐋𝐌 𝐨𝐛𝐬𝐞𝐫𝐯𝐚𝐛𝐢𝐥𝐢𝐭𝐲. Trace every call, latency, token count, and retrieval quality. Self-host free with no per-trace fees. Works with LangChain, LlamaIndex, and raw API calls. 𝟐. 𝐐𝐃𝐑𝐀𝐍𝐓 Open source 𝐯𝐞𝐜𝐭𝐨𝐫 𝐝𝐚𝐭𝐚𝐛𝐚𝐬𝐞. Production-grade performance with a clean Python client. Free cloud tier up to 1GB. Best balance of performance and free access in 2026. 𝟑. 𝐎𝐋𝐋𝐀𝐌𝐀 Run 𝐋𝐋𝐌𝐬 𝐥𝐨𝐜𝐚𝐥𝐥𝐲 at zero API cost. Llama 3, Mistral, Phi-3, and Gemma, all on your hardware. Essential for offline, private, or cost-sensitive inference. 𝟒. 𝐑𝐀𝐆𝐀𝐒 Open source 𝐑𝐀𝐆 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧. Measures context precision, context recall, and answer faithfulness. The fastest way to know if your retrieval pipeline is actually working. 𝟓. 𝐎𝐏𝐄𝐍𝐑𝐎𝐔𝐓𝐄𝐑 Access 𝟐𝟎𝟎+ 𝐋𝐋𝐌𝐬 through one API key, GPT-4o, Claude, Gemini, Mistral, Llama 3. Free tier available. Prevents vendor lock-in and makes model switching trivial. 𝟔. 𝐏𝐇𝐎𝐄𝐍𝐈𝐗 (𝐀𝐑𝐈𝐙𝐄) Open source 𝐋𝐋𝐌 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐨𝐛𝐬𝐞𝐫𝐯𝐚𝐛𝐢𝐥𝐢𝐭𝐲. Stronger for systematic testing across prompt versions than Langfuse. 𝟕. 𝐏𝐆𝐕𝐄𝐂𝐓𝐎𝐑 Vector search inside 𝐏𝐨𝐬𝐭𝐠𝐫𝐞𝐬. If you are already on Postgres, add vector similarity search without running a separate database. Zero additional cost. 𝘚𝘢𝘷𝘦 𝘵𝘩𝘪𝘴 𝘧𝘰𝘳 𝘵𝘩𝘦 𝘯𝘦𝘹𝘵 𝘵𝘪𝘮𝘦 𝘺𝘰𝘶 𝘢𝘳𝘦 𝘦𝘷𝘢𝘭𝘶𝘢𝘵𝘪𝘯𝘨 𝘺𝘰𝘶𝘳 𝘴𝘵𝘢𝘤𝘬. 👇 𝐖𝐡𝐢𝐜𝐡 𝐨𝐧𝐞𝐬 𝐚𝐫𝐞 𝐚𝐥𝐫𝐞𝐚𝐝𝐲 𝐢𝐧 𝐲𝐨𝐮𝐫 𝐭𝐨𝐨𝐥𝐤𝐢𝐭? #OpenSource #AITools #LLMOps #RAG #DevTools #AIEngineering #VectorDatabase #MachineLearning
Like Comment
To view or add a comment, sign in
McEnroe Ryan Dsilva
1w Edited
Report this post
Part 1 was about the infra. This is Part 2 - what I learned I learned once the agents were actually running. Honestly, the hardest bugs weren't in the model. They were in the plumbing around it. Early on I had agents passing messages to each other. By the time a result reached the final node, nobody could tell where it came from or why. I removed that out and replaced it with a single shared state object, a TypedDict that every agent reads from and writes to. That one change made debugging go from impossible to just hard. Memory was harder than I expected. I assumed I could just stuff everything into the context window and call it done. I ended up with three layers: in-context for the current task, Redis for session state that needed to survive across turns, and a vector DB for long-term retrieval. Each agent has a router that decides which layer to hit. I also started treating prompts like code. Every agent has its own system prompt, versioned in Git, reviewed in PRs, tested before deploy. A prompt is just another file. I don't know why it took me this long to think about it that way. The last thing and maybe the most underrated is the Postgres checkpointer. When an agent workflow fails at step 14 of 20, it doesn't restart from zero. It picks up at step 14. That alone has saved me more times than I can count. If you want to talk through the architecture DMs are open. #AgenticAI #LangGraph #Python #AWS #AIEngineering #MLOps #AIEngineering #SystemDesign #Terraform #Pinecone #Redis #LangGraph #AgenticAI #LLMOps #RAG
1 Comment
Like Comment
To view or add a comment, sign in
Omkar Kshirsagar
2w Edited
Report this post
Built a personal project called 𝗥𝗲𝗲𝗹𝗩𝗮𝘂𝗹𝘁 over the past few weeks and wanted to share what went into it. 𝗧𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝗜 𝘄𝗮𝘀 𝘀𝗼𝗹𝘃𝗶𝗻𝗴: I watch a lot of content on Instagram and YouTube about AI tools, open source models, and dev resources. I kept losing track of things I wanted to revisit. 𝗖𝗼𝗺𝗺𝗲𝗻𝘁𝘀 𝗮𝗻𝗱 𝗯𝗼𝗼𝗸𝗺𝗮𝗿𝗸𝘀 𝗱𝗼 𝗻𝗼𝘁 𝗰𝘂𝘁 𝗶𝘁. So I built a full-stack application where I can save any link, reel, or note and 𝘀𝗲𝗮𝗿𝗰𝗵 𝗶𝘁 𝗹𝗮𝘁𝗲𝗿 𝘂𝘀𝗶𝗻𝗴 𝗻𝗮𝘁𝘂𝗿𝗮𝗹 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲. Not keyword search. Meaning-based search. Tech used: Backend — 𝗝𝗮𝘃𝗮 𝟮𝟭 with Spring Boot 3.2, Spring Data JPA, REST APIs Database — 𝗣𝗼𝘀𝘁𝗴𝗿𝗲𝗦𝗤𝗟 with 𝗽𝗴𝘃𝗲𝗰𝘁𝗼𝗿 extension on Supabase Embeddings — 𝗛𝘂𝗴𝗴𝗶𝗻𝗴 𝗙𝗮𝗰𝗲 Inference API using sentence-transformers/all-MiniLM-L6-v2 to convert 𝘁𝗲𝘅𝘁 𝗶𝗻𝘁𝗼 𝟯𝟴𝟰-𝗱𝗶𝗺𝗲𝗻𝘀𝗶𝗼𝗻𝗮𝗹 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 Search — Cosine similarity search using 𝗽𝗴𝘃𝗲𝗰𝘁𝗼𝗿'𝘀 𝗶𝘃𝗳𝗳𝗹𝗮𝘁 𝗶𝗻𝗱𝗲𝘅 𝗧𝗲𝗹𝗲𝗴𝗿𝗮𝗺 𝗕𝗼𝘁 — built into the Spring Boot service, lets me send a URL and get it saved automatically with metadata extracted via Jsoup Frontend — Vanilla HTML, CSS, JS 𝗱𝗲𝗽𝗹𝗼𝘆𝗲𝗱 𝗼𝗻 𝗩𝗲𝗿𝗰𝗲𝗹 Deployment — 𝗗𝗼𝗰𝗸𝗲𝗿𝗶𝘇𝗲𝗱 𝗦𝗽𝗿𝗶𝗻𝗴 𝗕𝗼𝗼𝘁 𝗮𝗽𝗽 𝗼𝗻 𝗥𝗲𝗻𝗱𝗲𝗿 What I learned from actually shipping it: Hugging Face free tier uses a different endpoint than documented. Had to debug a 404 mid-production. Render is IPv4 only so Supabase Direct Connection does not work. Transaction Pooler with stringtype=unspecified in the JDBC URL is the fix. pgvector requires data to exist before the ivfflat index is useful. This project gave me hands-on experience with vector embeddings, semantic search, RAG-adjacent architecture, and end-to-end deployment on free-tier infrastructure. 𝗚𝗶𝘁𝗛𝘂𝗯 𝗹𝗶𝗻𝗸 𝗶𝗻 𝗰𝗼𝗺𝗺𝗲𝗻𝘁𝘀. #Java #SpringBoot #SemanticSearch #VectorDatabase #pgvector #HuggingFace #BackendDevelopment #FullStackDevelopment #RAG #GenerativeAI #AIEngineering #PostgreSQL #Docker #SoftwareEngineering #OpenSource

3 Comments
Like Comment
To view or add a comment, sign in
Allan Roberto
3w
Report this post
I published a write-up about a design decision I care about when adding AI capabilities to backend systems: How to use LangChain4j in a Spring Boot app without letting it take over the architecture. What changed in this project was not just "adding AI support". The bigger improvement was architectural: - the code is now organized by context - use cases stay in the application layer - LangChain4j sits behind clear ports and adapters - PostgreSQL + pgvector still own retrieval - tests were reorganized to match the architecture instead of generic technical layers The project now shows a more realistic RAG-style flow with: - document ingestion through REST - chunking and embedding generation - vector storage in PostgreSQL - hybrid retrieval with vector similarity, full-text search, and metadata filters - prompt building and answer generation through LangChain4j adapters What I like most is that the code did not become framework-shaped. The application core still owns the use cases. The infrastructure stays at the edges. Replacing providers is much closer to a wiring change than a rewrite. That is the lesson I think matters in real projects: Use frameworks as adapters. Do not let them become your architecture. Article: https://lnkd.in/dqf2mcRj Repository: https://lnkd.in/dCC5WPNB #java #springboot #postgresql #pgvector #langchain4j #softwarearchitecture #hexagonalarchitecture #cleanarchitecture #rag #backend
Like Comment
To view or add a comment, sign in
Manav Gandhi
4w
Report this post
Day 2/60: Production Infrastructure That Actually Scales What Most Developers Do: Start with SQLite. Hardcode credentials. Skip migrations. Write blocking database calls. Wonder why it breaks at 10K users. What I Built Today: ✅ Async SQLAlchemy 2.0 with connection pooling ✅ Docker Compose (PostgreSQL + Redis + Backend) ✅ Alembic migration system with rollback ✅ Database health checks and monitoring ✅ Multi-stage Docker builds (40% smaller images) ✅ Development scripts (init, validate, wait-for-db) ✅ 31 tests, 100% coverage on database layer Technical Decisions: Async Everything: Non-blocking I/O handles 100 concurrent users on single thread Connection Pooling: QueuePool (5+10) for PostgreSQL, NullPool for SQLite Health Checks: pg_isready with retry logic, services wait for dependencies Type Safety: mypy --strict passes, Mapped[T] catches bugs at compile time Architecture Highlight: DatabaseManager singleton manages lifecycle. Session context managers handle transactions. Automatic rollback on errors. Zero connection leaks. Why It Matters: Technical debt is a choice. Building for 10K users from day one means adding workers when growth comes, not rewriting the database layer. What's Working: ``` docker-compose up -d → All services healthy pytest → 31/31 tests passing Database connection → ✅ Validated ``` Metrics: - 11 new files - 1,800 lines of production code - 600 lines of documentation (DATABASE.md) - 100% test coverage on new code - 0 linting errors Day 3 Tomorrow: Database models (User, Organization, Channel, Post). First Alembic migration. Schema design for ML features. Buffer - Building a solid foundation for your API ecosystem. Would love to connect. Repository: https://lnkd.in/g8pdgJvM Medium Blog: https://lnkd.in/gRrs6WaR #BufferIQ #BuildingInPublic #DatabaseEngineering #Docker #Python #PostgreSQL #SQLAlchemy #SoftwareArchitecture #Buffer
Like Comment
To view or add a comment, sign in
Saif Uddin
4d
Report this post
Building a URL Shortener sounds simple until you have to handle database collisions and clean API redirects. 🚀 Hey LinkedIn family! 👋 Saif here. I recently wrapped up a new Backend project: a Production-Ready URL Shortener API. My goal wasn't just to make it work, but to understand how to build scalable, containerized backend systems. The Features (What it does) Short-Code Generation: Custom logic to create unique, collision-resistant URLs. Smart Redirects: Handling 302 redirects with real-time click tracking. Analytics: Dedicated endpoints to monitor URL performance. URL Management: Ability to deactivate links on the fly. The "Under the Hood" (The Deep Tech) This is where the real learning happened. I didn't just write Python; I built a mini-infrastructure: FastAPI & Pydantic: For strict data validation and lightning-fast performance. PostgreSQL & SQLAlchemy: Managing relational data with clean ORM patterns. Alembic: Handling database migrations (version control for my DB schema). Dockerized Environment: I used Docker to isolate the PostgreSQL environment, managing complex port mappings to avoid host-system conflicts. The Tech Stack 🛠 Backend: FastAPI, Python 3.12 🗄 Database: PostgreSQL, SQLAlchemy (ORM) 🔄 Migrations: Alembic 🐳 Infrastructure: Docker & Docker Compose What’s Next? Currently, it’s running perfectly in my local Docker environment. The next step? I'm moving it to AWS (EC2/RDS) to learn cloud deployment and security groups. Stay tuned—I'll be making the API live in a few days! I'd love to hear your thoughts on the architecture. #Python #FastAPI #BackendDevelopment #Docker #PostgreSQL #SoftwareEngineering #AWS
Like Comment
To view or add a comment, sign in

857 followers

5 Posts

View Profile Follow

RAG-based document search with OpenSearch and Bedrock

More Relevant Posts

Explore related topics

Explore content categories