Spring Proxy Self-Invocation Bypasses @Cacheable, Hurts API Performance

🚨 How Spring Proxy Self-Invocation Bypassed @Cacheable and Degraded API Performance Debugged a performance issue where a critical API response time degraded to ~500ms under load. 🔍 What actually happened? A @Cacheable method in the service layer was invoked from another method within the same class. This internal call bypassed the Spring proxy, so caching was never applied — with no errors or warnings. Consequently, MongoDB was hit on every invocation (~55ms per call), and concurrency amplified the latency. Methods were public, the annotation was correct, and indexes existed in MongoDB, but none of that mattered because Spring intercepts only external calls through the proxy; self-invocation does not trigger caching. ⚠️ Key Insight Spring's official documentation states: "In proxy mode, only external method calls coming in through the proxy are intercepted. Self-invocation will not lead to actual caching at runtime even if the method is marked with @Cacheable." Java resolves internal calls as this.method() — pointing to the target object, not the proxy. ✅ Fixes Identified via JFR stack traces and I/O analysis. Removed repeated DB calls from downstream flow, introduced parallel fetch using virtual threads, and passed pre-fetched data through the call chain. This removed the need for the internal method to independently call the @Cacheable method, eliminating the proxy bypass. 👉 Result: API response time dropped from ~500ms to ~20ms 🧠 Takeaway If @Cacheable is present but ineffective, verify the call path. Self-invocation within the same class bypasses the proxy, and caching will not execute. 👉 Curious if others have encountered Spring proxy self-invocation in production. #Java #SpringBoot #SpringFramework #PerformanceEngineering #MongoDB #BackendEngineering #JVM #JFR #Caching #DistributedSystems #SoftwareEngineering #ProductionIncident #Debugging #MicroServices #JavaPerformance

To view or add a comment, sign in

More Relevant Posts

INSRAM UL HASSAN
3w
Report this post
Keeping cache consistent with the database is one of the most practical challenges when building scalable systems with Java and Spring Boot. When designing high-performance applications using Spring Boot (with tools like Spring Cache, Redis, or Caffeine), choosing the right caching strategy directly impacts data consistency, latency, and reliability. Here are the most common approaches: 1) Cache Aside (Lazy Loading) The application first checks the cache. If data is missing, it fetches from the database and updates the cache. On updates, the cache is invalidated. ➡️ In Spring Boot: commonly implemented using @Cacheable and @CacheEvict ➡️ Why it works: simple, flexible, and widely adopted in real-world systems 2) Write Through Data is written to both the cache and database at the same time. ➡️ Ensures strong consistency between cache and DB ➡️ Trade-off: increased write latency due to dual writes 3) Write Behind (Write Back) Data is written to the cache first and persisted to the database asynchronously. ➡️ Great for high-throughput systems ➡️ Risk: potential data loss if cache crashes before DB sync 4) TTL (Time-To-Live) Each cache entry expires automatically after a defined duration. ➡️ Easy to implement using Redis TTL configuration ➡️ Trade-off: stale data may be served before expiration Key takeaway: There is no one-size-fits-all strategy. In Spring Boot systems, the choice depends on your consistency requirements, traffic patterns, and failure tolerance. Often, a hybrid approach (Cache Aside + TTL) provides a good balance between performance and data freshness. #SystemDesign #Java #SpringBoot #Caching #Redis #BackendDevelopment #Scalability #SoftwareEngineering #Microservices #PerformanceOptimization

2 Comments
Like Comment
To view or add a comment, sign in
Aliaksei Taliuk
3w
Report this post
🚀 Excited to share that JsonApi4j 1.5.0 is now live! 👉 https://lnkd.in/eSgd4sbs This release adds built-in caching to the Compound Documents Resolver - fewer downstream calls, faster responses for repeated include queries. How it works: → The resolver caches individual resources by resource type + id → On the next request, only cache misses trigger HTTP calls → Cache entries respect standard HTTP Cache-Control headers (max-age, no-store, etc.) → The final compound document response carries an aggregated Cache-Control header - the most restrictive directive across all included resources In-Memory Cache implementation is enabled by default - zero configuration needed. For distributed deployments, you can plug in your own implementation via the SPI, for example one for Redis. --- JsonApi4j is an open-source framework for building APIs aligned with the JSON:API specification, with a strong focus on developer productivity and clean architecture. If you're looking for a structured and flexible way to expose JSON:API endpoints — give it a try. Feedback and contributions are always welcome! 🙌 #java #jsonapi #opensource #api #caching
Like Comment
To view or add a comment, sign in
Muhammad Absar
4w
Report this post
🚀 Real-Time Notification Engine: Event-Driven with Kafka & WebSockets Real-time visibility is no longer a "nice-to-have"—it’s a modern requirement for distributed systems. 🏗️ I just finished building an event-driven notification engine for my Inventory Management System using the absolute latest Java/Spring ecosystem. The magic happens when an order status hits "DELIVERED": The Trigger: The Order Service persists the change and a Kafka Producer fires an asynchronous event. No waiting, no blocking. The Processing: A dedicated Notification Service consumes the message. Because it's asynchronous, the main order flow stays blazing fast and responsive. ⚡ The Delivery: Using STOMP over WebSockets, the update is pushed live to the user's dashboard instantly. No "refresh" button needed. 🔔 The Architectural Win: By using an event-driven approach, the Order Service is completely decoupled from the notification logic (SMS, Emails, etc.). We maintain a "lean" core while allowing the notification system to scale horizontally without breaking a sweat. Tech Stack (Bleeding Edge): Language: Java 25 ☕ Framework: Spring Boot 4.x Messaging: Apache Kafka (Event-Driven) Real-time: Spring WebSocket + STOMP Data Integrity: Optimistic Locking (Preventing overselling) Caching: Redis How are you handling real-time updates in your systems? Are you sticking with WebSockets, or exploring Server-Sent Events (SSE)? Let's discuss in the comments! 👇 #Java25 #SpringBoot #Kafka #WebSockets #BackendEngineering #SystemDesign #EventDriven #Architecture

2 Comments
Like Comment
To view or add a comment, sign in
Saurabh kumar
4w
Report this post
🚀 Built a production-grade Agentic Search Service from scratch using Spring Boot 3 + LangChain4j What started as a simple CRUD API evolved into an intelligent search system that decides HOW to search based on what you ask. 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝗲𝗮𝗿𝗰𝗵? Instead of always running the same query, the system classifies your intent first — then picks the right strategy automatically. "laptop" → keyword search "something portable for work" → semantic vector search "laptops under 500 with 16GB" → LLM extracts filters → structured query "good stuff" → asks for clarification 𝗧𝗲𝗰𝗵 𝗦𝘁𝗮𝗰𝗸 → Spring Boot 3 + Java 17 → LangChain4j + Groq (llama-3.3-70b) for intent classification → AllMiniLmL6V2 local embedding model (zero API cost) → pgvector on PostgreSQL for semantic similarity search → Redis for distributed caching → Apache Kafka for async write pipeline → HikariCP with primary/replica DB routing → Docker Compose for local infrastructure 𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗙𝗲𝗮𝘁𝘂𝗿𝗲𝘀 → @Transactional(readOnly=true) routes reads to replica automatically via LazyConnectionDataSourceProxy → Redis cache with toggle flag — on/off without code changes → Kafka async writes with 202 Accepted — DB pressure decoupled from API latency → Paginated reads with configurable sort → Input validation with field-level 400 error responses 𝗞𝗲𝘆 𝗗𝗲𝘀𝗶𝗴𝗻 𝗗𝗲𝗰𝗶𝘀𝗶𝗼𝗻𝘀 → LazyConnectionDataSourceProxy — without this, read/write routing silently breaks → AOP proxy ordering — @Transactional must wrap before @Cacheable fires → Embeddings generated at write time, not search time — semantic search stays O(1) → Kafka/cache toggleable via properties — same codebase, different behaviour per environment 𝗪𝗵𝗮𝘁 𝗜 𝗟𝗲𝗮𝗿𝗻𝗲𝗱 Building this end-to-end showed me that the gap between a working API and a production-ready service is filled with decisions most tutorials skip — connection pool tuning, proxy ordering, embedding lifecycle, broker networking in Docker. The agentic layer on top made it clear how LangChain4j's AiServices turns an LLM into a typed Java method — no boilerplate, no JSON parsing, just an interface and annotations. #Java #SpringBoot #LangChain4j #AI #Kafka #Redis #PostgreSQL #pgvector #SystemDesign #BackendEngineering
Like Comment
To view or add a comment, sign in
Mindy Ferguson
6d
Report this post
Java 11 standard support ends later this year. If your Flink jobs are still running on 1.x, you’re heading toward an unsupported runtime while your competitors are already running ML inference natively in SQL. Francisco Morillo from #AWS just published the step-by-step migration guide for Flink 2.2 on Amazon Managed Service for Apache Flink. The upgrade is in-place. You don’t blow away your application — you update the runtime, point to a new JAR, and the service handles the rest. Auto-rollback kicks in automatically if binary incompatibilities are detected at startup. The part most teams will get burned by: Kryo. The serializer upgraded from 2.24 to 5.6, which breaks state compatibility for POJOs using Java collections (HashMap, ArrayList, HashSet). If your app uses those and you’re relying on Kryo fallback, your upgrade “succeeds” — then you enter restart loops. Check your logs for Class class <className> cannot be used as a POJO type before you touch production. The upside of getting through this: RocksDB 8.10.0 gives you measurably faster checkpoints and recovery. And ML_PREDICT + CREATE MODEL means you can call ML models directly from SQL — no separate inference layer to maintain. What’s your current Flink version? Still on 1.x or already evaluating 2.x? #ApacheFlink #Flink https://lnkd.in/e2gWJPwv

Migrate to Apache Flink 2.2 on Amazon Managed Service for Apache Flink | Amazon Web Services aws.amazon.com

1 Comment
Like Comment
To view or add a comment, sign in
Gabriel Glodean

Java developer with 11 years of experience, mostly backend. Interested in JVM internals, APIs, and microservices.
1w
Report this post
🔍 Since I have plenty of time, I was building a tool that indexes every hardcoded constant in your Java bytecode and config files — and diffs them across versions. Ever had to answer "where is this SQL string used?" or "what changed between our last two releases?" across a large multi-module codebase? That's the problem I set out to solve. Constant Tracker parses JVM class files, classifies constants by semantic type (SQL, URL, logging, file path, error message, annotation), and indexes everything into Solr for full-text search. The version diff feature is my favourite part: upload two finalized JARs and see exactly which constants were added, removed, or changed — per class, with full usage context. Tech: Java 25 · Spring Boot 3 WebFlux · Solr 10 · Redis · Postgres · React 19 · Docker Compose 3-command quickstart with pre-seeded demo data included. 🔗 GitHub: https://lnkd.in/dzqxKGHP #java #springboot #solr #postgres #redis #react #githubcopilot
Like Comment
To view or add a comment, sign in
Vadym Kazulkin 🇺🇦
3w
Report this post
New blog post alert 🚨 "Serverless applications on AWS with Lambda using Java 25, API Gateway and DynamoDB – Part 5 Using SnapStart with full priming". In this article, we’ll introduce another Lambda SnapStart priming technique. I call it API Gateway Request Event priming (or full priming). We’ll then measure the Lambda performance by applying it and comparing the results with other already introduced approaches. The goal is to further improve the performance of our Lambda functions. If you like my content, please follow me on GitHub (github.com/Vadym79) and give my repositories like this https://lnkd.in/epud2eRf a star! Amazon Web Services (AWS) Oracle #Java #Serverless #AWS https://lnkd.in/egApAmbg

Use Lambda SnapStart with full priming and provide its performance measurements for the sample application vkazulkin.com
Like Comment
To view or add a comment, sign in
Apurva Rathore
3d
Report this post
I used to think like most people that Redis is just for caching. But while working on backend systems, I got curious — what else can it actually do? That curiosity led me to explore how Redis can be used for something more practical… like rate limiting 🚀 So I built a Rate Limiter Service using Spring Boot + Redis. 🔧 What I built: • Fixed-window rate limiting using Redis (INCR + EXPIRE) • Interceptor-based request throttling for APIs • Per-user request tracking • Designed using the Strategy Pattern, so it’s easy to plug in other algorithms 💡 One thing I really enjoyed while building this: Instead of locking myself into one approach, I kept the design flexible. I’m planning to extend it further by adding Sliding Window / Token Bucket algorithms next. 💡 Biggest takeaway: Redis isn’t just a cache — it’s incredibly powerful when you need fast, atomic operations for real-world problems. 🔗 Project: https://lnkd.in/gp9N3jEW 🌐 Portfolio: https://lnkd.in/g_epk7wt Still learning and exploring—would love to hear your thoughts or suggestions 🙌 #Java #SpringBoot #Redis #BackendDevelopment #SystemDesign #LearningInPublic

Apurva Rathore portfolio-e4t34.web.app

1 Comment
Like Comment
To view or add a comment, sign in
Rajashekar Reddy Nandipati
1w
Report this post
Why does a Spring Boot API become slow in production but works fine locally? This is a common issue many backend developers face. Key reasons: 1️⃣ Database inefficiencies • N+1 query problem in JPA • Missing indexes • Large join queries • Fetching unnecessary fields ✅ Solution: Optimize queries, use pagination, and enable Hibernate batching. 2️⃣ Connection management • Too many DB connections • Frequent connection creation ✅ Solution: Use connection pooling (HikariCP) to reuse connections efficiently. 3️⃣ Caching not implemented • Repeated DB calls for same data ✅ Solution: Use caching (Redis / in-memory) to reduce database load. 4️⃣ Thread handling issues • Improper thread pool configuration • Blocking calls affecting throughput ✅ Solution: Tune thread pool based on system resources and workload. 5️⃣ External dependencies • Slow third-party APIs • Network latency in production ✅ Solution: Use timeouts, retries, and circuit breakers. 6️⃣ Large payload handling • Sending/receiving huge JSON responses ✅ Solution: Use pagination and limit response size. 📊 Tools that help identify bottlenecks: • Spring Boot Actuator (metrics, health) • JVisualVM (CPU, memory, threads) • Prometheus + Grafana (monitoring dashboards) Key takeaway: APIs are fast locally because data is small and environment is simple. Production systems require proper optimization, monitoring, and scaling strategies. #Java #SpringBoot #Microservices #BackendDevelopment #Performance #LearningJourney
Like Comment
To view or add a comment, sign in

263 followers

10 Posts

View Profile Follow

Spring Proxy Self-Invocation Bypasses @Cacheable, Hurts API Performance

More Relevant Posts

Explore content categories