Top LinkedIn Content on Best Practices for API Development

Principal Software Engineer @Atlassian| Ex-Sr. Engineer @Microsoft || Sharing insights on SW Engineering, Career Growth & Interview Preparation

67,559 followers 5mo

A candidate interviewing for a Senior Engineer @ Meta was asked to design a rate limiter. Another candidate at Google's L5 loop got hit with the same question. I've been asked this three times across different companies. Rate-limiting questions look simple until you add one layer of complexity: – Add distributed rate limiting? Now you're dealing with race conditions and clock skew. – Add multiple rate limit tiers? Welcome to priority queues and quota management. – Add per-user, per-IP, and per-API-key limits? Your Redis bill just exploded. Here's my personal checklist of 15 things you must get right when building rate limiters: 1. Always do rate limiting on the server, not the client → Client-side limits are useless. They’re easily bypassed, so always enforce limits on your backend. 2. Choose the right placement → For most web APIs, place the rate limiter at the API gateway or load balancer (the “edge”) for global protection and minimal added latency. 3. Identify users correctly → Use a combination of user ID, API key, and IP address. Apply stricter limits for anonymous/IP-only clients, higher for authenticated or premium users. 4. Support multiple rule types → Allow per-user, per-IP, and per-endpoint limits. Make rules configurable, not hardcoded. 5. Pick an algorithm that fits your needs → Know the pros/cons: – Fixed Window: Easy, but suffers from burst issues. – Sliding Log: Accurate, but memory-heavy. – Sliding Window Counter: Good balance, small memory footprint. – Token Bucket: Handles bursts and steady rates, an industry standard for distributed systems. 6. Store rate limit state in a fast, shared store → Use an in-memory cache like Redis or Memcached. Every gateway instance must read and write to this store, so limits are enforced globally. 7. Make every check atomic → Use atomic operations (e.g., Redis Lua scripts or MULTI/EXEC) to avoid race conditions and double-accepting requests. 8. Shard your cache for scale → Don’t rely on a single Redis instance. Use Redis Cluster or consistent hashing to scale horizontally and handle millions of users/requests. 9. Build in replication and failover → Each cache node should have replicas. If a primary fails, replicas take over. This keeps the system available and fault-tolerant. 10. Decide your “failure mode” → Fail-open (let all requests through if the cache is down) = risk of backend overload. Fail-closed (block all requests) = user-facing downtime. For critical APIs, prefer fail-closed to protect backend. 11. Return proper status codes and headers → Use HTTP 429 for “Too Many Requests.” Include headers like: – X-RateLimit-Limit, – X-RateLimit-Remaining, – X-RateLimit-Reset, Retry-After This helps clients know when to back off. 12. Use connection pooling for cache access → Avoid reconnecting to Redis on every check. Pool connections to minimize latency. Continued in Comments...

69 Comments

Priyanka Vergadia

#1 Visual Storyteller in Tech | VP Level Product & GTM | TED Speaker | Enterprise AI Adoption at Scale

117,286 followers 3mo

🛑 "429 Too Many Requests" isn't just an error code; it's a survival strategy for your distributed systems. Stop treating Rate Limiting as a simple counter. To prevent crashes, you need the right algorithm. This visual explains the patterns you need to know. 𝐇𝐨𝐰 𝐰𝐞 𝐜𝐨𝐮𝐧𝐭: 1️⃣ Token Bucket: User gets a "bucket" of tokens that refills at a constant rate. Great for bursty traffic. If a user has been idle, they accumulate tokens and can make a sudden burst of requests without being throttled immediately. Use Case: Social media feeds or messaging apps. 2️⃣ Leaky Bucket: Requests enter a queue and are processed at a constant, fixed rate. Acts as a traffic shaper. It smooths out spikes, protecting your database from write-heavy shockwaves. Use Case: Throttling network packets or writing to legacy systems. 3️⃣ Fixed Window: A simple counter resets at specific time boundaries (e.g., the top of the minute). Easiest to implement but suffers from the "boundary double-hit" issue (e.g., 100 requests at 12:00:59 and 100 more at 12:01:01). Use Case: Basic internal tools where precision isn't critical. 4️⃣ Sliding Window Log: Tracks the timestamp of every request. Solves the boundary issue completely. It’s highly accurate but expensive on memory (O(N) space complexity) because you store logs, not just a count. Use Case: High-precision, low-volume APIs. 5️⃣ Sliding Window Counter: The hybrid approach. Approximates the rate by weighing the count of the previous window and the current window. Low memory footprint, high accuracy. Use Case: Large-scale systems handling millions of RPS. 𝐖𝐡𝐞𝐫𝐞 𝐰𝐞 𝐞𝐧𝐟𝐨𝐫𝐜𝐞 6️⃣ Distributed Rate Limiting: Essential for microservices. You cannot rely on local memory; you need a centralized store (like Redis with Lua scripts) to maintain a global count across the cluster. 7️⃣ Fixed Window with Quota: Often distinct from technical throttling. This is business logic—hard caps over long periods (months/years). Use Case: Tiered billing plans (e.g., "Free Tier: 10k calls/month"). 8️⃣ Adaptive Rate Limiting: The "smart" limiter. It doesn't use static numbers but monitors system health (CPU, memory, latency). If the system struggles, it tightens the limits automatically. Use Case: Auto-scaling systems and disaster recovery. 𝐖𝐡𝐨 𝐰𝐞 𝐥𝐢𝐦𝐢𝐭 9️⃣ IP-Based Rate Limiting: The first line of defense. Limits based on the source IP to prevent botnets or DDoS attacks. Use Case: Public-facing unauthenticated APIs. 🔟 User/Tenant-Based Rate Limiting: Limits based on API Key or User ID. Ensures one heavy user doesn't degrade performance for others ("Noisy Neighbor" problem). Use Case: SaaS platforms and multi-tenant architectures. 💡 For most production systems, Sliding Window Counter combined with Distributed Limiting is the gold standard. It offers the best balance of memory efficiency and user fairness. #SystemDesign #SoftwareArchitecture #API #Microservices #DevOps #BackendEngineering #RateLimiting #CloudComputing

2 Comments

Kanaiya Katarmal

Helping 45K+ Engineers with .NET | CTO | Software Architect | I Help Developers & Startups Turn Ideas into Scalable Software | Weekly .NET Tips

45,586 followers 5d

.NET 10 Clean Architecture + Vertical Slice Template I've built and open-sourced a modern .NET 10 REST API template designed with Clean Architecture + Vertical Slice Architecture + CQRS structured the way scalable, maintainable systems should be built. It can easily save you 100+ hours of setup, refactoring, and architectural cleanup. 𝗪𝗵𝘆 𝗧𝗵𝗶𝘀 𝗧𝗲𝗺𝗽𝗹𝗮𝘁𝗲? Most templates stop at CRUD. This one focuses on real-world architecture. - Clean Architecture (Domain → Application → Infrastructure → WebApi) - Vertical Slice Architecture (feature-first structure inside Application) - Clean CQRS separation (Commands & Queries) - Minimal APIs for high performance - FluentValidation pipeline (automatic validation) - Logging decorators (request/response tracing) - Audit Interceptor (automatic CreatedOn/UpdatedOn tracking) - Global exception handling - Result<T> pattern for consistent error responses - Health checks (DB readiness & liveness) - PostgreSQL + EF Core 10 - Scalar UI (modern OpenAPI experience) 𝗛𝗼𝘄 𝗜𝘁'𝘀 𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 4 clean projects with strict inward dependency: 📁 Domain - Entities, Result<T>, Error types (zero dependencies) 📁 Application - Vertical slices (Handler + Validator + Endpoint per feature), Pipelines, Abstractions 📁 Infrastructure - EF Core, Repository, UnitOfWork, Audit Interceptor 📁 WebApi - Thin host (Program.cs, Exception Handler, Health Checks) 𝗪𝗮𝗻𝘁 𝘁𝗵𝗶𝘀 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 .𝗡𝗘𝗧 𝘁𝗲𝗺𝗽𝗹𝗮𝘁𝗲? 👉 The GitHub repository link is available in the comments 👉 If you find it useful, please star ⭐ the repo 👉 Share this post with your network 🙌 After you try it, message me your thoughts I actively improve it based on real-world feedback. 💾 Save this for later & repost if this helped 👤 Follow Kanaiya Katarmal + turn on notifications.

90 Comments

Rocky Bhatia

400K+ Engineers | Architect @ Adobe | GenAI & Systems at Scale

214,752 followers 1mo

Your API works perfectly - until someone hammers it with 10,000 requests in a second. Rate limiting is what stands between a stable system and a full outage. But not all rate limiting algorithms are equal 👇 1. Fixed Window Counter Counts requests in a fixed time window and resets after each interval. Simple to implement but burst-prone at window boundaries. 2. Sliding Window Log Stores each request timestamp and removes expired entries. Accurate limiting but memory-heavy at scale. 4. Sliding Window Counter Combines current and previous window counts to smooth traffic. Lower memory usage, better burst protection than fixed windows. 4. Token Bucket Adds tokens at a fixed rate. Requests consume tokens. Supports controlled bursts while maintaining average rate limits. Most widely used. 5. Leaky Bucket Processes requests at a fixed outflow rate. Smooths bursts by queuing or dropping excess traffic. Predictable but less flexible. 6. Concurrency Limiter Limits how many requests run simultaneously - not per time window. Essential for protecting downstream services from overload. How to choose: → Need simplicity? Fixed Window → Need accuracy? Sliding Window Log → Need balance? Sliding Window Counter → Need burst tolerance? Token Bucket → Need smooth throughput? Leaky Bucket → Protecting a slow backend? Concurrency Limiter Most production systems combine 2–3 of these at different layers - gateway, service, and database. One algorithm rarely covers all your attack surfaces. Which one does your system rely on? 👇

59 Comments

Akash Kumar

Java Developer💻 | APIs | Spring Boot | Spring Framework

1,694 followers 10mo

🚀 Java (Spring Boot) – Clean Project Structure Matters! Over the years of working with Spring Boot and microservices, one lesson stands out: 🧱 A clean, well-structured project is not just nice-to-have — it’s essential. Without structure, things get messy fast. Bugs multiply, onboarding becomes painful, and scaling becomes a nightmare. Here’s what I always keep in mind: 🧠 Best Practices for Clean Architecture: ✅ Use DTOs to isolate your API layer from business logic ✅ Never connect directly to another service’s database — use REST APIs or message brokers ✅ Keep services small, modular, and single-responsibility ✅ Use Spring Cloud Config or Vault to manage configuration ✅ Document your APIs with Swagger, Postman, or OpenAPI ✅ Always write integration + unit tests to ensure system reliability 🧰 Clean structure = maintainable code 📈 Maintainable code = faster delivery and easier growth What are your go-to practices for organizing a Spring Boot project? Let’s share and grow together. 💬👇 #SpringBoot #JavaDeveloper #Microservices #SoftwareArchitecture #CleanCode #BackendDevelopment #SpringFramework #API #JavaTips #DevBestPractices #CodingStandards #TechLeadership #AkashCodes #CleanArchitecture

36 Comments

Sameer Bhardwaj

Co-founder @Layrs | Ex Google

49,820 followers 3w

Imagine you’re in a system design interview at Google for an L5 role, and the interviewer asks: “If 10M users hit your API at the same time and your rate limiter allows 1000 req/sec, what happens to the other 9.99M?” This is a classic overload-control + retry-amplification problem. Btw, if you’re preparing for system design interviews, check out our AI Tutor: https://lnkd.in/gcWfR7jW You can: - voice chat about your questions in real-time - get feedback in real time and improve with these sessions - learn concepts, practice HLD questions even if you're a complete beginner Here is how I would break it down. [1] Clarify what we actually need to build This is not just “return 429 when over the limit.” It is: - protect the backend from overload - keep latency stable for the requests we do accept - avoid retry storms from rejected clients - give clients a fair chance to recover - degrade gracefully instead of turning 10M requests into 20M So the core problem is not only rate limiting. It is admission control plus controlled recovery behavior. [2] The other 9.99M cannot all get immediate retries If all rejected requests get a 429 and retry immediately, the limiter becomes part of the problem. A better model is: - accept up to the allowed rate - reject excess traffic quickly - return backoff hints like `Retry-After` - force clients and SDKs to use exponential backoff + jitter - optionally queue a small bounded overflow only if the business case justifies it The key idea is simple: do not turn rejection into amplification. [3] High-level flow A reasonable design would be: - clients hit edge load balancers / API gateway - request first passes through a distributed rate limiter - accepted requests move to the backend - rejected requests get a fast 429 or graceful degradation response - clients retry later using backoff, not instantly - observability layer tracks rejection rate, retry rate, queue depth, and user impact The limiter is only one part. The client behavior matters just as much. [4] What should happen to the rejected traffic? This depends on the API. For example: - interactive read APIs: reject fast, retry later - write APIs: maybe accept into a bounded queue if loss is costly - idempotent operations: safer to retry - non-critical traffic: drop or degrade early - premium / internal traffic: separate priority buckets So the answer is not “all 9.99M get blocked.” The answer is “different classes of traffic may be handled differently.” [5] The tradeoffs interviewers care about This is where the answer gets interesting: - immediate 429 is cheap, but dangerous if clients retry badly - queues smooth bursts, but can increase latency and memory pressure - token bucket handles bursts better than a strict per-second counter - fairness matters so one tenant or region does not starve everyone else - backoff with jitter is critical to avoid synchronized retries - if the limiter itself fails, fail-open vs fail-closed depends on the API

26 Comments

Shaheen Aziz

23,941 followers 9mo

Clean Architecture in .NET – Scalable & Maintainable Project Structure Over the past few months, I’ve been architecting enterprise-grade applications using Clean Architecture principles in .NET — and the impact has been incredible! 💥 ✅ Scalability improved ✅ Code became modular & testable ✅ Development speed increased Here's the structure I follow to keep things clean, decoupled, and easy to maintain: 📂 API / Presentation Layer 🎯 Entry point for HTTP requests via Controllers 🧭 Sends Commands/Queries to the Application layer 🧩 Configures Dependency Injection in Program.cs / Startup.cs 📂 Application Layer ⚙️ Pure Application Logic – no infrastructure dependencies 📬 Implements CQRS using MediatR 🔁 Handles DTOs, Mapping, Events & Custom Exceptions 📂 Domain / Core Layer 🏛️ Contains core business rules and domain models 💼 Includes Entities, Interfaces, Domain Services 🚫 No EF Core, No HTTP, No UI logic 📂 Infrastructure Layer 🗄️ Handles persistence, file system, email, external APIs 🧱 Implements interfaces defined in the Domain Layer 🔌 Injected into the Application layer via DI 🎯 Why it works: This structure enables clean, scalable, and testable applications – perfect for microservices and enterprise systems. #DotNetCore #CleanArchitecture #Microservices #ScalableCode #SoftwareEngineering #CSharp #CodeStructure #DevArchitecture #DomainDrivenDesign #MediatR #CQRS #FullStackDeveloper #MaintainableCode #EnterpriseApps #CleanCode #SOLIDPrinciples

26 Comments

Mashhood Rastgar

karachiwala.dev - Engineering and AI Leadership - Google Developer Expert for AI and Web

9,014 followers 1mo

One of the most annoying things about using AI tools right now is hitting your limit in the middle of something important. Deep in a Claude Code session, momentum building — and then suddenly it stops. Limit hit. Frustrating every time. I've been using a tool by Kamran Ahmed for several weeks now that has genuinely helped me avoid this. It adds a lightweight statusline to Claude Code that shows your API usage limit in real time — both daily and weekly consumption. That visibility changes everything. Instead of getting blindsided, I can see exactly where I stand. If I'm burning through limits fast, I pace myself or choose a good stopping point before hitting the wall. It lets me manage my sessions intelligently rather than just hoping I don't run out mid-task. It also surfaces current directory and git branch in the statusline — small thing, but genuinely useful for situational awareness during long sessions. And the most important (is thinking enabled - is it ever disabled?) and the context window percentage. One-command install: `npx @kamranahmedse/claude-statusline` If you're on a metered Claude Code plan, this one's worth two minutes of your time. https://lnkd.in/dsXwxDMt What does your status line look like right now? #ClaudeCode #DeveloperTools #AIProductivity

17 Comments

Sujeeth Reddy P.

Software Engineering

7,915 followers 1y

If I were just starting out with APIs, these are the 10 rules I’d follow. These best practices will help you create simple, clear, and consistent APIs that are easy to use and understand. 1/ Keep It Simple ↳ Use clear, concise endpoints that describe resources. ↳ Avoid over-complicating; keep naming consistent and understandable. ↳ Example: `/books` for all books, `/books/{id}` for a specific book. 2/ Use RESTful Design ↳ Use standard HTTP methods: GET, POST, PUT, DELETE. ↳ Name endpoints with nouns like `/users` or `/orders` for clarity. ↳ Example: HTTP code 200 (success), 404 (not found), 500 (server error). 3/ Choose Standard Data Formats ↳ Use JSON as it’s readable and widely supported. ↳ Keep data formats consistent across endpoints. ↳ Example: `{ "title": "To Kill a Mockingbird", "author": "Harper Lee" }`. 4/ Provide Clear Documentation ↳ Document endpoints with detailed descriptions. ↳ Provide request and response examples for easy usage. ↳ Example: Explain `/users/{id}` with request/response samples. 5/ Implement Versioning ↳ Include versioning in the URL to manage changes. ↳ Allow for updates without breaking existing clients. ↳ Example: `/v1/books` for version 1, `/v2/books` for an updated version. 6/ Ensure Security ↳ Use HTTPS for data encryption. ↳ Implement authentication and authorization mechanisms. ↳ Example: OAuth 2.0 to secure user access to APIs. 7/ Handle Errors Gracefully ↳ Use standard HTTP status codes like 400, 404, and 500. ↳ Provide informative error messages to help resolve issues. ↳ Example: `400 Bad Request` for invalid input, with a detailed error message. 8/ Optimize Performance ↳ Use caching to store frequent responses and speed up access. ↳ Apply rate limiting to control the number of requests a user can make. ↳ Example: Cache popular books, limit requests to prevent server overload. 9/ Test Thoroughly ↳ Conduct functionality, performance, and security testing. ↳ Ensure different user scenarios are tested for reliability. ↳ Example: Use automated tools for end-to-end testing before deployment. 10/ Monitor and Update ↳ Monitor API performance and user activity continuously. ↳ Update the API to address bugs or add features regularly. ↳ Example: Use Prometheus to monitor latency and health. – P.S: What would you add from your experience?

4 Comments

Best Practices for API Development

More in Best Practices for API Development

More Technology topics

Explore categories