Kubernetes Taints & Tolerations: Smart Scheduling Control

🚀 Kubernetes Taints & Tolerations — The Hidden Power Behind Smart Scheduling Most engineers learn Pods get scheduled on Nodes… But very few understand how to control that behavior in production. That’s where Taints & Tolerations come in 👇 🧠 The Core Idea 🔴 Taint (Node Level) Blocks Pods from being scheduled 👉 “Don’t come here unless you’re allowed” 🟢 Toleration (Pod Level) Allows specific Pods to bypass that restriction 👉 “I have permission to run here” ⚙️ How It Works (Real Flow) 1️⃣ Pod is created 2️⃣ Scheduler checks Nodes 3️⃣ Node has a Taint? ❌ No → Pod can schedule ✅ Yes → Check toleration ❌ No toleration → Blocked ✅ Has toleration → Allowed ⚠️ Important: 👉 Toleration ≠ Guarantee It only makes the node eligible, not selected. 🔥 Real Production Use Cases ✅ Dedicated Nodes Run DB / GPU workloads on isolated nodes ✅ Node Maintenance Stop new Pods from scheduling ✅ Failure Handling Evict Pods automatically using NoExecute ✅ Security / Isolation Ensure only specific workloads run on sensitive nodes 🚨 Taint Effects (Must Know) NoSchedule → No new Pods PreferNoSchedule → Avoid if possible NoExecute → Evict running Pods 🎯 Pro Tip (Interview + Production) 👉 Combine Taints + Affinity for full control Taints → Block unwanted Pods Affinity → Attract desired Pods ⚠️ Common Mistakes ❌ Thinking toleration forces scheduling ✔️ It only removes restriction ❌ Ignoring NoExecute ✔️ It can kill running Pods 💡 One-Line Summary 👉 Taints restrict nodes. Tolerations allow Pods to bypass those restrictions. If you're working with Kubernetes in production, mastering this concept can: ✔ Improve resource isolation ✔ Increase cluster stability ✔ Optimize workload placement #devops #kubernetes #cloudcomputing #docker #microservices #sre #platformengineering #cloudnative #devsecops #cicd #aws #linux #automation #infrastructureascode #observability #ai #aiops #softwareengineering #tech #engineering

To view or add a comment, sign in

More Relevant Posts

Shiv Jani
3w
Report this post
Most people use Kubernetes. Very few actually understand what’s happening under the hood. Here’s a simple breakdown of what this architecture diagram is really showing 👇 At the center, you have the Control Plane — the brain of Kubernetes. This is where decisions are made. • API Server → the entry point. Every request (kubectl, CI/CD, UI) goes through this. • Scheduler → decides where your pods should run based on resources and constraints. • Controller Manager → constantly checks “desired state vs actual state” and fixes gaps. • etcd → the database. Stores the entire cluster state. If this is gone, your cluster memory is gone. Then comes the Worker Nodes — where real work happens. Each node contains: • Kubelet → talks to control plane and ensures containers are running as expected • Container Runtime → actually runs containers (Docker / containerd) • Kube Proxy → handles networking and service communication Now here’s the part beginners ignore: Kubernetes is not about containers. It’s about desired state reconciliation. You don’t tell Kubernetes how to run things. You tell it what you want, and it keeps trying until reality matches that. That’s why: • Pods restart automatically • Scaling happens without manual intervention • Failures don’t require panic But here’s the uncomfortable truth: If you don’t understand this flow, you’re just memorizing commands — not building systems. And that’s exactly why most “Kubernetes learners” get stuck at tutorials. Real skill = understanding: Control Plane → Node → Pod → Networking → Self-healing loop If this diagram finally makes sense to you, you’re no longer a beginner. You’re starting to think like a systems engineer. #Kubernetes #DevOps #CloudComputing #Containers #SystemDesign #LearningInPublic
1 Comment
Like Comment
To view or add a comment, sign in
Abhay Jha
1w
Report this post
We’re obsessed with “all-in-one” platforms. One tool to code, test, deploy, monitor, and scale. Sounds efficient. In reality, it often creates systems that are hard to debug, hard to change, and impossible to trust under pressure. Because the more a tool tries to do, the less it does well. Decades ago, Doug McIlroy introduced a different way of building systems—the Unix philosophy: • Do one thing, and do it well • Build small, composable tools • Prefer plain-text interfaces Now look at modern DevOps: → Docker containers run a single responsibility → Kubernetes decomposes systems into smaller units → CI/CD pipelines chain simple steps into complex workflows → Logs, YAML, and JSON keep everything observable and scriptable This isn’t coincidence. It’s the same philosophy—just operating at scale. Why this approach wins: - Simplicity: Less surface area → faster debugging - Composability: Systems evolve by combining stable parts - Loose coupling: Failures don’t cascade - Replaceability: Swap components without rewriting everything But here’s the part people miss: Modularity without discipline doesn’t create flexibility. It creates distributed chaos. More services. More pipelines. More moving parts. And no clear ownership or boundaries. The Unix philosophy was never about “many small things.” It was about well-defined responsibilities and clean interfaces. That’s the difference. In a world chasing platforms that promise everything, the real advantage still belongs to engineers who keep systems simple, decoupled, and composable. #DevOps #SRE #Unix #Engineering #Cloud #Kubernetes #SystemDesign
Like Comment
To view or add a comment, sign in
Sanket Deshpande ☁
1mo
Report this post
🚀 Kubernetes Deep Dive — Understanding Jobs (Run Once, Finish Strong 💪) After exploring Deployments, ReplicaSets, and DaemonSets… I moved to something different in Kubernetes: 👉 Jobs (batch workloads) And this changed my perspective completely. 🔧 What I implemented: Created a simple job.yml using BusyBox: • Runs a command → prints a message • Sleeps for a few seconds • Then exits Applied using: kubectl apply -f job.yml 📊 What I observed: • Job started → Pod created • Pod executed the task • Status changed → Completed Checked logs: 📝 "Keep Grinding Apna Time Ayega" 💥 And then… it stopped! 💡 What is a Job in Kubernetes? A Job ensures: 👉 A task runs successfully to completion Unlike Deployments: ❌ It does NOT keep running forever ✅ It runs → completes → exits 🧠 Key Configs I used: • completions: 1 → run task once • parallelism: 1 → run one pod at a time • restartPolicy: Never → don’t restart after completion 🧠 Real-world use cases: Jobs are perfect for one-time or batch tasks: • Database backups • Data processing • Batch scripts • CI/CD tasks • Migrations 🧠 Big Realization: • Deployment → Long-running apps • DaemonSet → Node-level tasks • Job → One-time execution Kubernetes isn’t just for apps… It handles every type of workload 📸 Attached: • Job YAML configuration • Job execution status • Pod lifecycle (Running → Completed) • Logs output from the job Step by step… building real Kubernetes understanding 🚀 #Kubernetes #DevOps #CloudNative #BatchProcessing #SRE #LearningInPublic #PlatformEngineering
Like Comment
To view or add a comment, sign in
Anupam Kumar
1w
Report this post
🚀 Kubernetes Logging Cheat Sheet – What to Check, When & Why Ever been stuck debugging a pod at the worst possible time? I recently came across a super handy cheat sheet for Kubernetes logging, and honestly — it’s the kind of thing that can save your on-call shift. Here’s a quick breakdown of how to think when things go wrong: 🔍 Pod stuck / CrashLoopBackOff / Pending? Check node + state 👉 kubectl get pods -o wide 📌 Need lifecycle events? Find out what actually killed the pod 👉 kubectl describe pod <pod> ⚠️ App misbehaving? Check current logs 👉 kubectl logs <pod> ⏪ App crashed earlier? Don’t forget previous logs (most people miss this!) 👉 kubectl logs -p <pod> 🌐 No clue what’s happening cluster-wide? 👉 kubectl get events --sort-by=.lastTimestamp 🔧 Network / DNS / env issues? Debug inside the container 👉 kubectl exec -it <pod> -- sh 🔗 Service not reachable? 👉 kubectl get endpoints <svc> 📊 CPU / Memory spikes? 👉 kubectl top pod 📄 Validate deployed config? 👉 kubectl get deploy <name> -o yaml 💡 Pro tip: Use describe for infra & logs for app-level debugging And always remember: -p flag = 🔑 for crashed pods This cheat sheet is simple, practical, and something every DevOps engineer should keep bookmarked. 💬 What’s your go-to debugging command in Kubernetes? #DevOps #Kubernetes #Cloud #SRE #Debugging #TechTips
Like Comment
To view or add a comment, sign in
Soham Sarkar
4w
Report this post
Why is a 10MB Container better than a 10GB Virtual Machine? 🐳🤔 Day 24 of #100DaysOfDevOps was all about the 'Why' behind the 'How.' While running containers is easy, explaining the underlying architecture and security is what defines a true DevOps Engineer. Today, I dived deep into Docker Interview Preparation and the internal mechanics of the container ecosystem. Key Learnings from Day 24: ✅ Architecture Deep-Dive: Analyzed the Client-Server model and how the Docker Daemon (dockerd) manages the entire lifecycle. ✅ Resource Efficiency: Understood why sharing the Host OS Kernel makes containers 100x more efficient than traditional VMs. ✅ Optimization & Security: Mastered the nuances of CMD vs ENTRYPOINT, and how Distroless images drastically reduce the attack surface. ✅ Real-World Challenges: Evaluated the 'Single Point of Failure' risks of the Docker Daemon and how Orchestration (Kubernetes) solves it. Practical Lab Results: Reviewed 12 core architectural questions that are fundamental for production-level deployments. From image scanning with Trivy to Multi-stage build logic, the focus was on building Secure, Tiny, and Scalable containers. 🛡️ DevOps isn't just about using tools; it's about understanding the infrastructure they run on! Check out my full technical breakdown and Q&A on GitHub (link in comment). #DevOps #Docker #CloudComputing #Containerization #AWS #100DaysOfCode #Infrastructure #SRE #TechLearning #Security
1 Comment
Like Comment
To view or add a comment, sign in
Narottam Kumar Saw
2w
Report this post
🚀 𝗣𝘂𝘀𝗵𝗶𝗻𝗴 𝗰𝗼𝗱𝗲 𝗶𝘀 𝗲𝗮𝘀𝘆... 𝗱𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 𝗶𝘀 𝘄𝗵𝗲𝗿𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆 𝗵𝗶𝘁𝘀 Every developer at some point: 👉 “It works on my machine!” 👉 “I just pushed the code… why isn’t it live?” And then… Kubernetes enters the chat 🐙 💥 What Kubernetes actually does behind the scenes: ✔️ Validates your YAML (no shortcuts 😅) ✔️ Checks RBAC permissions 🔐 ✔️ Pulls container images 📦 ✔️ Schedules pods on nodes 🧠 ✔️ Attaches volumes & secrets 🔑 ✔️ Sets up networking 🌐 ✔️ Runs health probes ❤️ ✔️ Handles restarts & failures 🔁 ✔️ Ensures desired state = actual state ⚖️ 😄 Reality Check: Kubernetes doesn’t make things harder… It exposes what was always missing in your system. 👉 Proper configuration 👉 Fault tolerance 👉 Observability 👉 Scalability 👉 Reliability ⚡ The Hard Truth: 💡 “It works on my machine” ❌ is NOT a deployment strategy 💡 “It’s running in production reliably” ✅ THAT is engineering 🎯 Lesson for DevOps & Cloud Engineers: 👉 Learn beyond just kubectl commands 👉 Understand how systems behave under failure 👉 Master debugging, networking, and observability Because real engineers don’t just deploy… They make systems survive production 🚀 💬 Be honest — how many times have you said 👉 “Why isn’t it live yet?” 😄 #Kubernetes #DevOps #CloudComputing #SRE #Docker #Containers #Microservices #PlatformEngineering #CICD #InfrastructureAsCode #Terraform #Azure #AWS #GCP #Monitoring #Observability #Prometheus #Grafana #ELK #TechHumor #ProgrammingLife #Developers #SoftwareEngineering #DistributedSystems #Networking #RBAC #YAML #CloudNative #Automation #Scalability #Reliability #ProductionReady #Debugging #LearningInPublic
1 Comment
Like Comment
To view or add a comment, sign in
Ayodeji Saberedowo
3w
Report this post
Day 63 - Deploying a Two-tier Application on K8s #100DaysOfDevOps 🧑💻 For Day 63 of my #100DaysOfDevOps journey, I deployed a two-tier Iron Gallery Application on Kubernetes, focusing on clean architecture and production-aligned practices. I provisioned a dedicated namespace for isolation, deployed both the frontend (Iron Gallery) and backend (MariaDB) using well-structured Deployments, and configured resource limits to simulate real-world workload control. I also implemented volume mounts with "emptyDir" for ephemeral storage and exposed services using ClusterIP for internal communication and NodePort for external access. While this setup is simplified, it closely mirrors production patterns where you'd typically extend with PersistentVolumes, Secrets, and Ingress for durability and security. The goal here wasn’t just to “make it work,” but to align with how real systems are deployed and managed in production environments - clear labeling, proper selectors, and scalable service exposure. All manifests and step-by-step documentation are available here: https://lnkd.in/eNQpu83F Looking forward to building on this foundation and pushing deeper into more advanced Kubernetes patterns. 💪 #Kubernetes #DevOps #CloudEngineering #K8s #Docker #InfrastructureAsCode #PlatformEngineering #SRE #TechCareers

100DaysOfDevOps-KodeKloud/Day-63-Deploy-Iron-Gallery-App-on-Kubernetes/README.md at main · Sabhayor/100DaysOfDevOps-KodeKloud github.com
Like Comment
To view or add a comment, sign in
Swapnil Phapale
3w Edited
Report this post
📅Day 2 of 30 | 💡Topic: Kubernetes Architecture (Simplified) | 🚀 From Zero to Kubernetes 🚀 Most engineers run Kubernetes… But very few understand what happens behind the scenes 👇 Kubernetes has 2 main parts: 🧠 Control Plane (Brain) • API Server → Entry point • Scheduler → Assigns Pods to Nodes • Controller Manager → Maintains desired state • etcd → Stores cluster data ⚙️ Worker Nodes (Execution) • Kubelet → Runs Pods • Container Runtime → Runs containers • Kube Proxy → Handles networking 🔄 Flow: kubectl → API Server → Scheduler → Node → Kubelet → Pod 🎯 Why this matters? ✔ Debug issues faster ✔ Understand scheduling ✔ Crack CKAD exam 🧠 CKAD Practice Question: 👉 You create a Pod using kubectl apply, but it stays in Pending state. ❓ Which Kubernetes component is responsible for assigning the Pod to a Node? A. Kubelet B. Scheduler C. Controller Manager D. Kube Proxy 👇 Drop your answer in comments 📚 Official Reference (for deeper understanding): https://lnkd.in/dJh6kZRa 📅 Day 3: Install Kubernetes using Kubeadm (Production Style) 🚀 Follow for more Kubernetes | CKAD | DevOps | SRE content #Kubernetes #CKAD #DevOps #SRE #Cloud #Docker #LearningInPublic #K8s #PlatformEngineering
3 Comments
Like Comment
To view or add a comment, sign in
Hassan Elhabbal
5d
Report this post
Most engineers memorize Kubernetes definitions... But few understand how to weave them together into a secure production environment. Happy to share what I learned in the Digilians DevOps Intensive Program! We took a deep dive into the core building blocks of Kubernetes to see how the pieces actually fit together. Here is my breakdown of the essentials: 1️⃣ Workloads (The Engine) ✔ Deployments: For stateless apps like your frontend or API. ✔ StatefulSets: For databases that need ordered pods and persistent storage. ✔ DaemonSets: To ensure a specific pod (like a log collector) runs on every single node. 2️⃣ Networking (The Traffic Cops) ✔ Services (ClusterIP): For internal, reliable routing between your microservices. ✔ Ingress: The external HTTP entry point routing outside traffic to the right internal service. 3️⃣ Security & Scaling (The Guards & Muscle) ✔ NetworkPolicy: To enforce strict traffic rules between pods. ✔ RBAC: To assign least-privilege access using Roles and ServiceAccounts. — Let me simplify it in another 👇 way If your Kubernetes cluster is a gated city: Nodes = The plots of land. Ingress = The city's main highway entrance. Services = The internal roads connecting neighborhoods. NetworkPolicy = The security checkpoints ensuring only authorized traffic can travel between specific roads. The right engineer isn't just someone who can write a Deployment. The right engineer understands how to secure and scale it. Top tip: Don't leave your Pods open to the world. Always implement a default deny-all NetworkPolicy and explicitly allow only the traffic you need. #Kubernetes #DevOps #Digilians #K8s #TechTips #CloudNative
10 Comments
Like Comment
To view or add a comment, sign in