Solutions for High Resource Usage in Kubernetes

Explore top LinkedIn content from expert professionals.

Summary

Solutions for high resource usage in Kubernetes involve managing how computing power and memory are shared among different applications to prevent outages and keep systems running smoothly. By setting rules and using built-in tools, organizations can avoid resource overload and reduce cloud costs.

Set resource boundaries: Define how much CPU and memory each application can use so one app doesn't overwhelm the system.
Scale automatically: Use autoscaling features to add or remove resources based on real-time demand, keeping workloads balanced and responsive.
Rebalance workloads: Employ tools like the descheduler to shift applications away from overloaded servers, making the most of available hardware.

Summarized by AI based on LinkedIn member posts

Thiruppathi Ayyavoo

🚀 |Cloud & DevOps|Application Support Engineer |PIAM|Broadcom Automic Batch Operation|Zerto Certified Associate|

3,590 followers 1y
Report this post
Post 19: Real-Time Cloud & DevOps Scenario Scenario: Your organization’s Kubernetes-based microservices faced a production outage due to a misconfigured pod overusing CPU and memory, causing resource starvation. As a DevOps engineer, your task is to prevent such issues and maintain system stability. Step-by-Step Solution: Set Resource Requests and Limits: Define resources.requests and resources.limits in pod specifications to control CPU and memory usage. Example: yaml Copy code resources: requests: memory: "500Mi" cpu: "250m" limits: memory: "1Gi" cpu: "500m" Enable Namespace Resource Quotas: Use ResourceQuota objects to restrict the total resource consumption within a namespace. Example: yaml Copy code apiVersion: v1 kind: ResourceQuota metadata: name: namespace-quota spec: hard: requests.cpu: "4" requests.memory: "8Gi" limits.cpu: "8" limits.memory: "16Gi" Leverage Horizontal Pod Autoscaler (HPA): Use HPA to scale pods dynamically based on CPU, memory, or custom metrics. Example: yaml Copy code apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: example-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: my-app minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 80 Implement Pod Priority and Preemption: Assign priority classes to pods to ensure critical workloads get resources during contention. Example: yaml Copy code apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: high-priority value: 1000 globalDefault: false description: "Priority for critical workloads" Monitor and Analyze Resource Usage: Use tools like Prometheus, Grafana, or Kubernetes Metrics Server to monitor CPU and memory usage trends. Set up alerts for resource usage thresholds. Implement Node Affinity and Taints: Use node affinity and taints/tolerations to distribute workloads effectively across nodes, avoiding resource bottlenecks. Audit Configurations Regularly: Periodically review and update resource configurations for pods and namespaces. Conduct load tests to validate performance under different conditions. Enable Cluster Autoscaler: Use Cluster Autoscaler to add or remove nodes dynamically based on overall resource demand.This ensures sufficient capacity during peak loads. Outcome: Improved resource allocation prevents single pod failures from impacting other services. The system becomes more resilient and scales dynamically based on demand. 💬 How do you handle resource contention in your Kubernetes clusters? Let’s discuss strategies in the comments! ✅ Follow Thiruppathi Ayyavoo for daily real-time scenarios in Cloud and DevOps. Together, we learn and grow! #DevOps #Kubernetes #CloudComputing #ResourceManagement #Containers #HorizontalPodAutoscaler #RealTimeScenarios #CloudEngineering #LinkedInLearning #careerbytecode #thirucloud #linkedin #USA CareerByteCode

3 Comments
Like Comment
Deepak Agrawal

Founder & CEO @ Infra360 | DevOps, FinOps & CloudOps Partner for FinTech, SaaS & Enterprises

18,569 followers 11mo
Report this post
5 Kubernetes features you’re NOT using (but SHOULD BE) Everyone loves to flex their kubectl skills. But most teams barely scratch the surface of what Kubernetes actually offers. Let’s change that. Here are 5 𝐡𝐢𝐝𝐝𝐞𝐧 𝐠𝐞𝐦𝐬 that silently reduce your downtime and cloud bills: 1. 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐐𝐮𝐨𝐭𝐚𝐬 + 𝐋𝐢𝐦𝐢𝐭𝐑𝐚𝐧𝐠𝐞𝐬 Think you’re controlling resource sprawl? Without these, your devs can (and will) spin up a pod that eats 32GB RAM… for a cron job. → Start enforcing project-level limits before finance hunts you down. 2. 𝐏𝐨𝐝𝐃𝐢𝐬𝐫𝐮𝐩𝐭𝐢𝐨𝐧𝐁𝐮𝐝𝐠𝐞𝐭𝐬 (𝐏𝐃𝐁𝐬) If you’re still seeing “all pods drained” during node upgrades, that’s not bad luck. → It’s a missing PDB. Control voluntary disruptions and keep your services alive during maintenance. 3. 𝐕𝐞𝐫𝐭𝐢𝐜𝐚𝐥 𝐏𝐨𝐝 𝐀𝐮𝐭𝐨𝐬𝐜𝐚𝐥𝐞𝐫 (𝐕𝐏𝐀) HPA gets all the love, but VPA quietly optimizes pod sizes based on real usage. → Stop hard coding resource requests. Let VPA handle right-sizing automatically. 4. 𝐍𝐞𝐭𝐰𝐨𝐫𝐤𝐏𝐨𝐥𝐢𝐜𝐢𝐞𝐬 If you don’t know exactly who can talk to whom inside your cluster… Congrats, you’ve built a perfect environment for lateral movement during a breach. → Block everything by default. Open up only what’s needed. Zero Trust starts inside the cluster. 5. 𝐄𝐩𝐡𝐞𝐦𝐞𝐫𝐚𝐥 𝐂𝐨𝐧𝐭𝐚𝐢𝐧𝐞𝐫𝐬 Stuck debugging a broken pod? → Use kubectl debug to inject a temporary container and troubleshoot without restarting anything. Most teams are still doing painful restarts and log hunts. You don’t have to. Final Word: If you’re running Kubernetes and not using these, you’re paying extra (for no reason). Extra downtime. Extra cloud bills. Extra firefighting. I’ve seen million-dollar cloud bills with $0 in these defenses. Don’t let yours be next. Which one are you guilty of not using? ♻️ 𝐑𝐄𝐏𝐎𝐒𝐓 𝐒𝐨 𝐎𝐭𝐡𝐞𝐫𝐬 𝐂𝐚𝐧 𝐋𝐞𝐚𝐫𝐧.
Like Comment
Trenton VanderWert

Kubernetes and Cloud Native Engineer || Ex-Rancher || Ex-Amazon

4,390 followers 11mo
Report this post
One of my big complaints with Kubernetes is the lack of auto-balancing of resources. If a node goes down - pods will reschedule on active nodes to meet up requirements. But once the node comes up Kubernetes wont re-balance the scheduling to the node again. Instead it will keep the workloads where they are. This is understandable as (sadly) a lot of applications running in kubernetes clusters today probably have no business running in kubernetes at all (extremely stateful). But Kubernetes is pretty mature and I'm certainly not the only person to share this complaint about kubernetes scheduling - this is where the kubernetes-descheduler comes into the picture: https://lnkd.in/gD3Xv7Ca This application allows for balancing rules and will kill pods on nodes that are over-provisioned. This is very useful for stateless applications that can afford a bit of backend turbulence in order to better utilize hardware. The descheduler uses a variety of sources such as the Metrics Controller and even prometheus data to determine the utilization of nodes and the workloads on them. It then uses policies enforced by the CRDs to flatten the workloads across the clusters. The Descheduler uses the Kubernetes Scoring strategy built into the scheduler to determine if a change is needed such as NodeResourceFit and NodeResourcesBalancedAllocation (https://lnkd.in/eU9zmtPs) which is used by the scheduler for making decisions around scheduling a NEW workload. The Descheduler in turn looks at if it can effectively rebalance when a policy violation is met. Here is an example policy: apiVersion: "descheduler/v1alpha2" kind: "DeschedulerPolicy" profiles: - name: ProfileName pluginConfig: - name: "LowNodeUtilization" args: thresholds: "memory": 20 targetThresholds: "memory": 70 plugins: balance: enabled: - "LowNodeUtilization" This policy will find workloads running on nodes with over 70% memory utilization and rebalance them to nodes with less than 20% utilization. This assures the maximal optimization of your horizontal scaling strategy! Happy hacking!

GitHub - kubernetes-sigs/descheduler: Descheduler for Kubernetes github.com

18 Comments
Like Comment

Solutions for High Resource Usage in Kubernetes

Summary

More in Resource Optimization

Explore categories