Steps to Debug Kubernetes Issues Locally

Explore top LinkedIn content from expert professionals.

Summary

Debugging Kubernetes issues locally means finding and fixing problems in your Kubernetes setup on your own computer or test environment, rather than in production. This process helps you identify why your applications or services aren’t working as expected by systematically checking logs, configurations, and network connections.

  • Check pod logs: Use commands to inspect your pod’s logs and spot errors or misconfigurations that might be causing your application to fail.
  • Review resources and connectivity: Make sure your pods have enough memory and CPU, and test network connectivity between services to rule out resource shortages or networking issues.
  • Confirm service setup: Look at service definitions, labels, and endpoints to ensure your application is reachable and properly routed inside the cluster.
Summarized by AI based on LinkedIn member posts
  • View profile for Deepak Agrawal

    Founder & CEO @ Infra360 | DevOps, FinOps & CloudOps Partner for FinTech, SaaS & Enterprises

    18,599 followers

    I use this simple 3-step logs flow that helps me debug almost anything in Kubernetes under 30 minutes. 𝗦𝘁𝗲𝗽 1 → kubectl logs <pod> Ask: “Did the app fail inside the container?” If the pod is up, this is your first stop. Look for stack traces, startup errors, misconfigs. But if logs show nothing (or the pod never started), move on fast. 𝗦𝘁𝗲𝗽 2 → kubectl describe pod <pod> Ask: “Did Kubernetes kill the pod?” This one’s underrated. It shows you probe failures, CrashLoops, image pull issues, and mount errors. Basically, if K8s is mad at your pod, this will tell you why. 𝗦𝘁𝗲𝗽 3 → kubectl get events --sort-by=.metadata.creationTimestamp Ask: “What else is breaking in the cluster?” This is your timeline. It shows broader issues: node pressure, CNI problems, preemptions. If the problem isn’t in logs or describe, this one usually holds the clue. This is the exact flow we use inside incident war rooms. ➤ If the pod is running → check logs. ➤ If it’s crashing or pending → check describe. ➤ If you’re still lost → check events. Don’t waste 45 minutes staring at Grafana hoping something makes sense. Start with the logs. Ask better questions. Fix faster. I built a 1-page cheatsheet of this debugging flow. It’s part of our SRE onboarding at Infra360. Want it? Drop a “LOGS” in the comments and I’ll send it to you.

  • View profile for Karuna Shreshth

    DevOps Engineer @ O9 Solutions || DeVOps Engineer AWS| EKS | GITLAB |Ansible |DevOps & Cloud | Kubernetes| CI/CD| Terraform| GITHUB Action| GITLAB| Docker| Scalability| Reliability| Security | Azure | GCP |

    24,240 followers

    Kubernetes Challenging Scenario Scenario: Service Not Accessible – Pod is Running, but the Application Can’t Be Reached You’ve deployed your application in Kubernetes. The pod is in the "Running" state, and you've created a service (ClusterIP, NodePort, or LoadBalancer) to expose it. However, when you try to access the application using the service, it doesn’t respond or times out. This issue can be confusing, especially since the pod seems healthy. Solution: 1. **Verify the Service Exists and Is Correctly Defined:** Check all services: kubectl get svc 2. **Inspect the Service Details:** kubectl describe svc <service-name> Look for: - Correct port mapping - Type (ClusterIP, NodePort, LoadBalancer) - Selector (does it match the pod labels?) 3. **Check Endpoints:** Services forward traffic only to healthy pods that match their selectors. Check if the service has endpoints: kubectl get endpoints <service-name> If no endpoints are listed, it means the service isn’t connected to any pod. 4. **Common Root Causes:** - The service selector doesn’t match the pod’s labels. - The pod is not marked as "Ready" (failing readiness probes). - A NetworkPolicy or firewall is blocking traffic. - The container is listening on a different port than the service is exposing. 5. **Resolution Steps:** **Ensure Pod Labels Are Correct:** Check the pod: kubectl get pods --show-labels Ensure it includes: metadata: labels: app: myapp **Match the Service Selector:** spec: selector: app: myapp **Check Pod Readiness:** kubectl get pods -o wide If the pod isn't in the "Ready" state, fix the health or readiness probe. **Validate Port Mapping in the Service:** spec: type: NodePort ports: - port: 80 targetPort: 8080 nodePort: 30007 Make sure: - The container listens on `targetPort` - The service forwards requests correctly **Bonus Tip:** Use `curl` or `wget` from inside the cluster (e.g., from another pod) to verify connectivity: kubectl run test --rm -it --image=busybox -- /bin/sh wget <service-name>:<port> Key Takeaway: A running pod doesn't guarantee service availability. Always validate: - **Label-selector match** - **Pod readiness** - **Endpoint connectivity** - **Correct port exposure** Understanding how services route traffic and depend on pod labels is essential to debugging these issues quickly. #Kubernetes #DevOps #K8sTroubleshooting #ServiceDebugging #CrashLoopBackOff #CloudNative #SRE #PlatformEngineering #Containers #PodReady #Microservices #Networking #K8sTips #NodePort #LoadBalancer

  • View profile for Gwen (Chen) Shapira

    Building Something Amazing

    12,845 followers

    🔎 Pro-tip: “kubectl debug” is a Swiss-army knife for investigations. My favorite trick: run pgbench inside the same Pod as Postgres. It eliminates all network noise from the test. A few other tricks I keep coming back to: 👉 Instant toolbox: Attach a container with gdb, strace, or tcpdump without modifying your image: kubectl debug mypod --image=busybox -c dbg 👉 CrashLoop investigation: Clone the pod with the same env and volume mounts, but start a shell instead of the crashing process: kubectl debug mypod --copy-to shell --share-processes -it -- bash 👉 Safe canary rollout: Create a new pod with a different image but same configs, secrets, and resources: kubectl debug mypod --copy-to api-next --image=ghcr.io/acme/api:1.4.0 📌 Notes: Ephemeral containers are GA since Kubernetes 1.25 --copy-to creates a brand-new pod Debug pods skip your normal lifecycle hooks, so clean up after yourself

  • View profile for Dilawar Javaid

    AWS × 3 certified | AWS Solutions architect | AWS Data Engineer | Python | Django | Linux | Next.js | React. js| Node.js | Kubernetes | Terraform | Iac | Databricks | Azure | Snowflake | GCP | Cloud Security | RUST

    10,493 followers

    🚨 Struggling with Kubernetes Deployments? Let’s Break Down How to Debug Like a Pro! 🔍 Kubernetes is a game-changer for container orchestration, but let’s be honest—Kubernetes deployments don’t always go smoothly. 😬 We’ve all faced those frustrating moments when something just doesn’t work as expected. But fear not! Debugging Kubernetes can be your secret weapon to fixing deployment issues and taking your skills to the next level. 🚀 Here’s how you can debug a Kubernetes deployment like a pro and turn those headaches into solutions! Step 1: Check the Pods’ Status 1️⃣ What to Do: 🟢 Use the kubectl get pods command to check the status of your pods. 🟢 Look for pods that are stuck in a "Pending" or "CrashLoopBackOff" state. 1️⃣ Why It Matters: 🟢 This is your first indication of what’s going wrong. If your pods aren’t starting properly, there’s a deeper issue to tackle. Step 2: Inspect Pod Logs 2️⃣ What to Do: 🟡 Run kubectl logs <pod-name> to retrieve logs from a specific pod. 🟡 If your container is crashing, these logs are crucial for identifying the root cause. 2️⃣ Why It Matters: 🟡 Logs give you detailed insights into what's happening inside the pod—whether it's a misconfiguration, missing environment variables, or something else. Step 3: Describe the Deployment 3️⃣ What to Do: 🔵 Use kubectl describe deployment <deployment-name> to get a detailed breakdown of the deployment, including events, pod scheduling issues, and resource constraints. 3️⃣ Why It Matters: 🔵 This command helps you spot potential issues with node scheduling, resource limits, or even image pull errors. It’s the full story of your deployment’s health! Step 4: Check for Resource Limitations 4️⃣ What to Do: 🔴 Look for resource issues with kubectl describe node <node-name>. Check if your pods have enough memory and CPU to run properly. 4️⃣ Why It Matters: 🔴 Many deployment failures come down to insufficient resources. Scaling your resources or adjusting your pod limits might be all you need to fix the problem! Step 5: Review ConfigMaps and Secrets 5️⃣ What to Do: 🟠 Check if your deployment is correctly loading ConfigMaps and Secrets. Use kubectl get configmap and kubectl get secret to ensure they are properly mounted. 5️⃣ Why It Matters: 🟠 Misconfigured environment variables, credentials, or missing files can cause containers to fail unexpectedly. This step helps you ensure the right settings are in place. Step 6: Network Connectivity 6️⃣ What to Do: 🟣 Use kubectl exec -it <pod-name> -- /bin/bash to enter a pod’s shell and troubleshoot network connectivity with tools like curl or ping. 6️⃣ Why It Matters: 🟣 If your pods can’t communicate with each other or external services, the entire deployment can break. Ensuring connectivity is critical for debugging. 💬 Join the Discussion #Kubernetes #DevOps #CloudComputing #Containerization #K8s #Debugging #TechTips #DigitalTransformation #DilawarJavaid #DeploymentIssues #CloudSolutions #InfrastructureAsCode

  • View profile for Fadhel Smiti

    Ingénieur DevOps | Certified kubernetes (CKA) ☸️

    4,702 followers

    Kubernetes common errors: 1. **CrashLoopBackOff**: - **Description**: A pod repeatedly crashes and restarts. - **Troubleshooting**: - Check pod logs: `kubectl logs <pod-name>`. - Describe the pod for more details: `kubectl describe pod <pod-name>`. - Investigate the application's start-up and initialization code. 2. **ImagePullBackOff**: - **Description**: Kubernetes cannot pull the container image from the registry. - **Troubleshooting**: - Verify the image name and tag. - Check the image registry credentials. - Ensure the image exists in the specified registry. 3. **Pending Pods**: - **Description**: Pods remain in the "Pending" state and are not scheduled. - Troubleshooting: - Check node resources (CPU, memory) to ensure there is enough capacity. - Ensure the nodes are labeled correctly if using node selectors or affinities. - Verify there are no taints on nodes that would prevent scheduling. 4. Node Not Ready: - Description: One or more nodes are in a "NotReady" state. - Troubleshooting: - Check node status: `kubectl describe node <node-name>`. - Review kubelet logs on the affected node. - Ensure the node has network connectivity. 5. Service Not Working - Description: Services are not accessible or routing traffic correctly. - Troubleshooting: - Check the service and endpoints: `kubectl get svc` and `kubectl get endpoints`. - Verify network policies and firewall rules. - Ensure the pods backing the service are healthy and running. 6. **Insufficient Resources**: - **Description**: Pods cannot be scheduled due to insufficient resources. - **Troubleshooting**: - Review resource requests and limits in pod specifications. - Scale the cluster by adding more nodes. 8. **PersistentVolume Claims Pending**: - **Description**: PVCs remain in a "Pending" state. - **Troubleshooting**: - Check if there are available PVs that match the PVC specifications. - Ensure the storage class exists and is configured correctly. - Verify that the underlying storage backend is healthy. 9. **Pod Stuck Terminating**: - **Description**: Pods get stuck in a "Terminating" state. - **Troubleshooting**: - Check for finalizers that might be preventing pod deletion. - Review the logs for shutdown hooks or long-running processes. - Force delete the pod if necessary: `kubectl delete pod <pod-name> --force --grace-period=0`. 10. **DNS Resolution Issues**: - **Description**: DNS lookups within the cluster fail. - **Troubleshooting**: - Check the DNS pod logs (e.g., CoreDNS): `kubectl logs <coredns-pod>`. - Ensure the DNS service is running: `kubectl get svc -n kube-system`. - Verify network policies and firewall rules do not block DNS traffic.

  • View profile for EBANGHA EBANE

    AWS Community Builder | Cloud Solutions Architect | Multi-Cloud (AWS, Azure & GCP) | FinOps | DevOps Eng | Chaos Engineer | ML & AI Strategy | RAG Solution| Migration | Terraform | 9x Certified | 30% Cost Reduction

    43,689 followers

    Here are some Kubernetes troubleshooting notes: *Common Issues:* 1. Pod not starting: - Check pod status (`kubectl get pods`) - Verify image name and tag - Inspect pod logs (`kubectl logs`) - Check node resources (CPU, memory) 2. Deployment not rolling out: - Verify deployment config (`kubectl get deployments`) - Check replica count and availability - Inspect deployment history (`kubectl rollout history`) - Check node affinity/anti-affinity 3. Service not exposed: - Verify service config (`kubectl get svc`) - Check endpoint configuration - Inspect service logs (`kubectl logs`) - Check network policies 4. Persistent Volume (PV) issues: - Verify PV config (`kubectl get pv`) - Check storage class configuration - Inspect PV logs (`kubectl logs`) - Check node storage capacity *Troubleshooting Tools:* 1. `kubectl get` - Retrieve information about resources 2. `kubectl describe` - Detailed information about resources 3. `kubectl logs` - Retrieve container logs 4. `kubectl exec` - Execute commands in containers 5. `kubectl debug` - Debug containers 6. `kubectl top` - Resource usage metrics 7. `kubectl cluster-info` - Cluster information *Logging and Monitoring:* 1. Kubernetes dashboard 2. Prometheus and Grafana 3. Fluentd and Elasticsearch 4. Logstash and Kibana *Networking:* 1. Verify pod-to-pod communication 2. Check service exposure (LoadBalancer, Ingress) 3. Inspect network policies 4. Verify DNS resolution *Security:* 1. Verify RBAC configuration 2. Check network policies 3. Inspect pod security context 4. Verify image security *Node and Cluster Issues:* 1. Node not ready: - Check node status (`kubectl get nodes`) - Verify node resources (CPU, memory) - Inspect node logs (`kubectl logs`) 2. Cluster not upgrading: - Verify cluster configuration (`kubectl get cluster`) - Check node compatibility - Inspect upgrade logs (`kubectl logs`) *Best Practices:* 1. Use meaningful resource names 2. Monitor resource usage 3. Implement logging and monitoring 4. Use network policies 5. Regularly backup and restore

  • View profile for Poojitha A S

    DevOps | SRE | Kubernetes | AWS | Azure | MLOps 🔗 Visit my website: poojithaas.com

    7,240 followers

    #DAY112 I’m back from vacation and will resume regular posting! #Advanced Kubernetes Commands Every Expert Should Know: Debugging, Dry Runs, and More! 1. Dry Run for Resource Validation A dry run is essential for validating YAML files and configurations before applying them to your cluster. kubectl apply -f mydeployment.yaml --dry-run=client What it does: Simulates resource creation locally and flags potential issues without making any changes. 2. Debugging Pods on the Fly If a pod or container fails, kubectl debug lets you create a temporary debugging container within your pod for troubleshooting. kubectl debug -it mypod --image=busybox --target=containerName What it does: Starts a debugging session inside the pod using a different image, without interrupting the application. It's great for inspecting logs, files, or running isolated commands. 3. Quickly Editing Resources with kubectl edit Rather than editing YAML files manually, kubectl edit allows you to make live changes directly from the command line. kubectl edit deployment mydeployment What it does: Opens the resource configuration in your default editor (e.g., vim or nano) for quick editing. 4. Rolling Back Deployments To quickly revert to a previous version after a failed deployment: kubectl rollout undo deployment/mydeployment What it does: Rolls back to the last successful deployment, minimizing downtime. 5. Tail Logs from a Specific Pod Container When debugging, it’s crucial to view logs from the container causing issues. Instead of filtering through multiple containers, target the specific one. kubectl logs -f mypod -c mycontainer What it does: Streams logs from a specific container inside the pod for easier debugging. 6. Setting Resource Limits on the Fly Use kubectl set resources to adjust resource limits for a running pod, helpful for debugging resource constraints. kubectl set resources deployment mydeployment --limits=cpu=500m,memory=256Mi What it does: Sets CPU and memory limits for your deployment to see how it performs under different resource conditions. 7. Get Pod Events to Track Down Issues Events give you insights into what’s happening behind the scenes. Use kubectl get events to track issues like scheduling problems or failed probes. kubectl get events --field-selector involvedObject.name=mypod What it does: Filters events related to a specific pod, helping to identify problems. Pro Tip: Combine kubectl describe with kubectl get events for more thorough troubleshooting insights. TL;DR: These Kubernetes commands are essential for experts who want to: Simulate and validate changes Debug containers quickly Edit live resources Roll back deployments Gather logs and events for precise troubleshooting Master these commands to optimize your Kubernetes operations! 🌟 #Kubernetes #DevOps #Cloud #K8S #AdvancedCommands #CloudNative #Debugging #DryRun #DevOpsTools #KubernetesTips

  • View profile for Saurav Chaudhary

    🚀 Building Scalable, Resilient, and Automated Systems That Don’t Break!

    49,458 followers

    🧨 “I failed my DevOps interview.” They asked me: “A pod can’t resolve service names. DNS looks fine. What’s your next move?” I froze. Mumbled something about restarting CoreDNS. They nodded politely. I didn’t get the job. What they actually wanted to hear: ✅ kubectl exec into the pod and test with nslookup or dig ✅ Check if CoreDNS pods are healthy but stuck ✅ Audit the CoreDNS ConfigMap – maybe the plugin chain is misordered ✅ Validate forward and loop plugins aren’t causing recursive hell ✅ Confirm the memory usage — maybe it’s a silent OOM loop ✅ Fix the order: log → errors → health → forward → cache ✅ Apply changes with kubectl apply and test in staging 💡 The real lesson? Most people know CoreDNS resolves names. But only the top 1% know it can silently break itself — even while showing green. 🛠 What I now teach in my mentorship: • How to build mental models for debugging CoreDNS • Real scenarios where probes are green but DNS is dead • How to detect misconfigured plugin chains • Simulating DNS outages and practicing real RCA workflows This isn’t a YouTube tutorial. This is war-room training — where we don’t panic when names stop resolving. 📍 Book a mock interview So the next time DNS fails, you’re the one they call, not the one they reject. #DevOps #Kubernetes #Cloud

Explore categories