Building a Real DevSecOps Pipeline on AWS

Most teams I work with have a pipeline. The problem is what happens inside it. Security tooling gets added reactively — a SonarQube instance nobody reviews, a Trivy scan that's set to warn instead of block, image tags so loosely managed that nobody can tell you what's actually running in production. The tools exist. The gates don't. This week I built a reference DevSecOps pipeline on AWS to demonstrate what it looks like when the gates are real. A few deliberate choices worth noting: → SAST with Bandit, not SonarQube. This is a Python service. Bandit is purpose-built for Python, runs in seconds with zero infrastructure overhead, and produces actionable output. Defaulting to SonarQube for every stack regardless of context is a tool decision masquerading as an architecture decision. → Container images tagged with git commit SHA, not "latest". Every running container is traceable to its exact source commit. When something breaks in production, you know precisely what code is running. This is not optional in any environment that takes incident response seriously. → Trivy configured to block, not warn. A scan that warns and proceeds is a reporting tool. A scan that blocks on HIGH severity CVEs with available fixes is a security gate. The distinction matters significantly under pressure. → ECS tasks in private subnets. The ALB faces the internet. The containers do not. This is the baseline architecture, not advanced hardening. The entire infrastructure — VPC, ECS Fargate, ALB, ECR, CodePipeline, CloudWatch — is Terraform. Nothing was configured through a console. Reproducible, auditable, version-controlled. Full code: https://lnkd.in/gr3G7K-k I work with engineering teams in telecom, e-commerce and fintech to close the gap between having security tooling and having security gates that actually hold. If that's a conversation worth having, my inbox is open. #devsecops #cloudnative #aws #terraform #cicd #devops #appsec #platformengineering #fintech #telecom #cloudarchitecture #securityengineering

GitHub - MurLeeDas/devsecops-aws-poc: DevSecOps pipeline — a real, working, secure CI/CD pipeline on AWS github.com

To view or add a comment, sign in

More Relevant Posts

Aditya Jaiswal
1w
Report this post
𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗚𝗿𝗮𝗱𝗲 𝗘𝗻𝗱-𝘁𝗼-𝗘𝗻𝗱 𝗗𝗲𝘃𝗦𝗲𝗰𝗢𝗽𝘀 𝗣𝗿𝗼𝗷𝗲𝗰𝘁 (𝗨𝗱𝗲𝗺𝘆) 𝗥𝘂𝗻𝘁𝗶𝗺𝗲 𝟵𝗛𝗼𝘂𝗿𝘀 This is a complete production-style implementation — from code to deployment to monitoring. You won’t just learn what to use. You’ll understand why, where, and how things connect. 𝗖𝗜/𝗖𝗗 𝗧𝗵𝗮𝘁 𝗔𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝗥𝗲𝗳𝗹𝗲𝗰𝘁𝘀 𝗥𝗲𝗮𝗹 𝗪𝗼𝗿𝗸 Instead of a single pipeline, you’ll implement: Feature → QA → Production branching strategy QA pipelines with build, scan, and deploy Production pipelines with controlled promotion 𝗗𝗲𝘃𝗦𝗲𝗰𝗢𝗽𝘀 — 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗲𝗱, 𝗡𝗼𝘁 𝗔𝗱𝗱𝗲𝗱 𝗟𝗮𝘁𝗲𝗿 You’ll integrate: Gitleaks → prevent secrets from entering repo Checkov → scan Terraform, Kubernetes, Dockerfiles Trivy → scan filesystem + container images SBOM → understand what actually goes into your builds SonarQube → enforce code quality with Quality Gates 𝗦𝗲𝗰𝗿𝗲𝘁𝗹𝗲𝘀𝘀 𝗔𝘂𝘁𝗵𝗲𝗻𝘁𝗶𝗰𝗮𝘁𝗶𝗼𝗻 (𝗠𝗼𝗱𝗲𝗿𝗻 𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱) In this course, you’ll implement: GitHub Actions → OIDC → AWS IAM → EKS Access No access keys. No hardcoding. No risk of leakage. This is how modern cloud-native systems authenticate securely. 𝗦𝗲𝗰𝗿𝗲𝘁 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁 𝗗𝗼𝗻𝗲 𝗥𝗶𝗴𝗵𝘁 AWS Secrets Manager as the source of truth External Secrets Operator (ESO) inside Kubernetes IRSA to securely fetch secrets inside pods This ensures: No secrets in GitHub No secrets in manifests Fully dynamic secret injection 𝗞𝘂𝗯𝗲𝗿𝗻𝗲𝘁𝗲𝘀 — 𝗕𝗲𝘆𝗼𝗻𝗱 𝗦𝘁𝗮𝘁𝗲𝗹𝗲𝘀𝘀 𝗔𝗽𝗽𝘀 Most tutorials stop at simple deployments. Here, you’ll go deeper: Deploy MySQL using StatefulSets Configure persistent storage 𝗛𝗮𝗻𝗱𝗹𝗲 𝗿𝗲𝗮𝗹-𝘄𝗼𝗿𝗹𝗱 𝗮𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝗰𝗶𝗲𝘀 You’ll design and deploy: Amazon EKS cluster AWS Load Balancer Controller (ALB) Route 53 for DNS ACM for SSL certificates End result: A fully working HTTPS-enabled production system 𝗢𝗯𝘀𝗲𝗿𝘃𝗮𝗯𝗶𝗹𝗶𝘁𝘆 (𝗧𝗵𝗲 𝗠𝗶𝘀𝘀𝗶𝗻𝗴 𝗟𝗮𝘆𝗲𝗿) You’ll implement: Prometheus → metrics Loki → logs Tempo → traces Grafana → visualization 𝗙𝗶𝗻𝗮𝗹 𝗟𝗮𝘆𝗲𝗿 — 𝗗𝗼𝗺𝗮𝗶𝗻 & 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 Purchase and configure a custom domain Map it to your application Enable HTTPS using ACM Route traffic securely via ALB This is the layer that turns a project into a real product 𝗖𝗼𝘂𝗿𝘀𝗲 𝗢𝘃𝗲𝗿𝘃𝗶𝗲𝘄 https://lnkd.in/gAuDmefg 𝗖𝗼𝘂𝗿𝘀𝗲 𝗟𝗶𝗻𝗸 https://lnkd.in/gTeUiGda Coupon Code: DEVOPSSHACK #udemy #devops #devsecops #eks #githubactions #devopscourse #devsecops #linux #devopsshack
4 Comments
Like Comment
To view or add a comment, sign in
Shantanu Bhayare
2w
Report this post
Nobody talks about how many DevOps pipelines are one leaked credential away from disaster. Here's the secure, production-grade way to build this: GitHub → Jenkins → ECR → EKS No hardcoded secrets. No static keys. No shortcuts. 🏗️ THE FULL ARCHITECTURE ① Developer pushes code to GitHub ② GitHub webhook triggers Jenkins pipeline ③ Jenkins builds a Docker image ④ Image is pushed to Amazon ECR (private registry) ⑤ Jenkins deploys to Amazon EKS using Helm ⑥ Kubernetes pulls the image from ECR and runs it Simple flow. But the SECURITY layer is what separates a hobbyist setup from a production one. 🔐 THE 5 SECURITY DECISIONS THAT MATTER 1️⃣ Use IRSA — not access keys Annotate your Jenkins K8s service account with an IAM role ARN. EKS injects an OIDC token → AWS STS returns short-lived credentials automatically. Your Jenkinsfile has zero AWS credentials. Zero. 2️⃣ Store images in private ECR — not Docker Hub Enable scanOnPush: true on your ECR repo. Fail the build if HIGH or CRITICAL CVEs. 3️⃣ Never use :latest in production. Every image must be traceable to exact source code. 4️⃣ Keep the EKS API private Set endpointPublicAccess: false. Jenkins must run inside the same VPC. No EKS control plane exposed to the internet. 5️⃣ Scope RBAC tightly Give Jenkins a namespace-scoped Role — not ClusterRole. It can create/update deployments. That's it. Nothing more. 🔁 WHAT THE SECURE JENKINSFILE LOOKS LIKE stage('ECR Push') { // No credentials block — IRSA handles it sh ''' aws ecr get-login-password --region $AWS_REGION \ | docker login --username AWS --password-stdin $ECR_URI docker push $ECR_URI:$GIT_COMMIT ''' } stage('Deploy to EKS') { // Kubeconfig generated fresh every build sh ''' aws eks update-kubeconfig --name $CLUSTER_NAME --region $AWS_REGION helm upgrade --install myapp ./helm \ --set image.tag=$GIT_COMMIT \ --atomic --timeout 5m ''' } Notice what's missing: no withCredentials(), no stored kubeconfig, no AWS_ACCESS_KEY_ID anywhere. ✅ QUICK SETUP CHECKLIST → ECR repo with scanOnPush enabled → EKS OIDC provider enabled (required for IRSA) → IAM role with trust policy scoped to Jenkins service account → Jenkins deployed inside the same VPC as EKS → GitHub webhook configured to trigger on push to main → Helm chart with --atomic flag for auto-rollback → Slack/Teams notification on deploy success or failure The pipeline that has no credentials to steal is the most secure pipeline. Once you set this up once, you'll never go back to managing access keys in Jenkins. 📄 I've put together a detailed guide covering all of this — full architecture breakdown, stage-by-stage Jenkinsfile, common mistakes, and a 9-point pre-production checklist. 👉 Link in the first comment below. 💾 Save this for your next EKS project. Tag a teammate who's still storing AWS keys in Jenkins credentials. 🙌 #DevOps #AWS #Kubernetes #Jenkins #EKS #ECR #GitHub #CICD #CloudSecurity #DevSecOps #Docker #SRE #InfrastructureAsCode

1 Comment
Like Comment
To view or add a comment, sign in
IRFAN BASHA
3w Edited
Report this post
Day 8 of #DevOpsJourneyToHired 🎉 WEEK 1 COMPLETE + Major Announcement! 🏗️ PROJECT FINALIZED: Secure Cloud-Native E-Commerce Platform After community feedback and planning, I'm building a production-grade DevSecOps platform that demonstrates enterprise-level skills. 📐 Architecture Highlights: ✅ **CI/CD & GitOps** → GitHub Actions for automation → ArgoCD for GitOps deployments → Terraform + AWS for IaC ✅ **DevSecOps Security** → Trivy (runtime scanning) → SonarQube (code quality) → Falco (runtime security) → Snyk (dependency scanning) ✅ **Kubernetes on EKS** → Multi-node cluster → Kyverno for policy enforcement → OPA Gatekeeper for admission control ✅ **Observability Stack** → Prometheus + Grafana for metrics → Loki for log aggregation → Jaeger for distributed tracing ✅ **Secrets Management** → HashiCorp Vault → AWS Secrets Manager → Terragrunt for state management 🎯 Why This Project Matters: This isn't a toy app. It's production-ready architecture showing: - Security-first approach (DevSecOps) - Cloud-native best practices - Complete observability - Enterprise secrets management - GitOps workflow - Policy-as-code enforcement 📊 Week 1 By The Numbers: - Days posted: 7/7 ✅ - Learning hours: 28+ - Applications: 17 - Skills gained: 7 - Project architecture: FINALIZED 🚀 Week 2 Goals: → Set up local dev environment → Master Docker containerization → Build base infrastructure with Terraform → Deploy first microservice → Continue daily job applications This is going to take 8-10 weeks to build properly. Following along? The journey starts NOW. Tomorrow: Docker deep dive begins! #DevOps #DevSecOps #Kubernetes #AWS #GitOps #ArgoCD #CloudNative #SecurityFirst #ProjectBased
Like Comment
To view or add a comment, sign in
Éder B.
3w
Report this post
A few months ago, I shared an open-source repo for deploying a free, forever Kubernetes cluster on OCI. (link in comments) Today I'm sharing what I built on top of it. 🌐 My personal portfolio is now live: https://ederbrito.com.br But this isn't just a "here's my CV" website. It's the missing piece that actually populates those Grafana datasources I mentioned in the last post, with real traces, metrics, and logs from a production workload. What's new: 🖥️ Frontend — Next.js 16, TypeScript, Node.js 24, deployed as a multi-arch Docker image (amd64 + arm64) 📡 Full Observability instrumented into the app: OpenTelemetry → Jaeger (distributed tracing) prom-client → Prometheus (custom metrics at /api/metrics) Pino structured JSON logs → Loki ⚙️ CI/CD pipeline with GitHub Actions: TypeScript checks + Trivy vulnerability scanning Multi-arch Docker build & push to Docker Hub Automated and controlled deployment of the generated image to Kubernetes The whole thing (infra + app + observability) costs $0/month if you use Cloudflare for DNS. What's coming next: I'll be building microservices in Go and Python that integrate with external APIs, generating richer traces, metrics, and logs across a real distributed system, the kind of multi-service topology where observability actually gets interesting. The goal is to have a living, evolving system that demonstrates SRE practices end-to-end: from infrastructure provisioning all the way to service-level observability across polyglot workloads. Still a beta, still improving. But it works, it's observable, and it's open source. Feedback and contributions are welcome. 🙏 Repo: https://lnkd.in/dKs4ihtQ #SRE #DevOps #Kubernetes #Observability #OpenTelemetry #NextJS #GitHubActions #OracleCloud #OpenSource

GitHub - britoederr/ederbrito.com.br: Open-source portfolio with fully automated CI/CD and infrastructure to deploy a website on OCI. github.com

1 Comment
Like Comment
To view or add a comment, sign in
Pavan KalyanReddy Y.
1w
Report this post
🚀 Microservices Challenges – The Reality No One Talks About Everyone loves to talk about microservices. Scalability. Flexibility. Independent deployments. But in real systems, the challenges hit you hard — especially in production. After working on large-scale distributed systems, here are 3 problems that show up every single time: ⚠️ 1. Distributed Transactions (The “It worked locally” problem) In monoliths: 👉 One DB transaction → commit or rollback → done In microservices: 👉 Multiple services + multiple databases + async calls Now ask yourself: What happens if Service A succeeds and Service B fails? You don’t get rollback. You get inconsistent state. 💡 What actually works in real systems: Saga pattern (orchestration/choreography) Event-driven compensation Idempotent APIs (retry-safe) 👉 Lesson: You don’t “solve” distributed transactions. You design around failure. ⏱️ 2. Latency (Death by 100 API calls) One request = Service A → B → C → D → DB → back again Congrats, your 50ms API just became 800ms+ And under load? Even worse. 💡 What helps: API aggregation (don’t chain blindly) Caching (Redis is your best friend) Async processing where possible Circuit breakers (fail fast > slow failure) 👉 Lesson: Latency is not a bug. It’s a design consequence. 🔍 3. Debugging (Welcome to the nightmare) In monolith: 👉 Stack trace → fix → done In microservices: 👉 6 services → 3 logs → 2 timeouts → 1 confused engineer “Where did it fail?” becomes a real question. 💡 What actually saves you: Distributed tracing (OpenTelemetry, Zipkin) Centralized logging (ELK / CloudWatch) Correlation IDs (non-negotiable) 👉 Lesson: If you don’t invest in observability early, you will pay for it later at 3 AM. 🧠 Final Thought Microservices are powerful — but they come with complexity. Not every system needs them. 👉 If you don’t need scale → keep it simple 👉 If you go microservices → design for failure from day one If you’ve worked with microservices in production, you already know: The real challenge isn’t building them. It’s running them reliably. #Microservices #SystemDesign #Java #Backend #Kafka #DistributedSystems #DevOps #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Hamedou Bathousha
2w
Report this post
I made 4 mistakes in one Kubernetes session. I'm publishing all of them. Not because I enjoy looking incompetent. Because the engineers I respect most have always been honest about the gap between what tutorials show and what terminals actually do. Here's what happened. I was building a 3-replica nginx Deployment. Three pods. Load balanced. Fault tolerant. If one dies, traffic reroutes automatically. This is table stakes in production — no serious company runs on single points of failure. I felt good. Then I tried to expose it. Typed kubectl create service. Hit enter. "Error: resource already exists." Turned out I'd mixed kubectl create and kubectl apply like they're the same command. They're not. create says: make this. If it already exists, fail loudly. apply says: make this. If it already exists, reconcile quietly. One word. Two completely different philosophies. Some tools are for starting. Others are for adapting. Knowing which one you're holding matters. I moved the service from ClusterIP to NodePort — port 30007, reachable from outside the cluster. Typed the command. Fast. kubeclt instead of kubectl. The terminal just... waited. I waited back. Three seconds of pure confusion before I spotted it. A typo. Under pressure. In a command I've typed a hundred times. Here's the thing nobody puts in the docs: The faster you type in a terminal, the slower you actually go. Every pilot, surgeon, and air traffic controller knows this. You slow down so you don't restart. Fixed the typo. NodePort came up clean. Then the rolling update. Live traffic. Zero downtime. nginx → nginx:1.25. Kubernetes handles this beautifully — it replaces pods one at a time, waits for each to be healthy before moving to the next. I went to verify the new image with a JSONPath query. Ran it. Got nothing back. No error. No warning. Just... silence. I checked three times. Then I saw it. Missing dot. {items[*]...} instead of {.items[*]...} One character. The difference between a query that works and one that returns nothing — and lies to you about it. Silence in tech is the most dangerous feedback you can get. Four errors. One hour. One Kubernetes lab. And what I walked away with wasn't frustration. It was this: Competence isn't the absence of mistakes. It's the muscle memory you build from recovering from them — fast, calmly, without catastrophising. The engineers who intimidate you? They have a graveyard of typos, conflicts, and silent failures behind them too. They just stopped being embarrassed about it earlier than you. I'm documenting every session like this. Not because I've figured it out. Because I remember what it felt like to think everyone else had. What's the error that actually taught you something? Drop it below. I mean it.
Like Comment
To view or add a comment, sign in
Akhilesh Mishra
3w Edited
Report this post
The Kubernetes ecosystem is beautiful. Every tool exists to solve a problem that Kubernetes couldn't solve. You run everything with kubectl. Get pods, describe, logs, exec, delete, apply, 50 times a day across 5 namespaces. It works, but it is slow and painful, specially -n namespcae in every command. >> So you use K9s or Lens. A terminal UI that shows your entire cluster in one view. It lets you switch namespaces, different clusters, and tail logs, exec inside pod, and do everything you need. You deploy with kubectl apply from your laptop. Someone changes a deployment directly on the cluster and what is running no longer matches what is in Git. That is drift, and it is silent until prod breaks. >> So you use ArgoCD. Git becomes the single source of truth and if anyone touches a deployment manually ArgoCD overrides it back. Your Kafka consumer has 200,000 messages piling up, CPU is at 5 percent and HPA sees no reason to scale. >> So you use KEDA. It scales pods on queue depth, SQS message count or Prometheus metrics. Not just CPU. The backlog clears. HPA adds pods during a spike but nodes are full and new pods sit in Pending. >> So you use Karpenter. A new node appears in seconds and disappears when the load drops. You only pay for what you use. Every pod can talk to every other pod by default. Nothing is blocked unless you block it. >> So you use Network Policies. Your database only accepts traffic from the app. Everything else is denied. One microservice slows down, retries pile up across 4 others and a cascade begins. You have no visibility because all traffic is invisible. >> So you use a Service Mesh. Istio or Linkerd gives you mTLS, retries, circuit breaking and traffic metrics without touching a single line of app code. Your secrets are Base64 encoded in etcd and readable by anyone with kubectl access. >> So you use the Secrets Store CSI Driver. Secrets live in Vault or AWS Secrets Manager and mount directly into your pod. The secret never lives in Kubernetes. A developer ships a container running as root, another ships with no resource limits and you find out after the incident. >> So you use Kyverno. No root containers, no images without a digest, no deployments without limits. Enforced before anything enters the cluster. Pods are restarting, latency is spiking and memory is climbing but you have no numbers and no way to know when it started. >> So you use Prometheus and Grafana. Metrics from every pod and node, turned into dashboards. You see the spike and which service caused it. Grafana shows the spike but not which request triggered it or where it slowed down. >> So you use Jaeger. It follows one request across every service, shows latency per hop and the exact failure point. That is the ecosystem. Not a list of tools. Planning to transition into Devops/MLops/AIops from another domain? My upcoming bootcamp can help. Take a look 👇 https://lnkd.in/gz4CjgFn 25% discount for Indian students
1 Comment
Like Comment
To view or add a comment, sign in
Ben Faingold
2w
Report this post
Most homelabs are pets. Mine is cattle. For years, my home server was a massive, fragile snowflake. I would SSH in late at night, run an update, and pray I didn't break a dependency from six months ago. At work as a DevOps engineer, I spend my days building resilient, repeatable infrastructure. But at home, I was still practicing "artisanal sysadmin." I decided it was time to close that gap. I recently finished rebuilding my entire homelab on a single Proxmox node using production-grade patterns. ❌ No manual SSH tweaks. ❌ No in-place OS patching. Instead, the whole stack is built on three strict rules: 1️⃣ Immutability: While I might update a metadata tag in place, the virtual hardware and OS layer is strictly disposable. Packer builds new versioned images, and Terraform replaces the old VMs entirely. 2️⃣ Stateless Compute: The K3s cluster stores nothing locally. Application workloads are continuously synced by ArgoCD, secrets are pre-seeded, and certificates are restored on bootstrap. 3️⃣ Merge-Driven: Every single change flows through a Pull Request. CI validates, I merge, and the pipeline does the rest. I wrote up the philosophy behind this architecture and what "immutable" actually looks like in practice in chapter 1 of my new series, Immutable Homelab. 📖 Read the full breakdown here: https://lnkd.in/dXypr3cN 🐙 Check out the code in the GitHub org: https://lnkd.in/d8kqzgkY #DevOps #Homelab #GitOps #Proxmox #Kubernetes #Terraform #InfrastructureAsCode

Starktastic Homelab github.com

4 Comments
Like Comment
To view or add a comment, sign in
Mazen Essam
1w
Report this post
Just wrapped up a 3-tier web application deployment on AWS EKS with a full GitOps workflow. Sharing the architecture for anyone building something similar. Infrastructure Provisioning Everything is codified with Terraform — AWS Provider v6.x and EKS Module v21.x. One repo provisions the VPC, public and private subnets across AZs, the EKS cluster with managed node groups, and the OIDC provider for IAM roles for service accounts (IRSA). No clicks in the AWS console. GitOps Delivery ArgoCD runs inside the cluster, continuously monitoring the Git repository for manifest changes. Developers push to Git, ArgoCD detects the discrepancy between desired and actual state, and syncs automatically. No kubectl apply from laptops, full audit trail in Git, and drift detection out of the box. Application Architecture (3tirewebapp-dev namespace) - Frontend tier: React, 3 replicas behind a ClusterIP Service on port 3000 - Backend API tier: Node.js, 3 replicas behind a ClusterIP Service on port 8080 - Database tier: PostgreSQL Deployment with credentials sourced from a Kubernetes Secret on port 5432 Networking and Security - AWS Load Balancer sits at the edge, routes internet traffic to the frontend Service - All application tiers run in private subnets — zero direct internet exposure - Only the load balancer and NAT resources live in public subnets - East-west traffic flows strictly through Kubernetes Services, so each tier only talks to the next one Why this design works Terraform gives reproducible infrastructure. ArgoCD gives declarative, auditable application delivery. The three-tier separation keeps blast radius small — a frontend bug cannot reach the database directly. And IRSA means no long-lived AWS credentials inside pods. Building this reinforced how much operational overhead disappears once you commit to Infrastructure as Code and GitOps end to end. #AWS #EKS #Kubernetes #DevOps #CloudEngineering #Terraform #ArgoCD #GitOps #IaC #CloudNative #CICD
Like Comment
To view or add a comment, sign in
Amarachi Ekwebelem
5d Edited
Report this post
𝗜 𝗱𝗲𝗰𝗶𝗱𝗲𝗱 𝘁𝗼 𝘀𝗼𝗹𝘃𝗲 𝗮 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝗯𝗲𝗳𝗼𝗿𝗲 𝗱𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 𝗲𝘃𝗲𝗻 𝗯𝗲𝗴𝗮𝗻. The problem is this: Building the application first and thinking about infrastructure later. Then production happens. You realize the architecture cannot survive an Availability Zone failure. Credentials are scattered in the wrong places. Infrastructure exists only in someone’s memory instead of being defined, reviewed, and version-controlled as code. I took on this challenge this differently. I completed Phase 1 of deploying a production-grade, cloud-native application on AWS. The application itself React frontend, Flask backend, and PostgreSQL database has not been deployed to AWS yet. But the foundation it will run on is already designed with reliability, security, automation, and scalability in mind. Here’s what I built: ⚫ A custom VPC across 3 Availability Zones Designed with 6 subnets —> 3 public and 3 private —> so application workloads can run privately without public IP exposure. ⚫ Redundant NAT Gateways One NAT Gateway per Availability Zone, improving availability and reducing the risk of a single AZ failure affecting outbound traffic. ⚫Terraform remote backend Terraform state is stored in S3 with DynamoDB state locking, making infrastructure changes safer, trackable, and protected from conflicting updates. ⚫Private ECR repositories Configured with immutable image tags and lifecycle policies to prevent accidental image overwrites and reduce unnecessary storage costs. ⚫Route 53 DNS setup Configured DNS management for my domain delegated from Namecheap and prepared for automated SSL certificate management in a later phase. ⚫Least-privilege IAM setup for kOps No root account usage. No hardcoded credentials. Permissions are scoped for the infrastructure workflow. All resources were created using a single Terraform workflow. No manual console changes. No hidden setup. Everything is defined in code and tracked in Git. This phase taught me something important: A good deployment does not start when the application is ready. It starts when the infrastructure is designed properly. Because the shortcuts taken in infrastructure often become the incidents teams respond to later. To the next step: Building a self-managed Kubernetes cluster on AWS using kOps, with: 📌3 master/control plane nodes 📌 Private topology 📌Spot instances for cost optimization 📌Worker nodes across multiple Availability Zones I am not just trying to deploy an app. I’m thinking like a Cloud/DevOps Engineer, designing for reliability, security, automation, and real production behavior from the beginning. If you are building a cloud or DevOps team and want someone who thinks about reliability before it becomes a production problem, I would love to connect. #CloudEngineering #AWS #DevOps #Terraform #Kubernetes #InfrastructureAsCode #CloudNative #BuildingInPublic #OpenToWork #HiringAlert
1 Comment
Like Comment
To view or add a comment, sign in

7,282 followers

210 Posts

View Profile Follow

Building a Real DevSecOps Pipeline on AWS

More Relevant Posts

Explore related topics

Explore content categories