Skip to content

Built a production-grade cloud-native microservices platform using Kubernetes, featuring CI/CD automation, blue-green deployments, autoscaling, observability, and disaster recovery. The system simulates real production workloads with fault tolerance, backup/restore, and zero-downtime releases.

Notifications You must be signed in to change notification settings

adharsh277/devops-microservices-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

110 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DevOps Microservices Platform

End-to-End DevOps | Kubernetes | CI/CD | Blue-Green | Observability | Resilience

FastAPI Python Docker Docker Compose Kubernetes Helm GitHub Actions Blue-Green Deployment Canary Deployment Prometheus Grafana k6

πŸ“Œ Project Overview

This project demonstrates a real-world DevOps implementation for a microservices-based backend platform, focusing on production readiness, not just deployment.

The project covers:

  • Zero-downtime deployments
  • Safe rollbacks
  • Kubernetes orchestration
  • CI/CD automation
  • Observability and metrics
  • Load testing and resilience
  • Autoscaling behavior and analysis

🧱 Architecture Overview

Arc Diagram

Architecture Diagram

Microservices

  • Auth Service
  • Orders Service
  • Payments Service
  • Notifications Service

High-Level Flow

Client

β†’ Kubernetes Service β†’ Pod (Blue / Green Deployment)

Each service is:

  • Independently containerized
  • Deployed via CI/CD
  • Managed by Kubernetes
  • Observable through metrics

πŸ›  Technology Stack

  • Backend: FastAPI (Python)
  • Containerization: Docker
  • Orchestration: Kubernetes (AKS)
  • CI/CD: GitHub Actions
  • Deployment Strategy: Blue-Green
  • Monitoring: Prometheus, Grafana
  • Load Testing: k6
  • Autoscaling: Kubernetes HPA

πŸ”΅ Phase 0 β€” Environment & Foundations

Goal: Never struggle with setup again

What was done

  • Linux CLI workflow
  • Git branching and commits
  • Python virtual environments
  • Docker fundamentals

Outcome

  • Clean, reproducible development environment

πŸ”΅ Phase 1 β€” Backend Microservices

Goal: Understand service behavior and boundaries

What was done

  • Multiple FastAPI services
  • Independent APIs
  • Async request handling
  • Clear service separation

Outcome

  • All services respond correctly
  • APIs are independently deployable

Screenshots

Swagger UI Diagram

Swagger UI Diagram

API UI Response

API UI Response

πŸ”΅ Phase 2 β€” Dockerization

Goal: Application runs identically everywhere

What was done

  • Dockerfile per service
  • Environment-based configuration
  • Docker Compose for local orchestration

Outcome

  • Full system starts with a single command
  • Zero manual setup steps

πŸ”΅ Phase 3 β€” Kubernetes Core

Goal: Production-grade orchestration

What was done

  • Kubernetes Deployments and Services
  • Namespace isolation
  • ConfigMaps and Secrets
  • Declarative infrastructure

Outcome

  • All services run inside Kubernetes
  • Fully declarative cluster setup

Screenshots

kubectl get pods kubectl deployments kubectl get deployments


πŸ”΅ Phase 4 β€” CI/CD & Blue-Green Deployment

Goal: Zero-downtime deployments

What was done

  • GitHub Actions CI/CD pipelines
  • Image build and push on commit
  • Automated Kubernetes deployments
  • Blue-Green deployments for all services
  • Instant rollback via service selector switch

Why Blue-Green

  • No downtime
  • Safe releases
  • Fast rollback during incidents

Outcome

  • One-click deployments
  • Rollback in seconds

Screenshots

GitHub Actions pipeline runs

GitHub Actions pipeline runs

Blue–Green Deployment Traffic Switch

Blue–Green Deployment Traffic Switch


πŸ”΅ Phase 5 β€” Observability

Goal: See failures before users do

What was done

  • Prometheus metrics collection
  • Grafana dashboards
  • Metrics validation under load

Outcome

  • CPU, memory, and pod metrics visible
  • System behavior measurable

Screenshots

Grafana dashboards

Grafana dashboards

Grafana dashboards

Prometheus metrics

Prometheus metrics

▢️ Grafana dashboards Explained
Watch the video


πŸ”΅ Phase 6 β€” Backup & Disaster Recovery (Design)

Goal: Disaster recovery strategy

What was evaluated

  • Velero for Kubernetes backups
  • Object-storage-based restore model

Design Decision

  • Platform is stateless
  • No databases or persistent volumes
  • Full DR execution intentionally deferred

Why

  • Stateless workloads rely on redeploy, not restore
  • Design clarity is more important than forced demos

Outcome

  • DR approach clearly understood and documented
  • Clear upgrade path for future stateful workloads

πŸ”΅ Phase 7 β€” Load, Failure & Resilience

Load Testing

  • k6 used to generate concurrent traffic
  • Latency and error rates measured
  • System remained stable under load

Screenshots

k6 load testing results


Autoscaling (HPA)

  • CPU-based Horizontal Pod Autoscaler configured
  • Metrics validated
  • Correct non-scaling behavior explained (service was not CPU-bound)

Key Insight

Autoscaling should happen only when resource pressure exists.


Pod Failure Simulation

  • Live pods deleted manually
  • Kubernetes self-healing observed
  • No service disruption

🎯 Key Learnings

  • Zero-downtime is about process, not just tools
  • Observability is mandatory in production
  • Autoscaling must be driven by metrics, not assumptions
  • Knowing when not to scale is critical
  • DevOps is decision-making, not tool collection

🏁 Final Status

βœ… Project Complete
πŸš€ Production-oriented DevOps platform

Author: Adharsh U

πŸ“„ Research Paper

This research paper was prepared as part of the design and evaluation of this DevOps platform.

Title: Blue–Green and Canary Deployments in Cloud-Native DevOps
Format: PDF
View Research Paper

About

Built a production-grade cloud-native microservices platform using Kubernetes, featuring CI/CD automation, blue-green deployments, autoscaling, observability, and disaster recovery. The system simulates real production workloads with fault tolerance, backup/restore, and zero-downtime releases.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published