Featured Projects
Real-world cloud and DevOps engineering projects demonstrating architecture, automation, and reliability at scale.
AWS DevOps & SRE Control Center
Real-time AWS infrastructure monitoring dashboard with EC2, Lambda, S3, RDS, CloudWatch, IAM, and VPC visibility. Features live metrics, CI/CD pipeline simulation, and GitHub integration.
View setup Steps & DiagramEnterprise AWS EKS Deployment
Designed and deployed a highly available Kubernetes cluster on AWS EKS across public/private subnets. Integrated AWS Load Balancer Controller, HPA, and core ingress controllers.
View setup Steps & DiagramJenkins Pipeline Shared Library
Created a reusable Jenkins Shared Library in Groovy to standardize CI/CD pipelines across 10+ microservices. Built automated stages for build, containerization, and EKS rollouts.
View setup Steps & DiagramPrometheus & Grafana Monitoring
Configured full observability for AWS EKS workloads. Setup node-exporters, kube-state-metrics, Prometheus alerts, and customized dashboard visualizations in Grafana.
View setup Steps & DiagramAutomated Terraform IaC Modules
Developed reusable Terraform modules for standard AWS resource provisioning (VPC, EC2, S3, IAM, ASG). Implemented secure remote state locking via DynamoDB.
View setup Steps & DiagramAWS Cost Optimizer Bot
Wrote automated Python Lambda functions triggered by EventBridge scheduler to clean up unattached EBS volumes and shut down non-prod resources outside working hours.
View setup Steps & DiagramAI & MLOps Initiatives
Production-ready AI agents, Retrieval-Augmented Generation (RAG) copilots, and self-healing infrastructure integrations.
Kubernetes Troubleshooting Agent
A smart telemetry agent designed for Kubernetes clusters. It tracks pod restart events, scans active YAML manifests, pulls error logs, and uses local LLMs to suggest step-by-step remediation plans for CrashLoopBackOffs, resource starvations, or OOMKilled states.
RAG Knowledge Chatbot
An offline Enterprise RAG chatbot that crawls local runbooks and manuals, computes embeddings using nomic-embed-text, stores them in ChromaDB/pgvector, and queries local LLMs for factual DevOps answers.
DevOps Copilot (LLM Integration)
A code/config explanation assistant integrated across all 15 studios. Automatically parses Helm charts, Terraform configurations, Ansible playbooks, and Dockerfiles to explain SRE implications.
AI-Powered Log Analyzer
Real-time log collector and parser that maps application log streams. It flags critical error stack traces (like Java Maven failures or Nginx proxy breaks) and uses AI to summarize root causes.
SRE GenAI Copilot
An automated risk auditor integrated into pipelines. Checks Terraform planning files and Kubernetes values parameters for scaling shortages or configuration vulnerabilities before rollout.