I'm a Data & AI Engineer passionate about designing and building end-to-end data solutions — from raw ingestion to AI-powered insights. I specialize in cloud-native data architectures, real-time streaming pipelines, and machine learning integrations using modern data stack technologies.
- 🏗️ Building production-grade ETL/ELT pipelines on Azure, AWS & GCP
- 🤖 Developing AI/ML-powered data workflows & RAG (Retrieval-Augmented Generation) systems
- ⚡ Engineering real-time streaming solutions with Apache Spark, Kafka & Flink
- 🗄️ Designing dimensional data models (Star/Snowflake schema) for analytics
- 📊 Optimizing data platforms for scalability, reliability, and performance
- 📚 Microsoft Azure Data Engineer Associate (DP-203) — In Progress
| Project | Description | Tech Stack |
|---|---|---|
| 🔥 Apache Spark Portfolio | End-to-end Spark data engineering solutions with local vs. global sort optimizations | PySpark, Scala |
| ☁️ Azure Data Engineer (DP-203) | Azure-based ETL/ELT pipelines — preparation for DP-203 certification | Azure Data Factory, Synapse, ADLS |
| 🤖 AI Chat RAG Workflow | Retrieval-Augmented Generation pipeline for intelligent document Q&A | Python, LangChain, OpenAI |
| 📰 News Trend Data Pipeline | Real-time news trend ingestion and analytics pipeline | Python, Airflow, Kafka |
| 🗄️ Dimensional Modeling - NBA | Star schema dimensional model for NBA analytics | SQL, PostgreSQL |
| ☸️ Kubernetes Data Engineer | Containerized data pipeline deployment with Kubernetes | Kubernetes, Docker, Python |
| 📊 SQL Deep Dive | Advanced SQL techniques: window functions, CTEs, optimization | SQL, Jupyter Notebook |
I'm always open to discussing data engineering, AI/ML projects, cloud architecture, or opportunities in consulting and technology.
⭐ "Turning raw data into actionable intelligence — one pipeline at a time."
