Transforming data into actionable insights through advanced analytics and machine learning
Data Scientist passionate about solving complex business problems through data-driven approaches. My work spans the entire data science lifecycle: from exploratory analysis and feature engineering to model development, deployment, and monitoring in production environments.
class DataScientist:
def __init__(self):
self.name = "IΓ±aki Rosello"
self.role = "Data Scientist & CS Student"
self.location = "Buenos Aires, Argentina"
def expertise(self):
return {
"analytics": ["EDA", "Statistical Analysis", "Data Visualization"],
"modeling": ["Classification", "Clustering", "Recommendation Systems"],
"mlops": ["Model Deployment", "Monitoring", "CI/CD Pipelines"],
"business": ["Fraud Detection", "Churn Prediction", "Demand Forecasting"]
}
def currently_learning(self):
return ["Deep Learning", "Advanced MLOps", "Price Optimization"]|
Customer Churn Prediction with Full MLOps Pipeline Production ML system for FinTech churn prediction optimized for Recall (0.76). Complete pipeline: SMOTE balancing, model comparison (RF, XGBoost, LogReg), hyperparameter tuning, SHAP explainability, MLflow tracking, Evidently drift detection, FastAPI + Docker deployment. Stack: Scikit-learn, MLflow, Evidently, SHAP, FastAPI, Docker |
|
|
Retail Demand Forecasting & Price Optimization Complete data science pipeline for demand estimation and optimal pricing strategy. Integrates econometric analysis (price elasticity) with predictive modeling. Stack: Python, Pandas, Scikit-learn, Optimization |
E-commerce Fraud Detection System Final project for EDVAI Bootcamp. Production-ready fraud detection using clustering + classification. Includes API, Docker containerization, and Gradio UI. Stack: Scikit-learn, FastAPI, Docker, Gradio |
|
Hybrid Recommendation Engine Personal project exploring recommendation systems. Implements content-based + collaborative filtering with efficient similarity search using FAISS & ANNOY. Stack: FAISS, ANNOY, FastAPI, Gradio |
Lead Scoring System for the Equestrian Market System that automatically classifies visitors as Bronze, Silver, or Gold leads in the equestrian market ($50K+ horses). My role: design and implementation of the cascade Lead Scoring pipeline (4 XGBoost models) and interactive demo on Gradio/HuggingFace. F2 Gold Lead: ~0.51. Stack: XGBoost, Scikit-learn, MLflow, DagsHub, Gradio |
- π Computer Science - Universidad de Buenos Aires (Currently Studying)
- π Data Science & MLOps Bootcamp - EDVAI (Completed)
- π AlixPartners Case Competition 2025 - Participant
- πΌ NoCountry Simulation (S02-26-E45) - EquineLead: Lead Scoring & Recommendation System
- πΌ NoCountry Simulation - Data Science Team Member (FinTech Churn)
- π Continuous Learning: Deep Learning, Advanced Analytics (EDA, Data Preparation), Production ML
Current Learning Path:
- π§ Deep Learning fundamentals and advanced architectures
- π Time series forecasting and demand prediction
- π§ Production-grade MLOps practices and automation
- π° Price optimization and econometric modeling
- π³ Scalable deployment with Docker and cloud services
Core Competencies:
- Analytics: Exploratory Analysis, Statistical Modeling, Business Intelligence
- Machine Learning: Classification, Clustering, Recommendation Systems, Fraud Detection
- MLOps: Model Deployment, Monitoring & Drift Detection, CI/CD Pipelines, Containerization
- Business Applications: Churn Prediction, Demand Forecasting, Price Optimization
π FinTech Churn Prediction - Technical Overview
Business Context: Customer retention prediction for a FinTech platform
Challenge: Highly imbalanced dataset (80% no-churn, 20% churn)
Optimization Goal: Maximize Recall to minimize false negatives (missed churners)
Data Pipeline:
- Preprocessing: Feature engineering on
cleaned_data.csv - Scaling: StandardScaler for linear models
- Balancing: SMOTE oversampling on training set
- Split: 80/20 train-test with stratification
Model Development:
| Model | Hyperparameter Tuning | Recall | F1-Score | ROC AUC |
|---|---|---|---|---|
| Random Forest | RandomizedSearchCV β GridSearchCV | 0.66 | 0.57 | 0.83 |
| XGBoost | RandomizedSearchCV β GridSearchCV | 0.55 | 0.59 | 0.84 |
| Logistic Regression β | RandomizedSearchCV β GridSearchCV | 0.76 | 0.56 | 0.84 |
Winner: Logistic Regression with class_weight='balanced'
Reason: Highest Recall (0.76), meeting business requirement of catching churners
MLOps Implementation:
- Experiment Tracking: MLflow for all runs, parameters, and metrics
- Model Explainability: SHAP for feature importance analysis
- Model Artifacts: Serialized model + scaler with pickle
- Monitoring: Evidently AI for drift detection
- Deployment: FastAPI + Docker containerization
- CI/CD: Automated retraining pipeline
Key Learnings:
- SMOTE effectively handled class imbalance
- Linear models outperformed tree-based for this use case
- Recall optimization crucial for business impact
- SHAP provided actionable insights for stakeholders
π Fraud Detection System - Architecture & Approach
Project Type: EDVAI Bootcamp Final Project
Domain: E-commerce Transaction Fraud Detection
Methodology:
- Unsupervised Learning: Clustering to identify fraud patterns
- Supervised Learning: Classification on clustered features
- Ensemble Approach: Combined insights from both techniques
Production Features:
- RESTful API with FastAPI
- Docker containerization for deployment
- Interactive Gradio UI for demos
- Real-time fraud scoring
Technologies: Scikit-learn, FastAPI, Docker, Gradio
π EquineLead: Lead Scoring System - Technical Overview
Project Type: NoCountry Job Simulation β S02-26-E45 | Equestrian Market Intelligence
My Role: Lead Scoring Model Design & Implementation + Interactive Gradio Demo
Business Context: Identifying a $50K horse buyer among thousands of casual visitors is the main bottleneck for equestrian sales teams. My contribution: the predictive pipeline that automatically classifies each user as Bronze, Silver, or Gold lead.
Cascade Architecture (4 Models)
User Behavior
β
βΌ
[Step 1] Purchase intent? (Bronze vs Silver/Gold)
β
ββββΊ [Step 2] High-value buyer? (Silver vs Gold)
4 models: P1-horse, P2-horse, P1-prods, P2-prods
Top Predictive Features:
| Feature | Signal |
|---|---|
max_horse_price_viewed |
βββ |
horses_added_to_cart |
βββ |
max_revisits_same_horse |
βββ |
prestige_score (job profile) |
ββ |
Results (XGBoost Tuned): F2 Lead Oro β 0.51
My MLOps Contribution:
- Experiment Tracking: MLflow + DagsHub for model registry and metrics
- Explainability: SHAP for feature importance analysis
- Demo: Interactive Gradio app on HuggingFace Spaces
Live Demo:
π¬ Movie Recommendation System - Implementation Details
Approach: Hybrid Recommendation System
Components:
- Content-Based Filtering: TF-IDF on movie metadata
- Collaborative Filtering: User-item interaction matrix
- Similarity Search: FAISS & ANNOY for efficient retrieval
Performance:
- Sub-second response time for recommendations
- Scalable to millions of items
- API-ready deployment
Technologies: FAISS, ANNOY, FastAPI, Gradio
Data Science Projects:
- β EquineLead (NoCountry S02-26-E45) - Cascade Lead Scoring with XGBoost, F2 Gold Lead ~0.51, interactive Gradio demo on HuggingFace
- β Churn Prediction (FinTech) - Achieved 0.76 Recall through SMOTE + Logistic Regression optimization
- β Fraud Detection System - Built production-ready fraud classifier with clustering + classification approach
- β Recommendation Engine - Implemented hybrid RecSys with FAISS/ANNOY for sub-second retrieval
- β Demand Forecasting - Created econometric optimization model for retail pricing strategy
Technical Skills:
- π End-to-end data science workflows with imbalanced data handling
- π€ Model selection and hyperparameter optimization (RandomizedSearchCV, GridSearchCV)
- π Advanced feature engineering and SMOTE oversampling
- π Model interpretability with SHAP and explainable AI
- π Production deployment with MLflow tracking and Evidently monitoring
- π Drift detection and automated model retraining pipelines
Bootcamp & Competitions:
- π― EDVAI Data Science & MLOps Bootcamp
- π AlixPartners 2025 Case Competition - Demand Forecasting Hackaton Competition
- πΌ NoCountry S02-26-E45 β EquineLead: Lead Intelligence System for the Equestrian Market
- πΌ NoCountry Job Simulation - FinTech Churn Prediction Project
I'm always open to collaborating on data science projects, discussing new techniques, or connecting with fellow data enthusiasts and professionals!
