Skip to content

coderGtm/ratelimiter

Repository files navigation

Custom Rate Limiter

A multi-layered, configurable rate limiter built with Spring Boot. Supports three algorithms, per-user/per-IP/global limiting, and ships with Prometheus + Grafana monitoring and k6 load testing scripts.

Architecture

Request → Interceptor → Global Limiter → User Limiter → IP Limiter → Controller
                              │                │              │
                              └── Metrics (Prometheus) ───────┘

Three layers, evaluated in order:

  1. Global — shared across all clients (protects the server)
  2. Per-User — identified via X-User-Id header or Spring Security principal
  3. Per-IP — identified via X-Forwarded-For or remote address

If any layer rejects, the request gets 429 Too Many Requests.

Algorithms

Algorithm Best For Behavior
TOKEN_BUCKET Bursty traffic Allows short bursts up to capacity, refills at steady rate
SLIDING_WINDOW Strict rate enforcement Hard cap per time window, resets when window slides
LEAKY_BUCKET Smooth output rate Queues requests conceptually, processes at constant rate

Configure in application.yml:

rate-limiter:
  algorithm: TOKEN_BUCKET  # TOKEN_BUCKET, SLIDING_WINDOW, LEAKY_BUCKET
  limits:
    global:
      requests-per-second: 1000
    per-user:
      requests-per-second: 100
    per-ip:
      requests-per-second: 100

Quick Start

Run locally

./mvnw spring-boot:run

Run with Docker (app + Prometheus + Grafana)

docker-compose up --build

API Endpoints

Method Path Rate Limited Description
GET /api/ping Yes Health check (use as load test target)
GET /api/public/ping No Unprotected health check
GET /api/admin/config No View current rate limiter configuration
GET /actuator/prometheus No Prometheus metrics endpoint

Load Testing with k6

Install k6: brew install k6

# Steady 20 req/s baseline
k6 run k6/uniform.js

# Burst: 200 req/s then quiet (TOKEN_BUCKET vs LEAKY_BUCKET comparison)
k6 run k6/burst.js

# Ramp: 0 → 300 req/s over 60s (find the breaking point)
k6 run k6/ramp.js

# Spike: normal → 500 req/s spike → normal
k6 run k6/spike.js

# Multi-user: heavy users get blocked, light users pass
k6 run k6/multi-user.js

# Multi-IP: abusive IPs get blocked, normal IPs pass
k6 run k6/multi-ip.js

# Full demo: all 3 layers activated in sequence (run with Grafana open)
k6 run k6/full-demo.js

Testing

./mvnw test

Unit tests — each algorithm tested in isolation:

  • Allows requests up to capacity, blocks after exhaustion
  • Remaining permits decrease correctly
  • Refill/leak/window-reset behavior after time elapses
  • Factory creates the correct algorithm from enum and string inputs, defaults to token bucket for unknown values

Integration tests — full Spring Boot context with MockMvc:

  • Ping and admin endpoints return expected responses
  • Per-IP blocking: 6th request from the same IP gets 429
  • Per-user blocking: user exceeds quota across different IPs, still gets blocked
  • Independent limits: exhausting one IP doesn't affect another
  • Admin and public endpoints are excluded from rate limiting

Tech Stack

  • Java 25, Spring Boot 4.0
  • Micrometer + Prometheus + Grafana (observability)
  • Caffeine (TTL-based cache for per-user/IP limiters)
  • k6 (load testing)
  • Docker + Docker Compose
  • GitHub Actions CI

About

A multi-layered, configurable rate limiter built with Spring Boot.

Topics

Resources

Stars

Watchers

Forks

Contributors