Teacher-student RL prototype for Geometry Dash with a mock training pipeline and a Geode shared-memory bridge.
Implemented:
- Gym-compatible privileged env (
GDPrivilegedEnv) with mock IPC fallback - Teacher PPO training entrypoint (SB3)
- Dataset collector writing compressed NPZ shards
- Student distillation trainer with weighted mode sampling
- Geode shared-memory adapter on the Python side
- Geode mod skeleton that writes core telemetry (
obs[0..7]) + level-complete flag
Not finished yet:
- Real screen-frame capture in the collector
- Input injection from
action_ininto live gameplay - Full production training/inference loop against real game frames
| Path | Purpose |
|---|---|
src/gdrl/env/privileged_env.py |
Gym env, reward logic, mock IPC adapter |
src/gdrl/teacher/train_ppo.py |
Teacher PPO training (saves SB3 model) |
src/gdrl/data/collector_cli.py |
Rollout collection + NPZ export |
src/gdrl/student/train_distill_dataset.py |
Student distillation from NPZ |
src/gdrl/student/train_distill.py |
Sanity-only random train step |
src/gdrl/env/geode_ipc.py |
Python shared-memory adapter (gdrl_ipc) |
src/gdrl/env/geode_ipc_test.py |
Fake writer/reader IPC smoke test |
src/gdrl/env/live_monitor.py |
Live telemetry monitor for Geode segment |
geode_mod/GDRLBridge/src/main.cpp |
Geode mod hook implementation |
docs/ |
IPC protocol + implementation planning docs |
- Python
>=3.10 - Virtualenv recommended
- Python deps from
requirements.txt(torch,stable-baselines3,gymnasium, etc.) - Optional Geode path requires Geometry Dash
2.2081and Geode SDK (as required bygeode_mod/GDRLBridge/CMakeLists.txt)
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -r requirements.txt
pip install -e .
mkdir -p artifactsOr:
make setup- Train teacher:
python -m gdrl.teacher.train_ppoExpected output model:
artifacts/teacher_mock.zip
- Collect dataset shard from teacher rollouts:
python -m gdrl.data.collector_cli \
--teacher-path artifacts/teacher_mock.zip \
--episodes 3 \
--out artifacts/datasets/mock_shard_000.npz- Train student from NPZ:
python -m gdrl.student.train_distill_dataset \
--data artifacts/datasets/mock_shard_000.npz \
--epochs 2 \
--out artifacts/student_mock.pt- Optional quick sanity step (no dataset required):
python -m gdrl.student.train_distill- Optional env smoke run:
python -m gdrl.env.run_env_smoke --mode mock --steps 10src/gdrl/data/collector.py writes:
frames:uint8, shape[N, 84, 84]teacher_probs:float32, shape[N, 2]modes:int32, shape[N]level_ids:int32, shape[N]frame_idxs:int64, shape[N]
Notes:
- Distillation stacks 4 consecutive frames.
- You need at least 5 frames in a shard or
train_distill_datasetraises:Dataset too small for stack_size=4.
Shared memory segment:
- name:
gdrl_ipc(Python side) - size: 444 bytes
| Field | Type | Notes |
|---|---|---|
version |
uint32 |
Protocol version |
tick |
uint32 |
Incremented every update |
obs |
float32[108] |
Privileged observation vector |
action_in |
uint8 |
Action byte written by Python |
reserved |
uint8[3] |
reserved[0] is level-complete flag |
Action semantics:
0: release/no-click1: press/hold
Reference: docs/GEODE_IPC_PROTOCOL.md.
Terminal A:
python -m gdrl.env.geode_ipc_test writerTerminal B:
python -m gdrl.env.geode_ipc_test readerOr run env smoke against shared memory:
python -m gdrl.env.run_env_smoke --mode geode --steps 10- Build the Geode mod (example CMake flow):
export GEODE_SDK=/path/to/geode-sdk
cmake -S geode_mod/GDRLBridge -B geode_mod/GDRLBridge/build
cmake --build geode_mod/GDRLBridge/build -j-
Install/load the built mod in Geometry Dash via Geode.
-
Start a level, then verify the segment appears:
python -m gdrl.env.geode_wait- Monitor live values:
python -m gdrl.env.live_monitor --print-every 25- Drive the Gym env from shared memory:
python -m gdrl.env.run_env_smoke --mode geode --steps 20- If a stale segment remains after crashes:
python -m gdrl.env.geode_shm_cleanup --shm-name gdrl_ipcCurrent implementation in geode_mod/GDRLBridge/src/main.cpp:
- Opens/creates POSIX shared memory
/gdrl_ipc - Writes
version, incrementstickeveryPlayLayer::postUpdate - Fills core telemetry:
obs[0]=x,obs[1]=y,obs[2]=y_vel,obs[3]=x_delta,obs[4]=on_ground,obs[5]=is_dead,obs[6]=speed(placeholder),obs[7]=mode - Sets
reserved[0]=1onlevelComplete
Pending:
- Applying
action_into actual jump/hold input (currently read only)
FileNotFoundErroror timeout waiting for segment: make sure a level is running and the Geode mod is loaded.- Version mismatch in Python adapter: compare
EXPECTED_VERSIONinsrc/gdrl/env/geode_ipc.pywithIPC_VERSIONingeode_mod/GDRLBridge/src/main.cpp. - Dataset trainer says shard is too small: collect more frames by increasing
--episodes. - Stale shared-memory segment: run
python -m gdrl.env.geode_shm_cleanup --shm-name gdrl_ipc.
docs/IMPLEMENTATION_PLAN.mddocs/GEODE_IPC_PROTOCOL.mddocs/GEODE_HOOK_PLAN.md