Alonso Urbano1, David W. Romero2, Max Zimmer1, Sebastian Pokutta1,3
1 Zuse Institute Berlin (ZIB) 2 Cartesia AI 3 Technische Universität Berlin
Minimal reproduction codebase for the paper.
@misc{urbano2026reconrobustsymmetrydiscovery,
title={RECON: Robust symmetry discovery via Explicit Canonical Orientation Normalization},
author={Alonso Urbano and David W. Romero and Max Zimmer and Sebastian Pokutta},
year={2026},
eprint={2505.13289},
archivePrefix={arXiv},
primaryClass={cs.LG}
}recon/: Python package (images + GEOM/QM9 + OOD + test-time canonicalization).configs/: JSON configs for dataset building and training runs.
conda env create -f environment.yml
conda activate recon
pip install -e .All dataset paths in configs and CLI defaults use scratch_root (defaults to ".").
Set it to wherever your data lives, e.g. --scratch_root /data/myuser or edit the
JSON configs directly. Dataset paths like datasets/mnist_processed/mnist6090_train.pkl
are resolved relative to scratch_root.
Images:
python3 -m recon.cli.make_dataset_images \
--config configs/datasets/images/mnist6090.jsonFashionMNIST variant:
python3 -m recon.cli.make_dataset_images \
--config configs/datasets/images/fashion_gaussian_x3.jsonGEOM/QM9:
python3 -m recon.cli.make_dataset_geom_qm9 \
--config configs/datasets/geom/geom_rmsd1.5_n64_F3axes.jsonImages (MNIST default):
python3 -m recon.cli.train_images \
--config_path configs/train/images/mnist_ssl_default.jsonImages (Fashion default):
python3 -m recon.cli.train_images \
--config_path configs/train/images/fashion_ssl_default.jsonGEOM/QM9 (default):
python3 -m recon.cli.train_geom \
--config_path configs/train/geom/geom_ssl_default.jsonImages (uses the upright MNIST test set by default for on-the-fly random rotations):
python3 -m recon.cli.ood_images \
--checkpoint_path /path/to/best_theta_model.pt \
--dataset_name mnistThe --data_path_test flag defaults to datasets/mnist/mnist_test.amat (relative to
scratch_root). This must be the vanilla test set; random rotations are applied on the
fly during evaluation.
GEOM/QM9:
First build an OOD dataset from an ID GEOM dataset:
python3 -m recon.cli.make_ood_geom_dataset \
--base_dataset_path /path/to/id_dataset.pt \
--output_dataset_path /path/to/ood_dataset.ptpython3 -m recon.cli.ood_geom \
--ie_encoder_checkpoint_path /path/to/pretrain_best_encoder_model.pt \
--centering_predictor_checkpoint_path /path/to/ssl_best_model_Gamma_model.pt \
--matrix_fisher_predictor_checkpoint_path /path/to/ssl_best_model_F_model.pt \
--ood_dataset_path /path/to/ood_dataset.ptTrain a ResNet-18 on upright MNIST 12k (no augmentations), then evaluate on a rotated test set under no canonicalization, IE-AE, and RECON:
python3 -m recon.cli.canonicalization_benchmark \
--dataset_name mnist \
--rotated_test_pkl /path/to/mnist6090_test.pkl \
--ssl_checkpoint /path/to/best_theta_model.pt \
--scratch_root /path/to/data_root \
--out_dir ./canon_benchmark_outTo skip classifier training and reuse a saved checkpoint:
python3 -m recon.cli.canonicalization_benchmark \
--dataset_name mnist \
--rotated_test_pkl /path/to/mnist6090_test.pkl \
--ssl_checkpoint /path/to/best_theta_model.pt \
--classifier_checkpoint ./canon_benchmark_out/best_classifier.pt \
--scratch_root /path/to/data_root \
--out_dir ./canon_benchmark_outTrain a GCN on gt-aligned GEOM-QM9 conformers (SMILES classification), then evaluate on the augmented (rotated) test set:
python3 -m recon.cli.canonicalization_benchmark_geom \
--train_dataset /path/to/gt_aligned_train.pt \
--val_dataset /path/to/gt_aligned_val.pt \
--test_dataset /path/to/augmented_test.pt \
--ie_encoder_ckpt /path/to/pretrain_best_encoder_model.pt \
--centering_ckpt /path/to/ssl_best_model_Gamma_model.pt \
--out_dir ./geom_benchmark_outTo skip classifier training:
python3 -m recon.cli.canonicalization_benchmark_geom \
--train_dataset /path/to/gt_aligned_train.pt \
--val_dataset /path/to/gt_aligned_val.pt \
--test_dataset /path/to/augmented_test.pt \
--ie_encoder_ckpt /path/to/pretrain_best_encoder_model.pt \
--centering_ckpt /path/to/ssl_best_model_Gamma_model.pt \
--classifier_checkpoint ./geom_benchmark_out/best_gcn_classifier.pt \
--out_dir ./geom_benchmark_out