Feature proposal: Ordered Base64 (OB64) encoding — 22-char sortable representation

## Summary

I'd like to propose adding an **Ordered Base64 (OB64)** encoding to `python-ulid`, which represents a ULID as a **22-character** string that preserves lexicographic sort order — 4 characters shorter than the current Crockford Base32 representation (26 chars).

## Motivation

Crockford Base32 is great, but 26 characters can be a real cost in high-density storage scenarios:

- **Database columns**: a `CHAR(22)` vs `CHAR(26)` primary key saves ~15% in index size per row. At billions of rows this matters.
- **URLs / query strings**: shorter IDs reduce payload size in APIs that embed IDs in paths or parameters.
- **Log lines**: if you emit ULID in every log line, shorter is better.

Standard Base64 / Base64url are the obvious candidates for a more compact encoding, but they are **not** lexicographically ordered — sorting Base64url strings does not sort the underlying binary values. This breaks range queries, cursor pagination, and any system that relies on ID sort order.

## Design

OB64 solves this by choosing a 64-character alphabet that is in **strict ascending ASCII order**, so that string comparison == integer comparison.

**Alphabet** (64 chars, all URL-safe, strictly ascending ASCII):

```
- 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
```

(`-` = 45, `0–9` = 48–57, `A–Z` = 65–90, `_` = 95, `a–z` = 97–122 — the only 64 printable, URL-safe ASCII characters that can be arranged in strict ascending order.)

**Encoding**: 128 bits packed MSB-first into 22 × 6-bit groups. The final character carries the 2 LSBs of the ULID + 4 zero-padding bits. Because the padding bits are always zero, sort order is preserved end-to-end.

**Comparison table**:

| Encoding | Length | Bits/char | Sorted? | URL-safe? |
|---|---|---|---|---|
| Crockford Base32 (current) | 26 chars | 5 | ✓ | ✓ |
| **OB64 (proposed)** | **22 chars** | **6** | **✓** | **✓** |
| Standard Base64 | 24 chars + `==` | 6 | ✗ | ✗ |
| Base64url (RFC 4648 §5) | 22 chars | 6 | ✗ | ✓ |

## Implementation sketch

```python
ulid = ULID()
ulid.ob64               # → 22-char OB64 string, e.g. '-Oo99nS99brQCjbT_fkYTk'
ULID.from_ob64(s)       # → round-trips back to the same ULID
```

The implementation delegates to C-accelerated `base64.b64encode` + `bytes.translate` (alphabet remapping), so there is no pure-Python bit-manipulation loop. On Apple Silicon / Python 3.13:

- encode: ~3.9 M ops/s (vs ~8.5 M for raw `b64encode`)
- decode: ~2.6 M ops/s (vs ~4.4 M for raw `b64decode`)

The overhead over raw stdlib is exactly the cost of the two additional C calls needed for alphabet remapping.

## Questions for the maintainer

1. Is this within scope for `python-ulid`, or do you prefer to keep the library strictly aligned with the ULID spec (Crockford Base32 only)?
2. If in scope, would you prefer it as a method on `ULID` directly (`ulid.ob64` / `ULID.from_ob64`), or as a separate codec utility (e.g. `ulid.codecs.ob64`)?
3. Any concern about the `-` character being first in the alphabet (all-zero ULID encodes as `----------------------`)?

I have a working implementation with tests, benchmark, and documentation ready — happy to open a PR if there's interest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature proposal: Ordered Base64 (OB64) encoding — 22-char sortable representation #54

Summary

Motivation

Design

Implementation sketch

Questions for the maintainer

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Encoding	Length	Bits/char	Sorted?	URL-safe?
Crockford Base32 (current)	26 chars	5	✓	✓
OB64 (proposed)	22 chars	6	✓	✓
Standard Base64	24 chars + `==`	6	✗	✗
Base64url (RFC 4648 §5)	22 chars	6	✗	✓

Feature proposal: Ordered Base64 (OB64) encoding — 22-char sortable representation #54

Description

Summary

Motivation

Design

Implementation sketch

Questions for the maintainer

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions