Skip to content

feat(flatkv): adds per-DB LtHash tracking to the FlatKV commit store#3074

Open
blindchaser wants to merge 4 commits intomainfrom
yiren/lthash-per-db
Open

feat(flatkv): adds per-DB LtHash tracking to the FlatKV commit store#3074
blindchaser wants to merge 4 commits intomainfrom
yiren/lthash-per-db

Conversation

@blindchaser
Copy link
Contributor

Summary

Adds per-DB LtHash tracking to each of the four FlatKV data databases (account, code, storage, legacy) alongside the existing global LtHash. The global LtHash is now derived from per-DB hashes via the homomorphic property (sum of per-DB = global), eliminating an independent computation path and making the invariant structural.

Persistence model: authoritative per-DB hashes live in metadataDB (written atomically with the global hash in commitGlobalMetadata); secondary copies are embedded in each DB's LocalMeta for per-DB integrity verification.

  • keys.go: Extends LocalMeta with an optional *lthash.LtHash field. MarshalLocalMeta/UnmarshalLocalMeta accept both 8-byte (version-only, old format) and 2056-byte (version + LtHash) formats. Rejects any other length.
  • store.go: Adds perDBCommittedLtHash and perDBWorkingLtHash maps to CommitStore. loadGlobalMetadata now calls loadPerDBLtHashes to read per-DB hashes from metadataDB; missing keys initialize to zero (fresh start).
  • store_meta.go: commitGlobalMetadata atomically writes per-DB hashes in the same batch as global version and global hash. snapshotLtHashes clones all working hashes (global + per-DB) to committed state. loadPerDBLtHashes reads per-DB keys from metadataDB, initializing to zero on not-found.
  • store_write.go: ApplyChangeSets computes per-DB hashes independently using the already-separated pair slices, then derives the global hash via Reset() + MixIn(). commitBatches embeds the per-DB hash into each DB's LocalMeta. Uses a fixed-size [4]dbPairs array to avoid per-block map allocation.
  • store_catchup.go: WAL replay snapshots per-DB hashes via snapshotLtHashes() after each replayed version.
  • importer.go: Snapshot import uses snapshotLtHashes() to persist per-DB committed state.

Test plan

perdb_lthash_test.go (11 tests):

  • SkewRecovery: Tampers accountDB LocalMeta to version V-1, reopens, verifies catchup produces correct per-DB hashes via full scan.
  • PersistenceAfterReopen: 10-block commit cycle, close, reopen, verifies persisted per-DB hashes match full scan and committed == working.
  • IncrementalEqualsFullScan: 20 blocks with writes, updates, and deletes across all DB types; verifies incremental per-DB hashes match full scan at multiple checkpoints.
  • SumEqualsGlobal: Verifies the homomorphic invariant (sum of per-DB hashes == global hash) after 5 mixed-state blocks.
  • CatchupReplay: Snapshot at V2, commit to V5, reopen from snapshot, verifies per-DB hashes after WAL replay match pre-close values.
  • EmptyBlocks: Verifies 5 empty blocks do not drift per-DB hashes.
  • AfterImport: Snapshot import via Importer, verifies per-DB hashes match full scan and committed == working.
  • Rollback: Commits to V5, rolls back to V3, verifies per-DB hashes match full scan at V3.
  • PersistedInMetadataDB: Reads per-DB keys directly from metadataDB after commit, verifies they match in-memory committed hashes.
  • AfterDirectImport: Large ApplyChangeSets with 20 pairs, verifies per-DB hashes match full scan.

keys_test.go:

  • RoundTripWithLtHash: Verifies 2056-byte LocalMeta round-trip serialization with embedded LtHash.
  • Updated existing tests to assert LtHash == nil for old 8-byte format.

@github-actions
Copy link

github-actions bot commented Mar 16, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMar 17, 2026, 2:43 PM

@codecov
Copy link

codecov bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 89.04110% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.44%. Comparing base (1d93a1a) to head (b6c9641).

Files with missing lines Patch % Lines
sei-db/state_db/sc/flatkv/store.go 76.00% 5 Missing and 1 partial ⚠️
sei-db/state_db/sc/flatkv/keys.go 90.47% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3074      +/-   ##
==========================================
+ Coverage   58.42%   58.44%   +0.02%     
==========================================
  Files        2088     2088              
  Lines      172102   172149      +47     
==========================================
+ Hits       100547   100609      +62     
+ Misses      62618    62592      -26     
- Partials     8937     8948      +11     
Flag Coverage Δ
sei-chain-pr 69.20% <89.04%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/state_db/sc/flatkv/store_meta.go 71.42% <100.00%> (+3.86%) ⬆️
sei-db/state_db/sc/flatkv/store_write.go 81.05% <100.00%> (+0.97%) ⬆️
sei-db/state_db/sc/flatkv/keys.go 98.03% <90.47%> (-1.97%) ⬇️
sei-db/state_db/sc/flatkv/store.go 72.18% <76.00%> (-0.55%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 58610cf3c4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +109 to +112
if errorutils.IsNotFound(err) {
s.perDBCommittedLtHash[dbDir] = lthash.New()
s.perDBWorkingLtHash[dbDir] = lthash.New()
continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Backfill missing per-DB hashes before using them

loadPerDBLtHashes treats missing _meta/hash/* keys as a fresh database and seeds each per-DB hash with zero, but this also happens when opening an existing pre-migration store where global metadata and data already exist. In that upgrade path, the next non-empty ApplyChangeSets computes deltas from zero baselines and then rebuilds workingLtHash from those per-DB hashes, causing the committed global hash to diverge from actual on-disk state. Missing per-DB keys need a migration/backfill path (or guarded error) when the store is not truly empty.

Useful? React with 👍 / 👎.

@blindchaser blindchaser changed the title feat: adds per-DB LtHash tracking to the FlatKV commit store feat(flatkv): adds per-DB LtHash tracking to the FlatKV commit store Mar 16, 2026
func (s *CommitStore) loadPerDBLtHashes() error {
for dbDir, metaKey := range perDBLtHashKeys {
data, err := s.metadataDB.Get([]byte(metaKey))
if errorutils.IsNotFound(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to log a warning level log if we don't find the hash? Something like this:

No lattice hash found for %s DB, initializing to fresh hash.

That way, if hashes diverge because a hash is missing from the metadataDB then the cause of failure is more obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed


imp.store.committedVersion = imp.version
imp.store.committedLtHash = imp.store.workingLtHash.Clone()
imp.store.snapshotLtHashes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We don't need the individual perDB committed LT hash
  2. We only need a single committedLtHash, so we don't need this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, removed snapshotLtHashes(). per-db lthashes are now persisted inside each db's LocalMeta atomically via commitBatches(), no separate snapshot step needed.

// Metadata DB keys
MetaGlobalVersion = "_meta/version" // Global committed version watermark (8 bytes)
MetaGlobalLtHash = "_meta/hash" // Global LtHash (2048 bytes)
MetaAccountLtHash = "_meta/hash/account"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, shall we store these per DB hash + version into its corresponding DB ? And metadata only store the GlobalHash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per-db lthash + version are now stored in each db's own LocalMeta

@blindchaser blindchaser force-pushed the yiren/lthash-per-db branch from 970933a to b6c9641 Compare March 17, 2026 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants