Skip to content

perf: optimize block creation#3093

Draft
alpe wants to merge 12 commits intomainfrom
alex/execute_perf
Draft

perf: optimize block creation#3093
alpe wants to merge 12 commits intomainfrom
alex/execute_perf

Conversation

@alpe
Copy link
Contributor

@alpe alpe commented Feb 19, 2026

Bring moonshot 2 optimizations to main

Performance Optimizations Analysis (alex/bench_phase1_start vs HEAD)

Summary: Targeted optimizations focused on reducing CPU overhead (marshalling/hashing) and I/O latency (store reads) in the block production hot path.

1. Executor: In-Memory Caching for Block Production

  • Change: Executor now caches lastHeaderHash, lastDataHash, and lastSignature in memory after committing a block.
  • Why: Eliminates the need to call store.GetBlockData and store.GetSignature (database I/O) at the start of CreateBlock. This significantly speeds up the "cold start" of a new block round.

2. Executor: Avoid Re-hashing

  • Change: CreateBlock reuses newState.LastHeaderHash (computed during state transition) instead of calling header.Hash() again before broadcasting.
  • Why: header.Hash() triggers a full Protobuf marshal and SHA-256 hash. Reusing the already-computed hash saves CPU cycles.

3. Executor: Optimized Metrics Collection

  • Change: Replaced data.Size() with data.TxsByteSize() for the BlockSizeBytes metric.
  • Why: data.Size() marshals the entire object to Protobuf to get the size. data.TxsByteSize() simply sums the lengths of the transaction byte slices, avoiding expensive allocations and serialization.

4. Types: Cached Public Key Marshalling

  • Change: Signer struct now caches the marshalled public key bytes (marshalledPubKey).
  • Why: ToProto methods (called frequently during P2P exchange and storage) used to call crypto.MarshalPublicKey every time. Caching this eliminates repeated serialization overhead.

5. Types: Lightweight Sequence Validation

  • Change: Introduced AssertValidSequence in State.
  • Why: Provides a "fast path" validation for self-produced blocks. It checks simple fields (Height, Time, Hash Chaining) but skips the expensive Validate() call, which requires re-marshalling and re-hashing the entire block data to verify the DACommitment.

6. Store/Executor: Pre-serialized writes (SaveBlockDataFromBytes)

  • Change: Added SaveBlockDataFromBytes to the Store interface; signHeader now returns the serialized header bytes.
  • Why: Allows the Executor to write the block to disk using the exact bytes it already generated for signing/broadcasting, avoiding a redundant MarshalBinary call at the storage layer.

7. Protobuf: Zero-Allocation Timestamp Conversion

  • Change: In State.ToProto, manually constructed &timestamppb.Timestamp{...}.
  • Why: Replaces timestamppb.New(time), which performs a heap allocation. Small optimization for a high-frequency operation.

8. Executor: Sequential P2P Broadcast

  • Change: Changed Header and Data broadcast from parallel (errgroup) to sequential (Header then Data).
  • Why: Correctness fix that also aids performance stability. Prevents peers from rejecting Data because they haven't received the Header yet, preventing network churn and retries.
go test -bench=BenchmarkProduceBlock -count=5 -benchmem -run='^$'  ./block/internal/executing/...

This branch:

goos: darwin
goarch: arm64
pkg: github.com/evstack/ev-node/block/internal/executing
cpu: Apple M2 Max
BenchmarkProduceBlock/single_tx-12                 43686             26992 ns/op            7205 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 45000             27086 ns/op            7305 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44564             27181 ns/op            7263 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44956             27089 ns/op            7295 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44187             27195 ns/op            7234 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   36262             32719 ns/op           25766 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   36657             32518 ns/op           25758 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37702             32044 ns/op           25739 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37425             31974 ns/op           25744 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37120             32146 ns/op           25750 B/op         80 allocs/op
BenchmarkProduceBlock/empty_batch-12               46038             26368 ns/op            6992 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               45804             26304 ns/op            6944 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               45894             26368 ns/op            6975 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               46137             26293 ns/op            6986 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               44925             26837 ns/op            6846 B/op         70 allocs/op
PASS
ok      github.com/evstack/ev-node/block/internal/executing     23.345s

base alex/bench_phase1_start

goos: darwin
goarch: arm64
pkg: github.com/evstack/ev-node/block/internal/executing
cpu: Apple M2 Max
BenchmarkProduceBlock/empty_batch-12               14835             80996 ns/op           11042 B/op        155 allocs/op
BenchmarkProduceBlock/empty_batch-12               14977             80256 ns/op           11031 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               15104             81055 ns/op           11024 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               14895             81077 ns/op           11035 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               14671             80922 ns/op           11048 B/op        154 allocs/op
BenchmarkProduceBlock/single_tx-12                 14695             81520 ns/op           11605 B/op        171 allocs/op
BenchmarkProduceBlock/single_tx-12                 14871             81326 ns/op           11592 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14636             81783 ns/op           11605 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14800             81486 ns/op           11595 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14617             81957 ns/op           11606 B/op        170 allocs/op
BenchmarkProduceBlock/100_txs-12                   12792             93848 ns/op           47152 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12963             93155 ns/op           47128 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   13046             92560 ns/op           47121 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12789             93442 ns/op           47141 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12891             93534 ns/op           47133 B/op        277 allocs/op
PASS
ok      github.com/evstack/ev-node/block/internal/executing     31.700s

@github-actions
Copy link
Contributor

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedFeb 19, 2026, 3:49 PM

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alpe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on significant performance improvements for block execution and storage operations. By introducing caching mechanisms, optimizing data serialization, and refining key generation, the changes aim to reduce CPU and memory overhead, particularly in high-throughput scenarios. The modifications also streamline the block production pipeline by avoiding redundant computations and ensuring correct P2P broadcast ordering, contributing to a more efficient and robust node operation.

Highlights

  • Performance Optimizations: Implemented several performance enhancements across block execution, storage, and serialization paths. This includes reducing memory allocations during key generation and protobuf marshalling, and optimizing data access patterns.
  • Caching and Amortization: Introduced in-memory caching for last block data (header hash, data hash, signature) to avoid redundant store reads during block creation. Pending block checks are now amortized to reduce the frequency of expensive store scans.
  • Streamlined Block Production: Refactored block production logic to use a new SaveBlockDataFromBytes method, allowing pre-serialized data to be saved directly. Validation for self-produced blocks now uses a lightweight AssertValidSequence to skip expensive re-computations.
  • P2P Broadcast Ordering: Ensured sequential broadcasting of block headers before block data to the P2P network, addressing potential validation failures due to out-of-order delivery on peers.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • block/internal/executing/executor.go
    • Removed errgroup import to simplify concurrency model.
    • Added hasPendingBlock, lastHeaderHash, lastDataHash, lastSignature, and pendingCheckCounter fields to Executor for caching and amortized checks.
    • Initialized and warmed the new last-block cache fields in initializeState.
    • Modified ProduceBlock to amortize pending limit checks, reducing store scans.
    • Optimized ProduceBlock to use hasPendingBlock flag to conditionally check for pending blocks in the store.
    • Updated signHeader call in ProduceBlock to retrieve serialized header bytes, avoiding re-marshalling.
    • Replaced blockProducer.ValidateBlock with currentState.AssertValidSequence for lightweight validation of self-produced blocks.
    • Removed deletePendingBlock call and explicitly set hasPendingBlock to false after commit.
    • Updated the in-memory last-block cache after successful block production.
    • Changed P2P broadcasting from parallel errgroup to sequential, ensuring header broadcast before data.
    • Implemented a fast path in CreateBlock to use cached last block data, falling back to store reads if the cache is empty.
    • Optimized ApplyBlock to conditionally allocate rawTxs slice and used a dedicated context for transaction execution.
    • Modified signHeader to return both the signature and the serialized header bytes.
    • Updated recordBlockMetrics to use data.TxsByteSize() for more accurate block size metrics.
  • pkg/store/batch.go
    • Updated ds.NewKey calls to ds.RawKey for direct key usage.
    • Introduced SaveBlockDataFromBytes to save pre-serialized block data, avoiding redundant marshalling.
    • Refactored SaveBlockData to utilize the new SaveBlockDataFromBytes method.
  • pkg/store/keys.go
    • Added heightKey helper function for efficient, allocation-minimal key generation for height-based keys.
    • Updated GetHeaderKey, getDataKey, GetSignatureKey, and getStateAtHeightKey to use the new heightKey function.
    • Simplified getHeightKey to directly return a constant string.
  • pkg/store/kv.go
    • Removed unused path import.
    • Refactored GenerateKey to use strings.Builder for more efficient string concatenation and reduced allocations.
  • pkg/store/tracing.go
    • Added tracing instrumentation for the new SaveBlockDataFromBytes method.
  • pkg/store/tracing_test.go
    • Added a mock implementation for the SaveBlockDataFromBytes method for testing purposes.
  • pkg/store/types.go
    • Added SaveBlockDataFromBytes to the Batch interface definition.
  • types/data.go
    • Added TxsByteSize() method to calculate the total byte size of transactions without protobuf marshalling, for cheaper metrics.
  • types/serialization.go
    • Modified SignedHeader.ToProto and SignedData.ToProto to use a cached marshalled public key from the Signer struct.
    • Updated SignedHeader.FromProto and SignedData.FromProto to populate the cached marshalled public key.
    • Optimized State.ToProto to avoid timestamppb.New allocation by directly constructing the timestamppb.Timestamp.
  • types/signer.go
    • Added marshalledPubKey field to Signer struct to cache marshalled public key bytes.
    • Implemented MarshalledPubKey() method to retrieve the cached public key or marshal it if not present.
    • Updated NewSigner to pre-cache the marshalled public key upon creation.
  • types/state.go
    • Added AssertValidSequence method for lightweight structural validation of self-produced blocks, bypassing expensive DACommitment re-computation.
Activity
  • No specific review comments or activity have been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link
Contributor

claude bot commented Feb 19, 2026

Claude finished @alpe's task in 4m 7s —— View job


Code Review: Alex/execute_perf

Tasks

  • Gather context and understand the PR
  • Review block/internal/executing/executor.go
  • Review pkg/store/ changes (batch, keys, kv, tracing, types)
  • Review types/ changes (data, serialization, signer, state)
  • Analyze correctness, performance, and concurrency
  • Post final review

Good set of targeted optimizations overall. The approach is sound — caching per-block hashes, amortizing pending checks, and fixing the sequential P2P broadcast ordering. Below are the issues found, ordered by severity.


Critical / High

1. signHeader second return value is dead code — misleading API

block/internal/executing/executor.go:525

signature, _, err := e.signHeader(&header.Header)

signHeader was changed to return (types.Signature, []byte, error) with the comment:

"The caller can reuse headerBytes in SaveBlockDataFromBytes to avoid a redundant MarshalBinary call."

But the bytes are always discarded with _. More importantly, the bytes are the signing payload from AggregatorNodeSignatureBytesProvider, which may differ from header.MarshalBinary() (what SaveBlockDataFromBytes expects for storage). Using them interchangeably would be incorrect unless the provider always returns the full header binary. Either implement the optimization end-to-end (using the bytes in SaveBlockDataFromBytes if they match), or revert signHeader back to returning two values and remove the misleading comment. Fix this →


2. Cache warm silently ignores errors — no logging

block/internal/executing/executor.go:299–308

if state.LastBlockHeight > 0 {
    h, d, err := e.store.GetBlockData(e.ctx, state.LastBlockHeight)
    if err == nil {
        e.lastHeaderHash = h.Hash()
        e.lastDataHash = d.Hash()
        sig, err := e.store.GetSignature(e.ctx, state.LastBlockHeight)
        if err == nil {
            e.lastSignature = *sig
        }
    }
}

A failed GetBlockData at startup (e.g., due to data corruption or aggressive pruning) silently falls back to the cold path. No log message is emitted, so operators have no signal that the cache didn't warm. Even a logger.Debug() here would help diagnose performance regressions. Fix this →


Medium

3. Amortized pending check can overshoot the limit

block/internal/executing/executor.go:449–466

const pendingCheckInterval uint64 = 64
shouldCheck := e.config.Node.MaxPendingHeadersAndData <= pendingCheckInterval ||
    e.pendingCheckCounter%pendingCheckInterval == 0

The condition MaxPendingHeadersAndData <= pendingCheckInterval is meant to force eager checking when the limit is small enough to overshoot within one interval. But with MaxPendingHeadersAndData = 100 and pendingCheckInterval = 64, 100 <= 64 is false, so checks are amortized every 64 blocks. In this case, the actual count could reach 163 before the next check triggers a rejection — a 63% overshoot. The comment claims it "won't overshoot" but the code doesn't fully enforce that. Either document the trade-off explicitly or tighten the condition.


4. AssertValidSequence removes a correctness safety net

types/state.go:92, block/internal/executing/executor.go:533

Replacing blockProducer.ValidateBlock (which calls Validate()d.DACommitment()) with AssertValidSequence skips recomputing and verifying the DataHash. The justification is that the producer "just computed it", but AssertValidSequence would silently accept a block with an incorrectly computed DataHash — masking bugs in CreateBlock. Consider keeping the full validation during testing/debug builds or adding an explicit unit test that proves CreateBlock always produces a consistent DataHash. At minimum, document this trade-off clearly with a link to a tracking issue.


5. Missing unit tests for AssertValidSequence

types/state.go:92

The new AssertValidSequence method is used in the critical block production path as a replacement for full validation. There are no unit tests added for it that verify it correctly catches all checked conditions (wrong chain ID, wrong height, time regression, wrong last header hash, wrong app hash). Given that skipping DataHash verification is a conscious safety trade-off, the remaining checks should be thoroughly tested.


6. tracingMockBatch.SaveBlockDataFromBytes drops data silently

pkg/store/tracing_test.go:170

func (b *tracingMockBatch) SaveBlockDataFromBytes(header *types.SignedHeader, _, _ []byte, signature *types.Signature) error {
    // Delegate to SaveBlockData for test mocking purposes.
    return b.SaveBlockData(header, nil, signature)
}

This mock delegates to SaveBlockData with data = nil, silently discarding the pre-serialized bytes. If a test ever checks that SaveBlockDataFromBytes correctly persists the data, this mock would give a false negative. The mock should at least store the blobs for assertion, or the comment should more clearly warn that this is a structural stub only. Fix this →


7. MarshalledPubKey has a potential data race

types/signer.go:19

func (s *Signer) MarshalledPubKey() ([]byte, error) {
    if len(s.marshalledPubKey) > 0 {
        return s.marshalledPubKey, nil
    }
    ...
    s.marshalledPubKey = bz
    return bz, nil
}

If two goroutines both call MarshalledPubKey() on the same *Signer concurrently, the check-and-set is a race (benign in terms of outcome — both write the same value — but still undefined behavior in Go's memory model). In the current codebase this is likely not exercised concurrently, but if SignedHeader.ToProto() is ever called from multiple goroutines on the same header (e.g., during sync + broadcast), it could trigger Go's race detector. Consider using sync.Once or sync/atomic for the cache.


Low / Style

8. Unnecessary copy in SaveBlockDataFromBytes

pkg/store/batch.go:66

signatureHash := *signature
...
if err := b.batch.Put(b.ctx, ds.RawKey(getSignatureKey(height)), signatureHash[:]); err != nil {

signatureHash := *signature dereferences the pointer, assigning the []byte to a local variable, then signatureHash[:] is a no-op re-slice. Just use *signature directly:

if err := b.batch.Put(b.ctx, ds.RawKey(getSignatureKey(height)), *signature); err != nil {

9. Duplicate exported/unexported key functions

pkg/store/keys.go:54–72

func GetHeaderKey(height uint64) string { return heightKey(headerPrefix, height) }
func getHeaderKey(height uint64) string { return heightKey(headerPrefix, height) }

Both exported and unexported variants do the same thing. Either use only the exported form internally (no functional difference) or explain why both are needed. The same pattern repeats for GetDataKey/getDataKey, GetSignatureKey/getSignatureKey. This is a pre-existing pattern but the PR perpetuates it.


10. path import listed as "removed" but path/filepath remains

pkg/store/kv.go

The changelog mentions "Removed unused path import." This is correct — path (stdlib simple path package) was removed. path/filepath is correctly retained and used by rootify. Just noting this for clarity since the changelog wording is slightly ambiguous.


Summary

Area Assessment
P2P sequential broadcast fix Correct and necessary — good catch
In-memory hash/sig cache Correct; single goroutine access means no race
Amortized pending check Functionally correct; overshoot potential should be documented
SaveBlockDataFromBytes refactor Clean; correctly avoids double-marshal via delegation
AssertValidSequence Risky trade-off; DataHash check omission needs more test coverage
MarshalledPubKey caching Correct for current access patterns; fragile if concurrency changes
heightKey allocation reduction Clean optimization
TxsByteSize() Simple and correct
State.ToProto timestamp optimization Correct; behavior matches timestamppb.New

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces several performance optimizations aimed at reducing redundant store reads and protobuf marshaling/unmarshaling. Key changes include caching lastHeaderHash, lastDataHash, and lastSignature in the Executor struct, amortizing pending limit checks, and introducing SaveBlockDataFromBytes to avoid re-serialization. Additionally, TxsByteSize is added for more efficient metrics reporting, and MarshalledPubKey caches public key serialization. The signHeader function now returns serialized header bytes to prevent redundant marshaling. Overall, these changes should lead to a more efficient block production pipeline.

Comment on lines +293 to +294
if _, err := e.store.GetMetadata(e.ctx, headerKey); err == nil {
e.hasPendingBlock.Store(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The headerKey variable is not defined in this scope. It seems to be a global variable or a constant that is not imported or declared within this file. This will lead to a compilation error.

Suggested change
if _, err := e.store.GetMetadata(e.ctx, headerKey); err == nil {
e.hasPendingBlock.Store(true)
if _, err := e.store.GetMetadata(e.ctx, store.PendingBlockKey); err == nil {

"github.com/libp2p/go-libp2p/core/crypto"
"github.com/rs/zerolog"
"golang.org/x/sync/errgroup"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The golang.org/x/sync/errgroup import was removed, but the corresponding usage in ProduceBlock (lines 548-558 in the old code) was replaced with sequential broadcasting. This is a functional change that might impact the overall block production time if broadcasting was previously done concurrently. While the comment on line 598 explains the sequential broadcast, it's worth noting the performance implication of this change.

// For based sequencer, this will return an empty signature
signature, err := e.signHeader(header.Header)
// For based sequencer, this will return an empty signature.
signature, _, err := e.signHeader(&header.Header)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The signHeader function now returns three values, but only two are being captured here. The headerBytes (second return value) is not being used, which could be passed to batch.SaveBlockDataFromBytes to avoid re-marshalling the header. This would further optimize the block production process.

Suggested change
signature, _, err := e.signHeader(&header.Header)
signature, headerBytes, err := e.signHeader(&header.Header)

height := header.Height()
signatureHash := *signature

headerBlob, err := header.MarshalBinary()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The height variable is not used in this function after the refactoring. It can be removed.

signatureHash := *signature

headerBlob, err := header.MarshalBinary()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The signatureHash variable is not used in this function after the refactoring. It can be removed.

// SaveBlockDataFromBytes saves pre-serialized block data to the batch.
// This avoids re-marshalling header and data when the caller already has the binary blobs.
func (b *DefaultBatch) SaveBlockDataFromBytes(header *types.SignedHeader, headerBlob, dataBlob []byte, signature *types.Signature) error {
height := header.Height()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The height variable is already available from header.Height() and signatureHash from *signature. These lines are redundant and can be removed.

Suggested change
height := header.Height()
height := header.Height()
signatureHash := *signature

@@ -39,29 +39,40 @@ const (
heightPrefix = "t"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The heightPrefix constant is defined but not used in the new heightKey function. It seems like the intention was to use it as the prefix, but it's hardcoded as 't' in the constant definition and then 'h', 'd', 'c', 's' in the heightKey calls. This could lead to confusion or incorrect key generation.

Comment on lines 90 to +91
func getHeightKey() string {
return GenerateKey([]string{heightPrefix})
return "/" + heightPrefix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The getHeightKey function now returns a hardcoded string "/t". This is inconsistent with the heightKey function which dynamically generates keys based on a prefix and height. If heightPrefix is intended to be used here, it should be passed to heightKey or directly used in the string concatenation.

Suggested change
func getHeightKey() string {
return GenerateKey([]string{heightPrefix})
return "/" + heightPrefix
func getHeightKey() string {
return heightKey(heightPrefix, 0) // Or whatever default height is appropriate
}

Comment on lines 43 to +55
func GenerateKey(fields []string) string {
key := "/" + strings.Join(fields, "/")
return path.Clean(key)
// Pre-calculate total size to avoid re-allocation.
n := 0
for _, f := range fields {
n += 1 + len(f) // '/' + field
}
var b strings.Builder
b.Grow(n)
for _, f := range fields {
b.WriteByte('/')
b.WriteString(f)
}
return b.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The GenerateKey function was optimized to use strings.Builder for performance. However, the path.Clean call was removed. While strings.Builder is more efficient for concatenation, path.Clean ensures that the generated key is always in a canonical form (e.g., handling double slashes or ./.. components). Removing it might lead to non-canonical keys if the input fields are not perfectly clean, which could cause issues with datastore lookups.

Base automatically changed from alex/bench_phase1_start to main February 19, 2026 16:20
@alpe alpe changed the title Alex/execute perf perf: optimize block creation Feb 19, 2026
Copy link
Member

@julienrbrt julienrbrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! mostly lgtm! i think we can delete some old code and delete some struct cache to put them in cached_store instead (better for readability and consistency)

} else {
// Check if there's an already stored block at the newHeight.
// Only hit the store if the in-memory flag indicates a pending block exists.
if e.hasPendingBlock.Load() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Makes a lot of sense indeed.


// Cached per-block data to avoid store reads + protobuf deserialization
// in CreateBlock. Updated after each successful block production.
lastHeaderHash types.Hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still believe those aren't necessary, because GetBlockData is cached on the second call.
We may simply want to add GetSignature to the pkg/store/cached_store.go for consistency.

// Amortized pending limit check — NumPendingHeaders/NumPendingData call
// advancePastEmptyData which scans the store. Only amortize when the limit
// is large enough that checking every N blocks won't overshoot.
const pendingCheckInterval uint64 = 64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. Maybe we should stay 64 is an arbitrary but good value?

header.Signature = signature

if err := e.blockProducer.ValidateBlock(ctx, currentState, header, data); err != nil {
// Structural validation only — skip the expensive Validate() / DACommitment()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, im not sure about this, we may have a buggy sequencer config and we won't see it.
AggregatorNodeSignatureBytesProvider is different for ev-abci, and we may want to verify?

Need to check more, but if we don't need to we should remove SetCustomVerifierForAggregator and ValidateBlock are they aren't needed nor used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: i think we can remove those two indeed, and delete that comment, as ValidateBlock should be removed (with SetCustomVerifierForAggregator as well)

lastSignature = e.lastSignature
} else {
// Cold start fallback: read from store
lastHeader, lastData, err := e.store.GetBlockData(ctx, height-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically not true, without those cached struct, it would have read from cache anyway

}

// AssertValidSequence performs lightweight state-sequence validation for self-produced blocks.
// It skips the expensive Validate() call (which re-computes DACommitment by re-marshaling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we delete ValidateBlocks, let's clean up those comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments