perf: optimize block creation by alpe · Pull Request #3093 · evstack/ev-node

alpe · 2026-02-19T15:49:24Z

Bring moonshot 2 optimizations to main

Performance Optimizations Analysis (alex/bench_phase1_start vs HEAD)

Summary: Targeted optimizations focused on reducing CPU overhead (marshalling/hashing) and I/O latency (store reads) in the block production hot path.

1. Executor: In-Memory Caching for Block Production

Change: Executor now caches lastHeaderHash, lastDataHash, and lastSignature in memory after committing a block.
Why: Eliminates the need to call store.GetBlockData and store.GetSignature (database I/O) at the start of CreateBlock. This significantly speeds up the "cold start" of a new block round.

2. Executor: Avoid Re-hashing

Change: CreateBlock reuses newState.LastHeaderHash (computed during state transition) instead of calling header.Hash() again before broadcasting.
Why: header.Hash() triggers a full Protobuf marshal and SHA-256 hash. Reusing the already-computed hash saves CPU cycles.

3. Executor: Optimized Metrics Collection

Change: Replaced data.Size() with data.TxsByteSize() for the BlockSizeBytes metric.
Why: data.Size() marshals the entire object to Protobuf to get the size. data.TxsByteSize() simply sums the lengths of the transaction byte slices, avoiding expensive allocations and serialization.

4. Types: Cached Public Key Marshalling

Change: Signer struct now caches the marshalled public key bytes (marshalledPubKey).
Why: ToProto methods (called frequently during P2P exchange and storage) used to call crypto.MarshalPublicKey every time. Caching this eliminates repeated serialization overhead.

5. Types: Lightweight Sequence Validation

Change: Introduced AssertValidSequence in State.
Why: Provides a "fast path" validation for self-produced blocks. It checks simple fields (Height, Time, Hash Chaining) but skips the expensive Validate() call, which requires re-marshalling and re-hashing the entire block data to verify the DACommitment.

6. Store/Executor: Pre-serialized writes (`SaveBlockDataFromBytes`)

Change: Added SaveBlockDataFromBytes to the Store interface; signHeader now returns the serialized header bytes.
Why: Allows the Executor to write the block to disk using the exact bytes it already generated for signing/broadcasting, avoiding a redundant MarshalBinary call at the storage layer.

7. Protobuf: Zero-Allocation Timestamp Conversion

Change: In State.ToProto, manually constructed &timestamppb.Timestamp{...}.
Why: Replaces timestamppb.New(time), which performs a heap allocation. Small optimization for a high-frequency operation.

8. Executor: Sequential P2P Broadcast

Change: Changed Header and Data broadcast from parallel (errgroup) to sequential (Header then Data).
Why: Correctness fix that also aids performance stability. Prevents peers from rejecting Data because they haven't received the Header yet, preventing network churn and retries.

go test -bench=BenchmarkProduceBlock -count=5 -benchmem -run='^$'  ./block/internal/executing/...

This branch:

goos: darwin
goarch: arm64
pkg: github.com/evstack/ev-node/block/internal/executing
cpu: Apple M2 Max
BenchmarkProduceBlock/single_tx-12                 43686             26992 ns/op            7205 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 45000             27086 ns/op            7305 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44564             27181 ns/op            7263 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44956             27089 ns/op            7295 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44187             27195 ns/op            7234 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   36262             32719 ns/op           25766 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   36657             32518 ns/op           25758 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37702             32044 ns/op           25739 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37425             31974 ns/op           25744 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37120             32146 ns/op           25750 B/op         80 allocs/op
BenchmarkProduceBlock/empty_batch-12               46038             26368 ns/op            6992 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               45804             26304 ns/op            6944 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               45894             26368 ns/op            6975 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               46137             26293 ns/op            6986 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               44925             26837 ns/op            6846 B/op         70 allocs/op
PASS
ok      github.com/evstack/ev-node/block/internal/executing     23.345s

base alex/bench_phase1_start

goos: darwin
goarch: arm64
pkg: github.com/evstack/ev-node/block/internal/executing
cpu: Apple M2 Max
BenchmarkProduceBlock/empty_batch-12               14835             80996 ns/op           11042 B/op        155 allocs/op
BenchmarkProduceBlock/empty_batch-12               14977             80256 ns/op           11031 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               15104             81055 ns/op           11024 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               14895             81077 ns/op           11035 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               14671             80922 ns/op           11048 B/op        154 allocs/op
BenchmarkProduceBlock/single_tx-12                 14695             81520 ns/op           11605 B/op        171 allocs/op
BenchmarkProduceBlock/single_tx-12                 14871             81326 ns/op           11592 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14636             81783 ns/op           11605 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14800             81486 ns/op           11595 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14617             81957 ns/op           11606 B/op        170 allocs/op
BenchmarkProduceBlock/100_txs-12                   12792             93848 ns/op           47152 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12963             93155 ns/op           47128 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   13046             92560 ns/op           47121 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12789             93442 ns/op           47141 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12891             93534 ns/op           47133 B/op        277 allocs/op
PASS
ok      github.com/evstack/ev-node/block/internal/executing     31.700s

… executor comments.

github-actions · 2026-02-19T15:49:42Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Feb 19, 2026, 3:49 PM

gemini-code-assist · 2026-02-19T15:49:52Z

Summary of Changes

Hello @alpe, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on significant performance improvements for block execution and storage operations. By introducing caching mechanisms, optimizing data serialization, and refining key generation, the changes aim to reduce CPU and memory overhead, particularly in high-throughput scenarios. The modifications also streamline the block production pipeline by avoiding redundant computations and ensuring correct P2P broadcast ordering, contributing to a more efficient and robust node operation.

Highlights

Performance Optimizations: Implemented several performance enhancements across block execution, storage, and serialization paths. This includes reducing memory allocations during key generation and protobuf marshalling, and optimizing data access patterns.
Caching and Amortization: Introduced in-memory caching for last block data (header hash, data hash, signature) to avoid redundant store reads during block creation. Pending block checks are now amortized to reduce the frequency of expensive store scans.
Streamlined Block Production: Refactored block production logic to use a new SaveBlockDataFromBytes method, allowing pre-serialized data to be saved directly. Validation for self-produced blocks now uses a lightweight AssertValidSequence to skip expensive re-computations.
P2P Broadcast Ordering: Ensured sequential broadcasting of block headers before block data to the P2P network, addressing potential validation failures due to out-of-order delivery on peers.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

block/internal/executing/executor.go
- Removed errgroup import to simplify concurrency model.
- Added hasPendingBlock, lastHeaderHash, lastDataHash, lastSignature, and pendingCheckCounter fields to Executor for caching and amortized checks.
- Initialized and warmed the new last-block cache fields in initializeState.
- Modified ProduceBlock to amortize pending limit checks, reducing store scans.
- Optimized ProduceBlock to use hasPendingBlock flag to conditionally check for pending blocks in the store.
- Updated signHeader call in ProduceBlock to retrieve serialized header bytes, avoiding re-marshalling.
- Replaced blockProducer.ValidateBlock with currentState.AssertValidSequence for lightweight validation of self-produced blocks.
- Removed deletePendingBlock call and explicitly set hasPendingBlock to false after commit.
- Updated the in-memory last-block cache after successful block production.
- Changed P2P broadcasting from parallel errgroup to sequential, ensuring header broadcast before data.
- Implemented a fast path in CreateBlock to use cached last block data, falling back to store reads if the cache is empty.
- Optimized ApplyBlock to conditionally allocate rawTxs slice and used a dedicated context for transaction execution.
- Modified signHeader to return both the signature and the serialized header bytes.
- Updated recordBlockMetrics to use data.TxsByteSize() for more accurate block size metrics.
pkg/store/batch.go
- Updated ds.NewKey calls to ds.RawKey for direct key usage.
- Introduced SaveBlockDataFromBytes to save pre-serialized block data, avoiding redundant marshalling.
- Refactored SaveBlockData to utilize the new SaveBlockDataFromBytes method.
pkg/store/keys.go
- Added heightKey helper function for efficient, allocation-minimal key generation for height-based keys.
- Updated GetHeaderKey, getDataKey, GetSignatureKey, and getStateAtHeightKey to use the new heightKey function.
- Simplified getHeightKey to directly return a constant string.
pkg/store/kv.go
- Removed unused path import.
- Refactored GenerateKey to use strings.Builder for more efficient string concatenation and reduced allocations.
pkg/store/tracing.go
- Added tracing instrumentation for the new SaveBlockDataFromBytes method.
pkg/store/tracing_test.go
- Added a mock implementation for the SaveBlockDataFromBytes method for testing purposes.
pkg/store/types.go
- Added SaveBlockDataFromBytes to the Batch interface definition.
types/data.go
- Added TxsByteSize() method to calculate the total byte size of transactions without protobuf marshalling, for cheaper metrics.
types/serialization.go
- Modified SignedHeader.ToProto and SignedData.ToProto to use a cached marshalled public key from the Signer struct.
- Updated SignedHeader.FromProto and SignedData.FromProto to populate the cached marshalled public key.
- Optimized State.ToProto to avoid timestamppb.New allocation by directly constructing the timestamppb.Timestamp.
types/signer.go
- Added marshalledPubKey field to Signer struct to cache marshalled public key bytes.
- Implemented MarshalledPubKey() method to retrieve the cached public key or marshal it if not present.
- Updated NewSigner to pre-cache the marshalled public key upon creation.
types/state.go
- Added AssertValidSequence method for lightweight structural validation of self-produced blocks, bypassing expensive DACommitment re-computation.

Activity

No specific review comments or activity have been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2026-02-19T15:49:55Z

Claude finished @alpe's task in 4m 7s —— View job

Code Review: Alex/execute_perf

Tasks

Gather context and understand the PR
Review block/internal/executing/executor.go
Review pkg/store/ changes (batch, keys, kv, tracing, types)
Review types/ changes (data, serialization, signer, state)
Analyze correctness, performance, and concurrency
Post final review

Good set of targeted optimizations overall. The approach is sound — caching per-block hashes, amortizing pending checks, and fixing the sequential P2P broadcast ordering. Below are the issues found, ordered by severity.

Critical / High

1. signHeader second return value is dead code — misleading API

block/internal/executing/executor.go:525

signature, _, err := e.signHeader(&header.Header)

signHeader was changed to return (types.Signature, []byte, error) with the comment:

"The caller can reuse headerBytes in SaveBlockDataFromBytes to avoid a redundant MarshalBinary call."

But the bytes are always discarded with _. More importantly, the bytes are the signing payload from AggregatorNodeSignatureBytesProvider, which may differ from header.MarshalBinary() (what SaveBlockDataFromBytes expects for storage). Using them interchangeably would be incorrect unless the provider always returns the full header binary. Either implement the optimization end-to-end (using the bytes in SaveBlockDataFromBytes if they match), or revert signHeader back to returning two values and remove the misleading comment. Fix this →

2. Cache warm silently ignores errors — no logging

block/internal/executing/executor.go:299–308

if state.LastBlockHeight > 0 {
    h, d, err := e.store.GetBlockData(e.ctx, state.LastBlockHeight)
    if err == nil {
        e.lastHeaderHash = h.Hash()
        e.lastDataHash = d.Hash()
        sig, err := e.store.GetSignature(e.ctx, state.LastBlockHeight)
        if err == nil {
            e.lastSignature = *sig
        }
    }
}

A failed GetBlockData at startup (e.g., due to data corruption or aggressive pruning) silently falls back to the cold path. No log message is emitted, so operators have no signal that the cache didn't warm. Even a logger.Debug() here would help diagnose performance regressions. Fix this →

Medium

3. Amortized pending check can overshoot the limit

block/internal/executing/executor.go:449–466

const pendingCheckInterval uint64 = 64
shouldCheck := e.config.Node.MaxPendingHeadersAndData <= pendingCheckInterval ||
    e.pendingCheckCounter%pendingCheckInterval == 0

The condition MaxPendingHeadersAndData <= pendingCheckInterval is meant to force eager checking when the limit is small enough to overshoot within one interval. But with MaxPendingHeadersAndData = 100 and pendingCheckInterval = 64, 100 <= 64 is false, so checks are amortized every 64 blocks. In this case, the actual count could reach 163 before the next check triggers a rejection — a 63% overshoot. The comment claims it "won't overshoot" but the code doesn't fully enforce that. Either document the trade-off explicitly or tighten the condition.

4. AssertValidSequence removes a correctness safety net

types/state.go:92, block/internal/executing/executor.go:533

Replacing blockProducer.ValidateBlock (which calls Validate() → d.DACommitment()) with AssertValidSequence skips recomputing and verifying the DataHash. The justification is that the producer "just computed it", but AssertValidSequence would silently accept a block with an incorrectly computed DataHash — masking bugs in CreateBlock. Consider keeping the full validation during testing/debug builds or adding an explicit unit test that proves CreateBlock always produces a consistent DataHash. At minimum, document this trade-off clearly with a link to a tracking issue.

5. Missing unit tests for AssertValidSequence

types/state.go:92

The new AssertValidSequence method is used in the critical block production path as a replacement for full validation. There are no unit tests added for it that verify it correctly catches all checked conditions (wrong chain ID, wrong height, time regression, wrong last header hash, wrong app hash). Given that skipping DataHash verification is a conscious safety trade-off, the remaining checks should be thoroughly tested.

6. tracingMockBatch.SaveBlockDataFromBytes drops data silently

pkg/store/tracing_test.go:170

func (b *tracingMockBatch) SaveBlockDataFromBytes(header *types.SignedHeader, _, _ []byte, signature *types.Signature) error {
    // Delegate to SaveBlockData for test mocking purposes.
    return b.SaveBlockData(header, nil, signature)
}

This mock delegates to SaveBlockData with data = nil, silently discarding the pre-serialized bytes. If a test ever checks that SaveBlockDataFromBytes correctly persists the data, this mock would give a false negative. The mock should at least store the blobs for assertion, or the comment should more clearly warn that this is a structural stub only. Fix this →

7. MarshalledPubKey has a potential data race

types/signer.go:19

func (s *Signer) MarshalledPubKey() ([]byte, error) {
    if len(s.marshalledPubKey) > 0 {
        return s.marshalledPubKey, nil
    }
    ...
    s.marshalledPubKey = bz
    return bz, nil
}

If two goroutines both call MarshalledPubKey() on the same *Signer concurrently, the check-and-set is a race (benign in terms of outcome — both write the same value — but still undefined behavior in Go's memory model). In the current codebase this is likely not exercised concurrently, but if SignedHeader.ToProto() is ever called from multiple goroutines on the same header (e.g., during sync + broadcast), it could trigger Go's race detector. Consider using sync.Once or sync/atomic for the cache.

Low / Style

8. Unnecessary copy in SaveBlockDataFromBytes

pkg/store/batch.go:66

signatureHash := *signature
...
if err := b.batch.Put(b.ctx, ds.RawKey(getSignatureKey(height)), signatureHash[:]); err != nil {

signatureHash := *signature dereferences the pointer, assigning the []byte to a local variable, then signatureHash[:] is a no-op re-slice. Just use *signature directly:

if err := b.batch.Put(b.ctx, ds.RawKey(getSignatureKey(height)), *signature); err != nil {

9. Duplicate exported/unexported key functions

pkg/store/keys.go:54–72

func GetHeaderKey(height uint64) string { return heightKey(headerPrefix, height) }
func getHeaderKey(height uint64) string { return heightKey(headerPrefix, height) }

Both exported and unexported variants do the same thing. Either use only the exported form internally (no functional difference) or explain why both are needed. The same pattern repeats for GetDataKey/getDataKey, GetSignatureKey/getSignatureKey. This is a pre-existing pattern but the PR perpetuates it.

10. path import listed as "removed" but path/filepath remains

pkg/store/kv.go

The changelog mentions "Removed unused path import." This is correct — path (stdlib simple path package) was removed. path/filepath is correctly retained and used by rootify. Just noting this for clarity since the changelog wording is slightly ambiguous.

Summary

Area	Assessment
P2P sequential broadcast fix	Correct and necessary — good catch
In-memory hash/sig cache	Correct; single goroutine access means no race
Amortized pending check	Functionally correct; overshoot potential should be documented
`SaveBlockDataFromBytes` refactor	Clean; correctly avoids double-marshal via delegation
`AssertValidSequence`	Risky trade-off; DataHash check omission needs more test coverage
`MarshalledPubKey` caching	Correct for current access patterns; fragile if concurrency changes
`heightKey` allocation reduction	Clean optimization
`TxsByteSize()`	Simple and correct
`State.ToProto` timestamp optimization	Correct; behavior matches `timestamppb.New`

gemini-code-assist

Code Review

The pull request introduces several performance optimizations aimed at reducing redundant store reads and protobuf marshaling/unmarshaling. Key changes include caching lastHeaderHash, lastDataHash, and lastSignature in the Executor struct, amortizing pending limit checks, and introducing SaveBlockDataFromBytes to avoid re-serialization. Additionally, TxsByteSize is added for more efficient metrics reporting, and MarshalledPubKey caches public key serialization. The signHeader function now returns serialized header bytes to prevent redundant marshaling. Overall, these changes should lead to a more efficient block production pipeline.

gemini-code-assist · 2026-02-19T15:52:02Z

block/internal/executing/executor.go

+	if _, err := e.store.GetMetadata(e.ctx, headerKey); err == nil {
+		e.hasPendingBlock.Store(true)


The headerKey variable is not defined in this scope. It seems to be a global variable or a constant that is not imported or declared within this file. This will lead to a compilation error.

Suggested change

if _, err := e.store.GetMetadata(e.ctx, headerKey); err == nil {

e.hasPendingBlock.Store(true)

if _, err := e.store.GetMetadata(e.ctx, store.PendingBlockKey); err == nil {

gemini-code-assist · 2026-02-19T15:52:02Z

block/internal/executing/executor.go

 	"github.com/libp2p/go-libp2p/core/crypto"
 	"github.com/rs/zerolog"
-	"golang.org/x/sync/errgroup"



The golang.org/x/sync/errgroup import was removed, but the corresponding usage in ProduceBlock (lines 548-558 in the old code) was replaced with sequential broadcasting. This is a functional change that might impact the overall block production time if broadcasting was previously done concurrently. While the comment on line 598 explains the sequential broadcast, it's worth noting the performance implication of this change.

gemini-code-assist · 2026-02-19T15:52:02Z

block/internal/executing/executor.go

-	// For based sequencer, this will return an empty signature
-	signature, err := e.signHeader(header.Header)
+	// For based sequencer, this will return an empty signature.
+	signature, _, err := e.signHeader(&header.Header)


The signHeader function now returns three values, but only two are being captured here. The headerBytes (second return value) is not being used, which could be passed to batch.SaveBlockDataFromBytes to avoid re-marshalling the header. This would further optimize the block production process.

Suggested change

signature, _, err := e.signHeader(&header.Header)

signature, headerBytes, err := e.signHeader(&header.Header)

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/batch.go

-	height := header.Height()
-	signatureHash := *signature
-
 	headerBlob, err := header.MarshalBinary()


The height variable is not used in this function after the refactoring. It can be removed.

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/batch.go

-	signatureHash := *signature
-
 	headerBlob, err := header.MarshalBinary()
 	if err != nil {


The signatureHash variable is not used in this function after the refactoring. It can be removed.

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/batch.go

+// SaveBlockDataFromBytes saves pre-serialized block data to the batch.
+// This avoids re-marshalling header and data when the caller already has the binary blobs.
+func (b *DefaultBatch) SaveBlockDataFromBytes(header *types.SignedHeader, headerBlob, dataBlob []byte, signature *types.Signature) error {
+	height := header.Height()


The height variable is already available from header.Height() and signatureHash from *signature. These lines are redundant and can be removed.

Suggested change

height := header.Height()

height := header.Height()

signatureHash := *signature

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/keys.go

@@ -39,29 +39,40 @@ const (
 	heightPrefix    = "t"


The heightPrefix constant is defined but not used in the new heightKey function. It seems like the intention was to use it as the prefix, but it's hardcoded as 't' in the constant definition and then 'h', 'd', 'c', 's' in the heightKey calls. This could lead to confusion or incorrect key generation.

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/keys.go

 func getHeightKey() string {
-	return GenerateKey([]string{heightPrefix})
+	return "/" + heightPrefix


The getHeightKey function now returns a hardcoded string "/t". This is inconsistent with the heightKey function which dynamically generates keys based on a prefix and height. If heightPrefix is intended to be used here, it should be passed to heightKey or directly used in the string concatenation.

Suggested change

func getHeightKey() string {

return GenerateKey([]string{heightPrefix})

return "/" + heightPrefix

func getHeightKey() string {

return heightKey(heightPrefix, 0) // Or whatever default height is appropriate

}

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/kv.go

 func GenerateKey(fields []string) string {
-	key := "/" + strings.Join(fields, "/")
-	return path.Clean(key)
+	// Pre-calculate total size to avoid re-allocation.
+	n := 0
+	for _, f := range fields {
+		n += 1 + len(f) // '/' + field
+	}
+	var b strings.Builder
+	b.Grow(n)
+	for _, f := range fields {
+		b.WriteByte('/')
+		b.WriteString(f)
+	}
+	return b.String()


The GenerateKey function was optimized to use strings.Builder for performance. However, the path.Clean call was removed. While strings.Builder is more efficient for concatenation, path.Clean ensures that the generated key is always in a canonical form (e.g., handling double slashes or ./.. components). Removing it might lead to non-canonical keys if the input fields are not perfectly clean, which could cause issues with datastore lookups.

julienrbrt

nice! mostly lgtm! i think we can delete some old code and delete some struct cache to put them in cached_store instead (better for readability and consistency)

julienrbrt · 2026-02-19T16:26:58Z

block/internal/executing/executor.go

-	} else {
+	// Check if there's an already stored block at the newHeight.
+	// Only hit the store if the in-memory flag indicates a pending block exists.
+	if e.hasPendingBlock.Load() {


Nice! Makes a lot of sense indeed.

julienrbrt · 2026-02-19T16:28:15Z

block/internal/executing/executor.go

+
+	// Cached per-block data to avoid store reads + protobuf deserialization
+	// in CreateBlock. Updated after each successful block production.
+	lastHeaderHash types.Hash


I still believe those aren't necessary, because GetBlockData is cached on the second call.
We may simply want to add GetSignature to the pkg/store/cached_store.go for consistency.

julienrbrt · 2026-02-19T16:30:16Z

block/internal/executing/executor.go

+	// Amortized pending limit check — NumPendingHeaders/NumPendingData call
+	// advancePastEmptyData which scans the store. Only amortize when the limit
+	// is large enough that checking every N blocks won't overshoot.
+	const pendingCheckInterval uint64 = 64


This makes sense. Maybe we should stay 64 is an arbitrary but good value?

julienrbrt · 2026-02-19T16:34:26Z

block/internal/executing/executor.go

 	header.Signature = signature

-	if err := e.blockProducer.ValidateBlock(ctx, currentState, header, data); err != nil {
+	// Structural validation only — skip the expensive Validate() / DACommitment()


Hum, im not sure about this, we may have a buggy sequencer config and we won't see it.
AggregatorNodeSignatureBytesProvider is different for ev-abci, and we may want to verify?

Need to check more, but if we don't need to we should remove SetCustomVerifierForAggregator and ValidateBlock are they aren't needed nor used.

EDIT: i think we can remove those two indeed, and delete that comment, as ValidateBlock should be removed (with SetCustomVerifierForAggregator as well)

julienrbrt · 2026-02-19T16:35:28Z

block/internal/executing/executor.go

+			lastSignature = e.lastSignature
+		} else {
+			// Cold start fallback: read from store
+			lastHeader, lastData, err := e.store.GetBlockData(ctx, height-1)


technically not true, without those cached struct, it would have read from cache anyway

julienrbrt · 2026-02-19T16:38:51Z

types/state.go

 }
+
+// AssertValidSequence performs lightweight state-sequence validation for self-produced blocks.
+// It skips the expensive Validate() call (which re-computes DACommitment by re-marshaling


if we delete ValidateBlocks, let's clean up those comments.

alpe added 12 commits February 19, 2026 12:26

Introduce phase1 benchmark

7dea42f

Bench refactor

fc7f729

bench: fix monotonically-increasing timestamp + add 100-tx case

a4fb81f

shot1

97bacb4

x

64bc9c7

y

d04ef7a

cgo

eb1145c

z

2ca152f

Revert proto changes and interface

71f4589

Remove unecessay caches

1f17854

Save pending

1bfac03

test: add 100-transaction benchmark for block production and simplify…

08163ad

… executor comments.

github-actions bot assigned alpe Feb 19, 2026

gemini-code-assist bot reviewed Feb 19, 2026

View reviewed changes

Base automatically changed from alex/bench_phase1_start to main February 19, 2026 16:20

alpe changed the title ~~Alex/execute perf~~ perf: optimize block creation Feb 19, 2026

julienrbrt reviewed Feb 19, 2026

View reviewed changes

		if _, err := e.store.GetMetadata(e.ctx, headerKey); err == nil {
		e.hasPendingBlock.Store(true)

	signature, _, err := e.signHeader(&header.Header)
	signature, headerBytes, err := e.signHeader(&header.Header)

	height := header.Height()
	height := header.Height()
	signatureHash := *signature

Conversation

alpe commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Optimizations Analysis (alex/bench_phase1_start vs HEAD)

1. Executor: In-Memory Caching for Block Production

2. Executor: Avoid Re-hashing

3. Executor: Optimized Metrics Collection

4. Types: Cached Public Key Marshalling

5. Types: Lightweight Sequence Validation

6. Store/Executor: Pre-serialized writes (SaveBlockDataFromBytes)

7. Protobuf: Zero-Allocation Timestamp Conversion

8. Executor: Sequential P2P Broadcast

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

gemini-code-assist bot commented Feb 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

claude bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: Alex/execute_perf

Tasks

Critical / High

Medium

Low / Style

Summary

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt left a comment

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 19, 2026

Choose a reason for hiding this comment

alpe commented Feb 19, 2026 •

edited

Loading

6. Store/Executor: Pre-serialized writes (`SaveBlockDataFromBytes`)

claude bot commented Feb 19, 2026 •

edited

Loading