Conversation
… executor comments.
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
Summary of ChangesHello @alpe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on significant performance improvements for block execution and storage operations. By introducing caching mechanisms, optimizing data serialization, and refining key generation, the changes aim to reduce CPU and memory overhead, particularly in high-throughput scenarios. The modifications also streamline the block production pipeline by avoiding redundant computations and ensuring correct P2P broadcast ordering, contributing to a more efficient and robust node operation. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Claude finished @alpe's task in 4m 7s —— View job Code Review: Alex/execute_perfTasks
Good set of targeted optimizations overall. The approach is sound — caching per-block hashes, amortizing pending checks, and fixing the sequential P2P broadcast ordering. Below are the issues found, ordered by severity. Critical / High1.
signature, _, err := e.signHeader(&header.Header)
But the bytes are always discarded with 2. Cache warm silently ignores errors — no logging
if state.LastBlockHeight > 0 {
h, d, err := e.store.GetBlockData(e.ctx, state.LastBlockHeight)
if err == nil {
e.lastHeaderHash = h.Hash()
e.lastDataHash = d.Hash()
sig, err := e.store.GetSignature(e.ctx, state.LastBlockHeight)
if err == nil {
e.lastSignature = *sig
}
}
}A failed Medium3. Amortized pending check can overshoot the limit
const pendingCheckInterval uint64 = 64
shouldCheck := e.config.Node.MaxPendingHeadersAndData <= pendingCheckInterval ||
e.pendingCheckCounter%pendingCheckInterval == 0The condition 4.
Replacing 5. Missing unit tests for
The new 6.
func (b *tracingMockBatch) SaveBlockDataFromBytes(header *types.SignedHeader, _, _ []byte, signature *types.Signature) error {
// Delegate to SaveBlockData for test mocking purposes.
return b.SaveBlockData(header, nil, signature)
}This mock delegates to 7.
func (s *Signer) MarshalledPubKey() ([]byte, error) {
if len(s.marshalledPubKey) > 0 {
return s.marshalledPubKey, nil
}
...
s.marshalledPubKey = bz
return bz, nil
}If two goroutines both call Low / Style8. Unnecessary copy in
signatureHash := *signature
...
if err := b.batch.Put(b.ctx, ds.RawKey(getSignatureKey(height)), signatureHash[:]); err != nil {
if err := b.batch.Put(b.ctx, ds.RawKey(getSignatureKey(height)), *signature); err != nil {9. Duplicate exported/unexported key functions
func GetHeaderKey(height uint64) string { return heightKey(headerPrefix, height) }
func getHeaderKey(height uint64) string { return heightKey(headerPrefix, height) }Both exported and unexported variants do the same thing. Either use only the exported form internally (no functional difference) or explain why both are needed. The same pattern repeats for 10.
The changelog mentions "Removed unused Summary
|
There was a problem hiding this comment.
Code Review
The pull request introduces several performance optimizations aimed at reducing redundant store reads and protobuf marshaling/unmarshaling. Key changes include caching lastHeaderHash, lastDataHash, and lastSignature in the Executor struct, amortizing pending limit checks, and introducing SaveBlockDataFromBytes to avoid re-serialization. Additionally, TxsByteSize is added for more efficient metrics reporting, and MarshalledPubKey caches public key serialization. The signHeader function now returns serialized header bytes to prevent redundant marshaling. Overall, these changes should lead to a more efficient block production pipeline.
| if _, err := e.store.GetMetadata(e.ctx, headerKey); err == nil { | ||
| e.hasPendingBlock.Store(true) |
There was a problem hiding this comment.
The headerKey variable is not defined in this scope. It seems to be a global variable or a constant that is not imported or declared within this file. This will lead to a compilation error.
| if _, err := e.store.GetMetadata(e.ctx, headerKey); err == nil { | |
| e.hasPendingBlock.Store(true) | |
| if _, err := e.store.GetMetadata(e.ctx, store.PendingBlockKey); err == nil { |
| "github.com/libp2p/go-libp2p/core/crypto" | ||
| "github.com/rs/zerolog" | ||
| "golang.org/x/sync/errgroup" | ||
|
|
There was a problem hiding this comment.
The golang.org/x/sync/errgroup import was removed, but the corresponding usage in ProduceBlock (lines 548-558 in the old code) was replaced with sequential broadcasting. This is a functional change that might impact the overall block production time if broadcasting was previously done concurrently. While the comment on line 598 explains the sequential broadcast, it's worth noting the performance implication of this change.
| // For based sequencer, this will return an empty signature | ||
| signature, err := e.signHeader(header.Header) | ||
| // For based sequencer, this will return an empty signature. | ||
| signature, _, err := e.signHeader(&header.Header) |
There was a problem hiding this comment.
The signHeader function now returns three values, but only two are being captured here. The headerBytes (second return value) is not being used, which could be passed to batch.SaveBlockDataFromBytes to avoid re-marshalling the header. This would further optimize the block production process.
| signature, _, err := e.signHeader(&header.Header) | |
| signature, headerBytes, err := e.signHeader(&header.Header) |
| height := header.Height() | ||
| signatureHash := *signature | ||
|
|
||
| headerBlob, err := header.MarshalBinary() |
| signatureHash := *signature | ||
|
|
||
| headerBlob, err := header.MarshalBinary() | ||
| if err != nil { |
| // SaveBlockDataFromBytes saves pre-serialized block data to the batch. | ||
| // This avoids re-marshalling header and data when the caller already has the binary blobs. | ||
| func (b *DefaultBatch) SaveBlockDataFromBytes(header *types.SignedHeader, headerBlob, dataBlob []byte, signature *types.Signature) error { | ||
| height := header.Height() |
| @@ -39,29 +39,40 @@ const ( | |||
| heightPrefix = "t" | |||
There was a problem hiding this comment.
The heightPrefix constant is defined but not used in the new heightKey function. It seems like the intention was to use it as the prefix, but it's hardcoded as 't' in the constant definition and then 'h', 'd', 'c', 's' in the heightKey calls. This could lead to confusion or incorrect key generation.
| func getHeightKey() string { | ||
| return GenerateKey([]string{heightPrefix}) | ||
| return "/" + heightPrefix |
There was a problem hiding this comment.
The getHeightKey function now returns a hardcoded string "/t". This is inconsistent with the heightKey function which dynamically generates keys based on a prefix and height. If heightPrefix is intended to be used here, it should be passed to heightKey or directly used in the string concatenation.
| func getHeightKey() string { | |
| return GenerateKey([]string{heightPrefix}) | |
| return "/" + heightPrefix | |
| func getHeightKey() string { | |
| return heightKey(heightPrefix, 0) // Or whatever default height is appropriate | |
| } |
| func GenerateKey(fields []string) string { | ||
| key := "/" + strings.Join(fields, "/") | ||
| return path.Clean(key) | ||
| // Pre-calculate total size to avoid re-allocation. | ||
| n := 0 | ||
| for _, f := range fields { | ||
| n += 1 + len(f) // '/' + field | ||
| } | ||
| var b strings.Builder | ||
| b.Grow(n) | ||
| for _, f := range fields { | ||
| b.WriteByte('/') | ||
| b.WriteString(f) | ||
| } | ||
| return b.String() |
There was a problem hiding this comment.
The GenerateKey function was optimized to use strings.Builder for performance. However, the path.Clean call was removed. While strings.Builder is more efficient for concatenation, path.Clean ensures that the generated key is always in a canonical form (e.g., handling double slashes or ./.. components). Removing it might lead to non-canonical keys if the input fields are not perfectly clean, which could cause issues with datastore lookups.
julienrbrt
left a comment
There was a problem hiding this comment.
nice! mostly lgtm! i think we can delete some old code and delete some struct cache to put them in cached_store instead (better for readability and consistency)
| } else { | ||
| // Check if there's an already stored block at the newHeight. | ||
| // Only hit the store if the in-memory flag indicates a pending block exists. | ||
| if e.hasPendingBlock.Load() { |
There was a problem hiding this comment.
Nice! Makes a lot of sense indeed.
|
|
||
| // Cached per-block data to avoid store reads + protobuf deserialization | ||
| // in CreateBlock. Updated after each successful block production. | ||
| lastHeaderHash types.Hash |
There was a problem hiding this comment.
I still believe those aren't necessary, because GetBlockData is cached on the second call.
We may simply want to add GetSignature to the pkg/store/cached_store.go for consistency.
| // Amortized pending limit check — NumPendingHeaders/NumPendingData call | ||
| // advancePastEmptyData which scans the store. Only amortize when the limit | ||
| // is large enough that checking every N blocks won't overshoot. | ||
| const pendingCheckInterval uint64 = 64 |
There was a problem hiding this comment.
This makes sense. Maybe we should stay 64 is an arbitrary but good value?
| header.Signature = signature | ||
|
|
||
| if err := e.blockProducer.ValidateBlock(ctx, currentState, header, data); err != nil { | ||
| // Structural validation only — skip the expensive Validate() / DACommitment() |
There was a problem hiding this comment.
Hum, im not sure about this, we may have a buggy sequencer config and we won't see it.
AggregatorNodeSignatureBytesProvider is different for ev-abci, and we may want to verify?
Need to check more, but if we don't need to we should remove SetCustomVerifierForAggregator and ValidateBlock are they aren't needed nor used.
There was a problem hiding this comment.
EDIT: i think we can remove those two indeed, and delete that comment, as ValidateBlock should be removed (with SetCustomVerifierForAggregator as well)
| lastSignature = e.lastSignature | ||
| } else { | ||
| // Cold start fallback: read from store | ||
| lastHeader, lastData, err := e.store.GetBlockData(ctx, height-1) |
There was a problem hiding this comment.
technically not true, without those cached struct, it would have read from cache anyway
| } | ||
|
|
||
| // AssertValidSequence performs lightweight state-sequence validation for self-produced blocks. | ||
| // It skips the expensive Validate() call (which re-computes DACommitment by re-marshaling |
There was a problem hiding this comment.
if we delete ValidateBlocks, let's clean up those comments.
Bring moonshot 2 optimizations to main
Performance Optimizations Analysis (alex/bench_phase1_start vs HEAD)
Summary: Targeted optimizations focused on reducing CPU overhead (marshalling/hashing) and I/O latency (store reads) in the block production hot path.
1. Executor: In-Memory Caching for Block Production
Executornow cacheslastHeaderHash,lastDataHash, andlastSignaturein memory after committing a block.store.GetBlockDataandstore.GetSignature(database I/O) at the start ofCreateBlock. This significantly speeds up the "cold start" of a new block round.2. Executor: Avoid Re-hashing
CreateBlockreusesnewState.LastHeaderHash(computed during state transition) instead of callingheader.Hash()again before broadcasting.header.Hash()triggers a full Protobuf marshal and SHA-256 hash. Reusing the already-computed hash saves CPU cycles.3. Executor: Optimized Metrics Collection
data.Size()withdata.TxsByteSize()for theBlockSizeBytesmetric.data.Size()marshals the entire object to Protobuf to get the size.data.TxsByteSize()simply sums the lengths of the transaction byte slices, avoiding expensive allocations and serialization.4. Types: Cached Public Key Marshalling
Signerstruct now caches the marshalled public key bytes (marshalledPubKey).ToProtomethods (called frequently during P2P exchange and storage) used to callcrypto.MarshalPublicKeyevery time. Caching this eliminates repeated serialization overhead.5. Types: Lightweight Sequence Validation
AssertValidSequenceinState.Validate()call, which requires re-marshalling and re-hashing the entire block data to verify theDACommitment.6. Store/Executor: Pre-serialized writes (
SaveBlockDataFromBytes)SaveBlockDataFromBytesto the Store interface;signHeadernow returns the serialized header bytes.MarshalBinarycall at the storage layer.7. Protobuf: Zero-Allocation Timestamp Conversion
State.ToProto, manually constructed×tamppb.Timestamp{...}.timestamppb.New(time), which performs a heap allocation. Small optimization for a high-frequency operation.8. Executor: Sequential P2P Broadcast
errgroup) to sequential (Header then Data).This branch:
base
alex/bench_phase1_start