Skip to content

Refactor: rewrite multi-tensormap with shared shard core#310

Open
jvjhfhg wants to merge 1 commit intohw-native-sys:mainfrom
jvjhfhg:refactor-tensormap
Open

Refactor: rewrite multi-tensormap with shared shard core#310
jvjhfhg wants to merge 1 commit intohw-native-sys:mainfrom
jvjhfhg:refactor-tensormap

Conversation

@jvjhfhg
Copy link
Collaborator

@jvjhfhg jvjhfhg commented Mar 17, 2026

  • route owner and fallback history through one TensorMapShardImpl template with two fixed configurations
  • derive fallback cleanup from real producer ring retirement instead of storing redundant lifecycle fields on entries
  • add a tracked design note under tensormap_and_ringbuffer/docs describing the committed multi-tensormap rewrite

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the TensorMap implementation to improve performance and isolation by introducing a per-ring architecture. Instead of a single global TensorMap, each processing ring now maintains its own independent set of TensorMap data structures. This change enables more efficient cleanup and lookup operations, particularly through a "break-on-stale" optimization, and better manages tensor dependencies across different rings. The orchestrator has been updated to correctly assign ring IDs to tensors, ensuring proper routing within the new TensorMap structure.

Highlights

  • Per-Ring TensorMap Structure: Introduced PTO2TensorMapRing to manage independent buckets, entry pools, free lists, and task entry heads for each processing ring, enhancing isolation and performance.
  • TensorMap Facade: Refactored PTO2TensorMap to act as a facade, routing all operations (lookup, insert, cleanup) to the appropriate PTO2TensorMapRing instance based on the tensor's ring_id.
  • Tensor Ring ID: Added a uint8_t ring_id field to the Tensor struct, allowing tensors to be explicitly associated with a specific ring, with TENSOR_RING_ID_NONE indicating an unassigned or external tensor.
  • Orchestrator Ring ID Assignment: Modified the orchestrator to automatically assign the current ring's ring_id to OUTPUT tensors and unassigned INOUT tensors during task submission.
  • Break-on-Stale Optimization: Enabled a 'break-on-stale' optimization within PTO2TensorMapRing lookup chains, which leverages the newest-first ordering of entries to stop searching a chain as soon as a stale entry is encountered.
  • Decentralized Cleanup Tracking: Moved the tensormap_last_cleanup tracking variable from the orchestrator's global state into each individual PTO2TensorMapRing instance, further decentralizing and isolating ring-specific state.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.cpp
    • Removed initialization of orch->tensormap_last_cleanup.
    • Added logic to assign ring_id to INOUT tensors if unassigned, and to OUTPUT tensors.
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.h
    • Removed tensormap_last_cleanup member from PTO2OrchestratorState.
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.cpp
    • Renamed PTO2TensorMap to PTO2TensorMapRing and updated its init, destroy, print_stats, valid_count, lookup, insert, and cleanup_retired methods to operate on a single ring.
    • Created a new PTO2TensorMap struct as a facade, containing an array of PTO2TensorMapRing instances and routing calls based on tensor.ring_id.
    • Modified sync_tensormap to iterate through all PTO2TensorMapRing instances and perform per-ring validity synchronization and cleanup.
    • Implemented the "break-on-stale" optimization in PTO2TensorMapRing::lookup.
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.h
    • Introduced PTO2TensorMapRing struct with per-ring specific members like buckets, entry_pool, free_entry_list, task_entry_heads, task_window_size, last_task_alive, and last_cleanup.
    • Updated PTO2TensorMap to be a facade, holding an array of PTO2TensorMapRing objects.
    • Adjusted comments to reflect the new per-ring design and the "break-on-stale" optimization.
    • Removed ring_id parameter from sync_validity and cleanup_retired in PTO2TensorMapRing as it now operates on its own ring.
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/tensor.h
    • Added uint8_t ring_id field to the Tensor struct.
    • Defined TENSOR_RING_ID_NONE constant.
    • Updated Tensor constructors and print method to include the new ring_id field.
Activity
  • No specific activity (comments, reviews, progress updates) was provided in the context for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring of TensorMap to a per-ring architecture. By creating PTO2TensorMapRing for per-ring resource management and making PTO2TensorMap a facade, the design improves isolation and enables a key performance optimization in lookup with break-on-stale logic. The changes are consistent and robust across all modified files. My review includes a couple of minor suggestions to simplify memory allocation and initialization by using calloc.

// Per-ring chain: entries are ordered newest-first (head-insert).
// A stale entry means all subsequent entries are also stale — break.
if (!entry_valid(*cur_entry)) {
cur_entry = next_entry;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里需要还原为老的逻辑,把next置为nullptr,是否立即进行entry清理再看。

@jvjhfhg jvjhfhg force-pushed the refactor-tensormap branch 3 times, most recently from ec51fd1 to 0ef4a89 Compare March 18, 2026 11:00
@jvjhfhg
Copy link
Collaborator Author

jvjhfhg commented Mar 19, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring of the TensorMap to support multiple rings. The new design, which separates TensorMap into per-ring instances and a fallback map for external/cross-ring tensors, is clean and robust. The changes are consistent across the codebase, from the Tensor definition to the orchestration logic and even the examples. I have a few minor suggestions to simplify memory allocation initialization code.

@jvjhfhg jvjhfhg force-pushed the refactor-tensormap branch 3 times, most recently from 62022dd to fbf25d0 Compare March 19, 2026 06:56
@jvjhfhg jvjhfhg marked this pull request as draft March 19, 2026 08:12
@jvjhfhg jvjhfhg force-pushed the refactor-tensormap branch from fbf25d0 to 6aac0b0 Compare March 19, 2026 10:33
@jvjhfhg jvjhfhg changed the title Refactor TensorMap to per-ring buckets and pools Refactor: unify tensormap shard implementation Mar 19, 2026
@jvjhfhg jvjhfhg force-pushed the refactor-tensormap branch from 6aac0b0 to 9acb727 Compare March 19, 2026 10:35
@jvjhfhg jvjhfhg changed the title Refactor: unify tensormap shard implementation Refactor: rewrite multi-tensormap with shared shard core Mar 19, 2026
- route owner and fallback history through one TensorMapShardImpl
  template with two fixed configurations
- derive fallback cleanup from real producer ring retirement instead
  of storing redundant lifecycle fields on entries
- add a tracked design note under tensormap_and_ringbuffer/docs
  describing the committed multi-tensormap rewrite
@jvjhfhg jvjhfhg force-pushed the refactor-tensormap branch from 9acb727 to c4051da Compare March 19, 2026 10:36
@jvjhfhg jvjhfhg marked this pull request as ready for review March 19, 2026 10:44
@jvjhfhg
Copy link
Collaborator Author

jvjhfhg commented Mar 19, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring of the multi-tensormap architecture, moving towards a shared shard core implementation for both owner and fallback history. The changes streamline the lifecycle management by deriving cleanup semantics from producer retirement and enforce a unified shard implementation. A new design note (MULTI_TENSORMAP_REWRITE.md) has been added to document these architectural changes and their rationale. Additionally, the Tensor structure has been updated to include a ring_id, and tensor factory functions now incorporate this new field, along with the introduction of a thread-local ring ID for scope management. The orchestration logic has been updated to reflect these new tensor and tensormap behaviors. A high-severity concern has been raised regarding the use of static thread_local for __pto2_ring_id, which could lead to inconsistent ring_id assignments and incorrect tensor routing if not managed carefully with RAII guards.

// Thread-local scope depth for tensor factory functions.
// Incremented/decremented by PTO2ScopeGuard and standalone scope wrappers.
// Tensor ring selection clamps this depth to the runtime's valid ring range.
static thread_local uint8_t __pto2_ring_id = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The introduction of a static thread_local variable __pto2_ring_id for managing the current tensor ring ID ties the ring selection to the thread's scope depth. While thread_local mitigates global state issues across threads, it still represents global mutable state within a single thread. Incorrect nesting of PTO2_SCOPE blocks or direct manipulation of this variable outside of the intended RAII guards could lead to an inconsistent ring_id being assigned to tensors. This could result in tensors being incorrectly routed to owner or fallback shards, leading to subtle and hard-to-debug correctness issues in the tensor mapping logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants