Update local-execution to newest task-spec by elliottslaughter · Pull Request #1625 · flexflow/flexflow-train

elliottslaughter · 2026-01-17T01:18:08Z

Runs end-to-end tests, including loss function. Do not include cost estimator, that will be a separate PR.

Changes to local-execution:

ComputationGraphInstance now captures all the necessary state to run a computation
- InitializedComputationGraphInstance has been removed
- Creating the ComputationGraphInstance runs all passes, allocates and initializes the graph
local_task_registry.h is now static: no dynamic map is retained
- This means task IDs are no longer used or required
Irrelevant sections of the old runtime have been removed

Changes to task-spec:

DynamicValueAttrs now stores a DynamicTensorAccessor (variant of GenericTensorAccessorR and GenericTensorAccessorW)
Exposed the mapping computed from labelled_open_kwarg_dataflow_graph_from_dynamic_open_dataflow_graph
Fixes for make_dynamic_open_dataflow_graph_from_cg
Fixes for perform_update_insertion
Fixes for linear op and optimizer wrappers

Follow-up PRs:

Update local cost estimator for latest task-spec #1626

This change is

… equivalent dataflow, expand.

Refactor computation graph instance as DTG. Cut out more stuff that doesn't compile yet.

…taflow_graph_from_cg.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

@elliottslaughter resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

lockshaw

@lockshaw reviewed 4 files and all commit messages, and made 1 comment.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @elliottslaughter).

lib/local-execution/include/local-execution/computation_graph_instance/computation_graph_instance.h line 30 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Generally we make all constructors explicit unless there's a clear reason not to

Testing publishing a response

elliottslaughter

@elliottslaughter made 10 comments.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

@elliottslaughter made 1 comment.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

lib/local-execution/include/local-execution/computation_graph_instance/computation_graph_instance.h line 30 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Testing publishing a response

Test reply

elliottslaughter

@elliottslaughter made 1 comment.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

elliottslaughter

@elliottslaughter made 6 comments and resolved 1 discussion.
Reviewable status: 68 of 81 files reviewed, 19 unresolved discussions (waiting on @lockshaw).

lib/local-execution/include/local-execution/task_execution.h line 17 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Any reason for deleting all of the argument names?

I went back and forth on this one because the rest of the code seems to be inconsistent. What do you prefer?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 6 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

I'm not sure this is the most consistent, particularly considering the parallel case of tensors.

I feel like the two consistent options are:

layer / parallel_layer / loss, or
cg_layer / pcg_layer / loss

We used option 1 for tensors so that feels the most consistent to me, but let me know if you disagree.

lockshaw

@lockshaw reviewed 13 files, made 7 comments, and resolved 11 discussions.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @elliottslaughter).

lib/local-execution/include/local-execution/task_execution.h line 17 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I went back and forth on this one because the rest of the code seems to be inconsistent. What do you prefer?

I have moved over to including the argument names because it enables us to use clang-tidy to check the argument name comments (i.e., the /*my_argument=*/... syntax), and because it enables adding doxygen docstrings (I don't think you can document arguments without names in doxygen, at least not easily)

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I'm actually not sure how you want me to do this, unless you're literally asking me to write:
SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")
Is that it or do you have a shorter version you want to see here?

You can also just use a prose name, like get_tensor for read-only forward tensor. I just don't want the subcase names to explicitly contradict the signature

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 6 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I'm not sure this is the most consistent, particularly considering the parallel case of tensors.

I feel like the two consistent options are:

layer / parallel_layer / loss, or

cg_layer / pcg_layer / loss

We used option 1 for tensors so that feels the most consistent to me, but let me know if you disagree.

Yeah, either is fine, I just don't like layer by itself because it's ambiguous (unfortunately we don't have a good name for a non-parallel layer that's shorter than "non_parallel_layer", hence "cg_layer"). Either parallel_layer or pcg_layer is fine to me

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

See if the new version in the updated diff makes more sense.

Yup, makes much more sense. Thanks!

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

Makes sense. I'd say either (1) include loss attrs in the graph, or (2) remove the loss_attrs argument. Either is fine with me, but having a loss_attrs argument that just isn't used feels weird.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

Nope, if it's working for now that's fine, just wanted to raise the issue to make sure it got considered/kept in mind as we go forward. We can always change it if we discover a case where it is necessary.

lib/local-execution/src/local-execution/device_state_initialization.cc line 85 at r5 (raw file):

      });
  // FIXME: this assert fails because not all kinds of nodes require
  // initialization ASSERT(all_nodes_are_initialized(dg));

Got it, in that case it seems like having an all_nodes_are_initialized function doesn't make a whole lot of sense as it doesn't tell us anything, so maybe the answer is just to remove that function. Another option would be to keep it, but to have it incorporate what types of nodes don't need to be initialized. Either is fine, just trying to understand (and if convenient, add a check in the code) the postcondition of this function

elliottslaughter

The remaining items should be resolved now. Let me know if there is anything else.

@elliottslaughter made 7 comments and resolved 3 discussions.
Reviewable status: 69 of 90 files reviewed, 6 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/device_state_initialization.cc line 85 at r5 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Got it, in that case it seems like having an all_nodes_are_initialized function doesn't make a whole lot of sense as it doesn't tell us anything, so maybe the answer is just to remove that function. Another option would be to keep it, but to have it incorporate what types of nodes don't need to be initialized. Either is fine, just trying to understand (and if convenient, add a check in the code) the postcondition of this function

I tried a couple different approaches, but I ultimately discovered that even if a node has an initializer task impl defined, it may nevertheless not result in any device state. Therefore, I think it's unreasonable to attempt to define this function, and I have removed it entirely.

lib/local-execution/src/local-execution/task_execution.cc line 53 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I think this can be converted to a binary_merge_disjoint_maps of the result of a composition of a map_keys and map_values (it might actually be a good idea to make a map_keys_and_values that is simply a composition of the two, as iirc composing map_keys and map_values is a reasonable common operation in the codebase.

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

You can also just use a prose name, like get_tensor for read-only forward tensor. I just don't want the subcase names to explicitly contradict the signature

Done.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Makes sense. I'd say either (1) include loss attrs in the graph, or (2) remove the loss_attrs argument. Either is fine with me, but having a loss_attrs argument that just isn't used feels weird.

I decided to put it into the graph. Rationale being: for Realm, I'm going to need to transport these anyway. Might as well put as much as I can into one place.

See if you like the new code and be sure to check out dynamic graph too.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 66 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Especially when you're passing back tuples with multiple values of the same type, it's usually preferred to make a dtgen struct so that the fields have names, which makes code that uses this function safer to write and easier to read

Done.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Nope, if it's working for now that's fine, just wanted to raise the issue to make sure it got considered/kept in mind as we go forward. We can always change it if we discover a case where it is necessary.

Done.

lockshaw

@lockshaw reviewed 21 files and all commit messages, made 2 comments, and resolved 6 discussions.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @elliottslaughter).

lib/task-spec/include/task-spec/dynamic_graph/dynamic_node_attrs.dtg.toml line 43 at r7 (raw file):

[[fields]]
name = "loss_attrs"
type = "std::optional<::FlexFlow::LossAttrs>"

It seems like only one of op_attrs or loss_attrs will ever be defined at any given time. Is this correct and, if so, is there any reason to combine the two fields into a variant of some kind to enforce this property?

lib/utils/include/utils/containers/map_keys_and_values.h line 1 at r7 (raw file):

#ifndef _FLEXFLOW_LIB_UTILS_INCLUDE_UTILS_CONTAINERS_MAP_KEYS_AND_VALUES_H

Every .h file should have a corresponding .cc file. In this case, the contents would be explicit specializations to emulate typechecking generics, e.g., https://github.com/flexflow/flexflow-train/blob/f71c384ce5decd5a30859e39f75f1418cb76d91c/lib/utils/src/utils/containers/binary_merge_maps_with.cc

elliottslaughter · 2026-02-12T06:07:06Z

lib/task-spec/include/task-spec/dynamic_graph/dynamic_node_attrs.dtg.toml line 43 at r7 (raw file):

Previously, lockshaw (Colin Unger) wrote…

It seems like only one of op_attrs or loss_attrs will ever be defined at any given time. Is this correct and, if so, is there any reason to combine the two fields into a variant of some kind to enforce this property?

What would you call it? It's not obvious to me what the union of op and loss attrs should be called.

lockshaw · 2026-02-12T06:19:48Z

lib/task-spec/include/task-spec/dynamic_graph/dynamic_node_attrs.dtg.toml line 43 at r7 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

What would you call it? It's not obvious to me what the union of op and loss attrs should be called.

You could go with TrainingNodeAttrs? In the past I've used the Training prefix to denote the presence of loss, as the loss function only makes sense in the context of training the model

elliottslaughter

@elliottslaughter made 2 comments.
Reviewable status: 75 of 92 files reviewed, 2 unresolved discussions (waiting on @lockshaw).

lib/task-spec/include/task-spec/dynamic_graph/dynamic_node_attrs.dtg.toml line 43 at r7 (raw file):

Previously, lockshaw (Colin Unger) wrote…

You could go with TrainingNodeAttrs? In the past I've used the Training prefix to denote the presence of loss, as the loss function only makes sense in the context of training the model

Done.

lib/utils/include/utils/containers/map_keys_and_values.h line 1 at r7 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Every .h file should have a corresponding .cc file. In this case, the contents would be explicit specializations to emulate typechecking generics, e.g., https://github.com/flexflow/flexflow-train/blob/f71c384ce5decd5a30859e39f75f1418cb76d91c/lib/utils/src/utils/containers/binary_merge_maps_with.cc

Done.

lockshaw

@lockshaw reviewed 17 files and all commit messages, and resolved 2 discussions.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @elliottslaughter).

codecov · 2026-02-12T20:07:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (b8e79cb) to head (4dfe4d6).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@      Coverage Diff       @@
##   master   #1625   +/-   ##
==============================
==============================

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

elliottslaughter marked this pull request as ready for review January 29, 2026 01:00

elliottslaughter added 29 commits January 30, 2026 13:37

Rip out a bunch of stuff, stub stuff out.

c1a8ab4

Basic flow to create computation graph instance: computation graph to…

9cbf779

… equivalent dataflow, expand.

Initialized computation graph instance owns input tensors.

b3b49ba

Refactor computation graph instance as DTG. Cut out more stuff that doesn't compile yet.

Factor out dynamic_layer_guid_t and some work on make_dynamic_open_da…

4f9bb24

…taflow_graph_from_cg.

Create initialized computation graph instance.

cfe3d55

Assert on state of computation graph.

402140a

Switch to variant to track pre-allocated tensors.

f35ed75

Code to initialize ops, need task argument accessor.

91a7c7c

Fix up LocalTaskArgumentAccessor.

3e8eca6

Fix some headers.

7e979e2

Local executor builds, does not link.

c2b6e40

Rewrite to use existing tensor allocator pass.

ef6f5ec

Fix multiple definition.

3cc4669

Fix up some tests or comment them out.

82991f6

Rip out task registry class, stub the static implementation.

00dd0a1

Implement task IDs.

ad28993

Rename computation graph instance.

b0b6002

Clean up includes.

3924b42

Fix accessor test.

892cfdb

Fixes for various test cases.

bd8c3dc

Remove obsolete sections of task argument accessor test.

d2aede0

First pass attempt to generate topological order for dataflow graph.

049c06d

Finish implementation of graph execution.

fcbd7fa

Refactor and lift topo order to initialization.

78647b0

Clean up and more implementation.

4a7fd2e

Call task implementation.

91e4921

Get everything to build and link.

50a3288

Implement rest of passes.

421f813

Update step.

75f1b61

elliottslaughter commented Feb 11, 2026

View reviewed changes

Make tests pass.

60e62d3

elliottslaughter commented Feb 11, 2026

View reviewed changes

lockshaw reviewed Feb 11, 2026

View reviewed changes

elliottslaughter commented Feb 11, 2026

View reviewed changes

lockshaw requested changes Feb 11, 2026

View reviewed changes

elliottslaughter added 3 commits February 11, 2026 17:19

More responses to feedback.

678b453

Remove device initialization check, since it can't be made reliable.

ed2cb6a

More PR feedback.

170e89a

elliottslaughter commented Feb 12, 2026

View reviewed changes

lockshaw requested changes Feb 12, 2026

View reviewed changes

elliottslaughter added 2 commits February 11, 2026 23:13

Switch to variant for attrs and fix all variant dtgs.

6553688

Add cc file for map_keys_and_values.

87aa36c

elliottslaughter commented Feb 12, 2026

View reviewed changes

Missed a condition.

4dfe4d6

lockshaw approved these changes Feb 12, 2026

View reviewed changes

lockshaw merged commit 5f3a990 into flexflow:master Feb 12, 2026
4 checks passed

elliottslaughter deleted the local-execution branch February 12, 2026 20:10

Conversation

elliottslaughter commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

lockshaw left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

lockshaw left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

lockshaw left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter commented Feb 12, 2026

Uh oh!

lockshaw commented Feb 12, 2026

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

lockshaw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Feb 12, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

elliottslaughter commented Jan 17, 2026 •

edited

Loading