Skip to content

feat: BED-7498 batch node update#38

Open
bsheth711 wants to merge 20 commits intomainfrom
BED-7498-batch-node-update
Open

feat: BED-7498 batch node update#38
bsheth711 wants to merge 20 commits intomainfrom
BED-7498-batch-node-update

Conversation

@bsheth711
Copy link
Contributor

@bsheth711 bsheth711 commented Mar 4, 2026

This PR adds the ability to batch update nodes.

A couple things to note:

  • I ended up going with the UpdateNodes(nodes []*Node) error signature for the Batch interface because this allows me to use two different implementations for updating depending on the number of nodes being updated. If the update is below a certain threshold, LargeNodeUpdateThreshold (currently set to 1M nodes), we do the typical batched SQL UPDATE...SET.... Above that threshold, we instead do a COPY...FROM... into a temporary table for the transaction, and then MERGE INTO the nodes table. This seems to be marginally faster for 1M+ updates, and should have even better scalability for much larger updates.
  • I added an optional BatchOption parameter to the BatchOperation function signature that allows one to set a BatchWriteSize for the BatchOperation
  • I added a helper function in ops.go: func UpdateNodes(ctx context.Context, graphDB graph.Database, nodes []*graph.Node, batchSize ...int) error that I think should be the first tool users of DAWGS reach for when updating nodes
  • I added a helper function (s *Node) StripAllPropertiesExcept(except ...string). The reason for this is if a developer is working with fully hydrated nodes in memory, but they only change one property and want to update all these nodes, we can cut down on a lot of network traffic to DB by removing all but the properties we care about before sending the nodes to the DB to be updated. The DB also does not need to bother parsing unnecessary information as well.

Summary by CodeRabbit

  • New Features

    • Batched bulk node updates for Neo4j and PostgreSQL with per-operation batch-size options.
    • Large-scale staging pipeline to speed PostgreSQL node updates.
    • Public APIs to trigger batched node updates and a Node method to strip properties selectively.
    • Utility to render deleted property keys as a formatted string.
  • Tests

    • New unit and integration tests covering batching logic, large-update staging, query formatting, and node-update behaviors.

@bsheth711 bsheth711 self-assigned this Mar 4, 2026
@bsheth711 bsheth711 added the go Pull requests that update go code label Mar 4, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

Walkthrough

Adds batch UpdateNodes plumbing and BatchOption support; implements Neo4j dual-buffer and hashing-keyed batched updates; adds PostgreSQL large-node bulk-update via staging + COPY FROM + MERGE; adds cypher utility to stringify deleted properties; updates drivers, graph API, ops, and tests.

Changes

Cohort / File(s) Summary
Cypher PGSQL utils & tests
cypher/models/pgsql/type.go, cypher/models/pgsql/type_test.go
Added DeletedPropertiesToString(properties *graph.Properties) string to format deleted property keys as a quoted set; added unit tests covering empty, single, multiple, and non-deleted cases.
Neo4j: batching internals & tests
drivers/neo4j/batch.go, drivers/neo4j/cypher.go, drivers/neo4j/transaction.go, drivers/neo4j/cypher_internal_test.go, drivers/neo4j/batch_integration_test.go
Introduced UpdateNodes, added nodeUpdateBuffer and per-id buffer, hashing-based updateKey grouping (xxhash), batched UNWIND Cypher generation, new flush paths, buffer flush logic, and unit+integration tests.
Postgres: large-node bulk update & SQL formatters
drivers/pg/batch.go, drivers/pg/query/format.go, drivers/pg/query/format_test.go
Added large-update pipeline (staging table, COPY FROM, MERGE), LargeNodeUpdateRows, NodeUpdateParameters, LargeNodeUpdateThreshold, SQL formatting helpers and tests for generated SQL and staging columns.
Batch API, options & propagation
graph/graph.go, graph/graph_test.go, graph/switch.go, drivers/neo4j/driver.go, drivers/pg/driver.go
Extended Batch with UpdateNodes, added BatchConfig, BatchOption, WithBatchSize, and changed BatchOperation to accept variadic options; options forwarded through switch and drivers; tests for option helper.
Node utilities & tests
graph/node.go, graph/node_test.go
Added StripAllPropertiesExcept(...string) on Node and corresponding tests exercising keep-lists and deleted-property interactions.
Ops wrapper
ops/ops.go
Added UpdateNodes(ctx, graphDB, nodes, batchSize...) helper that invokes graphDB.BatchOperation(...).

Sequence Diagram(s)

sequenceDiagram
    participant App
    participant Driver
    participant Batch
    participant Txn as Transaction
    participant DB as Database

    App->>Driver: BatchOperation(ctx, delegate, options...)
    Driver->>Driver: apply BatchOptions (BatchSize)
    Driver->>DB: open connection / start transaction
    Driver->>Batch: new batch (config.BatchSize)
    Driver->>Batch: delegate(batch)

    App->>Batch: UpdateNodes([node1,node2,...])
    Batch->>Batch: accumulate in nodeUpdateBuffer / nodeUpdateByBuffer
    Batch->>Batch: compute updateKey (xxhash) and group params
    Batch->>Batch: if threshold exceeded -> flushNodeUpdateBuffer

    alt Neo4j batched path
        Batch->>Txn: cypherBuildNodeUpdateQueryBatch(params)
        Txn->>DB: EXECUTE UNWIND-based update statements
        DB->>Txn: ack
        Txn->>Batch: logWrites
    else per-id path
        Batch->>Txn: flushNodeUpdateByBuffer -> execute per-id updates
    end

    Driver->>Batch: Commit()
    Batch->>Txn: flush remaining buffers
    Txn->>DB: commit
    DB->>Driver: commit ack
    Driver->>App: return
Loading
sequenceDiagram
    participant App
    participant Driver
    participant Batch
    participant Staging as StagingTable
    participant Txn as Transaction
    participant DB as Database

    App->>Driver: BatchOperation(ctx, delegate)
    Driver->>DB: open connection
    Driver->>Batch: new batch

    App->>Batch: UpdateNodes([>LargeNodeUpdateThreshold nodes])
    Batch->>Batch: detect large batch -> largeUpdate flow
    Batch->>Staging: CREATE TEMP TABLE (staging)
    Batch->>Staging: COPY FROM staging (stream rows)
    Staging->>Batch: rows loaded
    Batch->>Txn: Format MERGE from staging -> target node table
    Txn->>DB: EXECUTE MERGE (update kind_ids / properties)
    DB->>Txn: ack
    Txn->>Staging: ON COMMIT DROP (cleanup)
    Driver->>App: return
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through buffers, hashed each key,
I staged a table beneath a willow tree,
UNWIND and COPY danced beneath the moon,
Kinds and props updated — a soft tune,
A rabbit's patchwork of batch harmony.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: BED-7498 batch node update' directly and clearly describes the main change: adding batch node update capability across multiple drivers and the graph interface.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch BED-7498-batch-node-update
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.

Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present.

Copy link
Contributor

@superlinkx superlinkx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major logical concerns, would like to see a couple small things patched up, but otherwise this looks good

@bsheth711 bsheth711 marked this pull request as ready for review March 17, 2026 23:24
@bsheth711
Copy link
Contributor Author

@coderabbitai full review

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
drivers/neo4j/cypher.go (1)

261-263: ⚠️ Potential issue | 🔴 Critical

Bug: Iterating AddedKinds twice instead of DeletedKinds.

The second loop incorrectly iterates over node.AddedKinds.Strings() when it should iterate over node.DeletedKinds.Strings(). This causes nodes with different DeletedKinds but identical AddedKinds to be incorrectly grouped into the same batch, leading to incorrect label removal operations.

🐛 Proposed fix
-	for _, removedKindStr := range node.AddedKinds.Strings() {
+	for _, removedKindStr := range node.DeletedKinds.Strings() {
 		kindSet[removedKindStr] = struct{}{}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/cypher.go` around lines 261 - 263, The loop building kindSet
mistakenly iterates node.AddedKinds.Strings() twice; change the second loop to
iterate node.DeletedKinds.Strings() so that kindSet includes both added and
deleted kind strings. Locate the loops around kindSet construction (reference
symbols: node.AddedKinds, node.DeletedKinds, kindSet) in the code that prepares
batches for label updates and replace the second occurrence of
node.AddedKinds.Strings() with node.DeletedKinds.Strings() so label removal
grouping uses the correct deleted kinds.
🧹 Nitpick comments (6)
ops/ops.go (2)

606-611: Consider using var declaration for empty slice.

Minor style suggestion for idiomatic Go initialization.

♻️ Proposed change
 func UpdateNodes(ctx context.Context, graphDB graph.Database, nodes []*graph.Node, batchSize ...int) error {
-	options := []graph.BatchOption{}
+	var options []graph.BatchOption

 	if len(batchSize) > 0 {
 		options = append(options, graph.WithBatchSize(batchSize[0]))
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ops/ops.go` around lines 606 - 611, The slice options in UpdateNodes is
initialized with a composite literal; change it to a var declaration for
idiomatic Go by declaring "var options []graph.BatchOption" instead of "options
:= []graph.BatchOption{}", leaving the rest of the logic (appending
graph.WithBatchSize when batchSize provided) unchanged; update references to
options in the function to use this new variable.

613-619: Simplify the return statement.

The if-else block checking the error can be simplified by returning the error directly.

♻️ Proposed simplification
-	if err := graphDB.BatchOperation(ctx, func(batch graph.Batch) error {
+	return graphDB.BatchOperation(ctx, func(batch graph.Batch) error {
 		return batch.UpdateNodes(nodes)
-	}, options...); err != nil {
-		return err
-	}
-
-	return nil
+	}, options...)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ops/ops.go` around lines 613 - 619, Replace the verbose error-checking around
graphDB.BatchOperation with a direct return of the call result: call
graphDB.BatchOperation(ctx, func(batch graph.Batch) error { return
batch.UpdateNodes(nodes) }, options...) and return its error directly instead of
the current if err := ...; if err != nil { return err } return nil pattern to
simplify the control flow.
drivers/neo4j/cypher.go (2)

316-327: Missing empty kind check when removing node labels.

Same concern as above - should skip graph.EmptyKind to maintain consistency with the relationship update function and avoid generating invalid Cypher statements.

♻️ Proposed fix
 		if len(batch.nodeKindsToRemove) > 0 {
 			output.WriteString(" remove ")

 			for idx, kindToRemove := range batch.nodeKindsToRemove {
+				if kindToRemove == graph.EmptyKind {
+					continue
+				}
 				if idx > 0 {
 					output.WriteString(",")
 				}

 				output.WriteString("n:")
 				output.WriteString(kindToRemove.String())
 			}
 		}

Note: With this change, you'll also need to track whether any non-empty kinds have been written to correctly handle the comma separator logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/cypher.go` around lines 316 - 327, Skip graph.EmptyKind when
iterating batch.nodeKindsToRemove to avoid generating invalid Cypher; update the
loop in the block that writes the " remove " labels so it filters out
kindToRemove == graph.EmptyKind and only writes commas between
previously-written non-empty kinds (track a boolean like wroteKind or use a
separate counter) before calling output.WriteString("n:") and
output.WriteString(kindToRemove.String()).

309-314: Missing empty kind check when adding node labels.

Unlike cypherBuildRelationshipUpdateQueryBatch (lines 174-180) which explicitly skips graph.EmptyKind, this function doesn't check for empty kinds before writing labels. This inconsistency could generate invalid Cypher if an empty kind is present.

♻️ Proposed fix
 		if len(batch.nodeKindsToAdd) > 0 {
 			for _, kindToAdd := range batch.nodeKindsToAdd {
+				if kindToAdd == graph.EmptyKind {
+					continue
+				}
 				output.WriteString(", n:")
 				output.WriteString(kindToAdd.String())
 			}
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/cypher.go` around lines 309 - 314, The code that appends node
labels from batch.nodeKindsToAdd should skip empty kinds to avoid generating
invalid Cypher; update the loop in the function that writes to output (the block
using output.WriteString and kindToAdd.String()) to check if kindToAdd ==
graph.EmptyKind and continue/skip when true (same logic used in
cypherBuildRelationshipUpdateQueryBatch), so no "n:" labels are written for
graph.EmptyKind values.
drivers/neo4j/batch_integration_test.go (1)

23-25: Avoid overriding environment variables in init().

The init() function unconditionally sets NEO4J_CONNECTION, which will override any value set externally (e.g., in CI/CD pipelines). Consider removing this or making it conditional.

♻️ Proposed fix: Only set if not already configured
 func init() {
-	os.Setenv(Neo4jConnectionStringEnv, "neo4j://neo4j:neo4jj@localhost:7687")
+	if os.Getenv(Neo4jConnectionStringEnv) == "" {
+		os.Setenv(Neo4jConnectionStringEnv, "neo4j://neo4j:neo4jj@localhost:7687")
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/batch_integration_test.go` around lines 23 - 25, The init()
currently unconditionally sets the Neo4j connection env var
(Neo4jConnectionStringEnv), which can override external configuration; change
init() so it only sets os.Setenv(Neo4jConnectionStringEnv, ...) when
os.Getenv(Neo4jConnectionStringEnv) is empty (i.e., if not already configured),
or remove the init() entirely if tests should always rely on externally provided
NEO4J_CONNECTION; update the init() surrounding code that references
Neo4jConnectionStringEnv accordingly.
graph/graph.go (1)

286-288: Consider expanding the UpdateNodes documentation.

The comment "Updates nodes by ID" is brief. Consider documenting behavior details like whether nodes must have valid IDs, what properties are updated, and how kinds modifications (AddedKinds/DeletedKinds) are handled.

📝 Suggested documentation improvement
-	// Updates nodes by ID
+	// UpdateNodes updates multiple nodes in batch by their IDs. Each node's modified properties,
+	// added kinds, and deleted kinds will be persisted. Nodes must have valid IDs assigned.
 	UpdateNodes(nodes []*Node) error
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@graph/graph.go` around lines 286 - 288, Expand the UpdateNodes method comment
to clearly state expectations and behavior: indicate that each Node in the slice
must include a valid non-empty ID (reference Node.ID), list which fields are
updated (e.g., Name, Metadata, Properties) and whether unspecified fields are
left unchanged or cleared, and document how AddedKinds/DeletedKinds are applied
(are kinds appended/removed from the existing kinds set or does the Node.Kinds
replace existing kinds). Also describe partial-failure semantics and return
value (does UpdateNodes fail fast, return an aggregate error, or skip invalid
nodes), and note any concurrency or transactional guarantees provided by
UpdateNodes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@drivers/neo4j/batch_integration_test.go`:
- Around line 125-128: The assertion is wrong: after deleting HeatProperty the
test should assert it was removed; update the loop over nodes in the test (the
block referencing nodes, HeatProperty and NewProperty) to check that
HeatProperty no longer exists (e.g., use assert.False or assert.Equal to false
against node.Properties.Exists(HeatProperty) or
node.Properties.Get(HeatProperty).Any()) while keeping the NewProperty assertion
intact.

In `@drivers/neo4j/batch.go`:
- Around line 61-68: The UpdateNodes method appends the entire nodes slice to
s.nodeUpdateBuffer before checking s.batchWriteSize, allowing a single call to
bypass batching; change s.UpdateNodes (and use s.nodeUpdateBuffer,
s.batchWriteSize, and flushNodeUpdateBuffer) to append nodes in chunks: iterate
over the incoming nodes slice, append only as many items as fit until
s.batchWriteSize is reached, call s.flushNodeUpdateBuffer() when full, then
continue appending remaining items—repeat until all input nodes are consumed—so
the buffer never grows past batchWriteSize and existing flushNodeUpdateBuffer
logic is reused.

In `@drivers/pg/batch.go`:
- Around line 114-125: The large-update path currently builds a [][]any for
every node via NewLargeNodeUpdateRows and nodeRows.AppendAll and then passes
nodeRows.Rows() into tx.Conn().CopyFrom, which preallocates and retains the
entire payload; instead implement a streaming CopyFromSource (e.g.,
NodeUpdateStream or NodeUpdateCopySource) that implements pgx.CopyFromSource
(Next/Values/Error) to yield Values() per-row on demand (or iterate in bounded
chunks) and replace the nodeRows.AppendAll + nodeRows.Rows() call with
tx.Conn().CopyFrom(s.ctx, pgx.Identifier{sql.NodeUpdateStagingTable},
sql.NodeUpdateStagingColumns, nodeUpdateStream), so rows are encoded and emitted
lazily (or in small batches) rather than prebuilding [][]any in memory.
- Around line 199-202: Before taking the large update fast path, ensure any
buffered/ batched writes are flushed so updates see prior creations/changes: in
UpdateNodes, before calling s.largeUpdate(nodes), call the batch flush/commit
method (e.g., s.Commit()) and handle its error; if your codebase has a dedicated
flush method (e.g., flushPendingBuffers or flushWrites) use that instead—update
UpdateNodes to call that flush function, return any error, then proceed to
s.largeUpdate(nodes).

In `@graph/node.go`:
- Around line 121-135: The method StripAllPropertiesExcept currently
dereferences s.Properties (calling Exists, Get and accessing Deleted) without a
nil check; add a nil guard at the start of StripAllPropertiesExcept to handle a
nil s.Properties (e.g., return early or initialize s.Properties via
NewProperties()), then proceed to build newProperties and copy/exclude entries
as before; ensure references to s.Properties.Exists, s.Properties.Get and
s.Properties.Deleted are only done after the nil check and replace s.Properties
with newProperties at the end.

---

Duplicate comments:
In `@drivers/neo4j/cypher.go`:
- Around line 261-263: The loop building kindSet mistakenly iterates
node.AddedKinds.Strings() twice; change the second loop to iterate
node.DeletedKinds.Strings() so that kindSet includes both added and deleted kind
strings. Locate the loops around kindSet construction (reference symbols:
node.AddedKinds, node.DeletedKinds, kindSet) in the code that prepares batches
for label updates and replace the second occurrence of node.AddedKinds.Strings()
with node.DeletedKinds.Strings() so label removal grouping uses the correct
deleted kinds.

---

Nitpick comments:
In `@drivers/neo4j/batch_integration_test.go`:
- Around line 23-25: The init() currently unconditionally sets the Neo4j
connection env var (Neo4jConnectionStringEnv), which can override external
configuration; change init() so it only sets os.Setenv(Neo4jConnectionStringEnv,
...) when os.Getenv(Neo4jConnectionStringEnv) is empty (i.e., if not already
configured), or remove the init() entirely if tests should always rely on
externally provided NEO4J_CONNECTION; update the init() surrounding code that
references Neo4jConnectionStringEnv accordingly.

In `@drivers/neo4j/cypher.go`:
- Around line 316-327: Skip graph.EmptyKind when iterating
batch.nodeKindsToRemove to avoid generating invalid Cypher; update the loop in
the block that writes the " remove " labels so it filters out kindToRemove ==
graph.EmptyKind and only writes commas between previously-written non-empty
kinds (track a boolean like wroteKind or use a separate counter) before calling
output.WriteString("n:") and output.WriteString(kindToRemove.String()).
- Around line 309-314: The code that appends node labels from
batch.nodeKindsToAdd should skip empty kinds to avoid generating invalid Cypher;
update the loop in the function that writes to output (the block using
output.WriteString and kindToAdd.String()) to check if kindToAdd ==
graph.EmptyKind and continue/skip when true (same logic used in
cypherBuildRelationshipUpdateQueryBatch), so no "n:" labels are written for
graph.EmptyKind values.

In `@graph/graph.go`:
- Around line 286-288: Expand the UpdateNodes method comment to clearly state
expectations and behavior: indicate that each Node in the slice must include a
valid non-empty ID (reference Node.ID), list which fields are updated (e.g.,
Name, Metadata, Properties) and whether unspecified fields are left unchanged or
cleared, and document how AddedKinds/DeletedKinds are applied (are kinds
appended/removed from the existing kinds set or does the Node.Kinds replace
existing kinds). Also describe partial-failure semantics and return value (does
UpdateNodes fail fast, return an aggregate error, or skip invalid nodes), and
note any concurrency or transactional guarantees provided by UpdateNodes.

In `@ops/ops.go`:
- Around line 606-611: The slice options in UpdateNodes is initialized with a
composite literal; change it to a var declaration for idiomatic Go by declaring
"var options []graph.BatchOption" instead of "options := []graph.BatchOption{}",
leaving the rest of the logic (appending graph.WithBatchSize when batchSize
provided) unchanged; update references to options in the function to use this
new variable.
- Around line 613-619: Replace the verbose error-checking around
graphDB.BatchOperation with a direct return of the call result: call
graphDB.BatchOperation(ctx, func(batch graph.Batch) error { return
batch.UpdateNodes(nodes) }, options...) and return its error directly instead of
the current if err := ...; if err != nil { return err } return nil pattern to
simplify the control flow.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 44bf29f5-9c5f-4a1b-8c7d-058900fa49e6

📥 Commits

Reviewing files that changed from the base of the PR and between f9904ca and d9faf72.

📒 Files selected for processing (18)
  • cypher/models/pgsql/type.go
  • cypher/models/pgsql/type_test.go
  • drivers/neo4j/batch.go
  • drivers/neo4j/batch_integration_test.go
  • drivers/neo4j/cypher.go
  • drivers/neo4j/cypher_internal_test.go
  • drivers/neo4j/driver.go
  • drivers/neo4j/transaction.go
  • drivers/pg/batch.go
  • drivers/pg/driver.go
  • drivers/pg/query/format.go
  • drivers/pg/query/format_test.go
  • graph/graph.go
  • graph/graph_test.go
  • graph/node.go
  • graph/node_test.go
  • graph/switch.go
  • ops/ops.go

@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (6)
graph/node.go (1)

121-131: ⚠️ Potential issue | 🟠 Major

Add a nil guard before dereferencing s.Properties.

At Line 125 and Line 129, s.Properties is used without a nil check. A node created with nil properties will panic here.

🛡️ Proposed fix
 func (s *Node) StripAllPropertiesExcept(except ...string) {
 	newProperties := NewProperties()
+	if s.Properties == nil {
+		s.Properties = newProperties
+		return
+	}
 
 	for _, exclusion := range except {
 		if s.Properties.Exists(exclusion) {
 			newProperties.Set(exclusion, s.Properties.Get(exclusion).Any())
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@graph/node.go` around lines 121 - 131, The method StripAllPropertiesExcept on
struct Node dereferences s.Properties without checking for nil, which will panic
for nodes with nil Properties; update Node.StripAllPropertiesExcept to first
guard that s.Properties is non-nil (e.g., return early or initialize
newProperties appropriately) before calling s.Properties.Exists,
s.Properties.Get, or accessing s.Properties.Deleted, so that NewProperties,
property copying (Exists/Get/Any/Set) and Delete logic only run when
s.Properties is present.
drivers/pg/batch.go (2)

199-202: ⚠️ Potential issue | 🟠 Major

Flush pending batch buffers before entering largeUpdate.

At Line 200, the code branches directly to largeUpdate. Buffered writes already queued in this batch may not be applied yet, causing ordering inconsistencies for same-batch operations.

🧩 Proposed fix
 func (s *batch) UpdateNodes(nodes []*graph.Node) error {
 	if len(nodes) > LargeNodeUpdateThreshold {
+		if err := s.tryFlush(0); err != nil {
+			return err
+		}
 		return s.largeUpdate(nodes)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/pg/batch.go` around lines 199 - 202, The UpdateNodes method on type
batch currently jumps to s.largeUpdate when node count exceeds
LargeNodeUpdateThreshold without flushing any buffered writes; before invoking
s.largeUpdate(nodes) ensure pending batch buffers are flushed (e.g., call the
existing flush/commit method on batch such as s.flush or s.flushPendingBuffers)
so earlier queued writes are applied and ordering is preserved, then proceed to
call s.largeUpdate(nodes).

114-125: ⚠️ Potential issue | 🟠 Major

Large-update path still materializes the full COPY payload in memory.

At Line 114-Line 125 and Line 145-Line 197, all rows are encoded into [][]any before CopyFrom. At the 1,000,000 threshold this can create large memory spikes/OOM risk.

♻️ Suggested direction (stream rows lazily)
- nodeRows := NewLargeNodeUpdateRows(len(nodes))
- if err := nodeRows.AppendAll(s.ctx, nodes, s.schemaManager, s.kindIDEncoder); err != nil {
- 	return err
- }
+ nodeSource := NewLargeNodeUpdateCopySource(s.ctx, nodes, s.schemaManager, s.kindIDEncoder)

  if _, err := tx.Conn().CopyFrom(
  	s.ctx,
  	pgx.Identifier{sql.NodeUpdateStagingTable},
  	sql.NodeUpdateStagingColumns,
- 	pgx.CopyFromRows(nodeRows.Rows()),
+ 	nodeSource,
  ); err != nil {
  	return fmt.Errorf("copying nodes into staging table: %w", err)
  }

Add a pgx.CopyFromSource implementation to encode one row at a time.

Also applies to: 145-197

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/pg/batch.go` around lines 114 - 125, The current large-update path
builds a [][]any payload via NewLargeNodeUpdateRows + nodeRows.AppendAll and
then passes nodeRows.Rows() into tx.Conn().CopyFrom, which materializes all rows
in memory; instead implement a pgx.CopyFromSource that streams/encodes one row
at a time (e.g., add a type LargeNodeUpdateSource that implements CopyFromSource
with Next(), Values(), Err()), change NewLargeNodeUpdateRows (or add a factory
like NewLargeNodeUpdateSource) to return that source, and call
tx.Conn().CopyFrom(s.ctx, pgx.Identifier{sql.NodeUpdateStagingTable},
sql.NodeUpdateStagingColumns, largeNodeUpdateSource) so rows are produced
lazily; apply the same pattern to the other large-update block that currently
uses nodeRows.Rows().
drivers/neo4j/batch_integration_test.go (1)

125-127: ⚠️ Potential issue | 🟠 Major

Assertion is inverted for HeatProperty after update.

HeatProperty is deleted before batch.UpdateNodes(nodes), so this should assert absence, not presence.

🐛 Proposed fix
-				for _, node := range nodes {
-					assert.True(t, node.Properties.Exists(HeatProperty))
-					assert.Equal(t, true, node.Properties.Get(NewProperty).Any())
-				}
+				for _, node := range nodes {
+					assert.False(t, node.Properties.Exists(HeatProperty), "HeatProperty should have been deleted")
+					assert.Equal(t, true, node.Properties.Get(NewProperty).Any())
+				}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/batch_integration_test.go` around lines 125 - 127, The test
currently asserts that HeatProperty exists on each node after calling
batch.UpdateNodes(nodes), but HeatProperty is deleted before the update; change
the assertion to assert that HeatProperty does not exist by replacing
assert.True(t, node.Properties.Exists(HeatProperty)) with an assertion that it
is false (e.g., assert.False or equivalent) while keeping the NewProperty
assertion (assert.Equal(t, true, node.Properties.Get(NewProperty).Any()))
intact; locate the check in the loop over nodes in the test that calls
batch.UpdateNodes(nodes).
drivers/neo4j/batch.go (1)

61-66: ⚠️ Potential issue | 🟠 Major

UpdateNodes currently bypasses batchWriteSize on large slices.

Appending all nodes before checking the threshold allows one call to exceed configured batch limits.

🐛 Proposed fix
 func (s *batchTransaction) UpdateNodes(nodes []*graph.Node) error {
-	s.nodeUpdateBuffer = append(s.nodeUpdateBuffer, nodes...)
-
-	if len(s.nodeUpdateBuffer) > s.batchWriteSize {
-		return s.flushNodeUpdateBuffer()
-	}
-
-	return nil
+	for _, node := range nodes {
+		s.nodeUpdateBuffer = append(s.nodeUpdateBuffer, node)
+
+		if len(s.nodeUpdateBuffer) >= s.batchWriteSize {
+			if err := s.flushNodeUpdateBuffer(); err != nil {
+				return err
+			}
+		}
+	}
+
+	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/batch.go` around lines 61 - 66, UpdateNodes currently appends
the entire nodes slice to s.nodeUpdateBuffer and only then checks
s.batchWriteSize, which can allow a single call to exceed the batch limit;
modify UpdateNodes to append and flush in chunks so the buffer never grows
beyond s.batchWriteSize by repeatedly appending up to the remaining capacity and
calling s.flushNodeUpdateBuffer() as needed (i.e., loop over the input slice,
add a portion to s.nodeUpdateBuffer, flush when len(s.nodeUpdateBuffer) >=
s.batchWriteSize, then continue until all nodes are consumed), referencing
UpdateNodes, s.nodeUpdateBuffer, s.batchWriteSize and flushNodeUpdateBuffer.
drivers/neo4j/cypher.go (1)

261-263: ⚠️ Potential issue | 🔴 Critical

Bug: Iterating over AddedKinds instead of DeletedKinds.

This should iterate over node.DeletedKinds to correctly compute the update key. Currently, nodes with different DeletedKinds but identical AddedKinds will be batched together, causing incorrect label removals.

🐛 Proposed fix
-	for _, removedKindStr := range node.AddedKinds.Strings() {
+	for _, removedKindStr := range node.DeletedKinds.Strings() {
 		kindSet[removedKindStr] = struct{}{}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/cypher.go` around lines 261 - 263, The code is populating
kindSet from the wrong field: change the loop that currently iterates
node.AddedKinds.Strings() to iterate node.DeletedKinds.Strings() so the update
key reflects label removals correctly; specifically update the loop that assigns
to kindSet (the one using node.AddedKinds.Strings()) to use
node.DeletedKinds.Strings() instead so nodes are grouped by their deleted kinds
when computing the update key.
🧹 Nitpick comments (1)
drivers/pg/query/format.go (1)

124-125: Avoid exposing mutable global staging column state.

At Line 125, NodeUpdateStagingColumns is exported as a mutable slice. External mutation can silently break COPY column alignment at runtime. Prefer an unexported backing value with a getter that returns a copy.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/pg/query/format.go` around lines 124 - 125, Replace the exported
mutable slice NodeUpdateStagingColumns with an unexported backing variable
(e.g., nodeUpdateStagingColumns) and add an exported getter function
NodeUpdateStagingColumns() []string that returns a copy of the slice;
specifically, rename the current var to nodeUpdateStagingColumns and implement
NodeUpdateStagingColumns() to allocate a new slice, copy the contents of
nodeUpdateStagingColumns into it, and return that copy to prevent external
mutation from breaking COPY column alignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cypher/models/pgsql/type.go`:
- Around line 60-69: DeletedPropertiesToString currently calls
properties.DeletedProperties() without guarding for a nil receiver which will
panic; update DeletedPropertiesToString to return "{}" immediately if properties
== nil (mirroring the nil-check used by PropertiesToJSONB/MapOrEmpty), then
proceed to call properties.DeletedProperties(), build the quoted slice and join
as before; also add a unit test that calls DeletedPropertiesToString(nil) and
expects "{}" to ensure nil-safety.

In `@drivers/neo4j/batch_integration_test.go`:
- Around line 23-25: The init() function unconditionally calls
os.Setenv(Neo4jConnectionStringEnv, "...") which overrides any existing
NEO4J_CONNECTION value; remove that unconditional override and instead only set
a default when the env var is empty (e.g., check
os.Getenv(Neo4jConnectionStringEnv) and only call os.Setenv if it's ""), or
delete the init() entirely so tests respect CI/local configuration; update the
code around the init() function and references to Neo4jConnectionStringEnv
accordingly.

In `@graph/graph.go`:
- Around line 369-372: The WithBatchSize option currently sets
BatchConfig.BatchSize unconditionally; guard against non-positive values by
validating the incoming size in WithBatchSize and only assigning it when size >
0 (otherwise leave the existing config.BatchSize unchanged or set a sensible
default). Update the WithBatchSize(func(config *BatchConfig)) to check size <= 0
and return early (or set to a default constant) so BatchConfig.BatchSize is
never set to a non-positive value; reference BatchConfig.BatchSize,
WithBatchSize, and BatchOption when making the change.

In `@ops/ops.go`:
- Around line 609-611: The code silently ignores extra variadic batchSize
values; update the handling around batchSize to explicitly reject multiple
values: if len(batchSize) > 1 return an error (e.g., fmt.Errorf("only one
batchSize allowed, got %d", len(batchSize))) so misuse is visible, and keep the
existing branch that appends graph.WithBatchSize(batchSize[0]) when exactly one
value is provided; modify the enclosing function signature to return an error if
it doesn't already so you can propagate this validation failure.

---

Duplicate comments:
In `@drivers/neo4j/batch_integration_test.go`:
- Around line 125-127: The test currently asserts that HeatProperty exists on
each node after calling batch.UpdateNodes(nodes), but HeatProperty is deleted
before the update; change the assertion to assert that HeatProperty does not
exist by replacing assert.True(t, node.Properties.Exists(HeatProperty)) with an
assertion that it is false (e.g., assert.False or equivalent) while keeping the
NewProperty assertion (assert.Equal(t, true,
node.Properties.Get(NewProperty).Any())) intact; locate the check in the loop
over nodes in the test that calls batch.UpdateNodes(nodes).

In `@drivers/neo4j/batch.go`:
- Around line 61-66: UpdateNodes currently appends the entire nodes slice to
s.nodeUpdateBuffer and only then checks s.batchWriteSize, which can allow a
single call to exceed the batch limit; modify UpdateNodes to append and flush in
chunks so the buffer never grows beyond s.batchWriteSize by repeatedly appending
up to the remaining capacity and calling s.flushNodeUpdateBuffer() as needed
(i.e., loop over the input slice, add a portion to s.nodeUpdateBuffer, flush
when len(s.nodeUpdateBuffer) >= s.batchWriteSize, then continue until all nodes
are consumed), referencing UpdateNodes, s.nodeUpdateBuffer, s.batchWriteSize and
flushNodeUpdateBuffer.

In `@drivers/neo4j/cypher.go`:
- Around line 261-263: The code is populating kindSet from the wrong field:
change the loop that currently iterates node.AddedKinds.Strings() to iterate
node.DeletedKinds.Strings() so the update key reflects label removals correctly;
specifically update the loop that assigns to kindSet (the one using
node.AddedKinds.Strings()) to use node.DeletedKinds.Strings() instead so nodes
are grouped by their deleted kinds when computing the update key.

In `@drivers/pg/batch.go`:
- Around line 199-202: The UpdateNodes method on type batch currently jumps to
s.largeUpdate when node count exceeds LargeNodeUpdateThreshold without flushing
any buffered writes; before invoking s.largeUpdate(nodes) ensure pending batch
buffers are flushed (e.g., call the existing flush/commit method on batch such
as s.flush or s.flushPendingBuffers) so earlier queued writes are applied and
ordering is preserved, then proceed to call s.largeUpdate(nodes).
- Around line 114-125: The current large-update path builds a [][]any payload
via NewLargeNodeUpdateRows + nodeRows.AppendAll and then passes nodeRows.Rows()
into tx.Conn().CopyFrom, which materializes all rows in memory; instead
implement a pgx.CopyFromSource that streams/encodes one row at a time (e.g., add
a type LargeNodeUpdateSource that implements CopyFromSource with Next(),
Values(), Err()), change NewLargeNodeUpdateRows (or add a factory like
NewLargeNodeUpdateSource) to return that source, and call
tx.Conn().CopyFrom(s.ctx, pgx.Identifier{sql.NodeUpdateStagingTable},
sql.NodeUpdateStagingColumns, largeNodeUpdateSource) so rows are produced
lazily; apply the same pattern to the other large-update block that currently
uses nodeRows.Rows().

In `@graph/node.go`:
- Around line 121-131: The method StripAllPropertiesExcept on struct Node
dereferences s.Properties without checking for nil, which will panic for nodes
with nil Properties; update Node.StripAllPropertiesExcept to first guard that
s.Properties is non-nil (e.g., return early or initialize newProperties
appropriately) before calling s.Properties.Exists, s.Properties.Get, or
accessing s.Properties.Deleted, so that NewProperties, property copying
(Exists/Get/Any/Set) and Delete logic only run when s.Properties is present.

---

Nitpick comments:
In `@drivers/pg/query/format.go`:
- Around line 124-125: Replace the exported mutable slice
NodeUpdateStagingColumns with an unexported backing variable (e.g.,
nodeUpdateStagingColumns) and add an exported getter function
NodeUpdateStagingColumns() []string that returns a copy of the slice;
specifically, rename the current var to nodeUpdateStagingColumns and implement
NodeUpdateStagingColumns() to allocate a new slice, copy the contents of
nodeUpdateStagingColumns into it, and return that copy to prevent external
mutation from breaking COPY column alignment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 07233c5b-1a0e-4303-b794-a862fc49d584

📥 Commits

Reviewing files that changed from the base of the PR and between f9904ca and d9faf72.

📒 Files selected for processing (18)
  • cypher/models/pgsql/type.go
  • cypher/models/pgsql/type_test.go
  • drivers/neo4j/batch.go
  • drivers/neo4j/batch_integration_test.go
  • drivers/neo4j/cypher.go
  • drivers/neo4j/cypher_internal_test.go
  • drivers/neo4j/driver.go
  • drivers/neo4j/transaction.go
  • drivers/pg/batch.go
  • drivers/pg/driver.go
  • drivers/pg/query/format.go
  • drivers/pg/query/format_test.go
  • graph/graph.go
  • graph/graph_test.go
  • graph/node.go
  • graph/node_test.go
  • graph/switch.go
  • ops/ops.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
graph/graph.go (1)

286-288: Minor doc comment convention.

Go doc comments for interface methods should start with the method name. Consider:

-	// Updates nodes by ID
+	// UpdateNodes updates nodes by ID.
 	UpdateNodes(nodes []*Node) error
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@graph/graph.go` around lines 286 - 288, The interface method comment "Updates
nodes by ID" should follow Go doc conventions and start with the method name;
update the comment to begin with "UpdateNodes ..." so it reads like "UpdateNodes
updates nodes by ID" (or similar), placing it immediately above the UpdateNodes
method declaration in the interface to satisfy GoDoc style and linters.
ops/ops.go (1)

617-624: Simplify the return statement.

The if err != nil { return err }; return nil pattern can be replaced by directly returning the result.

♻️ Suggested simplification
-	if err := graphDB.BatchOperation(ctx, func(batch graph.Batch) error {
+	return graphDB.BatchOperation(ctx, func(batch graph.Batch) error {
 		return batch.UpdateNodes(nodes)
-	}, options...); err != nil {
-		return err
-	}
-
-	return nil
+	}, options...)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ops/ops.go` around lines 617 - 624, The code calls
graphDB.BatchOperation(...) and currently does "if err := ...; err != nil {
return err } return nil"; replace that pattern by directly returning the result
of the BatchOperation call. Locate the block containing
graphDB.BatchOperation(ctx, func(batch graph.Batch) error { return
batch.UpdateNodes(nodes) }, options...) and remove the conditional check so the
function simply returns the BatchOperation call's error value.
graph/node.go (1)

115-140: LGTM! Nil guard addressed; minor naming clarity suggestion.

The implementation correctly handles the nil Properties case and the logic for preserving specified properties is sound. Direct map access on s.Properties.Deleted is safe in Go for nil maps.

One optional refinement: the loop variable exclusion is semantically inverted—these are properties to keep (excluded from removal), not properties being excluded. A name like propertyToKeep or key would improve readability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@graph/node.go` around lines 115 - 140, The loop variable name "exclusion" in
the StripAllPropertiesExcept method is misleading because it represents keys to
keep; rename it to something clearer (e.g., propertyToKeep or key) to improve
readability: update the for loop header and all uses inside the loop (the
Exists/Get/Delete calls) inside the StripAllPropertiesExcept function to use the
new identifier, leaving all logic unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@graph/graph.go`:
- Around line 286-288: The interface method comment "Updates nodes by ID" should
follow Go doc conventions and start with the method name; update the comment to
begin with "UpdateNodes ..." so it reads like "UpdateNodes updates nodes by ID"
(or similar), placing it immediately above the UpdateNodes method declaration in
the interface to satisfy GoDoc style and linters.

In `@graph/node.go`:
- Around line 115-140: The loop variable name "exclusion" in the
StripAllPropertiesExcept method is misleading because it represents keys to
keep; rename it to something clearer (e.g., propertyToKeep or key) to improve
readability: update the for loop header and all uses inside the loop (the
Exists/Get/Delete calls) inside the StripAllPropertiesExcept function to use the
new identifier, leaving all logic unchanged.

In `@ops/ops.go`:
- Around line 617-624: The code calls graphDB.BatchOperation(...) and currently
does "if err := ...; err != nil { return err } return nil"; replace that pattern
by directly returning the result of the BatchOperation call. Locate the block
containing graphDB.BatchOperation(ctx, func(batch graph.Batch) error { return
batch.UpdateNodes(nodes) }, options...) and remove the conditional check so the
function simply returns the BatchOperation call's error value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0f34a6b0-f49d-4083-8cf3-0edb4fa04dbb

📥 Commits

Reviewing files that changed from the base of the PR and between d9faf72 and 21082bd.

📒 Files selected for processing (4)
  • cypher/models/pgsql/type.go
  • graph/graph.go
  • graph/node.go
  • ops/ops.go

@zinic zinic force-pushed the BED-7498-batch-node-update branch from ff7a3d6 to 1ac2011 Compare March 19, 2026 00:51
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
drivers/neo4j/cypher.go (2)

308-313: Consider adding an EmptyKind check for consistency.

The relationship update function (cypherBuildRelationshipUpdateQueryBatch, lines 175-176) skips graph.EmptyKind when adding labels, but this function does not. If AddedKinds could ever contain an empty kind, this would produce invalid Cypher like n:.

🔧 Proposed fix for defensive consistency
 if len(batch.nodeKindsToAdd) > 0 {
     for _, kindToAdd := range batch.nodeKindsToAdd {
+        if kindToAdd == nil || kindToAdd.Is(graph.EmptyKind) {
+            continue
+        }
         output.WriteString(", n:")
         output.WriteString(kindToAdd.String())
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/cypher.go` around lines 308 - 313, The loop that appends node
labels from batch.nodeKindsToAdd can emit an empty label (`n:`) if a kind equals
graph.EmptyKind; update the logic in cypherBuildNodeUpdateQueryBatch (the block
iterating over batch.nodeKindsToAdd) to skip any kind where kindToAdd ==
graph.EmptyKind before calling kindToAdd.String(), mirroring the check used in
cypherBuildRelationshipUpdateQueryBatch, so no empty label tokens are written to
output.

315-326: Same EmptyKind consideration applies to removed kinds.

For symmetry with the relationship batch logic and defensive coding, consider adding the same guard here.

🔧 Proposed fix
 if len(batch.nodeKindsToRemove) > 0 {
     output.WriteString(" remove ")

+    firstRemoval := true
     for idx, kindToRemove := range batch.nodeKindsToRemove {
-        if idx > 0 {
+        if kindToRemove == nil || kindToRemove.Is(graph.EmptyKind) {
+            continue
+        }
+        if !firstRemoval {
             output.WriteString(",")
+        } else {
+            firstRemoval = false
         }

         output.WriteString("n:")
         output.WriteString(kindToRemove.String())
     }
 }

Note: If empty kinds are filtered earlier in the pipeline and cannot reach this function, this is purely defensive and can be skipped.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/neo4j/cypher.go` around lines 315 - 326, The loop that appends
removed node kinds should defensively skip EmptyKind entries like the
relationship batch logic does; in the block iterating batch.nodeKindsToRemove,
add a guard that continues the loop when kindToRemove == EmptyKind (or
equivalent sentinel) so you don't emit "n:" for empty kinds, ensuring you still
handle comma placement correctly when some entries are skipped; use the
batch.nodeKindsToRemove identifier and kindToRemove.String() to locate and
adjust the loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@drivers/neo4j/cypher_internal_test.go`:
- Line 50: Fix the typo in the comment containing "Digest 3 and 4 should have a
distinct update key from digests 1 and 2 becaus the query" by changing "becaus"
to "because" so the sentence reads correctly; locate that comment in
cypher_internal_test.go (search for the exact comment text) and update it in
place (no behavior changes required).

---

Nitpick comments:
In `@drivers/neo4j/cypher.go`:
- Around line 308-313: The loop that appends node labels from
batch.nodeKindsToAdd can emit an empty label (`n:`) if a kind equals
graph.EmptyKind; update the logic in cypherBuildNodeUpdateQueryBatch (the block
iterating over batch.nodeKindsToAdd) to skip any kind where kindToAdd ==
graph.EmptyKind before calling kindToAdd.String(), mirroring the check used in
cypherBuildRelationshipUpdateQueryBatch, so no empty label tokens are written to
output.
- Around line 315-326: The loop that appends removed node kinds should
defensively skip EmptyKind entries like the relationship batch logic does; in
the block iterating batch.nodeKindsToRemove, add a guard that continues the loop
when kindToRemove == EmptyKind (or equivalent sentinel) so you don't emit "n:"
for empty kinds, ensuring you still handle comma placement correctly when some
entries are skipped; use the batch.nodeKindsToRemove identifier and
kindToRemove.String() to locate and adjust the loop.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4c2d5175-a592-400c-8b4b-45bc2d28dddd

📥 Commits

Reviewing files that changed from the base of the PR and between 21082bd and ff7a3d6.

📒 Files selected for processing (2)
  • drivers/neo4j/cypher.go
  • drivers/neo4j/cypher_internal_test.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@drivers/neo4j/batch_integration_test.go`:
- Line 64: Fix the typo in the inline comment that currently reads "Regsiter a
cleanup step to wipe the database" by changing "Regsiter" to "Register" so the
comment reads "Register a cleanup step to wipe the database"; update the comment
near the cleanup registration code (the comment containing the exact text
"Regsiter a cleanup step to wipe the database") to correct the spelling.

In `@drivers/neo4j/cypher.go`:
- Around line 308-326: The node batch Cypher generation must skip EmptyKind to
avoid producing labels like "n:" or "remove n:"; update the loops that iterate
batch.nodeKindsToAdd and batch.nodeKindsToRemove in the function building the
Cypher (where output.WriteString is used) to check for and continue on kinds
equal to EmptyKind (the same guard used in the relationship batch logic),
ensuring only non-empty kind.String() values are appended; this mirrors the
handling in the AddKinds/DeleteKinds flow and prevents invalid Cypher when
AddKinds()/DeleteKinds() leave EmptyKind in the slices.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: daae105c-679c-4884-aeeb-f986099ee261

📥 Commits

Reviewing files that changed from the base of the PR and between ff7a3d6 and 1ac2011.

📒 Files selected for processing (3)
  • drivers/neo4j/batch_integration_test.go
  • drivers/neo4j/cypher.go
  • drivers/neo4j/cypher_internal_test.go

@bsheth711 bsheth711 requested a review from superlinkx March 19, 2026 16:40
Copy link
Contributor

@zinic zinic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm impressed at the plumbing of the copy table method. I figured it would be too ugly to plumb to make it approachable but I can see that I was wrong. This looks really tight.

I'm good with this approach on the PG side and the Neo4j side is patched up but will want for one last stamp before this goes in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants