Skip to content

fix(cysql): correct edge filtering, recursive depth bounds, SQL join predicate placement, fused edge index - BED-6499#45

Open
zinic wants to merge 5 commits intomainfrom
BED-6499
Open

fix(cysql): correct edge filtering, recursive depth bounds, SQL join predicate placement, fused edge index - BED-6499#45
zinic wants to merge 5 commits intomainfrom
BED-6499

Conversation

@zinic
Copy link
Contributor

@zinic zinic commented Mar 23, 2026

Tickets

Summary by CodeRabbit

  • Bug Fixes

    • Corrected edge filtering logic in graph projections
    • Fixed test assertion handling for query execution validation
    • Updated recursive depth constraints to use exclusive upper bounds
  • New Features

    • Enhanced constraint checking for query translation optimization
  • Tests

    • Improved test reporting to track passing test cases alongside total test count
    • Updated query translation test cases with refined join predicates and filter placement
  • Chores

    • Added database indexes on edge table for optimized graph traversals
    • Upgraded Go toolchain to 1.25 and refreshed dependencies
  • Refactor

    • Optimized predicate placement in SQL joins for better query efficiency
    • Refined aggregate function handling with improved type casting for numeric values
    • Simplified internal graph traversal logic

Summary by CodeRabbit

  • Bug Fixes

    • Fixed deleted-node exclusion in graph traversal and tightened recursive depth bounds from inclusive to exclusive.
  • Improvements

    • Reworked SQL translation and join predicate placement to produce more precise queries and improved aggregation input typing for sum/avg/min/max.
    • Partitioned constraint handling to tighten joins and projections.
  • Tests

    • Improved test assertions and subtest pass/fail reporting; updated many translation test cases.
  • Chores

    • Bumped Go toolchain and dependency versions.

@coderabbitai
Copy link

coderabbitai bot commented Mar 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR moves many node/edge predicates from WHERE into JOIN ONs, tightens recursive traversal bounds (<= → <), adds locality-aware constraint partitioning in the translator, enforces float8 casts for aggregate inputs, tweaks traversal/projection logic, updates tests/assertions, fixes a triplestore edge check, and bumps Go/toolchain deps.

Changes

Cohort / File(s) Summary
Recursive depth & expansion
cypher/models/pgsql/translate/expansion.go, cypher/models/pgsql/test/translation_cases/pattern_expansion.sql, cypher/models/pgsql/test/translation_cases/pattern_binding.sql, cypher/models/pgsql/test/translation_cases/multipart.sql
Changed recursion termination from <= N to < N across recursive CTEs and expansion logic; adjusted related primer/previous-frame handling in shortest-path/expansion builders.
Constraint locality partitioning
cypher/models/pgsql/translate/model.go, cypher/models/pgsql/translate/traversal.go, cypher/models/pgsql/translate/predicate.go
Added helpers to partition constraints into scope-local vs external (flattenConjunction, isLocalToScope, partitionConstraintByLocality); updated traversal root/step builders and predicate emission to place local constraints in JOIN ON and externals in WHERE.
Predicate migration to JOINs (tests updated)
cypher/models/pgsql/test/translation_cases/nodes.sql, cypher/models/pgsql/test/translation_cases/pattern_binding.sql, cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql, cypher/models/pgsql/test/translation_cases/quantifiers.sql, cypher/models/pgsql/test/translation_cases/update.sql, cypher/models/pgsql/test/translation_cases/multipart.sql
Relocated node kind/id and endpoint identity checks into JOIN ON clauses in many generated SQL test cases; adjusted inner-query predicate placements and some unnest comparisons.
Negated pattern subquery rewrite
cypher/models/pgsql/test/translation_cases/nodes.sql
Rewrote negated edge-patterns to use not (WITH ... SELECT count(*) > 0) style subqueries, reorganizing inner joins and predicates for both directed and undirected variants.
Aggregate input casting
cypher/models/pgsql/translate/function.go, cypher/models/pgsql/test/translation_cases/scalar_aggregation.sql
Added explicit float8 casts to aggregate inputs for sum, avg, min, max in translator and updated test SQL to cast JSON extraction results to float8 before aggregation.
Expression/identity rewrite & constraint check
cypher/models/pgsql/translate/expression.go, cypher/models/pgsql/translate/predicate.go
Added HasAnyConstraints method; identity comparisons against NodeCompositeArray now use an unnest() subquery; predicate builder now calls HasAnyConstraints when checking for global constraints.
Constraint optimization & parenthesis changes
cypher/models/pgsql/translate/constraints.go
Wrapped DirectionBoth OR conditions in explicit parenthetical expressions; OptimizePatternConstraintBalance returns early when LeftNodeBound to skip selectivity measurement.
Traversal/projection & walk changes
cypher/models/pgsql/translate/projection.go, cypher/models/walk/walk_pgsql.go
Split pre/post-expansion edge reference ordering in path projection, adjusted parameter concatenation and expansion tracking; simplified SQL AST walk cursor branches for aliased expressions and row column references.
Test runner and assertions
cypher/models/pgsql/test/testcase.go, cypher/models/pgsql/test/validation_integration_test.go
Changed require.Nilrequire.NoError for driver.Run assertions; validation runner now counts total and passed subtests and prints both; build tag line modified (extra slash).
Triplestore edge deletion check
container/triplestore.go
Fixed edge exclusion predicate to correctly check End node deletion membership instead of repeating Start.
Go/toolchain and deps
go.mod
Bumped Go toolchain to 1.25.0; updated several direct and indirect module versions (e.g., cuelang.org/go, roaring/v2, gammazero/deque, golang.org/x/*).

Sequence Diagram(s)

(omitted — changes are internal translation/refactorings without a new multi-component sequential flow warranting a diagram)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 I hopped through joins and moved each clause,
Depths now stop before they reach the fence,
Casts to float8 make aggregates pause,
Constraints split local from the rest — so dense!
A tiny fix, a leap — the queries prance.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: edge filtering fixes, recursive depth bound corrections, SQL join predicate placement improvements, and a fused edge index addition.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch BED-6499

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
container/triplestore.go (1)

282-290: ⚠️ Potential issue | 🔴 Critical

Bug: EachAdjacentEdge has the same duplicate next.Start check that was fixed in EachEdge.

Line 284 checks next.Start twice instead of checking both next.Start and next.End. This will incorrectly include edges whose end-node has been deleted.

🐛 Proposed fix
 func (s *triplestoreProjection) EachAdjacentEdge(node uint64, direction graph.Direction, delegate func(next Edge) bool) {
 	s.origin.EachAdjacentEdge(node, direction, func(next Edge) bool {
-		if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.Start) {
+		if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.End) {
 			return delegate(next)
 		}

 		return true
 	})
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@container/triplestore.go` around lines 282 - 290, In EachAdjacentEdge, the
condition mistakenly checks next.Start twice so edges with a deleted end node
slip through; update the predicate inside triplestoreProjection.EachAdjacentEdge
to check both s.deletedNodes.Contains(next.Start) and
s.deletedNodes.Contains(next.End) (and still check
s.deletedEdges.Contains(next.ID)), then call delegate(next) only when none are
deleted; use the same pattern as the corrected EachEdge and keep the
s.origin.EachAdjacentEdge(...) wrapper and delegate return semantics unchanged.
cypher/models/pgsql/translate/traversal.go (1)

43-60: ⚠️ Potential issue | 🟠 Major

Parenthesize the directionless endpoint predicate before conjoining local filters.

For graph.DirectionBoth, traversalStep.LeftNodeJoinCondition and traversalStep.RightNodeJoinCondition are OR expressions. Feeding them straight into OptionalAnd(...) produces SQL like name = 'n3' AND n1.id = e0.end_id OR n1.id = e0.start_id, so the second branch bypasses the local filter. The generated SQL in cypher/models/pgsql/test/translation_cases/nodes.sql (lines 189–208) already shows this regression.

🐛 Proposed fix
 		Joins: []pgsql.Join{{
 			Table: pgsql.TableReference{
 				Name:    pgsql.CompoundIdentifier{pgsql.TableNode},
 				Binding: models.OptionalValue(traversalStep.LeftNode.Identifier),
 			},
 			JoinOperator: pgsql.JoinOperator{
 				JoinType:   pgsql.JoinTypeInner,
-				Constraint: pgsql.OptionalAnd(leftJoinLocal, traversalStep.LeftNodeJoinCondition),
+				Constraint: pgsql.OptionalAnd(leftJoinLocal, pgsql.NewParenthetical(traversalStep.LeftNodeJoinCondition)),
 			},
 		}, {
 			Table: pgsql.TableReference{
 				Name:    pgsql.CompoundIdentifier{pgsql.TableNode},
 				Binding: models.OptionalValue(traversalStep.RightNode.Identifier),
 			},
 			JoinOperator: pgsql.JoinOperator{
 				JoinType:   pgsql.JoinTypeInner,
-				Constraint: pgsql.OptionalAnd(rightJoinLocal, traversalStep.RightNodeJoinCondition),
+				Constraint: pgsql.OptionalAnd(rightJoinLocal, pgsql.NewParenthetical(traversalStep.RightNodeJoinCondition)),
 			},
 		}},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/traversal.go` around lines 43 - 60, The bug is
that when traversalStep.LeftNodeJoinCondition or
traversalStep.RightNodeJoinCondition is an OR (e.g., for graph.DirectionBoth) it
must be parenthesized before conjoining with local filters; update the
JoinOperator.Constraint construction to call OptionalAnd(leftJoinLocal,
pgsql.ParenthesizedExpr(traversalStep.LeftNodeJoinCondition)) and
OptionalAnd(rightJoinLocal,
pgsql.ParenthesizedExpr(traversalStep.RightNodeJoinCondition)) (or the
equivalent parenthesizing helper in pgsql) so the OR branch does not bypass the
local ANDed filter.
🧹 Nitpick comments (1)
cypher/models/pgsql/translate/model.go (1)

124-130: Broaden flattenConjunction beyond pointer-form AND nodes.

Right now only *pgsql.BinaryExpression is flattened. If a conjunction reaches this helper as a value pgsql.BinaryExpression or under a Parenthetical, the whole term stays intact and join-local predicates can still get stranded in WHERE.

♻️ Proposed hardening
 func flattenConjunction(expr pgsql.Expression) []pgsql.Expression {
-	if bin, typeOK := expr.(*pgsql.BinaryExpression); !typeOK || bin.Operator != pgsql.OperatorAnd {
-		return []pgsql.Expression{expr}
-	} else {
-		return append(flattenConjunction(bin.LOperand), flattenConjunction(bin.ROperand)...)
-	}
+	switch typed := expr.(type) {
+	case *pgsql.Parenthetical:
+		return flattenConjunction(typed.Expression)
+	case *pgsql.BinaryExpression:
+		if typed.Operator == pgsql.OperatorAnd {
+			return append(flattenConjunction(typed.LOperand), flattenConjunction(typed.ROperand)...)
+		}
+	case pgsql.BinaryExpression:
+		if typed.Operator == pgsql.OperatorAnd {
+			return append(flattenConjunction(typed.LOperand), flattenConjunction(typed.ROperand)...)
+		}
+	}
+
+	return []pgsql.Expression{expr}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/model.go` around lines 124 - 130,
flattenConjunction only handles pointer-form *pgsql.BinaryExpression and ignores
non-pointer pgsql.BinaryExpression and Parenthetical wrappers, so update
flattenConjunction to recognize both pointer and non-pointer binary nodes and to
unwrap Parenthetical expressions before checking the operator; e.g., normalize
the expr by stripping Parenthetical layers, then use a type switch to detect
either *pgsql.BinaryExpression or pgsql.BinaryExpression with Operator ==
pgsql.OperatorAnd and recursively flatten bin.LOperand and bin.ROperand,
returning the single expr when it’s not an AND binary node.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 27-33: The file contains a duplicated test case: two identical
Cypher->SQL blocks starting with the comment "-- case: match
(n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) ..."
and the CTE chain beginning with with s0 as (with s1 as (with recursive
s2(...))); remove one of these duplicate blocks (keep a single instance) —
locate the duplicate by the unique comment text and the CTE root symbol s0/s1/s2
and delete the redundant entire block including its trailing select s0.n2 as m
from s0 where (s0.i0 >= 10);
- Around line 76-99: Remove the duplicate SQL test blocks that repeat earlier
cases: delete the repeated case starting "match
(n:NodeKind1)<-[:EdgeKind1]-(:NodeKind2) where n.objectid ends with '-516'...",
the repeated "match (n:NodeKind1)-[:EdgeKind1]->(m:NodeKind2) where n.enabled =
true..." block, the two repeated "with \"a\" as check, \"b\" as ref ...
collect(tolower(g.samaccountname)) as refmembership, tolower(u.samaccountname)
as samname" blocks (both the variant matching EdgeKind2 with '-' and the '->'
variant), the repeated recursive path case "match p
=(n:NodeKind1)<-[r:EdgeKind1|EdgeKind2*..3]-(u:NodeKind1) ...", the repeated
"match (u:NodeKind1)-[:EdgeKind1]->(g:NodeKind2) with g match
(g)<-[:EdgeKind1]-(u:NodeKind1) return g" block, and the repeated "match
(cg:NodeKind1) where cg.name =~ ... collect (cg.email) as emails ..." block;
keep the first occurrence of each unique case (the initial s0/s1 CTE variants)
and remove subsequent duplicate CTE groups so each described case appears only
once.

In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql`:
- Around line 71-75: Two duplicate test cases are present: the blocks starting
with "-- case: match p =
(:NodeKind1)-[:EdgeKind1|EdgeKind2]->(e:NodeKind2)-[:EdgeKind2]->(:NodeKind1)
where 'a' in e.values ..." and "-- case: match p =
(n:NodeKind1)-[r]-(m:NodeKind1) return p" (which use s0/s1, e0/e1, n0/n1 and
edges_to_path) are exact copies of earlier cases and should be removed; delete
these two trailing case blocks so each scenario appears only once in the fixture
and leave the original versions earlier in the file intact.
- Line 54: The undirected endpoint OR is evaluated with wrong precedence so the
kind filter only applies to one branch; wrap the OR together and then AND the
kind check so the kind constraint applies to whichever endpoint matches.
Concretely, in the s0 subquery's JOINs for n0 and n1 (references: s0, e0, n0,
n1) change the ON clauses from "n0.kind_ids ... and n0.id = e0.end_id or n0.id =
e0.start_id" to "((n0.id = e0.end_id or n0.id = e0.start_id) and n0.kind_ids
...)" and do the analogous change for n1 so the NodeKind1 predicate is grouped
with the endpoint match.
- Line 42: The final path concatenation currently prepends the last edge by
using [(s2.e1).id] || s2.ep0; change it so the recursive path ep0 comes first
and then the last edge is appended: build the array from s0.ep0 (s1.path)
followed by the single-element array for s2.e1 before calling edges_to_path,
i.e. swap the concatenation order around the edges_to_path call that produces p
so the result is ep0 || array[(s2.e1).id], preserving forward order from s to i
to final node.

In `@cypher/models/pgsql/test/translation_cases/quantifiers.sql`:
- Around line 41-45: Remove the duplicated quantifier test blocks that repeat
the two cases beginning with the comments "-- case: MATCH (m:NodeKind1) WHERE
m.unconstraineddelegation = true ..." and "-- case: MATCH (m:NodeKind1) WHERE
ANY(name in m.serviceprincipalnames WHERE name CONTAINS "PHANTOM") ..."; locate
the repeated WITH ... select ... queries that build s0/s2 (and s0/s2 with arrays
i0/i1) and delete the second occurrence so each case appears only once,
preserving the first instance that defines s0/s2 and their final SELECT
returning m.

In `@cypher/models/pgsql/test/translation_cases/update.sql`:
- Around line 53-54: Remove the duplicate SQL test case block that begins with
the comment "-- case: match (a)-[r]->(:NodeKind1) set a.name = '123',
r.is_special_outbound = true'": the loader in testcase.go appends every "--
case:" block it finds, so keep only the first occurrence and delete the second
identical copy to avoid creating a duplicate test entry.

In `@cypher/models/pgsql/test/validation_integration_test.go`:
- Around line 68-71: Rename the misspelled variable cassesPassed to casesPassed
in the var block alongside casesRun and update every usage/reference to that
identifier in the test (e.g., the increments/assertions where cassesPassed is
read/updated); ensure the declaration reads "casesPassed" and all places that
previously referenced cassesPassed now reference casesPassed so the test
compiles and tracks passed cases correctly.

In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 191-194: Add a new timestamped SQL migration that creates the two
missing indexes for existing databases: create the edge_start_kind_index on edge
(start_id, kind_id) and create the edge_end_kind_index on edge (end_id,
kind_id); ensure the migration uses the same "create index if not exists"
semantics as in schema_up.sql so it is idempotent and will be applied by
incremental migration runners, and name the file with a proper timestamp prefix
so it runs in order alongside other migrations.

---

Outside diff comments:
In `@container/triplestore.go`:
- Around line 282-290: In EachAdjacentEdge, the condition mistakenly checks
next.Start twice so edges with a deleted end node slip through; update the
predicate inside triplestoreProjection.EachAdjacentEdge to check both
s.deletedNodes.Contains(next.Start) and s.deletedNodes.Contains(next.End) (and
still check s.deletedEdges.Contains(next.ID)), then call delegate(next) only
when none are deleted; use the same pattern as the corrected EachEdge and keep
the s.origin.EachAdjacentEdge(...) wrapper and delegate return semantics
unchanged.

In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 43-60: The bug is that when traversalStep.LeftNodeJoinCondition or
traversalStep.RightNodeJoinCondition is an OR (e.g., for graph.DirectionBoth) it
must be parenthesized before conjoining with local filters; update the
JoinOperator.Constraint construction to call OptionalAnd(leftJoinLocal,
pgsql.ParenthesizedExpr(traversalStep.LeftNodeJoinCondition)) and
OptionalAnd(rightJoinLocal,
pgsql.ParenthesizedExpr(traversalStep.RightNodeJoinCondition)) (or the
equivalent parenthesizing helper in pgsql) so the OR branch does not bypass the
local ANDed filter.

---

Nitpick comments:
In `@cypher/models/pgsql/translate/model.go`:
- Around line 124-130: flattenConjunction only handles pointer-form
*pgsql.BinaryExpression and ignores non-pointer pgsql.BinaryExpression and
Parenthetical wrappers, so update flattenConjunction to recognize both pointer
and non-pointer binary nodes and to unwrap Parenthetical expressions before
checking the operator; e.g., normalize the expr by stripping Parenthetical
layers, then use a type switch to detect either *pgsql.BinaryExpression or
pgsql.BinaryExpression with Operator == pgsql.OperatorAnd and recursively
flatten bin.LOperand and bin.ROperand, returning the single expr when it’s not
an AND binary node.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 522ca622-5fe3-419b-90c6-5b6f5d23b4fa

📥 Commits

Reviewing files that changed from the base of the PR and between 21473e6 and 0aae4b7.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (21)
  • container/triplestore.go
  • cypher/models/pgsql/test/testcase.go
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql
  • cypher/models/pgsql/test/translation_cases/scalar_aggregation.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/expression.go
  • cypher/models/pgsql/translate/function.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/predicate.go
  • cypher/models/pgsql/translate/traversal.go
  • cypher/models/walk/walk_pgsql.go
  • drivers/pg/query/sql/schema_up.sql
  • go.mod

@zinic zinic force-pushed the BED-6499 branch 3 times, most recently from 66d540c to 44ddf42 Compare March 23, 2026 18:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cypher/models/pgsql/translate/function.go (1)

309-334: ⚠️ Potential issue | 🟠 Major

Remove float8 casting from min/max arguments.

Per Neo4j's official documentation, min() and max() accept ANY value type—not just numeric values. They support strings, temporal values (DATE, DATETIME, TIME, DURATION), and other comparable types. Casting arguments to float8 violates Cypher semantics and will cause runtime failures for queries like min(n.name) or min(n.created_date). The test cases already reflect this regression by forcing temporal values through numeric casting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/function.go` around lines 309 - 334, The
MinFunction and MaxFunction handlers are incorrectly casting their single
argument to pgsql.Float8 via TypeCastExpression, which breaks non-numeric
inputs; update the cypher.MinFunction and cypher.MaxFunction cases to stop
calling TypeCastExpression/using pgsql.Float8 and instead push the original
popped operand as the sole parameter in the pgsql.FunctionCall (i.e., after the
NumArguments check and PopOperand call, use the popped argument directly in
treeTranslator.PushOperand with Function: pgsql.FunctionMin / pgsql.FunctionMax
and Parameters: []pgsql.Expression{argument}); keep existing error handling and
name checks intact.
♻️ Duplicate comments (2)
cypher/models/pgsql/test/translation_cases/multipart.sql (1)

32-33: ⚠️ Potential issue | 🟡 Minor

Remove the duplicated count(distinct(n)) case.

This block is a byte-for-byte copy of the case at Lines 29-30, so it adds no coverage and makes the fixture harder to maintain.

Verify by locating both identical -- case: headers; expected: this case should appear once.

#!/bin/bash
python - <<'PY'
from pathlib import Path
lines = Path("cypher/models/pgsql/test/translation_cases/multipart.sql").read_text().splitlines()
needle = "-- case: match (n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) where (n:NodeKind1 or n:NodeKind2) and n.enabled = true with m, count(distinct(n)) as p where p >= 10 return m"
for idx, line in enumerate(lines, 1):
    if line == needle:
        print(idx)
PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 32 -
33, Remove the duplicated test case block that starts with the comment header
"-- case: match
(n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) where
(n:NodeKind1 or n:NodeKind2) and n.enabled = true with m, count(distinct(n)) as
p where p >= 10 return m"; specifically delete the second byte-for-byte copy of
that case (the entire WITH ... SELECT ... query that builds s0/s1/s2/s3 and
returns s0.n2 as m) so only a single instance of the case remains, then run the
provided verification script or search for the identical "-- case:" header to
confirm there is exactly one occurrence.
cypher/models/pgsql/test/translation_cases/pattern_binding.sql (1)

54-54: ⚠️ Potential issue | 🟠 Major

Parenthesize the n1 endpoint OR in the undirected case.

n0 was fixed, but n1 still parses as (NodeKind1 AND end_id) OR start_id. That lets the start_id branch bind a non-NodeKind1 node, so the expected SQL still weakens match p = (n:NodeKind1)-[r]-(m:NodeKind1).

🛠️ Suggested fix
-n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id
+n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)

Verify by inspecting the n1 join; expected: both endpoint branches should be grouped under the same NodeKind1 predicate.

#!/bin/bash
sed -n '53,54p' cypher/models/pgsql/test/translation_cases/pattern_binding.sql
rg -nP 'join node n1 on .*n1\.kind_ids.* and n1\.id = e0\.end_id or n1\.id = e0\.start_id' cypher/models/pgsql/test/translation_cases/pattern_binding.sql
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql` at line 54,
The join for n1 is missing parentheses so the OR binds only to the id checks
instead of grouping both endpoint id branches under the NodeKind1 predicate;
locate the "join node n1 on n1.kind_ids operator (pg_catalog.@>) array
[1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id" expression in
pattern_binding.sql and wrap the endpoint OR with parentheses so the condition
reads like "join node n1 on n1.kind_ids operator (pg_catalog.@>) array
[1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)" (i.e., ensure the
kind_ids check and both id branches are grouped together).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@container/triplestore.go`:
- Around line 272-275: The projection filters are inconsistent: update
EachAdjacentEdge to mirror EachEdge by skipping edges where
s.deletedEdges.Contains(next.ID) or either endpoint is deleted (use
s.deletedNodes.Contains(next.Start) and s.deletedNodes.Contains(next.End)
instead of checking Start twice), update NumEdges to count only edges that are
not in s.deletedEdges and whose Start and End nodes are not in s.deletedNodes
(e.g., iterate s.origin.EachEdge or filter the stored count accordingly), and
ensure EachAdjacentNode (and any other edge-returning APIs) applies the same
deletedEdges/deletedNodes checks so hidden edges cannot be observed via
adjacency helpers; reference s.origin, s.deletedEdges and s.deletedNodes in each
fix.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Line 63: The n3 JOIN ON condition in CTE s3 is missing parentheses so the AND
binds tighter than the OR; update the n3 join (the one referencing n3.kind_ids
and n3.id with e1.end_id/e1.start_id) to parenthesize the end_id branch: wrap
"(n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n3.id = e1.end_id)"
and then OR n3.id = e1.start_id, mirroring how n2's ON clause is parenthesized
so the NodeKind1 predicate applies to both endpoint branches of e1.
- Around line 29-30: The aggregate in the s3 -> s0 projection is missing
DISTINCT and overcounts; update the select that defines i0 (currently
"count((s3.n0))::int8 as i0") to use a distinct count of the node composite
(e.g., count(DISTINCT (s3.n0))::int8 as i0) so it matches the Cypher semantics
count(distinct(n)) — locate the select in the CTE that projects "select s3.n2 as
n2, count((s3.n0))::int8 as i0 from s3 group by n2" and change only the
aggregate to use DISTINCT.

In `@cypher/models/pgsql/translate/constraints.go`:
- Around line 29-43: The terminalNodeConstraint function returns an OR-based
binary expression that is not wrapped in pgsql.NewParenthetical, causing
operator-precedence bugs when combined with OptionalAnd; update
terminalNodeConstraint to wrap the OR expression in pgsql.NewParenthetical (same
as the other constraint functions handling DirectionBoth) so the combined
predicate preserves intended grouping — locate terminalNodeConstraint and ensure
the compound pgsql.NewBinaryExpression( ... OperatorOr ... ) is passed into
pgsql.NewParenthetical before returning.

---

Outside diff comments:
In `@cypher/models/pgsql/translate/function.go`:
- Around line 309-334: The MinFunction and MaxFunction handlers are incorrectly
casting their single argument to pgsql.Float8 via TypeCastExpression, which
breaks non-numeric inputs; update the cypher.MinFunction and cypher.MaxFunction
cases to stop calling TypeCastExpression/using pgsql.Float8 and instead push the
original popped operand as the sole parameter in the pgsql.FunctionCall (i.e.,
after the NumArguments check and PopOperand call, use the popped argument
directly in treeTranslator.PushOperand with Function: pgsql.FunctionMin /
pgsql.FunctionMax and Parameters: []pgsql.Expression{argument}); keep existing
error handling and name checks intact.

---

Duplicate comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 32-33: Remove the duplicated test case block that starts with the
comment header "-- case: match
(n:NodeKind1)-[:EdgeKind1*1..]->(:NodeKind2)-[:EdgeKind2]->(m:NodeKind1) where
(n:NodeKind1 or n:NodeKind2) and n.enabled = true with m, count(distinct(n)) as
p where p >= 10 return m"; specifically delete the second byte-for-byte copy of
that case (the entire WITH ... SELECT ... query that builds s0/s1/s2/s3 and
returns s0.n2 as m) so only a single instance of the case remains, then run the
provided verification script or search for the identical "-- case:" header to
confirm there is exactly one occurrence.

In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql`:
- Line 54: The join for n1 is missing parentheses so the OR binds only to the id
checks instead of grouping both endpoint id branches under the NodeKind1
predicate; locate the "join node n1 on n1.kind_ids operator (pg_catalog.@>)
array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id" expression in
pattern_binding.sql and wrap the endpoint OR with parentheses so the condition
reads like "join node n1 on n1.kind_ids operator (pg_catalog.@>) array
[1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)" (i.e., ensure the
kind_ids check and both id branches are grouped together).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 50f5e6d0-2550-4c3b-a239-0f3fe8aab167

📥 Commits

Reviewing files that changed from the base of the PR and between 0aae4b7 and 131b07d.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (22)
  • container/triplestore.go
  • cypher/models/pgsql/test/testcase.go
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql
  • cypher/models/pgsql/test/translation_cases/scalar_aggregation.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/expression.go
  • cypher/models/pgsql/translate/function.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/predicate.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/traversal.go
  • cypher/models/walk/walk_pgsql.go
  • drivers/pg/query/sql/schema_up.sql
  • go.mod
✅ Files skipped from review due to trivial changes (6)
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/test/testcase.go
  • drivers/pg/query/sql/schema_up.sql
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
🚧 Files skipped from review as they are similar to previous changes (4)
  • cypher/models/pgsql/translate/predicate.go
  • cypher/models/walk/walk_pgsql.go
  • go.mod
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
container/triplestore.go (1)

282-290: ⚠️ Potential issue | 🟠 Major

EachAdjacentEdge still checks next.Start twice.

This method has the same bug that was just fixed in EachEdge — line 284 checks next.Start twice instead of checking both next.Start and next.End.

 func (s *triplestoreProjection) EachAdjacentEdge(node uint64, direction graph.Direction, delegate func(next Edge) bool) {
 	s.origin.EachAdjacentEdge(node, direction, func(next Edge) bool {
-		if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.Start) {
+		if !s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.End) {
 			return delegate(next)
 		}

 		return true
 	})
 }

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@container/triplestore.go` around lines 282 - 290, In EachAdjacentEdge on
triplestoreProjection you’re mistakenly checking next.Start twice; update the
deletedNodes checks so the second check uses next.End (i.e., ensure the
condition is !s.deletedEdges.Contains(next.ID) &&
!s.deletedNodes.Contains(next.Start) && !s.deletedNodes.Contains(next.End)) so
edges adjacent to deleted end nodes are filtered correctly; modify the anonymous
func passed to s.origin.EachAdjacentEdge accordingly.
cypher/models/pgsql/translate/constraints.go (1)

29-43: ⚠️ Potential issue | 🔴 Critical

Wrap terminalNodeConstraint's DirectionBoth predicate in pgsql.NewParenthetical(...).

This is still the only traversal-side A OR B predicate returned without grouping. Once rightNodeTraversalStepJoinCondition is combined with other right-node filters via pgsql.OptionalAnd(...), the start_id arm can bypass those filters; the updated negative-pattern expectations in nodes.sql already show that broken ... AND ... OR ... shape.

🔧 Suggested fix
 	case graph.DirectionBoth:
-		return pgsql.NewBinaryExpression(
-			pgsql.NewBinaryExpression(
-				pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID},
-				pgsql.OperatorEquals,
-				pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnEndID},
-			),
-			pgsql.OperatorOr,
-			pgsql.NewBinaryExpression(
-				pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID},
-				pgsql.OperatorEquals,
-				pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnStartID},
-			),
-		), nil
+		return pgsql.NewParenthetical(
+			pgsql.NewBinaryExpression(
+				pgsql.NewBinaryExpression(
+					pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID},
+					pgsql.OperatorEquals,
+					pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnEndID},
+				),
+				pgsql.OperatorOr,
+				pgsql.NewBinaryExpression(
+					pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID},
+					pgsql.OperatorEquals,
+					pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnStartID},
+				),
+			),
+		), nil

Also applies to: 76-90, 152-165

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/constraints.go` around lines 29 - 43,
terminalNodeConstraint returns a DirectionBoth predicate composed as "A OR B"
without grouping, which can escape surrounding AND filters when combined (see
rightNodeTraversalStepJoinCondition and pgsql.OptionalAnd); wrap the
DirectionBoth predicate in pgsql.NewParenthetical(...) wherever it is built
(e.g., in terminalNodeConstraint and the similar constructions at the other
locations mentioned) so the OR expression is parenthesized and cannot bypass
adjacent filters.
cypher/models/pgsql/test/translation_cases/multipart.sql (2)

29-30: ⚠️ Potential issue | 🟠 Major

Preserve the DISTINCT in this aggregate.

Line 30 still uses plain count((s3.n0)), so the same n is overcounted when it reaches the same m through multiple paths. That does not match the count(distinct(n)) case on Line 29.

Suggested fix
-select s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2
+select s3.n2 as n2, count(distinct (s3.n0))::int8 as i0 from s3 group by n2
In PostgreSQL, does `count(expr)` count duplicate non-null values, while `count(DISTINCT expr)` removes duplicates?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 29 -
30, The COUNT aggregate lost its DISTINCT qualifier causing duplicate n values
to be overcounted; update the aggregate in the final CTE from count((s3.n0)) (in
the SELECT inside CTE s0 / s3 -> producing i0) to count(DISTINCT (s3.n0)) so it
matches the original intent count(distinct(n)) — locate the COUNT expression
producing i0 (count((s3.n0))::int8 as i0 in the s3 -> s0 selection) and add
DISTINCT to the count.

59-60: ⚠️ Potential issue | 🟠 Major

Parenthesize the n3 endpoint predicate.

Line 60 still parses as (n3.kind_ids @> ... and n3.id = e1.end_id) or n3.id = e1.start_id, so the start_id branch can bind any node kind. That weakens the g:NodeKind1 constraint for the undirected match.

Suggested fix
-join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n3.id = e1.end_id or n3.id = e1.start_id
+join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n3.id = e1.end_id or n3.id = e1.start_id)
In PostgreSQL SQL operator precedence, does `AND` bind tighter than `OR` inside a `JOIN ... ON` predicate?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 59 -
60, The ON predicate for the join with node n3 is missing parentheses so the
expression is parsed as (n3.kind_ids @> ... AND n3.id = e1.end_id) OR n3.id =
e1.start_id, weakening the g:NodeKind1 constraint; fix by parenthesizing the id
check so the kind_ids requirement applies to both endpoints—e.g. wrap the OR as
(n3.id = e1.end_id OR n3.id = e1.start_id) or wrap the whole right-hand side so
the join clause for n3 reads n3.kind_ids operator (...) AND (n3.id = e1.end_id
OR n3.id = e1.start_id) in the CTE that builds s3 (references: n3, e1, s1, s3).
🧹 Nitpick comments (1)
cypher/models/pgsql/translate/function.go (1)

309-335: Observation: Min/Max lack outer CastType, unlike Sum/Avg.

The Min and Max functions apply the Float8 input cast but don't set CastType on the resulting FunctionCall, whereas Sum and Avg set CastType: pgsql.Numeric. This appears intentional since min()/max() preserve input type semantics (returning Float8), while sum()/avg() benefit from Numeric precision.

If this asymmetry is intentional for semantic correctness, consider adding a brief comment explaining why Min/Max don't need the outer Numeric cast.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/function.go` around lines 309 - 335, Min and
Max handlers (cypher.MinFunction, cypher.MaxFunction) cast the argument to
Float8 via TypeCastExpression but do not set the FunctionCall.CastType (unlike
Sum/Avg which set CastType: pgsql.Numeric); add a short clarifying comment above
the Min/Max cases explaining that min()/max() intentionally preserve the input's
Float8 type and therefore do not require the outer pgsql.Numeric CastType, and
mention why Sum/Avg differ (numeric aggregation precision) to prevent future
confusion when editing the FunctionCall construction in these branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 47-48: The recursive CTE s2 currently seeds at depth 1 and never
produces the zero-length path for the *0.. traversal, so nodes matching the
'%-516' predicate never populate the exclude collection; to fix, add a zero-hop
seed row to the s2 CTE (the same CTE declared as s2(root_id, next_id, depth,
satisfied, is_cycle, path)) that sets depth = 0, next_id = root_id (or both
start/end = the matching node), path = '{}' (empty array), satisfied = true when
the node matches ((n1.properties ->> 'objectid') like '%-516') and is_cycle =
true if appropriate; keep the existing recursive step unchanged so you still
collect longer paths and ensure s1/s0 aggregate will include the zero-length
node for the exclude check used later.

In `@cypher/models/pgsql/test/translation_cases/update.sql`:
- Around line 50-51: The bug is using UPDATE ... FROM ... RETURNING as row
carriers (CTEs s1/s2) which collapses one-to-many matches; instead, keep the
original match set in s0 and do updates in two phases: 1) collect matched rows
in s0 (already present), 2) perform UPDATE node AS n2 SET ... WHERE id IN
(select distinct (s0.n0).id from s0) RETURNING n2.id, n2.properties into a new
CTE (e.g., updated_nodes), then join s0 to updated_nodes on id to rehydrate one
row per original match, and finally perform the edge UPDATE using ids from that
rehydrated set (not the UPDATE returning rows) storing its RETURNING into
updated_edges; apply the same pattern for the edge update (use distinct ids from
the rehydrated rows, UPDATE edge AS e1 WHERE id IN (...) RETURNING ... into
updated_edges) so no match multiplicity is lost (refer to CTE names s0,
updated_nodes, updated_edges and tables node/edge and columns id, start_id,
end_id, properties).

In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 12-28: The function buildDirectionlessTraversalPatternRoot still
always joins the left node which causes an unused joined node and
row-multiplication when traversalStep.LeftNode is already bound; mirror the
LeftNodeBound handling from buildTraversalPatternRoot: detect when
traversalStep.LeftNodeBound (or when RewriteFrameBindings can satisfy
traversalStep.LeftNodeJoinCondition from the previous frame and
traversalStep.Edge), then join the edge/table from the prior frame and avoid
materializing the left-node table here (only materialize the terminal/right node
and adjust nextSelect/Projection accordingly); update the join creation logic
that uses traversalStep.LeftNode, traversalStep.Edge,
traversalStep.LeftNodeJoinCondition and nextSelect so the left node is not added
to the FROM/JOIN list when LeftNodeBound is true and ensure constraints use the
edge+previous frame identifiers instead.

---

Duplicate comments:
In `@container/triplestore.go`:
- Around line 282-290: In EachAdjacentEdge on triplestoreProjection you’re
mistakenly checking next.Start twice; update the deletedNodes checks so the
second check uses next.End (i.e., ensure the condition is
!s.deletedEdges.Contains(next.ID) && !s.deletedNodes.Contains(next.Start) &&
!s.deletedNodes.Contains(next.End)) so edges adjacent to deleted end nodes are
filtered correctly; modify the anonymous func passed to
s.origin.EachAdjacentEdge accordingly.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 29-30: The COUNT aggregate lost its DISTINCT qualifier causing
duplicate n values to be overcounted; update the aggregate in the final CTE from
count((s3.n0)) (in the SELECT inside CTE s0 / s3 -> producing i0) to
count(DISTINCT (s3.n0)) so it matches the original intent count(distinct(n)) —
locate the COUNT expression producing i0 (count((s3.n0))::int8 as i0 in the s3
-> s0 selection) and add DISTINCT to the count.
- Around line 59-60: The ON predicate for the join with node n3 is missing
parentheses so the expression is parsed as (n3.kind_ids @> ... AND n3.id =
e1.end_id) OR n3.id = e1.start_id, weakening the g:NodeKind1 constraint; fix by
parenthesizing the id check so the kind_ids requirement applies to both
endpoints—e.g. wrap the OR as (n3.id = e1.end_id OR n3.id = e1.start_id) or wrap
the whole right-hand side so the join clause for n3 reads n3.kind_ids operator
(...) AND (n3.id = e1.end_id OR n3.id = e1.start_id) in the CTE that builds s3
(references: n3, e1, s1, s3).

In `@cypher/models/pgsql/translate/constraints.go`:
- Around line 29-43: terminalNodeConstraint returns a DirectionBoth predicate
composed as "A OR B" without grouping, which can escape surrounding AND filters
when combined (see rightNodeTraversalStepJoinCondition and pgsql.OptionalAnd);
wrap the DirectionBoth predicate in pgsql.NewParenthetical(...) wherever it is
built (e.g., in terminalNodeConstraint and the similar constructions at the
other locations mentioned) so the OR expression is parenthesized and cannot
bypass adjacent filters.

---

Nitpick comments:
In `@cypher/models/pgsql/translate/function.go`:
- Around line 309-335: Min and Max handlers (cypher.MinFunction,
cypher.MaxFunction) cast the argument to Float8 via TypeCastExpression but do
not set the FunctionCall.CastType (unlike Sum/Avg which set CastType:
pgsql.Numeric); add a short clarifying comment above the Min/Max cases
explaining that min()/max() intentionally preserve the input's Float8 type and
therefore do not require the outer pgsql.Numeric CastType, and mention why
Sum/Avg differ (numeric aggregation precision) to prevent future confusion when
editing the FunctionCall construction in these branches.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 80a70886-2ff9-45fe-9fae-79d607458564

📥 Commits

Reviewing files that changed from the base of the PR and between 131b07d and 44ddf42.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (22)
  • container/triplestore.go
  • cypher/models/pgsql/test/testcase.go
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql
  • cypher/models/pgsql/test/translation_cases/scalar_aggregation.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/expression.go
  • cypher/models/pgsql/translate/function.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/predicate.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/traversal.go
  • cypher/models/walk/walk_pgsql.go
  • drivers/pg/query/sql/schema_up.sql
  • go.mod
✅ Files skipped from review due to trivial changes (5)
  • cypher/models/pgsql/test/testcase.go
  • drivers/pg/query/sql/schema_up.sql
  • cypher/models/pgsql/translate/expansion.go
  • go.mod
  • cypher/models/pgsql/translate/model.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • cypher/models/walk/walk_pgsql.go
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/scalar_aggregation.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql

…oin predicate placement - BED-6499

- fix: correct `triplestoreProjection.EachEdge` to check `next.End` instead of `next.Start` twice, preventing deleted end-nodes from being included in edge iteration
- fix: change recursive CTE depth condition from `<= N` to `< N` across all multipart traversal queries to enforce consistent exclusive upper-bound semantics
- fix: move `not c in exclude` filter out of the JOIN condition and into a final WHERE clause for collect/exclude path patterns, correcting result filtering
- fix: update edge accumulation in chained traversal to avoid reordering the final leg of a path
- fix: node kind grouping for bidirecitonal matches missing a parenthetical wrapper
- refactor: push node kind and property predicates from WHERE clauses into JOIN ON conditions throughout generated SQL for improved query planning
- test: replace `require.Nil` with `require.NoError` in `AssertLive` for clearer assertion semantics
- test: add additional expected SQL output cases for multipart, collect/exclude, and samaccountname translation scenarios
- chore: update go.mod, go.sum
- feat: add new btree index to accelerate fused kind and edge start or end node ID lookups
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (7)
cypher/models/pgsql/translate/constraints.go (1)

152-165: ⚠️ Potential issue | 🔴 Critical

terminalNodeConstraint still missing parenthesization for DirectionBoth.

This function returns an OR-based expression that is not wrapped in pgsql.NewParenthetical, while the analogous leftNodeConstraint and rightEdgeConstraint were fixed. When combined with other predicates via OptionalAnd, SQL operator precedence will cause incorrect filtering.

Suggested fix
 case graph.DirectionBoth:
-	return pgsql.NewBinaryExpression(
+	return pgsql.NewParenthetical(
+		pgsql.NewBinaryExpression(
 			pgsql.NewBinaryExpression(
 				pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID},
 				pgsql.OperatorEquals,
 				pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnEndID},
 			),
 			pgsql.OperatorOr,
 			pgsql.NewBinaryExpression(
 				pgsql.CompoundIdentifier{nodeIdentifier, pgsql.ColumnID},
 				pgsql.OperatorEquals,
 				pgsql.CompoundIdentifier{edgeIdentifier, pgsql.ColumnStartID},
 			),
-		), nil
+		),
+	), nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/constraints.go` around lines 152 - 165,
terminalNodeConstraint is returning an OR expression for graph.DirectionBoth
without wrapping it in pgsql.NewParenthetical, causing operator precedence
issues when combined via OptionalAnd; update the DirectionBoth branch in
terminalNodeConstraint to wrap the pgsql.NewBinaryExpression(...) OR expression
with pgsql.NewParenthetical(...) (matching how leftNodeConstraint and
rightEdgeConstraint were fixed) so the combined predicate is parenthesized
before returning.
cypher/models/pgsql/test/validation_integration_test.go (1)

68-71: ⚠️ Potential issue | 🟡 Minor

Typo in variable name: cassesPassed should be casesPassed.

🔤 Proposed fix
 		var (
 			casesRun     = 0
-			cassesPassed = 0
+			casesPassed  = 0
 		)

Also update references on lines 89 and 96.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/validation_integration_test.go` around lines 68 -
71, Rename the misspelled variable cassesPassed to casesPassed in the test scope
where it is declared and update every reference to that identifier (e.g.,
increments and assertions) so they use casesPassed instead; ensure the variable
name is consistently used throughout the test (including places that currently
reference cassesPassed).
cypher/models/pgsql/translate/traversal.go (1)

12-28: ⚠️ Potential issue | 🟠 Major

Directionless roots still need a LeftNodeBound branch.

Line 30 already detects a previous frame, but this builder still always emits edge JOIN node left JOIN node right. When the left node is already bound, that can leave the left join unused and multiply rows by node cardinality for bound directionless traversals. Mirror the bound-join shape from buildTraversalPatternRoot here.

Also applies to: 30-62

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/traversal.go` around lines 12 - 28, The builder
buildDirectionlessTraversalPatternRoot always emits an unbound left Join even
when traversalStep.LeftNodeBound is true, causing an unused left join and row
multiplication; update this function to mirror the bound-join shape used in
buildTraversalPatternRoot: detect traversalStep.LeftNodeBound and, when true,
emit the join order that treats the left node as already bound (skip or convert
the left join into a bound-join shape) while still applying
partitionConstraintByLocality results (leftJoinLocal/leftJoinExternal and
rightJoinLocal/rightJoinExternal) and preserving nextSelect. Ensure you reuse
the same join construction logic/shape from buildTraversalPatternRoot so bound
directionless traversals do not create the extra left JOIN.
cypher/models/pgsql/test/translation_cases/multipart.sql (3)

59-60: ⚠️ Potential issue | 🟠 Major

Group the n3 endpoint disjunction.

n3.kind_ids ... and n3.id = e1.end_id or n3.id = e1.start_id only constrains the end_id branch. The undirected g:NodeKind1 match becomes too permissive on the start_id side.

Suggested fix
-join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n3.id = e1.end_id or n3.id = e1.start_id
+join node n3 on n3.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n3.id = e1.end_id or n3.id = e1.start_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 59 -
60, The join condition for n3 in the s3 CTE is missing parentheses around the
OR, so the "n3.kind_ids ... and n3.id = e1.end_id or n3.id = e1.start_id" only
applies the kind_ids check to the end_id branch; update the join in the s3 CTE
(the "join node n3 on ..." clause) to group the disjunction so the kind_ids
check applies to both id branches — e.g., change it to require n3.kind_ids ...
AND (n3.id = e1.end_id OR n3.id = e1.start_id) so the undirected match for
g:NodeKind1 is correctly constrained.

47-48: ⚠️ Potential issue | 🟠 Major

*0.. still needs a zero-hop seed.

s2 still starts at depth 1, so the zero-length dc = g match never reaches exclude. That changes the final not c in exclude filter whenever a -516 node should exclude itself.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 47 -
48, The recursive CTE s2 currently seeds at depth 1 so zero-hop matches are
omitted; modify the s2 seed to include a zero-length seed (root_id = e0.end_id,
next_id = e0.end_id, depth = 0, satisfied = true, is_cycle = false, path = '{}'
or empty array) or add an explicit initial SELECT that represents the zero-hop
case before the existing union select, so that s0 (and ultimately exclude)
contains the dc nodes equal to g; update any downstream assumptions in s0/s3
that expect non-empty path accordingly.

29-30: ⚠️ Potential issue | 🟠 Major

Use DISTINCT in this aggregate.

The case is count(distinct(n)), but the fixture still expects plain count((s3.n0)). That overcounts duplicate n bindings that reach the same m through multiple paths and blesses the wrong translation.

Suggested fix
-select s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2
+select s3.n2 as n2, count(distinct (s3.n0))::int8 as i0 from s3 group by n2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/multipart.sql` around lines 29 -
30, The aggregate in the s3→s0 pipeline currently uses count((s3.n0)) which
overcounts duplicates; change the aggregation in the query fragment "select
s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2" to use a DISTINCT
qualifier on s3.n0 (e.g. count(distinct (s3.n0))::int8 as i0) so the computed i0
matches count(distinct(n)) and downstream filters in s0 behave correctly.
cypher/models/pgsql/test/translation_cases/pattern_binding.sql (1)

53-54: ⚠️ Potential issue | 🟠 Major

The n1 join still loses precedence.

n0 is parenthesized now, but n1.kind_ids ... and n1.id = e0.end_id or n1.id = e0.start_id still lets the start_id branch skip the NodeKind1 filter.

Suggested fix
-join node n1 on n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id
+join node n1 on n1.kind_ids operator (pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql` around lines
53 - 54, The join for n1 loses precedence because the OR isn't parenthesized,
allowing the "n1.id = e0.start_id" branch to bypass the kind_ids filter; update
the n1 join condition to mirror n0 by grouping the kind_ids check with the id
disjunction, e.g. change "join node n1 on n1.kind_ids operator (pg_catalog.@>)
array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id" to use
parentheses so the condition reads like "(n1.kind_ids operator (pg_catalog.@>)
array [1]::int2[] and (n1.id = e0.end_id or n1.id = e0.start_id))", ensuring the
NodeKind1 filter applies to both id branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cypher/models/pgsql/test/translation_cases/nodes.sql`:
- Around line 189-190: terminalNodeConstraint's graph.DirectionBoth branch
currently returns an unwrapped BinaryExpression (OR) which breaks operator
precedence when combined with ANDed property/kind filters; change that branch to
wrap the BinaryExpression in pgsql.NewParenthetical(), mirroring the pattern
used in leftNodeConstraint and rightEdgeConstraint so expressions like
"(n1.properties ->> 'name') = 'n3' and (n1.id = e0.end_id or n1.id =
e0.start_id)" are correctly grouped; update the graph.DirectionBoth case inside
terminalNodeConstraint to return pgsql.NewParenthetical(<the existing
BinaryExpression>) instead of the raw BinaryExpression.

In `@cypher/models/walk/walk_pgsql.go`:
- Around line 255-258: The RowColumnReference branch only walks
typedNode.Identifier but must also walk the column identifier; in the case
handling for pgsql.RowColumnReference (returning &Cursor[pgsql.SyntaxNode]{Node:
node, Branches: []pgsql.SyntaxNode{typedNode.Identifier}}) update the Branches
list to include typedNode.Column as well (i.e., include both
typedNode.Identifier and typedNode.Column) so identifier-collection and rewrite
passes see the column side of row.column references.
- Around line 92-96: The traversal for pgsql.AliasedExpression currently only
adds typedNode.Expression to Cursor.Branches and omits the Alias, causing alias
identifiers to be skipped; update both case handlers for
*pgsql.AliasedExpression (the ones returning &Cursor[pgsql.SyntaxNode]{...}
around the shown diff and the similar block at lines ~104-108) to include
typedNode.Alias in the Branches slice (e.g., Branches:
[]pgsql.SyntaxNode{typedNode.Expression, typedNode.Alias}) so the generic walker
will visit the Alias (pgsql.Identifier) as well.

---

Duplicate comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 59-60: The join condition for n3 in the s3 CTE is missing
parentheses around the OR, so the "n3.kind_ids ... and n3.id = e1.end_id or
n3.id = e1.start_id" only applies the kind_ids check to the end_id branch;
update the join in the s3 CTE (the "join node n3 on ..." clause) to group the
disjunction so the kind_ids check applies to both id branches — e.g., change it
to require n3.kind_ids ... AND (n3.id = e1.end_id OR n3.id = e1.start_id) so the
undirected match for g:NodeKind1 is correctly constrained.
- Around line 47-48: The recursive CTE s2 currently seeds at depth 1 so zero-hop
matches are omitted; modify the s2 seed to include a zero-length seed (root_id =
e0.end_id, next_id = e0.end_id, depth = 0, satisfied = true, is_cycle = false,
path = '{}' or empty array) or add an explicit initial SELECT that represents
the zero-hop case before the existing union select, so that s0 (and ultimately
exclude) contains the dc nodes equal to g; update any downstream assumptions in
s0/s3 that expect non-empty path accordingly.
- Around line 29-30: The aggregate in the s3→s0 pipeline currently uses
count((s3.n0)) which overcounts duplicates; change the aggregation in the query
fragment "select s3.n2 as n2, count((s3.n0))::int8 as i0 from s3 group by n2" to
use a DISTINCT qualifier on s3.n0 (e.g. count(distinct (s3.n0))::int8 as i0) so
the computed i0 matches count(distinct(n)) and downstream filters in s0 behave
correctly.

In `@cypher/models/pgsql/test/translation_cases/pattern_binding.sql`:
- Around line 53-54: The join for n1 loses precedence because the OR isn't
parenthesized, allowing the "n1.id = e0.start_id" branch to bypass the kind_ids
filter; update the n1 join condition to mirror n0 by grouping the kind_ids check
with the id disjunction, e.g. change "join node n1 on n1.kind_ids operator
(pg_catalog.@>) array [1]::int2[] and n1.id = e0.end_id or n1.id = e0.start_id"
to use parentheses so the condition reads like "(n1.kind_ids operator
(pg_catalog.@>) array [1]::int2[] and (n1.id = e0.end_id or n1.id =
e0.start_id))", ensuring the NodeKind1 filter applies to both id branches.

In `@cypher/models/pgsql/test/validation_integration_test.go`:
- Around line 68-71: Rename the misspelled variable cassesPassed to casesPassed
in the test scope where it is declared and update every reference to that
identifier (e.g., increments and assertions) so they use casesPassed instead;
ensure the variable name is consistently used throughout the test (including
places that currently reference cassesPassed).

In `@cypher/models/pgsql/translate/constraints.go`:
- Around line 152-165: terminalNodeConstraint is returning an OR expression for
graph.DirectionBoth without wrapping it in pgsql.NewParenthetical, causing
operator precedence issues when combined via OptionalAnd; update the
DirectionBoth branch in terminalNodeConstraint to wrap the
pgsql.NewBinaryExpression(...) OR expression with pgsql.NewParenthetical(...)
(matching how leftNodeConstraint and rightEdgeConstraint were fixed) so the
combined predicate is parenthesized before returning.

In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 12-28: The builder buildDirectionlessTraversalPatternRoot always
emits an unbound left Join even when traversalStep.LeftNodeBound is true,
causing an unused left join and row multiplication; update this function to
mirror the bound-join shape used in buildTraversalPatternRoot: detect
traversalStep.LeftNodeBound and, when true, emit the join order that treats the
left node as already bound (skip or convert the left join into a bound-join
shape) while still applying partitionConstraintByLocality results
(leftJoinLocal/leftJoinExternal and rightJoinLocal/rightJoinExternal) and
preserving nextSelect. Ensure you reuse the same join construction logic/shape
from buildTraversalPatternRoot so bound directionless traversals do not create
the extra left JOIN.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4781e1bd-b6b7-4dc8-ac9a-f19be8ead553

📥 Commits

Reviewing files that changed from the base of the PR and between 44ddf42 and 1967dc9.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (21)
  • container/triplestore.go
  • cypher/models/pgsql/test/testcase.go
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql
  • cypher/models/pgsql/test/translation_cases/scalar_aggregation.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/expression.go
  • cypher/models/pgsql/translate/function.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/predicate.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/traversal.go
  • cypher/models/walk/walk_pgsql.go
  • go.mod
✅ Files skipped from review due to trivial changes (4)
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/test/testcase.go
  • cypher/models/pgsql/translate/model.go
  • go.mod
🚧 Files skipped from review as they are similar to previous changes (5)
  • container/triplestore.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
cypher/models/pgsql/translate/traversal.go (1)

30-60: ⚠️ Potential issue | 🟠 Major

Still missing bound-node handling in the directionless root.

Lines 38-60 still always materialize both node tables. When traversalStep.LeftNodeBound or traversalStep.RightNodeBound is true, the corresponding join can be satisfied from the previous frame plus edge, so the extra JOIN node ... stops constraining that alias and multiplies rows. Mirror the bound-node branching from buildTraversalPatternRoot here instead of always joining both n0 and n1.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/traversal.go` around lines 30 - 60, The current
traversal root always adds JOINs for both node tables (pgsql.TableNode) in
nextSelect.From causing row multiplication when a node is bound; update the
logic in the block that builds nextSelect.From (the code that appends the
pgsql.FromClause with TableEdge and its Joins) to mirror the branching in
buildTraversalPatternRoot: check traversalStep.LeftNodeBound and
traversalStep.RightNodeBound and, for any bound side, avoid materializing the
corresponding JOIN node table and instead satisfy that join from the previous
frame + edge (use previousValidFrame(traversalStep.Frame) and the edge binding),
while for unbound sides retain the existing JOIN with JoinOperator.Constraint
set to pgsql.OptionalAnd(leftJoinLocal, traversalStep.LeftNodeJoinCondition) or
pgsql.OptionalAnd(rightJoinLocal, traversalStep.RightNodeJoinCondition)
respectively; ensure you still set the TableReference.Binding to
traversalStep.LeftNode.Identifier / traversalStep.RightNode.Identifier where
appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@cypher/models/pgsql/translate/traversal.go`:
- Around line 30-60: The current traversal root always adds JOINs for both node
tables (pgsql.TableNode) in nextSelect.From causing row multiplication when a
node is bound; update the logic in the block that builds nextSelect.From (the
code that appends the pgsql.FromClause with TableEdge and its Joins) to mirror
the branching in buildTraversalPatternRoot: check traversalStep.LeftNodeBound
and traversalStep.RightNodeBound and, for any bound side, avoid materializing
the corresponding JOIN node table and instead satisfy that join from the
previous frame + edge (use previousValidFrame(traversalStep.Frame) and the edge
binding), while for unbound sides retain the existing JOIN with
JoinOperator.Constraint set to pgsql.OptionalAnd(leftJoinLocal,
traversalStep.LeftNodeJoinCondition) or pgsql.OptionalAnd(rightJoinLocal,
traversalStep.RightNodeJoinCondition) respectively; ensure you still set the
TableReference.Binding to traversalStep.LeftNode.Identifier /
traversalStep.RightNode.Identifier where appropriate.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 94c39b79-23cf-4d07-8359-01aedfb06286

📥 Commits

Reviewing files that changed from the base of the PR and between 1967dc9 and 82b228a.

📒 Files selected for processing (4)
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/traversal.go
✅ Files skipped from review due to trivial changes (1)
  • cypher/models/pgsql/translate/expansion.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/test/translation_cases/multipart.sql

@zinic zinic force-pushed the BED-6499 branch 8 times, most recently from 37fbefe to e7e8f51 Compare March 23, 2026 20:33
Copy link
Contributor

@urangel urangel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope locality utilities are great! Thanks for all of the comments and fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants